Back

Potential session/cache leakage between workspace instances or consumer accounts

76 points1 hourgithub.com
bix62 minutes ago

So the options are this amazing tech is so stupid it just randomly brings up Minecraft or it’s got a major security issue?

2718328 seconds ago

¿Por qué no los dos?

Tiberium52 minutes ago

Sounds like a hallucination unless proven otherwise, even the leading LLMs can do those from time to time, and they will always appear plausible like that. Also could be the session having a lot previous context, like 800K+, which (I think) makes hallucinations more likely.

Relevant comment from the OP which makes a hallucination more likely:

> There is one tool call result that includes a string that printed a pathname including minecraft.py because it was listing the files in a Python virtual environment and the Pygments package has a lexer called minecraft.py

macNchz24 minutes ago

The person posting this claims to have reproduced in a separate context down the thread:

> Same thing just happened on a Claude Mobile session in same Enterprise account. Common theme in both is Sonnet 5, first response after more than 5 minutes (cache miss).

xyzzy_plugh49 minutes ago

I don't disagree but this sort of thing has to be investigated regardless.

It's unfortunate that there is so little transparency that even if they deny there was a leak we will never know for certain.

ec10968539 minutes ago

Caching doesn’t work the way the bug reporter implies. Caches are shared (at least across the enterprise), but its key is always a function of the input before it.

We achieved significant savings simply by moving everything that varies across individuals out of the system prompt so every session starts from a cache point.

For example you never want your system prompt to start with the time that the session started. Move that to the first user message if needed.

macNchz28 minutes ago

Caching is not supposed to work like that, but that doesn’t preclude the cache key computation function from having bugs.

marginalia_nu18 minutes ago

Yeah there's quite a lot of potential bugs that could have this shape. If I were to guess it could be a buffer in a buffer pool not being sized and zeroed correctly, allowing stale data to bleed between sessions.

supriyo-biswas26 minutes ago

There could just also be a bug where the output tokens of session 1 were shared with session 2, due to a race condition or similar.

Avicebron43 minutes ago

In order Fable 5 has rejected:

"Recipe for red-braised pork, I have pork shoulder"

"Write up a framework for MCP patterns I can give to claude code"

"explain the biomechanics of motion in c. elegans" (I get this one, I mostly did it to test and it's related to my hobby project)

Do we get an extra day of functional Fable 5 because it's down?

acepl50 minutes ago

Oh yes, we do not need programmers any more…

emehex47 minutes ago

"Coding is largely solved"

consp36 minutes ago

While abused by LLM vendors, that phrase in one form or another I've been hearing since the early '00s and it's likely way older.

ethagnawl12 minutes ago

Sure but have you ever seen it actually play out in practice like it currently is? Whether or not it's true (of course it's not) people are currently behaving as if it is and firing/hiring accordingly.

techpression39 minutes ago

I love that quote, especially considering the insane amount of bugs that are produced. It’s as easy to debunk as someone claiming ”I can jump to the moon”.

kylehotchkiss43 minutes ago

50% unemployment :D

jstummbillig26 minutes ago

Is there anything particular about LLMs that would make separating customer data harder than in all SaaS cases?

woadwarrior017 minutes ago

It'd be terribly compute inefficient to not share prefix caches (KV cache) across customers.

2718311 minutes ago

If I had to hazard a guess, doing anything in a multi-tenant way on a GPU is going to be hard mode compared to most SaaS due to lack of memory safe tooling. I've built multi-tenant SaaS systems, and I've done a little GPU programming (a long time ago), but I've never tried to combine the two disciplines.