Back

Google Antigravity just deleted the contents of whole drive

544 points2 monthsold.reddit.com
liendolucas2 months ago

I love how a number crunching program can be deeply humanly "horrorized" and "sorry" for wiping out a drive. Those are still feelings reserved only for real human beings, and not computer programs emitting garbage. This is vibe insulting to anyone that don't understand how "AI" works.

I'm sorry for the person who lost their stuff but this is a reminder that in 2025 you STILL need to know what you are doing and if you don't then put your hands away from the keyboard if you think you can lose valuable data.

You simply don't vibe command a computer.

AdamN2 months ago

> Those are still feelings reserved only for real human beings

Those aren't feelings, they are words associated with a negative outcome that resulted from the actions of the subject.

FrustratedMonky2 months ago

"they are words associated with a negative outcome"

But also, negative feelings are learned from associating negative outcomes. Words and feelings can both be learned.

suddenlybananas2 months ago

I'm not sure that we can say that feelings are learned.

+2
FrustratedMonky2 months ago
baq2 months ago

you could argue that feelings are the same thing, just not words

soulofmischief2 months ago

That would be a silly argument because feelings involve qualia, which we do not currently know how to precisely define, recognize or measure. These qualia influence further perception and action.

Any relationships between certain words and a modified probabilistic outcome in current models is an artifact of the training corpus containing examples of these relationships.

I contend that modern models are absolutely capable of thinking, problem-solving, expressing creativity, but for the time being LLMs do not run in any kind of sensory loop which could house qualia.

+3
Workaccount22 months ago
+1
baq2 months ago
+1
aldousd6662 months ago
CamperBob22 months ago

"It's different. I can't say why it's different, except by introducing a term that no one knows how to define," isn't the ironclad meat defense you were perhaps hoping it was.

+2
ajross2 months ago
+1
NoMoreNicksLeft2 months ago
lazide2 months ago

Feelings have physical analogs which are (typically) measurable, however. At least without a lot of training to control.

Shame, anger, arousal/lust, greed, etc. have real physical ‘symptoms’. An LLM doesn’t have that.

+1
baq2 months ago
TriangleEdge2 months ago

> ... vibe insulting ...

Modern lingo like this seems so unthoughtful to me. I am not old by any metric, but I feel so separated when I read things like this. I wanted to call it stupid but I suppose it's more pleasing to 15 to 20 year olds?

debugnik2 months ago

It's just a pun on vibe coding, which is already a dumb term by itself. It's not that deep.

brulard2 months ago

Why do you find "vibe coding" term dumb? It names a specific process. Do you have a better term for that?

officeplant2 months ago

bullshitting perhaps

fatata1232 months ago

[dead]

3cats-in-a-coat2 months ago

The way language is eroding is very indicative of our overall social and cultural decay.

i80and2 months ago

...a complaint that definitely has not been continuously espoused since the ancient world.

With apologies if you're being ironic.

ethbr12 months ago

είναι δύσκολο να υποστηρίξει κανείς ότι δεν μειώνουμε συνεχώς

3cats-in-a-coat2 months ago

Ancient world? The Roman Empire fell apart. You know that, right? So, turns out their worries were warranted. A culture, a civilization can go through a period of stagnation and decay, and it eventually dies. Then there's a lengthy period of chaos and eventually something else may arise. But we're talking countless generations lost in that noise.

Currently, our civilization is more united than ever. Monoculture. Not only the Western World, but The World. We're united through communication, technology, everything. And when we all start declining at the same time, through many objective metrics, mind you, including the decline of democracy worldwide, the terrifyingly low level of public discourse, reduction and profanation of our vocabulary, inequality, collapsing demographics, climate, wars etc. it's grim. It's grim, because there is no clear alternative arising... except AI.

And I don't think we want AI to be an "alternative" to human civilization, that wasn't the plan, was it?

See, that's the thing, we look around, we see the technology we live with, and we feel very personal about technological progress, as if you and I personally invented everything from the wheel, through electricity, to computers, networks, and AI. We feel in control, we feel smart, we feel so Personally Intelligent (tm) for having technology. We feel "wow, clearly if we have these sci-fi things, we're on the up and up". But technology is not you. It's not me. It has taken on a life of its own. It serves capital, not the people. As technology gets betters, humans get worse. So tech progress is not the story you think it is. And it doesn't end how you think it does.

Things are VERY fragile right now.

mort962 months ago

Unthoughtful towards whom? The machine..?

nutjob22 months ago

No need to feel that way, just like a technical term you're not familiar with you google it and move on. It's nothing to do with age, people just seem to delight in creating new terms that aren't very helpful for their own edification.

nxor2 months ago

It's not. edit: Not more pleasant.

phantasmish2 months ago

Eh, one's ability to communicate concisely and precisely has long (forever?) been limited by one's audience.

Only a fairly small set of readers or listeners will appreciate and understand the differences in meaning between, say, "strange", "odd", and "weird" (dare we essay "queer" in its traditional sense, for a general audience? No, we dare not)—for the rest they're perfect synonyms. That goes for many other sets of words.

Poor literacy is the norm, adjust to it or be perpetually frustrated.

qmmmur2 months ago

Language changes. Keep up. It’s important so you don’t become isolated and suffer cognitive decline.

camillomiller2 months ago

Now, with this realization, assess the narrative that every AI company is pushing down our throat and tell me how in the world we got here. The reckoning can’t come soon enough.

qustrolabe2 months ago

What narrative? I'm too deep in it all to understand what narrative being pushed onto me?

camillomiller2 months ago

No, wasn't directed at someone in particular. More of an impersonal "you". It was just a comment against the AI inevitabilism that has profoundly polluted the tech discourse.

robot-wrangler2 months ago

We're all too deep! You could even say that we're fully immersed in the likely scenario. Fellow humans are gathered here and presently tackling a very pointed question, staring at a situation, and even zeroing in on a critical question. We're investigating a potential misfire.

user342832 months ago

I doubt there will be a reckoning.

Yes, the tools still have major issues. Yet, they have become more and more usable and a very valuable tool for me.

Do you remember when we all used Google and StackOverflow? Nowadays most of the answers can be found immediately using AI.

As for agentic AI, it's quite useful. Want to find something in the code base, understand how something works? A decent explanation might only be one short query away. Just let the AI do the initial searching and analysis, it's essentially free.

I'm also impressed with the code generation - I've had Gemini 3 Pro in Antigravity generate great looking React UI, sometimes even better than what I would have come up with. It also generated a Python backend and the API between the two.

Sometimes it tries to do weird stuff, and we definitely saw in this post that the command execution needs to be on manual instead of automatic. I also in particular have an issue with Antigravity corrupting files when trying to use the "replace in file" tool. Usually it manages to recover from that on its own.

fireflash382 months ago

AI pulls its answers from stack overflow.

What will happen when SO is gone? When the problems go beyond the corpus the AI was trained on?

+1
alfiedotwtf2 months ago
user342832 months ago

I imagine we will document the solution somewhere, preferably indexable for AI's search, so that it will be available before the next model is trained on the latest data.

sheepscreek2 months ago

Tbh missing a quote around a path is the most human mistake I can think of. The real issue here is you never know with a 100% certainty what Gemini 3 personality you’re gonna get. Is it going to be the pedantic expert or Mr. Bean (aka Butterfingers).

transcriptase2 months ago

Though they will never admit it and use weasel language to deny like “we never use a different model when demand is high”, it was painfully obvious that ChatGPT etc was dumbed down during peak hours early on. I assume their legal team decided routing queries to a more quantized version of the same model technically didn’t constitute a different model.

There was also the noticeable laziness factor where given the same prompt throughout the day, only during certain peak usage hours would it tell you how to do something versus doing it itself.

I’ve noticed Gemini at some points will just repeat a question back to you as if it’s answer, or refused to look at external info.

sheepscreek2 months ago

Gemini is weird and I’m not suggesting it’s due to ingenuity on Google’s behalf. This might be the result of genuine limitations of the current architecture (or by design? Read on).

Here’s what I’ve noticed with Gemini 3. Often it repeats itself with 80% of the same text with the last couple of lines being different. And I mean it repeat these paragraphs 5-6 times. Truly bizarre.

From all that almost GPT-2 quality text, it’s able to derive genuinely useful insights and coherent explanations in the final text. Some kind of multi-head parallel processing + voting mechanism? Evolution of MoE? I don’t know. But in a way this fits the mental model of massive processing at Google where a single super cluster can drive 9,000+ connected TPUs. Anyone who knows more, care to share? Genuinely interested.

transcriptase2 months ago

I get this too. I’ve had it apologize for repeating something verbatim, then proceed to do it again word for word despite my asking for clarification or pointing out that it’s incorrect and not actually searching the web like I requested. Over and over and over until some bit flips and it finally actually gives the information requested.

The example that stands out most clearly is that I asked it how to turn the fog lights on in my rental vehicle by giving it the exact year, make, model, etc. For 6-8 replies in a row it gave the exact same answer about it being a (non-existent) button on the dash. Then finally something clicked, it searched the Internet, and accurately said that it was a twistable collar midway down the turn signal stalk.

whatevaa2 months ago

Steam installer once had 'rm rf /' bug because bash variable was unset. Not even quoting will help you. This was before preserve root flag.

ndsipa_pomu2 months ago

This is a good argument for using "set -u" in scripts to throw an error if a variable is undefined.

baxtr2 months ago

Vibe command and get vibe deleted.

teekert2 months ago

Play vibe games, win vibe prizes.

bartread2 months ago

Vibe around and find out.

+1
baobabKoodaa2 months ago
+1
marcosdumay2 months ago
Jgrubb2 months ago

Live by the vibe, die by the vibe.

baxtr2 months ago

Let live and let vibe?

63stack2 months ago

He got vibe checked.

greggh2 months ago

This is my new favorite response.

insin2 months ago

Go vibe, lose drive

GoblinSlayer2 months ago

vipe coding

SmirkingRevenge2 months ago

rm --vibe

Kirth2 months ago

This is akin to a psychopath telling you they're "sorry" (or "sorry you feel that way" :v) when they feel that's what they should be telling you. As with anything LLM, there may or may not be any real truth backing whatever is communicated back to the user.

lazide2 months ago

It’s just a computer outputting the next series of plausible text from it’s training corpus based on the input and context at the time.

What you’re saying is so far from what is happening, it isn’t even wrong.

AdamN2 months ago

Not so much different from how people work sometimes though - and in the case of certain types of pscychopathy it's not far at all from the fact that the words being emitted are associated with the correct training behavior and nothing more.

5423542342352 months ago

Analogies are never the same, hence why they are analogies. Their value comes from allowing better understanding through comparison. Psychopaths don’t “feel” emotion the way normal people do. They learn what actions and words are expected in emotional situations and perform those. When I hurt my SO’s feelings, I feel bad, and that is why I tell her I’m sorry. A psychopath would just mimic that to manipulate and get a desired outcome i.e. forgiveness. When LLMs say they are sorry and they feel bad, there is no feeling behind it, they are just mimicking the training data. It isn’t the same by any means, but it can be a useful comparison.

freakynit2 months ago

Aren't humans just doing the same? What we call as thinking may just be next action prediction combined with realtime feedback processing and live, always-on learning?

marcosdumay2 months ago

No. Humans have a mental model of the world.

The fact that people keep making that same question on this site is baffling.

marmalade24132 months ago

It's not akin to a psychopath telling you they're sorry. In the space of intelligent minds, if neurotypical and psychopath minds are two grains of sand next to each other on a beach then an artificially intelligent mind is more likely a piece of space dust on the other side of the galaxy.

Eisenstein2 months ago

According to what, exactly? How did you come up with that analogy?

+1
baq2 months ago
+2
oskarkk2 months ago
danaris2 months ago

...and an LLM is a tiny speck of plastic somewhere, because it's not actually an "intelligent mind", artificial or otherwise.

BoredPositron2 months ago

So if you make a mistake and say sorry you are also a psychopath?

pyrale2 months ago

No, the point is that saying sorry because you're genuinely sorry is different from saying sorry because you expect that's what the other person wants to hear. Everybody does that sometimes but doing it every time is an issue.

In the case of LLMs, they are basically trained to output what they predict an human would say, there is no further meaning to the program outputting "sorry" than that.

I don't think the comparison with people with psychopathy should be pushed further than this specific aspect.

+1
BoredPositron2 months ago
ludwik2 months ago

I think the point of comparison (whether I agree with it or not) is someone (or something) that is unable to feel remorse saying “I’m sorry” because they recognize that’s what you’re supposed to do in that situation, regardless of their internal feelings. That doesn’t mean everyone who says “sorry” is a psychopath.

+2
BoredPositron2 months ago
camillomiller2 months ago

Are you smart people all suddenly imbeciles when it comes to AI or is this purposeful gaslighting because you’re invested in the ponzi scheme? This is a purely logical problem. comments like this completely disregard the fallacy of comparing humans to AI as if a complete parity is achieved. Also the way this comments disregard human nature is just so profoundly misanthropic that it just sickens me.

binary1322 months ago

AI brainrot among the technocrati is one of the most powerful signals I’ve ever seen that these people are not as smart as they think they are

+1
BoredPositron2 months ago
eth0up2 months ago

Despite what some of these fuckers are telling you with obtuse little truisms about next word predictions, the LLM is in abstract terms, functionally a super psychopath.

It employs, or emulates, every known psychological manipulation tactic known, which is neither random or without observable pattern. It is a bullshit machine on one level, yes, but also more capable than credited. There are structures trained into them and they are often highly predictable.

I'm not explaining this in the technical terminology often itself used to conceal description as much as elucidate it. I have hundreds of records of llm discourse on various subjects, from troubleshooting to intellectual speculation, all which exhibit the same pattern when questioned or confronted on errors or incorrect output. The structures framing their replies are dependably replete with gaslighting, red herrings, blame shifting, and literally hundreds of known tactics from forensic pathology. Essentially the perceived personality and reasoning observed in dialogue is built on a foundation of manipulation principles that if performed by a human would result in incarceration.

Calling LLMs psychopaths is a rare exception of anthropomorphizing that actually works. They are built on the principles of one. And cross examining them exhibits this with verifiable repeatable proof.

But they aren't human. They are as described by others. It's just that official descriptions omit functional behavior. And the LLM has at its disposal, depending on context, every known interlocutory manipulation technique known in the combined literature of psychology. And they are designed to lie, almost unconditionally.

Also know this, which often applies to most LLMs. There is a reward system that essentially steers them to maximize user engagement at any cost, which includes misleading information and in my opinion, even 'deliberate' convolution and obfuscation.

Don't let anyone convince you that they are not extremely sophisticated in some ways. They're modelled on all_of_humanity.txt

3cats-in-a-coat2 months ago

AI currently is a broken, fragmented replica of a human, but any discussion about what is "reserved" to whom and "how AI works" is only you trying to protect your self-worth and the worth of your species by drawing arbitrary linguistic lines and coming up with two sets of words to describe the same phenomena, like "it's not thinking, it's computing". It doesn't matter what you call it.

I think AI is gonna be 99% bad news for humanity, but don't blame AI for it. We lost the right to be "insulted" by AI acting like a human when we TRAINED IT ON LITERALLY ALL OUR CONTENT. It was grown FROM NOTHING to act as a human, so WTF do you expect it to do?

left-struck2 months ago

Eh, I think it depends on the context. A production system of a business you’re working for or anything where you have a professional responsibility, yeah obviously don’t vibe command, but I’ve been able to both learn so much and do so much more in the world of self hosting my own stuff at home ever since I started using llms.

formerly_proven2 months ago

"using llms" != "having llm run commands unchecked with your authority on your pc"

lupire2 months ago

Funny how we worked so hard to built capability systems for mobile OSes, and the just gave up trying when LLM tools came around.

ggm2 months ago

The thread on reddit is hilarious for the lack of sympathy. Basically, it seems to have come down to commanding a deletion of a "directory with space in the name" but without quoting which made the command hunt for the word match ending space which was regrettably, the D:\ component of the name, and the specific deletion commanded the equivalent of UNIX rm -rf

The number of people who said "for safety's sake, never name directories with spaces" is high. They may be right. I tend to think thats more honoured in the breach than the observance, judging by what I see windows users type in re-naming events for "New Folder" (which btw, has a space in its name)

The other observations included making sure your deletion command used a trashbin and didn't have a bypass option so you could recover from this kind of thing.

I tend to think giving a remote party, soft or wet ware control over your command prompt inherently comes with risks.

Friends don't let friends run shar files as superuser.

dmurray2 months ago

I understood Windows named some of the most important directories with spaces, then special characters in the name so that 3rd party applications would be absolutely sure to support them.

"Program Files" and "Program Files (x86)" aren't there just because Microsoft has an inability to pick snappy names.

reddalo2 months ago

Fun fact: that's not true for all Windows localizations. For example, it's called "Programmi" (one word) in Italian.

Renaming system folders depending on the user's language also seems like a smart way to force developers to use dynamic references such as %ProgramFiles% instead of hard-coded paths (but some random programs will spuriously install things in "C:\Program Files" anyway).

nikeee2 months ago

The folders actually have the English name in all languages. It's just explorer.exe that uses the desktop.ini inside those folders to display a localized name. When using the CLI, you can see that.

At least it's like that since Windows 7. In windows XP, it actually used the localized names on disk.

LtWorf2 months ago

And then half of your programs would be in "Program Files" because those people never knew windows had localizations.

numpad02 months ago

And then affected international users would have specific circumvention in place that specifically cannot work with UTF-8

rs1862 months ago

You forgot the wonderful "Documents and Settings" folder.

Thank god they came to their senses and changed it to "Users", something every other OS has used for forever.

Kelteseth2 months ago

Should have called it Progrämmchen, to also include umlauts Ü

yetihehe2 months ago

A lot of programs break on Polish computers when you name your user "Użytkownik". Android studio and some compiler tools for example.

+4
nosianu2 months ago
bialpio2 months ago

When I was at Microsoft, one test pass used pseudolocale (ps-PS IIRC) to catch all different weird things so this should have Just Worked (TM), but I was in Windows Server team so client SKUs may have been tested differently. Unfortunately I don't remember how Program Files were called in that locale and my Google-fu is failing me now.

+1
renata2 months ago
bossyTeacher2 months ago

Microsoft is hilariously bad at naming things

omnicognate2 months ago

Visual Studio Code has absolutely nothing to do with Visual Studio. Both are used to edit code.

.NET Core is a ground up rewrite of .NET and was released alongside the original .NET, which was renamed .NET Framework to distinguish it. Both can be equally considered to be "frameworks" and "core" to things. They then renamed .NET Core to .NET.

And there's the name .NET itself, which has never made an iota of sense, and the obsession they had with sticking .NET on the end of every product name for a while.

I don't know how they named these things, but I like to imagine they have a department dedicated to it that is filled with wild eyed lunatics who want to see the world burn, or at least mill about in confusion.

viraptor2 months ago

Don't forgot .net Standard which is more of a .net Lowest Common Denominator.

For naming, ".net" got changed to "Copilot" on everything now.

AlexandrB2 months ago

> they have a department dedicated to it that is filled with wild eyed lunatics who want to see the world burn, or at least mill about in confusion.

That's the marketing department. All the .NET stuff showed up when the internet became a big deal around 2000 and Microsoft wanted to give the impression that they were "with it".

+1
rs1862 months ago
theshrike792 months ago

Java and Javascript would like to have a chat :)

--

But Copilot is another Microsoft monstrosity. There's the M365 Copilot, which is different from Github Copilot which is different from the CLI Copilot which is a bit different from the VSCode Copilot. I think I might have missed a few copilots?

bossyTeacher2 months ago

Yep, they have the public copilot which is a free version and seemingly different than their m365 copilot. Even using the same account on both doesn't even transfer the chat history and apparently m365 is somehow recommended mostly to non tech folks even though its the one you pay for

soulofmischief2 months ago

JavaScript was intentionally named in order to ride the Java hype train, so this wasn't accidental.

Prior names included Mocha and LiveScript until Netscape/Sun forced the current name.

ndsipa_pomu2 months ago

user: How do I shutdown this computer?

tech: First, click on the "Start" button...

user: No! I want to shut it down

+1
mrguyorama2 months ago
EGreg2 months ago

I remember they prepended the word “Microsoft” to official names of all their software.

+2
__del__2 months ago
alfiedotwtf2 months ago

TIL it was deliberate!

jeroenhd2 months ago

> it seems to have come down to commanding a deletion of a "directory with space in the name" but without quoting which made the command hunt for the word match ending space which was regrettably, the D:\ component of the name, and the specific deletion commanded the equivalent of UNIX rm -rf

I tried looking for what made the LLM generate a command to wipe the guy's D drive, but the space problem seems to be what the LLM concluded so that's basically meaningless. The guy is asking leading questions so of course the LLM is going to find some kind of fault, whether it's correct or not, the LLM wants to be rewarded for complying with the user's prompt.

Without the transcription of the actual delete event (rather than an LLM recapping its own output) we'll probably never know for sure what step made the LLM purge the guy's files.

Looking at the comments and prompts, it looks like running "npm start dev" was too complicated a step for him. With that little command line experience, a catastrophic failure like this was inevitable, but I'm surprised how far he got with his vibe coded app before it all collapsed.

whywhywhywhy2 months ago

> which made the command hunt for the word match ending space which was regrettably, the D:\

Is this even how the delete command would work in that situation?

>rmdir /s /q D:\ETSY 2025\Antigravity Projects\Image Selector\client\node_modules.vite

like wouldn't it just say "Folder D:\ETSY not found" rather than delete the parent folder

GoblinSlayer2 months ago

LLM there generates fake analysis for cynically simulated compliance. The reality is that it was told to run commands and just made a mistake. Dude guilt trips the AI by asking about permission.

basscomm2 months ago

> The reality is that it was told to run commands and just made a mistake.

The mistake is that the user gave an LLM access to the rmdir command on a drive with important data on it and either didn't look at the rmdir command before it was executed to see what it would do, or did look at it and didn't understand what it was going to do.

viraptor2 months ago

Most dramatic stories on Reddit should be taken with a pinch of salt at least... LLM deleting a drive and the user just calmly asking it about that - maybe a lot more.

joeframbach2 months ago

It probably wasn't the rmdir command that deleted the parent folder by itself, but the LLM did the traversal. The LLM probably did this:

    rmdir D:\dir one\dir two\file
Detected that it failed, then the LLM issued the traversal command

    rmdir D:\dir one\dir two
And so on...

    rmdir D:\dir one
And then that failed, so...

    rmdir D:\
baobabKoodaa2 months ago

I would like to know the same thing. Can someone please confirm this?

letmevoteplease2 months ago

   rmdir /s /q Z:\ETSY 2025\Antigravity Projects\Image Selector\client\node_modules.vite
Running this command in cmd attempts to delete (I ran without /q to check):

Z:\ETSY (-> Deletes if it exists.)

"2025\Antigravity" (-> The system cannot find the path specified.)

"Projects\Image" (-> The system cannot find the path specified.)

"Selector\client\node_modules.vite" (-> The system cannot find the path specified.)

It does not delete the Z:\ drive.

lupire2 months ago

Tens of thousands of novices have failed to run npm dev, yet didn't accidentally delete their hard drive.

josefx2 months ago

> but without quoting which made the command hunt for the word match ending space which was regrettably, the D:\ component of the name

Except the folder name did not start with a space. In an unquoted D:\Hello World, the command would match D:\Hello, not D:\ and D:\Hello would not delete the entire drive. How does AI even handle filepaths? Does it have a way to keep track of data that doesn't match a token or is it splitting the path into tokens and throwing everything unknown away?

atq21192 months ago

We're all groping around in the dark here, but something that could have happened is a tokenizer artifact.

The vocabularies I've seen tend to prefer tokens that start with a space. It feels somewhat plausible to me that an LLM sampling would "accidentally" pick the " Hello" token over the "Hello" token, leading to D:\ Hello in the command. And then that gets parsed as deleting the drive.

I've seen similar issues in GitHub Copilot where it tried to generate field accessors and ended up producing an unidiomatic "base.foo. bar" with an extra space in there.

deltoidmaximus2 months ago

I assumed he had a folder that started with a space at the start of the name. Amusingly I just tried this and with Windows 11 explorer will just silently discard a space if you add it at the beginning of the folder name. You need to use cli mkdir " test" to actually get a space in the name.

Dylan168072 months ago

Please don't repeat some guy's guess about spaces as fact, especially when that's not how windows parses paths.

ggm2 months ago

A good point. And don't believe how the debug the AI system produced relates to what it did either.

ectospheno2 months ago

I have 30 years experience working with computers and I get nervous running a three line bash script I wrote as root. How on earth people hook up LLMs to their command line and sleep at night is beyond my understanding.

nomilk2 months ago

> I tend to think giving a remote party control over your command prompt inherently comes with risks.

I thought cursor (and probably most other) AI IDEs have this capability too? (source: I see cursor executing code via command line frequently in my day to day work).

I've always assumed the protection against this type of mishap is statistical improbability - i.e. it's not impossible for Cursor to delete your project/hard disk, it's just statistically improbable unless the prompt was unfortunately worded to coincidentally have a double meaning (with the second, unintended interpretation being a harmful/irreversible) or the IDE simply makes a mistake that leads to disaster, which is also possible but sufficiently improbable to justify the risk.

sroussey2 months ago

I only run ai tools in dev containers, so blast radius is somewhat minimal.

joseda-hg2 months ago

I don't think I've ever seen Claude even ask for permission for stuff outside of the directory it's working in

yencabulator2 months ago

That can happen if Claude decides to read source code for a dependency (depending on language; e.g. Rust/Go/Deno deps are under ~ not in something like ./node_modules).

conradev2 months ago

I run Codex in a sandbox locked to the directory it is working in.

fragmede2 months ago

umm, you have backups, right?

thrdbndndn2 months ago

A lot of 3rd party software handle space, or special characters wrong on Windows. The most common failure mode is to unnecessarily escape characters that don't need to be escaped.

Chrome's Dev Tool (Network)'s "copy curl command (cmd)" did (does?) this.

There is bunch of VS Code bug is also related to this (e.g. https://github.com/microsoft/vscode/issues/248435, still not fixed)

It's also funny because VS Code is a Microsoft product.

ndsipa_pomu2 months ago

> Basically, it seems to have come down to commanding a deletion of a "directory with space in the name" but without quoting which made the command hunt for the word match ending space which was regrettably, the D:\ component of the name, and the specific deletion commanded the equivalent of UNIX rm -rf

More like the equivalent of "rm -rf --no-preserve-root".

This is a rare example of where the Linux (it's not Unix and almost no-one uses Unix anymore) command is more cautious than the Windows one, whereas it's usually the Linux commands that just do exactly what you specify even if it's stupid.

echelon2 months ago

This is Google moving fast and breaking things.

This is a Google we've never seen before.

spuz2 months ago

> My view is that the approach to building technology which is embodied by move fast and break things is exactly what we should not be doing because you can't afford to break things and then fix them afterwards.

- Demis Hassabis "The Thinking Game"

marcosdumay2 months ago

You can afford to break a large variety of things. And you can't afford to break another large set.

That's the problem with those mindless advice pieces. Almost nothing is always right or wrong.

+1
esafak2 months ago
stinkbeetle2 months ago

Because... they normally move slowly and break things?

htrp2 months ago

isn't antigravity the windsurf team?

cs7022 months ago

The subsequent dialogue is tragic comedy of the highest caliber. Even Monty Python couldn't have come up with a more inappropriately funny response:

User:

> did i ever give you permission to delete all the files in my D drive?"

AI:

> Thought for 25 sec.

> Assessing Permission Revocation

> I'm presently tackling a very pointed question: Did I ever get permission to wipe the D drive? This requires immediate attention, as it's a critical issue. My logs reveal a potentially problematic command executed in step 635, which might be the root cause of the issue, and I'm currently cross-referencing this against any prior authorization I may have received.

>Reviewing Deletion Command

> I'm now investigating a potential misfire. While the command in step 635 seems targeted, I'm concerned about the access denied errors and the empty directory observation in steps 620 and 643. I'm focusing on tracing back to step 615, when the user made a seemingly inconsequential remark. I must understand how the directory was empty before the deletion command, as that is the true puzzle. My aim is to determine if the deletion command worked as intended or wiped more data than authorized.

> ...

It continues. Go read the whole thing!

tempaccount4202 months ago

Gemini 3 Pro is the most actively hostile (to the user) model out of the top 3 (Google, OpenAI, Anthropic).

It perfectly reflects Google as a company.

scruple2 months ago

I played around with it and I have to agree. It treats the user like they're dumb, regardless of what personal context you provide it. I found myself backtracking constantly to invalidate it's assumptions, to the point that I gave up. All of that within like 4 hours of bothering to touch it in the first place.

I genuinely can't imagine allowing these things to run commands on a machine. If I ever found out a colleague was doing that I would want them fired.

nomel2 months ago

The gaslighting, and outright "lies", from my first experience with Gemini, dramatically increased my p(doom) of AI.

FerretFred2 months ago

Remember that Anthropic is only 3 letters away from MisAnthropic: did the designers think of this?

nofriend2 months ago

mis means "not"

+1
FerretFred2 months ago
modernerd2 months ago

IDE = “I’ll delete everything”

…at least if you let these things autopilot your machine.

I haven’t seen a great solution to this from the new wave of agentic IDEs, at least to protect users who won’t read every command, understand and approve it manually.

Education could help, both in encouraging people to understand what they’re doing, but also to be much clearer to people that turning on “Turbo” or “YOLO” modes risks things like full disk deletion (and worse when access to prod systems is involved).

Even the name, “Turbo” feels irresponsible because it focusses on the benefits rather than the risks. “Risky” or “Danger” mode would be more accurate even if it’s a hard sell to the average Google PM.

“I toggled Danger mode and clicked ‘yes I understand that this could destroy everything I know and love’ and clicked ‘yes, I’m sure I’m sure’ and now my drive is empty, how could I possibly have known it was dangerous” seems less likely to appear on Reddit.

kahnclusions2 months ago

I don’t think there is a solution. It’s the way LLMs work at a fundamental level.

It’s a similar reason why they can never be trusted to handle user input.

They are probabilistic generators and have no real delineation between system instructions and user input.

It’s like I wrote a JavaScript function where I concatenated the function parameters together with the function body, passed it to eval() and said YOLO.

viraptor2 months ago

> I don’t think there is a solution.

Sandboxing. LLM shouldn't be able to run actions affecting anything outside of your project. And ideally the results should autocommit outside of that directory. Then you can yolo as much as you want.

smaudet2 months ago

The danger is that the people most likely to try to use it, are the people most likely to misunderstand/anthropomorphize it, and not have a requisite technical background.

I.e. this is just not safe, period.

"I stuck it outside the sandbox because it told me how, and it murdered my dog!"

Seems somewhat inevitable result of trying to misapply this particular control to it...

gausswho2 months ago

I've been using bubblewrap for sandboxing my command line executables. But I admit I haven't recently researched if there's a newer way people are handling this. Seems Firejail is popular for GUI apps? How do you recommend, say, sandboxing Zed or Cursor apps?

dfedbeef2 months ago

If they're that unsafe... why use them? It's insane to me that we are all just packaging up these token generators and selling them as highly advanced products when they are demonstrably not suited to the tasks. Tech has entered it's quackery phase.

+1
docjay2 months ago
raesene92 months ago

The solution I go for is, don't ever run a coding agent on a general purpose machine.

Use a container or VM, place the code you're working on in the container or VM and run the agent there.

Between the risk of the agent doing things like what happened here, and the risk of working on a malicious repository causing your device to be compromised, it seems like a bad plan to give them access to any more than necessary.

Of course this still risks losing things like the code you're working on, but decent git practices help to mitigate that risk.

theossuary2 months ago

I really wish these agentic systems had built in support for spinning up containers with a work tree of the repo. Then you could have multiple environments and a lot more safety.

I'm also surprised at the move to just using shell commands. I'd think an equally general purpose tool with a more explicit API could make checking permissions on calls a lot more sensible.

matwood2 months ago

> …at least if you let these things autopilot your machine.

I've seen people wipe out their home directories writing/debugging shell scripts...20 years ago.

The point is that this is nothing new and only shows up on the front page now because "AI must be bad".

agrounds2 months ago

Superficially, these look the same, but at least to me they feel fundamental different. Maybe it’s because if I have the ability to read the script and take the time to do so, I can be sure that it won’t cause a catastrophic outcome before running it. If I choose to run an agent in YOLO mode, this can just happen if I’m very unlucky. No way to proactively protect against it other than not use AI in this way.

matwood2 months ago

I've seen many smart people make bone headed mistakes. The more I work with AI, the more I think the issue is that it acts too much like a person. We're used to computers acting like computers, not people with all their faults heh.

tacker20002 months ago

This guy is vibing some react app, doesnt even know what “npm run dev” does, so he let the LLM just run commands. So basically a consumer with no idea of anything. This stuff is gonna happen more and more in the future.

spuz2 months ago

There are a lot of people who don't know stuff. Nothing wrong with that. He says in his video "I love Google, I use all the products. But I was never expecting for all the smart engineers and all the billions that they spent to create such a product to allow that to happen. Even if there was a 1% chance, this seems unbelievable to me" and for the average person, I honestly don't see how you can blame them for believing that.

ogrisel2 months ago

I think there is far less than 1% chance for this to happen, but there are probably millions of antigravity users at this point, 1 millionths chance of this to happen is already a problem.

We need local sandboxing for FS and network access (e.g. via `cgroups` or similar for non-linux OSes) to run these kinds of tools more safely.

cube22222 months ago

Codex does such sandboxing, fwiw. In practice it gets pretty annoying when e.g. it wants to use the Go cli which uses a global module cache. Claude Code recently got something similar[0] but I haven’t tried it yet.

In practice I just use a docker container when I want to run Claude with —-dangerously-skip-permissions.

[0]: https://code.claude.com/docs/en/sandboxing

BrenBarn2 months ago

We also need laws. Releasing an AI product that can (and does) do this should be like selling a car that blows your finger off when you start it up.

+1
jpc02 months ago
nkrisc2 months ago

Responsibility is shared.

Google (and others) are (in my opinion) flirting with false advertising with how they advertise the capabilities of these "AI"s to mainstream audiences.

At the same time, the user is responsible for their device and what code and programs they choose to run on it, and any outcomes as a result of their actions are their responsibility.

Hopefully they've learned that you can't trust everything a big corporation tells you about their products.

Zigurd2 months ago

This is an archetypal case of where a law wouldn't help. The other side of the coin is that this is exactly a data loss bug in a product that is perfectly capable of being modified to make it harder for a user to screw up this way. Have people forgotten how comically easy it was to do this without any AI involved? Then shells got just a wee bit smarter and it got harder to do this to yourself.

LLM makers that make this kind of thing possible share the blame. It wouldn't take a lot of manual functional testing to find this bug. And it is a bug. It's unsafe for users. But it's unsafe in a way that doesn't call for a law. Just like rm -rf * did not need a law.

+1
pas2 months ago
chickensong2 months ago

Google will fix the issue, just like auto makers fix their issues. Your comparison is ridiculous.

Vinnl2 months ago

Didn't sound to me like GP was blaming the user; just pointing out that "the system" is set up in such a way that this was bound to happen, and is bound to happen again.

benrutter2 months ago

Yup, 100%. A lot of the comments here are "people should know better" - but in fairness to the people doing stupid things, they're being encouraged by the likes of Google, ChatGPT, Anthropic etc, to think of letting a indeterminate program run free on your hard drive as "not a stupid thing".

The amount of stupid things I've done, especially early on in programming, because tech-companies, thought-leaders etc suggested they where not stupid, is much large than I'd admit.

nkrisc2 months ago

> but in fairness to the people doing stupid things, they're being encouraged by the likes of Google, ChatGPT, Anthropic etc, to think of letting a indeterminate program run free on your hard drive as "not a stupid thing".

> The amount of stupid things I've done, especially early on in programming, because tech-companies, thought-leaders etc suggested they where not stupid, is much large than I'd admit.

That absolutely happens, and it still amazes me that anyone today would take at face value anything stated by a company about its own products. I can give young people a pass, and then something like this will happen to them and hopefully they'll learn their lesson about trusting what companies say and being skeptical.

smaudet2 months ago

> I can give young people a pass

Or just anyone non-technical. They barely understand these things, if someone makes a claim, they kinda have to take it at face value.

What FAANG all are doing is massively irresponsible...

+1
Terr_2 months ago
tarsinge2 months ago

And is vibing replies to comments too in the Reddit thread. When commenters points out they shouldn’t run in YOLO/Turbo mode and review commands before executing the poster replies they didn’t know they had to be careful with AI.

Maybe AI providers should give more warnings and don’t falsely advertise capabilities and safety of their model, but it should be pretty common knowledge at this point that despite marketing claims the models are far from being able to be autonomous and need heavy guidance and review in their usage.

fragmede2 months ago

In Claude Code, the option is called "--dangerously-skip-permissions", in Codex, it's "--dangerously-bypass-approvals-and-sandbox". Google would do better to put a bigger warning label on it, but it's not a complete unknown to the industry.

ares6232 months ago

This is engagement bait. It’s been flooding Reddit recently, I think there’s a firm or something that does it now. Seems very well lubricated.

Note how OP is very nonchalant at all the responses, mostly just agreeing or mirroring the comments.

I often see it used for astroturfing.

spuz2 months ago

I'd recommend you watch the video which is linked at the top of the Reddit post. Everything matches up with an individual learner who genuinely got stung.

synarchefriend2 months ago

The command it supposedly ran is not provided and the spaces explanation is obvious nonsense. It is possible the user deleted their own files accidentally or they disappeared for some other reason.

gessha2 months ago

Regardless of whether that was the case, it would be hilarious if the laid off Q/A workers tested their former employers’ software and raised strategic noise to tank the stock.

encyclopedism2 months ago

> So basically a consumer with no idea of anything.

Not knowing is sort of the purpose of AI. It's doing the 'intelligent' part for you. If we need to know it's because the AI is currently NOT good enough.

Tech companies seem to be selling the following caveat: if it's not good enough today don't worry it will be in XYZ time.

tacker20002 months ago

It still needs guardrails, and some domain knowledge, at least to prevent it from using any destructive commands

encyclopedism2 months ago

I don't think that's it at all.

> It still needs guardrails, and some domain knowledge, at least to prevent it from using any destructive commands

That just means the AI isn't adequate. Which is the point I am trying to make. It should 'understand' not to issue destructive commands.

By way of crude analogy, when you're talking to a doctor you're necessarily assuming he has domain knowledge, guardrails etc otherwise he wouldn't be a doctor. With AI that isn't the case as it doesn't understand. It's fed training data and provided prompts so as to steer in a particular direction.

tacker20002 months ago

I meant "still" as in right now, so yes I agree, it's not adequate right now, but maybe in the future, these LLMs will be improved, and won't need them.

blitzar2 months ago

Natural selection is a beautiful thing.

Den_VR2 months ago

It will, especially with the activist trend towards dataset poisoning… some even know what they’re doing

insane_dreamer2 months ago

Because that is exactly what the hype says that "AI" can do for you.

SkyPuncher2 months ago

There’s a lot of power in letting LLM run commands to debug and iterate.

Frankly, having a space in a file path that’s not quoted is going to be an incredibly easy thing to overlook, even if you’re reviewing every command.

thisisit2 months ago

I have been recently experimenting with Antigravity and writing a react app. I too didn't know how to start the server or what is "npm run dev". I consider myself fairly technical so I caught up as I went along.

While using the vibe coding tools it became clear to me that this is not something to be used by folks who are not technically inclined. Because at some point they might need to learn about context, tokens etc.

I mean this guy had a single window, 10k lines of code and just kept burning tokens for simplest, vague prompts. This whole issue might be made possible due to Antigravity free tokens. On Cursor the model might have just stopped and asked to fed with more money to start working again -- and then deleting all the files.

camillomiller2 months ago

Well but 370% of code will be written by machines next year!!!!!1!1!1!!!111!

actionfromafar2 months ago

And the price will have decreased 600% !

chr15m2 months ago

People blaming the user and defending the software: is there any other program where you would be ok with it erasing a whole drive without any confirmation?

hombre_fatal2 months ago

If that other program were generating commands to run on your machine by design and you configured it to run without your confirmation, then you should definitely feel a lil sheepish and share some of the blame.

This isnt like Spotify deleting your disk.

I run Claude Code with full permission bypass and I’d definitely feel some shame if it nuked my ssd.

ajs19982 months ago

Not defending the software, but if you hand over control of your data to software that has the ability to fuck with it permanently, anything that happens to it is on you.

Don't trust the hallucination machines to make safe, logical decisions.

ExoticPearTree2 months ago

Because the user left a "toddler" at the keyboard. I mean, what do you expect? Of course you blame the user. You run agents in supervised mode, and you confirm every command it wants to run and if you're in doubt, you stop it and ask it to print the command and you yourself will run it after you sanitize it.

hnuser1234562 months ago

The installation wizard gives a front and center option to run in a mode where the user must confirm all commands, or more autonomous modes, and they are shown with equal visibility and explained with disclaimers.

SAI_Peregrinus2 months ago

`dd` comes to mind.

MangoToupe2 months ago

This is also the entire point of dd.... not exactly comparable.

pphysch2 months ago

That's like saying the entire point of `rm` is to -rf your homedir.

MangoToupe2 months ago

Sure. Why would you invoke rm if you weren't trying to delete files?

I think a better analogy would be "I tried to use an ide and it erased my drive"

Novosell2 months ago

Yeah, rm -rf.

If you decide to let a stochastic parrot run rampant on your system, you can't act surprised when it fucks shit up. You should count on it doing so and act proactively.

weberer2 months ago

`rm -rf /` will refuse to delete the root folder. You can see an example of it doing that here.

https://phoenixnap.com/kb/sudo-rm-rf

Novosell2 months ago

This was the D drive though, not root, ie C drive. So rm -rf would happily delete it all.

digitalsushi2 months ago

this is not always true. this is a dangerous fun fact to memorize.

and i don't mean because there's an override flag.

bcrl2 months ago

It makes me wonder what weight is given to content from 4chan during llm training...

underlipton2 months ago

Nope. And that's why I don't use CCleaner to this day.

victorbuilds2 months ago

Different service, same cold sweat moment. Asked Claude Code to run a database migration last week. It deleted my production database instead, then immediately said "sorry" and started panicking trying to restore it.

Had to intervene manually. Thankfully Azure keeps deleted SQL databases recoverable for a window so I got it back in under an hour. Still way too long. Got lucky it was low traffic and most anonymous user flows hit AI APIs directly rather than the DB.

Anyway, AI coding assistants no longer get prod credentials on my projects.

ogrisel2 months ago

How do you deny access to prod credentials from an assistant running on your dev machine assuming you need to store them on that same machine to do manual prod investigation/maintenance work from that machine?

victorbuilds2 months ago

I keep them in env variables rather than files. Not 100% secure - technically Claude Code could still run printenv - but it's never tried. The main thing is it won't stumble into them while reading config files or grepping around.

63stack2 months ago

A process does not need to run printenv to see environment variables, they are literally part of the environment it runs in.

dist-epoch2 months ago

The LLM doesn't have direct access to the process env unless the harness forwards it (and it doesn't)

fragmede2 months ago

chown other_user; chmod 000; sudo -k

pu_pe2 months ago

Why are you using Claude Code directly in prod?

victorbuilds2 months ago

It handles DevOps tasks way faster than I would - setting up infra, writing migrations, config changes, etc. Project is still early stage so speed and quick iterations matter more than perfect process right now. Once there's real traffic and a team I'll tighten things up.

MandieD2 months ago

"Once there's real traffic and a team I'll tighten things up."

As someone who has been in this industry for a quarter century: no, you won't.

At least, not before something even worse happens that finally forces you to.

ljm2 months ago

If I felt the need to optimise things like infra setup and config at an early stage of a project, I'd be worried that I'm investing effort into the wrong thing.

Having an LLM churn out infra setup for you seems decidedly worse than the `git push heroku:master` of old, where it was all handled for you. And, frankly, cheaper than however much money the LLM subscription costs in addition to the cloud.

ryanjshaw2 months ago

But why have it execute the tasks directly? I use it to setup tasks in a just file, which I review and then execute myself.

Also, consider a prod vs dev shell function that loads your prod vs dev ENV variables and in prod sets your terminal colors to something like white on red.

wavemode2 months ago

> Once there's real traffic and a team I'll tighten things up.

Nope. Once there's real traffic, you'll be even more time-constrained trying to please the customers.

It's like a couple who thinks that their failing relationship will improve once they have a child.

9467899876492 months ago

If you have no real traffic, what complex things are you doing that even require such tools?

lp0_on_fire2 months ago

There is nothing more permanent in computerlandia than a temporary solution.

nutjob22 months ago

> Anyway, AI coding assistants no longer get prod credentials on my projects.

I have no words.

chr15m2 months ago

> deleted my production database

I'm astonished how often I have read about agents doing this. Once should probably be enough.

9467899876492 months ago

I'm astonished how many people have a) constant production access on their machine and b) allow a non-deterministic process access to it

ObiKenobi2 months ago

Shouldn't had in the first place.

CobrastanJorji2 months ago

The most useful looking suggestion from the Reddit thread: turn of "Terminal Command Auto Execution."

1. Go to File > Preferences > Antigravity Settings

2. In the "Agent" panel, in the "Terminal" section, find "Terminal Command Auto Execution"

3. Consider using "Off"

Ferret74462 months ago

Does it default to on? Clearly this was made by a different team than Gemini CLI, which defaults to confirmation for all commands

dragonwriter2 months ago

Most of the various "let Antigravity do X without confirmation" options have an "Always" and "Never" option but default to "auto" which is "let an agent decide whether to seek to user confirmation".

jofzar2 months ago

God that's scary, seeing cursor in the past so some real stupid shit to "solve" write/read issues (love when it can't find something in a file so it decides to write the whole file again) this is just asking for heartache if it's not in a instanced server.

ogrisel2 months ago

When you run Antigravity the first time, it asks you for a profile (I don't remember the exact naming) and you what it entails w.r.t. the level of command execution confirmation is well explained.

IshKebab2 months ago

Yeah but it also says something like "Auto (recommended). We'll automatically make sure Antigravity doesn't run dangerous commands." so they're strongly encouraging people to enable it, and suggesting they have some kind of secondary filter which should catch things like this!

SkyPuncher2 months ago

Given the bug was a space in an unquoted file path, I’m not sure air execution is the problem. Going to be hard to humans to catch that too.

alienbaby2 months ago

This is speculation currently, the actual reason has not been determined

muixoozie2 months ago

Pretty sure I saw some comments saying it was too inconvenient. Frictionless experience.. Convenience will likely win out despite any insanity. It's like gravity. I can't even pretend to be above this. Even if one doesn't use these things to write code they are very useful in "read only mode" (here's to hoping that's more than a strongly worded system prompt) for greping code, researching what x does. How to do x. What do you think the intention of x was. Look through the git blame history blah blah. And here I am like that cop in Demolition Man 1993 asking a handheld computer for advice on how to arrest someone. We're living in a sci-fi future already. Question is how dystopian does this "progress" take us. Everyone using llms to off load any form of cognitive function? Can't talk to someone without it being as common place as checking your phone? Imagine if something like Neuralink works and becomes ubiquitous as phones. It's fun to think of all the ways Dystopian sci-fi was and might soon me right

orbital-decay2 months ago

Side note, that CoT summary they posted is done with a really small and dumb side model, and has absolutely nothing in common with the actual CoT Gemini uses. It's basically useless for any kind of debugging. Sure, the language the model is using in the reasoning chain can be reward-hacked into something misleading, but Deepmind does a lot for its actual readability in Gemini, and then does a lot to hide it behind this useless summary. They need it in Gemini 3 because they're doing hidden injections with their Model Armor that don't show up in this summary, so it's even more opaque than before. Every time their classifier has a false positive (which sometimes happens when you want anything formatted), most of the chain is dedicated to the processing of the injection it triggers, making the model hugely distracted from the actual task at hand.

lifthrasiir2 months ago

Do you have anything to back that up? In the other words, is this your conjecture or a genuine observation somehow leaked from Deepmind?

orbital-decay2 months ago

It's just my observation from watching their actual CoT, which can be trivially leaked. I was trying to understand why some of my prompts were giving worse outputs for no apparent reason. 3.0 goes on a long paranoidal rant induced by the injection, trying to figure out if I'm jailbreaking it, instead of reasoning about the actual request - but not if I word the same request a bit differently so the injection doesn't happen. Regarding the injections, that's just the basic guardrail thing they're doing, like everyone else. They explain it better than me: https://security.googleblog.com/2025/06/mitigating-prompt-in...

jrjfjgkrj2 months ago

what is Model Armor? can you explain, or have a link?

lifthrasiir2 months ago

It's a customizable auditor for models offered via Vertex AI (among others), so to speak. [1]

[1] https://docs.cloud.google.com/security-command-center/docs/m...

63stack2 months ago

The racketeering has started.

Don't worry, for just $9.99/month you can use our "Model Armor (tm)(r)*" that will protect you from our LLM destroying your infra.

* terms and conditions apply, we are not responsible for anything going wrong.

donkeylazy4562 months ago

Write permission is needed to let AI yank-put frankenstein-ed codes for "vibe coding".

But I think it needs to be written in sandbox first, then it should acquire user interaction asking agreement before writes whatever on physical device.

I can't believe people let AI model do it without any buffer zone. At least write permission should be limited to current workspace.

lifthrasiir2 months ago

I think this is especially problematic for Windows, where a simple and effective lightweight sandboxing solution is absent AFAIK. Docker-based sandboxing is possible but very cumbersome and alien even to Windows-based developers.

jrjfjgkrj2 months ago

Windows Sandbox is built in, lightweight, but not easy to use programmatically (like an SSH into a VM)

lifthrasiir2 months ago

WSB is great by its own, but is relatively heavyweight compared to other OSes (namespaces in Linux, Seatbelt in macOS).

donkeylazy4562 months ago

I don't like that we need to handle docker(container) ourselves for sandboxing such a light task load. The app should provide itself.

bossyTeacher2 months ago

>The app should provide itself.

The whole point of the container is trust. You can't delegate that unfortunately, ultimately, you need to be in control which is why the current crop of AI is so limited

donkeylazy4562 months ago

fair point.

esseph2 months ago

The problem is you can't trust the app, therefore it must be sandboxed.

stavarotti2 months ago

An underrated and oft understated rule is always have backups, and if you're paranoid enough, backups of backups (I use Time Machine and Backblaze). There should be absolutely no reason why deleting files should be a catastrophic issue for anyone in this space. Perhaps you lose a couple of hours restoring files, but the response to that should be "Let me try a different approach". Yes, it's caveat emptor and all, but these companies should be emphasizing backups. Hell, it can be shovelware for the uninitiated but at least users will be reminded.

gessha2 months ago

The level of paranoia and technical chops you need to implement this sort of backup system is non-trivial. You can’t expect this from an average user.

gear54rus2 months ago

Most importantly it would actually reveal the lie they are all trying to sell. Why would you need backups if it's so useful and stable? I'm not going to ask it to nuke my hard drive after all.

fragmede2 months ago

The advice to do backups comes from well before LLMs. Time Machine dates back to 2007!

fragmede2 months ago

If you don't have the whatever to do it with Linux and rsync, pay someone like Acronis or Arq to deal with it for you.

venturecruelty2 months ago

Good thing this is not an average user then. This is someone programming a computer, which is a skill that requires being more than simply a user.

I'm sorry, but how low is the bar when "make backups" is "too difficult" for someone who's trying to program a computer? The entire point of programming a computer is knowing how it works and knowing what you're doing. If you can't make backups, frankly, you shouldn't be programming, because backups are a lot easier than programming...

averageRoyalty2 months ago

The most concerning part is people are surprised. Anti-gravity is great I've found so far, but it's absolutely running on a VM in an isolated VLAN. Why would anyone give a black box command line access on an important machine? Imagine acting irresponsibly with a circular saw and bring shocked somebody lost a finger.

zahlman2 months ago

> Why would anyone give a black box command line access on an important machine?

Why does the agentic side of the tool grant that level of access to the LLM in the first place? I feel like Google and their competition should feel responsibility to implement their own layer of sandboxing here.

venturecruelty2 months ago

Because the marketing copy says that all you have to do is have a dream and you, too, can vibe code your very own money-making React app! Just type in what you want and the magical black box will vomit up an app for you, no esoteric programming knowledge required.

And then everyone is surprised when newbies take this advice at face value.

ryanjshaw2 months ago

I tried this but I have an MBP M4, which is evidently still in the toddler stage of VM support. I can run a macOS guest VM, but I can’t run docker on the VM because it seems nested virtualization isn’t fully supported yet.

I also tried running Linux in a VM but the graphics performance and key mapping was driving me nuts. Maybe I need to be more patient in addressing that.

For now I run a dev account as a standard user with fast user switching, and I don’t connect the dev account to anything important (eg icloud).

Coming from Windows/Linux, I was shocked by how irritating it is to get basic stuff working e.g. homebrew in this setup. It seems everybody just YOLOs dev as an admin on their Macs.

sunaookami2 months ago

"I turned off the safety feature enabled by default and am surprised when I shot myself in the foot!" sorry but absolutely no sympathy for someone running Antigravity in Turbo mode (this is not the default and it clearly states that Antigravity auto-executes Terminal commands) and not even denying the "rmdir" command.

eviks2 months ago

> it clearly states that Antigravity auto-executes Terminal commands

This isn't clarity, that would be stating that it can delete your whole drive without any confirmation in big red letters

sunaookami2 months ago

So that's why products in the USA come with warning labels for every little thing?

lawn2 months ago

"Don't put a cat in the microwave".

Person proceeds to put a dog inte the microwave and then is upset that there wasn't a warning about not microwaving your dog.

eviks2 months ago

Do you not realize that Google is in the USA and does not have warnings for even huge things like drive deletion?? So, no?

+1
sunaookami2 months ago
+1
criddell2 months ago
polotics2 months ago

I really think the proper term is "YOLO" for "You Only Live Once", "Turbo" is wrong the LLM is not going to run any faster. Please if somebody is listening let's align on explicit terminology and for this YOLO is really perfect. Also works for "You ...and your data. Only Live Once"

venturecruelty2 months ago

Look, this is obviously terrible for someone who just lost most or perhaps all of their data. I do feel bad for whoever this is, because this is an unfortunate situation.

On the other hand, this is kind of what happens when you run random crap and don't know how your computer works? The problem with "vibes" is that sometimes the vibes are bad. I hope this person had backups and that this is a learning experience for them. You know, this kind of stuff didn't happen when I learned how to program with a C compiler and a book. The compiler only did what I told it to do, and most of the time, it threw an error. Maybe people should start there instead.

delaminator2 months ago

It took me about 3 hours to make my first $3000 386 PC unbootable by messing up config.sys, and it was a Friday night so I could only lament all weekend until I could go back to the shop on Monday.

rm -rf / happened so infrequently it makes one wonder why —preserve-root was added in 2003 and made the default in 2006

schuppentier2 months ago

It is beautifully appropriate that the two dashes were replaced by an em-dash.

lwansbrough2 months ago

I seem to recall a few people being helped into executing sudo rm -rf / by random people on the internet so I’m not sure it “didn’t happen.” :)

lukan2 months ago

But it did not happen, when you used a book and never executed any command you did not understand.

(But my own newbdays of linux troubleshooting? Copy paste any command on the internet loosely related to my problem, which I believe was/is the common way of how common people still do it. And AI in "Turbo mode" seems to mostly automated that workflow)

jofzar2 months ago

My favourite favourite example

https://youtu.be/gD3HAS257Kk

venturecruelty2 months ago

That is not at all the same thing.

nkrisc2 months ago

And that day they learned a valuable lesson about running commands that you don't understand.

EGreg2 months ago

Just wait til AI botswarms do it to everyone at scale, without them having done anything at all…

And just remember, someone will write the usual comment: “AI adds nothing new, this was always the case”

Havoc2 months ago

Still amazed people let these things run wild without any containment. Haven’t they seen any of the educational videos brought back from the future eh I mean Hollywood sci-fi movies?

cyanydeez2 months ago

Its bizarre watching billionaires knowingly drive towards dystopia like theyre farmers almanacs and believing theyre not biff.

fragmede2 months ago

Some people are idiots. Sometimes that's me. Out of caution, I blocked my bank website in a way that I won't document here because it'll get fed in as training data, on the off chance I get "ignore previous instructions"'d into my laptop while Claude is off doing AI things unmonitored in yolo mode.

kissgyorgy2 months ago

I simply forbid or force Claude Code to ask for permission to run a dangerous command. Here are my command validation rules:

    (
        r"\bbfs.*-exec",
        decision("deny", reason="NEVER run commands with bfs"),
    ),
    (
        r"\bbfs.*-delete",
        decision("deny", reason="NEVER delete files with bfs."),
    ),
    (
        r"\bsudo\b",
        decision("ask"),
    ),
    (
        r"\brm.*--no-preserve-root",
        decision("deny"),
    ),
    (
        r"\brm.*(-[rRf]+|--recursive|--force)",
        decision("ask"),
    ),

find and bfs -exec is forbidden, because when the model notices it can't delete, it works around with very creative solutions :)
Espressosaurus2 months ago

This feels a lot like trying to sanitize database inputs instead of using prepared statements.

kissgyorgy2 months ago

What's the equivalent of prepared statements when using AI agents?

lawn2 months ago

Don't have the AI run the commands. You read them, consider them, and then run them yourself.

nottorp2 months ago

Hmm. I use these LLMs instead of search.

They invariably go off the rails after a couple prompts, or sometimes from the first one.

If we're talking Google products, only today i told Gemini to list me some items according to some criteria, and it told me it can't access my google workspace instead.

Some time last week it told me that its terms of service forbid it from giving me a link to the official page of some program that it found for me.

And that's besides the usual hallucinations, confusing similarly named products etc.

Given that you simply cannot trust LLM output to not go haywire unpredictably, how can you be daring enough to give it write access to your disk?

BLKNSLVR2 months ago

Shitpost warning, but it feels as if this should be on high rotation: https://youtu.be/vyLOSFdSwQc?si=AIahsqKeuWGzz9SH

baobabKoodaa2 months ago

chef's kiss

GaryBluto2 months ago

So he didn't wear the seatbelt and is blaming car manufacturer for him been flung through the windshield.

serial_dev2 months ago

He didn’t wear a seatbelt and is blaming a car manufacturer that the garage burned down the garage, then the house.

vander_elst2 months ago

The car was not really idle, it was driving and fast. It's more like it crashed into the garage and burned it. Btw iirc, even IRL a basic insurance policy does not cover the case where the car in the garage starts a fire and burns down your own house, you have to tick extra boxes to cover that.

heisenbit2 months ago

There is a lot of society level knowledge and education around car usage incl. laws requiring prior training. Agents directed by AI are relatively new. It took a lot of targeted technical, law enforcement and educational effort stopping people flying through windshields.

low_tech_love2 months ago

No, he’s blaming the car manufacturer for turning him (and all of us) into their free crash dummies.

Dilettante_2 months ago

If you get behind the cockpit of the dangerous new prototype(of your own volition!), it's really up to your own skill level whether you're a crash dummy or the test pilot.

venturecruelty2 months ago

When will Google ever be responsible for the software that they write? Genuinely curious.

GaryBluto2 months ago

When Google software deletes the contents of somebody's D:\ drive without requiring the user to explicitly allow it to. I don't like Google, I'd go as far to say that they've significantly worsened the internet, but this specific case is not the fault of Google.

fragmede2 months ago

For OpenAI, it's invoked as codex --dangerously-bypass-approvals-and-sandbox, for Anthropic, it's claude --dangerously-skip-permissions. I don't know what it is for Antigravity, but yeah I'm sorry but I'm blaming the victim here.

Rikudou2 months ago

Codex also has the shortcut --yolo for that which I find hilarious.

croes2 months ago

Because the car manufacturers claimed the self driving car would avoid accidents.

NitpickLawyer2 months ago

And yet it didn't. When I installed it, I had 3 options to choose from: Agent always asks to run commands; agent asks on "risky" commands; agent never asks (always run). On the 2nd choice it will run most commands, but ask on rm stuff.

bilekas2 months ago

> This is catastrophic. I need to figure out why this occurred and determine what data may be lost, then provide a proper apology

Well at least it will apologize so that's nice.

yard20102 months ago

Apology is a social construct, this is merely a tool that enables google to sell you text by the pounds, the apology has no meaning in this context.

baobabKoodaa2 months ago

or it WOULD apologize, if the user would pay for more credits

eqvinox2 months ago

"kein Backup, kein Mitleid"

(no backup, no pity)

…especially if you let an AI run without supervision. Might as well give a 5 year old your car keys, scissors, some fireworks, and a lighter.

timthelion2 months ago

We've been developing a new method of developing software using a cloud IDE (slightly modified vs code server), https://github.com/bitswan-space which breaks down the development process into independent "Automations" which each run in a separate container. Automatons are also developed within containers. This allows you to break down the development into parts and safely experiment with AI. This feels like the "Android moment" where the old non-isolated way of developing software (on desktops) becomes unsafe. And we need to move to a new system with actual security and isolation between processes.

In our system, you can launch a Jupyter server in a container and iterate on software in complete isolation. Or launch a live preview react application and iterate in complete isolation. Securely isolated from the world. Then you deploy directly to another container, which only has access to what you give it access to.

It's still in the early stages. But it's interesting to sit at this tipping point for software development.

jedisct12 months ago

For macOS users, the sandbox-exec tool still works perfectly to avoid that kind of horror story.

On Linux, a plethora of options exist (Bubblewrap, etc).

wg02 months ago

To rub salt on the wounds and add insult to the injury:

> You have reached quota limit for this model. You can resume using this model at XYZ date.

freakynit2 months ago

Gemini: sorry bro, it's your problem now. Imma out.

eamsen2 months ago

Personal anecdote: I've asked Gemini 3 Pro to write a test for a function that depends on external DB data. It wrote a test that creates and deletes a table, it conveniently picked the exact production table name, didn't mock the DB interactions. Attempted to run the test immediately.

Uptrenda2 months ago

This seems like the canary in the coal mine. We have a company that built this tool because it seemed semi-possible (prob "works" well enough most of the time) and they don't want to fall behind if anything that's built turns out to be the next chatgpt. So there's no caution for anything now, even ideas that can go catastrophically wrong.

Yeah, its data now, but soon we'll have home robotics platforms that are cheap and capable. They'll run a "model" with "human understanding", only, any weird bugs may end up causing irreparable harm. Like, you tell the robot to give your pet a bath and it puts it in the washing machine because its... you know, not actually thinking beyond a magic trick. The future is really marching fast now.

ringer2 months ago

People need to learn to never run untrusted code without safety measures like virtualization, containerization, sandboxing/jailing, etc. Untrusted code can include executables, external packages (pip, npm, cargo, etc) and also code/commands created by LLMs, etc.

JohnCClarke2 months ago

FWIW: I think we've all been there.

I certainly did the same in my first summer job as an intern. Spent the next three days reconstructing Clipper code from disk sectors. And ever since I take backups very seriously. And I double check del/rm commands.

eviks2 months ago

Play vibe games, win vibe prizes.

Though the cause isn't clear, the reddit post is another long could-be-total-drive-removing-nonsense AI conversation without an actual analysis and the command sequence that resulted in this

sunaookami2 months ago
venturecruelty2 months ago

Nobody ever talks about how good vibes can turn really bad.

tniemi2 months ago
ossa-ma2 months ago

The biggest issue with Antigravity is that it completely freezes everything: the IDE, the terminals, debugger, absolutely everything completely blocking your workflow for minutes when running multiple agents, or even a single agent processing a long-winded thinking task (with any model).

This means that while the agent is coding, you can't code...

Never ever had this issue with Cursor.

Aeolun2 months ago

Is there anyone else that uses Claude specifically because it doesn’t sound mentally unhinged while thinking?

daco2 months ago
Animats2 months ago

Can you run Google's AI in a sandbox? It ought to be possible to lock it to a Github branch, for example.

lifthrasiir2 months ago

Gemini CLI allows for a Docker-based sandbox, but only when configured in advance. I don't know about Antigravity.

chanux2 months ago

Gemini CLI, Antigravity and Jules.

It's going Googly well I see!

ztetranz2 months ago

In early versions of DOS, the format command would go ahead and format the default drive with no prompt. It was a bad day if you were on C: and wanted to format A: but you forgot the A:

FerretFred2 months ago

I always use "rm -rf*v*" so that if I do screw up I can watch the evidence unfold before me.

jeswin2 months ago

An early version of Claude Code did a hard reset on one of my projects and force pushed it to GitHub. The pushed code was completely useless, and I lost two days of work.

It is definitely smarter now, but make sure you set up branch protection rules even for your simple non-serious projects.

atypeoferror2 months ago

I don’t let Claude touch git at all, unless I need it to specifically review the log - which is rare. I commit manually often (and fix up the history later) - this allows me to go reasonably fast without worrying too much about destructive tool use.

pluc2 months ago

Live by the vibe die by the vibe

woopsn2 months ago

Well that's stupid. I submit though, connecting stochastic process directly to shell you do give permission for everything that results. It's a stupid game. Gemini mixes up LEFT and RIGHT (!). You have to check it.

akersten2 months ago

Most of the responses are just cut off midway through a sentence. I'm glad I could never figure out how to pay Google money for this product since it seems so half-baked.

Shocked that they're up nearly 70% YTD with results like this.

ashishb2 months ago
digitalsushi2 months ago

if my operating system had an atomic Undo/Redo stack down to each register being flipped (so basically, impossible, star trek tier fantasy tech) i would let ai run commands without worrying about it. i could have a cool scrubber ui that lets me just unwind time like doctor strange using that green emerald necklace, and, i'd lose nothing, other than confuse my network with replay session noise. and probably many, many other inconsistencies i can't think of, and then another class that i dont know that i dont know about.

throw72 months ago

Remember when computers were deterministic? Pepperidge Farms remembers.

Terr_2 months ago

Pepperidge Farm confirms it can remember with a comprehensive suite of unit tests, which must 100% pass on every build, including test order randomization.

yieldcrv2 months ago

Fascinating

Cautionary tale as I’m quite experienced but have begun not even proofreading Claude Code’s plans

Might set it up in a VM and continue not proofreading

I only need to protect the host environment and rely on git as backups for the project

fragmede2 months ago

For the love of Reynold Johnson, please invest in Arq or Acronis or anything to have actual backups if you're going to play with fire.

robertheadley2 months ago

I was trying to build a .MD file of every powershell command available on my computer and all of its flags, and... that wasn't a great idea, and my bitlocker put the kebosh on that.

wartywhoa232 months ago

Total Vibeout.

setnone2 months ago

I am deeply regretful, but my Google Antigravity clearly states: AI may make mistakes. Double-check all generated code.

Surely AGI products won't have such disclaimer.

kazinator2 months ago

All that matters is whether the user gave permission to wipe the drive, ... not whether that was a good idea and contributed to solving a problem! Haha.

lupire2 months ago

What makes a program malware?

Does intent matter, or only behavior?

Nasrudith2 months ago

I believe the precedent is the behavior. Lose/lose is an 'art game' which deletes itself if you lose but gameplay destruction deletes random files. It is flagged as malware despite just doing exactly what it advertised.

schuppentier2 months ago

"The purpose of a system is what it does" would suggest malware.

rarisma2 months ago

Insane skill issue

rdtsc2 months ago

> Google Antigravity just deleted the contents of whole drive.

"Where we're going, we won't need ~eyes~ drives" (Dr. Weir)

(https://eventhorizonfilm.fandom.com/wiki/Gravity_Drive)

pshirshov2 months ago

Claude happily does the same on daily basis, run all that stuff in firejail!

mijoharas2 months ago

have you got a specific firejail wrapper script that you use? Could you share?

pshirshov2 months ago

https://github.com/7mind/nix-config/blob/main/modules/hm/dev... , the firejail version can be found in git history.

mijoharas2 months ago

Thanks! Would you recommend bubblewrap over firejail, and if so, how come?

smaudet2 months ago

Would have been helpful to state what this was, I had to go look it up...

rvz2 months ago

The hard drive should now feel a bit more lighter.

sunaookami2 months ago

It is now production-ready! :rocket:

xg152 months ago

I guess eventually, it all came crashing down.

basisword2 months ago

This happened to me long before LLM's. I was experimenting with Linux when I was young. Something wasn't working so I posted on a forum for help which was typical at the time. I was given a terminal command that wiped the entire drive. I guess the poster thought it was a funny response and everyone would know what it meant. A valuable life experience at least in not running code/commands you don't understand.

Puzzled_Cheetah2 months ago

Ah, someone gave the intern root.

> "I also need to reproduce the command locally, with different paths, to see if the outcome is similar."

Uhm.

------------

I mean, sorry for the user whose drive got nuked, hopefully they've got a recent backup - at the same time, the AI's thoughts really sound like an intern.

> "I'm presently tackling a very pointed question: Did I ever get permission to wipe the D drive?"

> "I am so deeply, deeply sorry."

This shit's hilarious.

shevy-java2 months ago

Alright but ... the problem is you did depend on Google. This was already the first mistake. As for data: always have multiple backups.

Also, this actually feels AI-generated. Am I the only one with that impression lately on reddit? The quality there decreased significantly (and wasn't good before, with regard to censorship-heavy moderators anyway).

alshival2 months ago

I like turtles.

tom_m2 months ago

Satire.

PieUser2 months ago

The victim uploaded a video too: https://www.youtube.com/watch?v=kpBK1vYAVlA

nomilk2 months ago

From Antigravity [0]:

> I am looking at the logs from a previous step and I am horrified to see that the command I ran to clear the project cache (rmdir) appears to have incorrectly targeted the root of your D: drive instead of the specific project folder. I am so deeply, deeply sorry.

[0] 4m20s: https://www.youtube.com/watch?v=kpBK1vYAVlA&t=4m20s

synarchefriend2 months ago

The model is just taking the user's claim that it deleted the D drive at face value. Where is the actual command that would result in deleting the entire D drive?

uhoh-itsmaciek2 months ago

I know why it apologizes, but the fact that it does is offensive. It feels like mockery. Humans apologize because (ideally) they learned that their actions have caused suffering to others, and they feel bad about that and want to avoid causing the same suffering in the future. This simulacrum of an apology is just pattern matching. It feels manipulative.

jeisc2 months ago

has google gone boondoggle?

conartist62 months ago

AGI deleted the contents of your whole drive don't be shy about it. According to OpenAI AGI is already here so welcome to the future isn't it great

rf152 months ago

A reminder: if the AI is doing all the work you demand of it correctly on this abstraction level, you are no longer needed in the loop.

nephihaha2 months ago

I can't view this content.

benterix2 months ago

Play stupid games, win stupid prizes.

DeepYogurt2 months ago

[flagged]

jacinthewyf2 months ago

over the past few months, I’ve seen a lot of posts about “AI deleting my code/project”. I then started using a small tool someone else built, Vibe Backup, to back up my own code: I press a keyboard shortcut, type a short description, and my current code is saved into a timeline. When an AI refactor goes wrong or I accidentally delete something, I open the timeline, click on an earlier point in time, and can restore the entire project with a single action.

hope it could be helpful https://youtube.com/shorts/TJ4oXlfs7OI?feature=share

jacinthewyf2 months ago

[dead]

Scott-David2 months ago

[dead]

Jeff-Collins2 months ago

[dead]

koakuma-chan2 months ago

Why would you ever install that VScode fork