Back

Show HN: ZeroFS – A log-structured filesystem for S3

43 points1 hourzerofs.net
rockwotj15 minutes ago

The sub-millisecond writes with data in S3 is false and impossible. If you look at the benchmark the fsync is not timed, so this is just the latency of either the network or in kernel file operations depending on the mount settings

xyzzy_plugh6 minutes ago

I hate it when databases celebrate their performance without synchronous flushing. You should be clear about data loss window (which should be zero for committed transactions by default!) and the flushing interval to persistent storage.

I'm okay if you batch writes, I'm okay if you offer a low-latency mode with less durability, but by being unclear about this it just feels like a scam.

Eikon8 minutes ago

[dead]

BlackLotus8911 minutes ago

For a good S3 fs look at geesefs https://github.com/perrynzhou/geesefs-s3

coxley35 minutes ago

From the docs:

> ZeroFS fetches object data in 128 KiB parts

Read/write operations in object storage are _far more_ expensive than stored bytes. I'm always afraid of anything that abstracts over S3/GCS access specifically for that reason.

karakanb18 minutes ago

One of the reasons why ZeroFS seems interesting is they use SlateDB under the hood, which optimizes the requests that hit S3 behind the scenes.

throw123456789126 minutes ago

Especially that the “one fetch” is who knows how many reads and retries under the hood.

Eikon27 minutes ago

[dead]

abtinf59 minutes ago

Entrusting data storage to a vibe coded filesystem seems imprudent.

Eikon30 minutes ago

Is it? :)

Ask me anything!

abtinf20 minutes ago

Here is my unsolicited advice:

If one of your goals is to get others to adopt the software, I recommend you redo the marketing page and readme from scratch. Delete them without looking at them again, then hand write the content for them. Once you have the content, you call tell an LLM to format it into a nice landing page, but strictly keep your wording without changes.

Eikon18 minutes ago

That's fair advice, thanks.

wyager25 minutes ago

FYI it looks like some of your comments are getting auto-flagged by the HN moderation system and marked as dead

tribal80824 minutes ago

I’ve seen things like this before; your key differentiator needs to be efficiency and safety compared to other options.

ChocolateGod22 minutes ago

I believe the first version of this required the metadata to be stored on the ZeroFS server, making HA kinda hard.

This has changed now that if I stop the server and create a new instance with the same configuration file it'll pickup the existing metadata from the bucket?

dan_sbl58 minutes ago

> The test suites run in public CI.

> Each card links to the CI pipeline.

Thanks for being explicit, AI written marketing site. Wouldn't have been able to figure that out! Every currently maintained and reasonably popular open source project either runs CI in public or makes the tests extremely easy to run.

xx_ns56 minutes ago

I got the same vibe from

> These are asciinema recordings of real terminal sessions, rendered as text rather than video. Playback caps idle pauses at two seconds and changes nothing else.

Thanks? This sounds like it's the LLM's response to the prompter, not something you should display on the page itself...

dizhn48 minutes ago

I feel bad for actually liking that part now. Capping pauses at 2 seconds would show you where it hung 2+ seconds without wasting your time. Smart I thought.

Eikon47 minutes ago

Thank you for the feedback, the idea behind this was to say "We make claims that are backed by workflows you can verify". I'll improve the phrasing.

preetham_rangu27 minutes ago

How does this compare to JuiceFS or SeaweedFS in terms of metadata latency? The LSM tree approach is interesting but compaction pauses on a remote-backed store seem like they could be painful.

Eikon20 minutes ago

[dead]

tmach3242 minutes ago

See also: JuiceFS, S3FS, and quite a few others.

We have done loads of research into using object storage wherever we can (given how cheap it is compared to SSDs), and so far it seems like making your application object store-aware is a far surer bet than abstracting S3 behind the file system. The behavior is just too different.

I'm more interested in applications that cleverly use object storage, e.g. AutoMQ, which is quite compatible with Kafka APIs but needs no HDDs.

the847238 minutes ago

s3fs doesn't provide posix semantics. It's good enough™ for some uses, but not comparable to what this one is ostensibly providing.

iamalizaidi47 minutes ago

Seems purely vibecoded

lukewarm70714 minutes ago

wonder when we get agents good enough that we can't say vibecode any more and have to say 'code'.

there was slop with ai jesus but now gpt image is just a photo with hidden watermark

abtinf55 minutes ago

Why does this landing page load js from merklemap.com?

xx_ns51 minutes ago

Both projects have the same author.

Eikon50 minutes ago

Just a self hosted plausible instance :)

breckognize51 minutes ago

Under the hood, S3's storage nodes are also built on a log-structured file system: https://cdn.amazon.science/77/5e/4a7c238f4ce890efdc325df8326...

(Not posix compliant because it doesn't need to be.)

aniketsaini77719 minutes ago

[flagged]