Back

Binary fuse filters: Fast and smaller than xor filters (2022)

88 points5 daysarxiv.org
pastage11 minutes ago

I recommend the zig library [1], it was a joy to use. Bloom filters was one of the first interesting algorithms I did in class back in university, we upgraded hardware during the lab making the use of bloom filters unnecessary in a lab ment to interactively show its usefulness. I have had this repeated since then, these filters are magic until hardware catches up, having smaller filter is lovely.

[1] https://github.com/hexops/fastfilter

dang4 hours ago

Related prior work:

Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters - https://news.ycombinator.com/item?id=22742905 - March 2020 (25 comments)

Xor Filters: Faster and Smaller Than Bloom Filters - https://news.ycombinator.com/item?id=21840821 - Dec 2019 (81 comments)

vgb2k184 hours ago

Fast, but not faster than XOR filters. I was wondering if the title was a typo, but the article clarifies they sacrificed some speed for the smaller size.

vlovich12340 minutes ago

I think you misread:

> that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound.

They are faster to construct (2x) and smaller (within 13% of theoretical limit) while maintaining the same query performance.

nine_k3 hours ago

One construction is smaller and faster to build than xor filters, and another is even more compact, though slower:

> we build probabilistic filters -- called binary fuse filters -- that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound.

djmips5 hours ago

This feels like a better xor filter implementation.