HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
BetrFS: A Right-Optimized Write-Optimized File System

Microsoft Research · Youtube · 2 HN points · 5 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention Microsoft Research's video "BetrFS: A Right-Optimized Write-Optimized File System".
Youtube Summary
This talk will describe BetrFS, a file system built on B epsilon -trees, a Write-Optimized Data Structure (WODS). BetrFS outperforms widely-used file systems, such as ext4 and xfs, on many benchmarks, sometimes by orders of magnitude. A recent paper on BetrFS was the runner-up for best paper at USENIX FAST 2015. The talk will cover - Write-optimized data structures, such as LSM-trees and B epsilon -trees - Comparison of WODS for file system applications - How to design a file system around the performance strengths of WODS - Ongoing work to make BetrFS "dominate" all other file systems
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
The website doesn't seem to mention that several of the papers on the filesystem won best-paper awards at major conferences. The paper, Optimizing Every Operation in a Write-Optimized File System, in particular, won best-paper award at FAST '16.

Also, if you're interested in learning more about B^\epsilon trees, here's a talk given by Rob Johnson a few years ago at Microsoft Research: BetrFS: A Right-Optimized Write-Optimized File System https://www.youtube.com/watch?v=fBt5NuNsoII

In general, I think it's really cool that there is a file system that exists today (i.e., BetrFS) that uses data structures which didn't exist 25 years ago. It's a great example of theoreticians and systems researchers working together.

swills
The reason I posted it was I saw it in the FAST21 playlist:

https://www.youtube.com/watch?v=6KueHK9i8lE

pengaru
> here's a talk given by Rob Johnson a few years ago at Microsoft Research: BetrFS: A Right-Optimized Write-Optimized File System https://www.youtube.com/watch?v=fBt5NuNsoII

Interesting talk, I wonder how much of the perf. advantage diminishes in a finished, production-ready implementation though.

Comparing an 80%-complete R&D prototype mule against crash-resilient posix-compliant production filesystems is basically never a fair perf. comparison.

You might find that just implementing rename and hard-links properly alone is going to kill your perf. since you dispensed with on-disk inode equivalents.

Nice to see people poking at these issues nonetheless, Linux needs better filesystem options.

zokier
On the other hand I can easily imagine lot of applications that don't care about full posix compliance, and are perfectly happy to trade handling of some obscure feature for improved performance.
c0balt
Maybe a viable option for single-application container images. They would on the one side offer the ability to have tight control around used functions (to allow for missing features) but also be able to exactly target an FS and be optimized for it.
fartattack
Containers don't control the FS that they're written to. A container image is in the tar format and at runtime the underlying FS is defined by the host, which is why containers only run on hosts with union filesystems
pengaru
Renames and hard-links are not obscure features.

And there are myriad mount options for tailoring performance vs. crash-resilience/posix-compliance to the application in most the existing production filesystems. Which was honestly another aspect of the talk that was somewhat lacking; what journaling modes were used? barrier/nobarrier? was it even made equivalent to what betrfs achieves? We don't even know if a betrfs instance can successfully mount after a mid-write hard reboot.

This isn't completely fair, although most of it is.

Some of the reasons for modifying Linux are performance enhancements to the filesystem layer that only make sense with a Bε-tree filesystem (and are not possible without patching Linux). They cover this in the hour-long talk at MS research linked elsewhere in this thread.[1] (Yes, it's long, but it's a pretty good presentation.)

E.g., they describe modifying the page cache to write-through small modifications to file data, rather than dirtying the entire page and writing it back later (a form of write amplification).

[1]: https://www.youtube.com/watch?v=fBt5NuNsoII

wtallis
> E.g., they describe modifying the page cache to write-through small modifications to file data, rather than dirtying the entire page and writing it back later (a form of write amplification).

Considering that all the storage on the market now has sectors at least as large as a 4k page, this isn't actually reducing write amplification. At most, in some cases it might save a tiny bit of bus traffic.

loeg
They batch small edits into a log. Many edits to one sector written.
Talk given by Rob Johnson [1] at MSR a few years back...

BetrFS: A Right-Optimized Write-Optimized File System https://www.youtube.com/watch?v=fBt5NuNsoII

[1] http://www3.cs.stonybrook.edu/~rob/

jules
That's a great talk, thanks!
Feb 25, 2017 · 2 points, 0 comments · submitted by espeed
This reminded me of BetrFS presented at FAST 15. It looks like there's a more up to date presentation online[1] where the delete performance issues have been fixed.

[1] https://www.youtube.com/watch?v=fBt5NuNsoII

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.