TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Working with Files Is Hard (2019)

203 pointsby nathan_phoenix5 months ago

12 comments

continuational5 months ago
&gt; Pillai et al., OSDI’14 looked at a bunch of software that writes to files, including things we&#x27;d hope write to files safely, like databases and version control systems: Leveldb, LMDB, GDBM, HSQLDB, Sqlite, PostgreSQL, Git, Mercurial, HDFS, Zookeeper. They then wrote a static analysis tool that can find incorrect usage of the file API, things like incorrectly assuming that operations that aren&#x27;t atomic are actually atomic, incorrectly assuming that operations that can be re-ordered will execute in program order, etc.<p>&gt; When they did this, they found that every single piece of software they tested except for SQLite in one particular mode had at least one bug. This isn&#x27;t a knock on the developers of this software or the software -- the programmers who work on things like Leveldb, LBDM, etc., know more about filesystems than the vast majority programmers and the software has more rigorous tests than most software. But they still can&#x27;t use files safely every time! A natural follow-up to this is the question: why the file API so hard to use that even experts make mistakes?
评论 #42806268 未加载
评论 #42806447 未加载
评论 #42807094 未加载
评论 #42810422 未加载
评论 #42806700 未加载
评论 #42810669 未加载
评论 #42814136 未加载
praptak5 months ago
Ext4 actually special-handles the rename trick so that it works even if it should not:<p>&quot;If auto_da_alloc is enabled, ext4 will detect the replace-via-rename and replace-via-truncate patterns and [basically save your ass]&quot;[0]<p>[0]<a href="https:&#x2F;&#x2F;docs.kernel.org&#x2F;admin-guide&#x2F;ext4.html" rel="nofollow">https:&#x2F;&#x2F;docs.kernel.org&#x2F;admin-guide&#x2F;ext4.html</a>
Retr0id5 months ago
&gt; they found that every single piece of software they tested except for SQLite in one particular mode had at least one bug.<p>This is why whenever I need to persist any kind of state to disk, SQLite is the first tool I reach for. Filesystem APIs are scary, but SQLite is well-behaved.<p>Of course, it doesn&#x27;t always make sense to do that, like the dropbox use case.
评论 #42806901 未加载
评论 #42806194 未加载
评论 #42812321 未加载
edgarvaldes4 months ago
As per HN headlines, files are hard, git is hard, regex is hard, time zones are hard, money as data type is hard, hiring is hard, people is hard.<p>I wonder what is easy.
评论 #42811060 未加载
评论 #42816042 未加载
评论 #42810378 未加载
gavinhoward5 months ago
I wonder if, in the Pillai paper, I wonder if they tested the SQLite Rollback option with the default synchronous [1] (`NORMAL`, I believe) or with `EXTRA`. I&#x27;m thinking that it was probably the default.<p>I kinda think, and I could be wrong, that SQLite rollback would not have any vulnerabilities with `synchronous=EXTRA` (and `fullfsync=F_FULLFSYNC` on macOS [2]).<p>[1]: <a href="https:&#x2F;&#x2F;www.sqlite.org&#x2F;pragma.html#pragma_synchronous" rel="nofollow">https:&#x2F;&#x2F;www.sqlite.org&#x2F;pragma.html#pragma_synchronous</a><p>[2]: <a href="https:&#x2F;&#x2F;www.sqlite.org&#x2F;pragma.html#pragma_fullfsync" rel="nofollow">https:&#x2F;&#x2F;www.sqlite.org&#x2F;pragma.html#pragma_fullfsync</a>
wruza5 months ago
No mention on ntfs and windows keywords in the article, for those interested.
评论 #42806797 未加载
评论 #42806409 未加载
ryao4 months ago
&gt; On Linux ZFS, it appears that there&#x27;s a code path designed to do the right thing, but CPU usage spikes and the system may hang or become unusable.<p>ZFS fsync will not fail, although it could end up waiting forever when a pool faults due to hardware failures:<p><a href="https:&#x2F;&#x2F;papers.freebsd.org&#x2F;2024&#x2F;asiabsdcon&#x2F;norris_openzfs-fsync-failure&#x2F;" rel="nofollow">https:&#x2F;&#x2F;papers.freebsd.org&#x2F;2024&#x2F;asiabsdcon&#x2F;norris_openzfs-fs...</a>
评论 #42809245 未加载
einpoklum4 months ago
The article wrap up with this salient point:<p>&gt; <i>In conclusion, computers don&#x27;t work (but I guess you already know this...</i>
评论 #42811077 未加载
1vuio0pswjnm74 months ago
No Javascript or SNI:<p><a href="https:&#x2F;&#x2F;archive.wikiwix.com&#x2F;cache&#x2F;index2.php?rev_t=&amp;url=https%3A%2F%2Fdanluu.com%2Fdeconstruct-files%2F" rel="nofollow">https:&#x2F;&#x2F;archive.wikiwix.com&#x2F;cache&#x2F;index2.php?rev_t=&amp;url=http...</a>
AutistiCoder4 months ago
it&#x27;s a good thing I&#x27;m a Web developer.<p>closest I come to working with files is localStorage, but that&#x27;s thread safe.
jheriko4 months ago
this whole thing is a story about using outdated stuff in a shitty ecosystem.<p>its not a real problem for most modern developers.<p>pwrite? wtf?<p>not one mention of fopen.<p>granted some of the fine detail discussion is interesting, but it doesn&#x27;t make practical sense since about 1990.
评论 #42813400 未加载
userbinator5 months ago
I don&#x27;t get it. The only times I&#x27;ve had problems with filesystem corruption in the past few decades was with a hardware problem, and said hardware was quickly replaced. FAT family has been perfectly fine while I&#x27;ve encountered corruption on every other FS including NTFS, exFAT, and the ext* family.<p>Meanwhile you can read plenty of stories of others having the exact opposite experience.<p>If you keep losing data to power losses or crashes, perhaps fix the cause of that? It doesn&#x27;t make sense to try to work around it.
评论 #42810018 未加载
评论 #42809882 未加载
评论 #42808666 未加载