科技回声

12 条评论

> Pillai et al., OSDI’14 looked at a bunch of software that writes to files, including things we'd hope write to files safely, like databases and version control systems: Leveldb, LMDB, GDBM, HSQLDB, Sqlite, PostgreSQL, Git, Mercurial, HDFS, Zookeeper. They then wrote a static analysis tool that can find incorrect usage of the file API, things like incorrectly assuming that operations that aren't atomic are actually atomic, incorrectly assuming that operations that can be re-ordered will execute in program order, etc.> When they did this, they found that every single piece of software they tested except for SQLite in one particular mode had at least one bug. This isn't a knock on the developers of this software or the software -- the programmers who work on things like Leveldb, LBDM, etc., know more about filesystems than the vast majority programmers and the software has more rigorous tests than most software. But they still can't use files safely every time! A natural follow-up to this is the question: why the file API so hard to use that even experts make mistakes?

评论 #42806268 未加载

评论 #42806447 未加载

评论 #42807094 未加载

评论 #42810422 未加载

评论 #42806700 未加载

评论 #42810669 未加载

评论 #42814136 未加载

praptak4 个月前

Ext4 actually special-handles the rename trick so that it works even if it should not:"If auto_da_alloc is enabled, ext4 will detect the replace-via-rename and replace-via-truncate patterns and [basically save your ass]"[0][0]<a href="https://docs.kernel.org/admin-guide/ext4.html" rel="nofollow">https://docs.kernel.org/admin-guide/ext4.html</a>

Retr0id4 个月前

> they found that every single piece of software they tested except for SQLite in one particular mode had at least one bug.This is why whenever I need to persist any kind of state to disk, SQLite is the first tool I reach for. Filesystem APIs are scary, but SQLite is well-behaved.Of course, it doesn't always make sense to do that, like the dropbox use case.

评论 #42806901 未加载

评论 #42806194 未加载

评论 #42812321 未加载

edgarvaldes4 个月前

As per HN headlines, files are hard, git is hard, regex is hard, time zones are hard, money as data type is hard, hiring is hard, people is hard.I wonder what is easy.

评论 #42811060 未加载

评论 #42816042 未加载

评论 #42810378 未加载

gavinhoward4 个月前

I wonder if, in the Pillai paper, I wonder if they tested the SQLite Rollback option with the default synchronous [1] (`NORMAL`, I believe) or with `EXTRA`. I'm thinking that it was probably the default.I kinda think, and I could be wrong, that SQLite rollback would not have any vulnerabilities with `synchronous=EXTRA` (and `fullfsync=F_FULLFSYNC` on macOS [2]).[1]: <a href="https://www.sqlite.org/pragma.html#pragma_synchronous" rel="nofollow">https://www.sqlite.org/pragma.html#pragma_synchronous</a>[2]: <a href="https://www.sqlite.org/pragma.html#pragma_fullfsync" rel="nofollow">https://www.sqlite.org/pragma.html#pragma_fullfsync</a>

wruza4 个月前

No mention on ntfs and windows keywords in the article, for those interested.

评论 #42806797 未加载

评论 #42806409 未加载

ryao4 个月前

> On Linux ZFS, it appears that there's a code path designed to do the right thing, but CPU usage spikes and the system may hang or become unusable.ZFS fsync will not fail, although it could end up waiting forever when a pool faults due to hardware failures:<a href="https://papers.freebsd.org/2024/asiabsdcon/norris_openzfs-fsync-failure/" rel="nofollow">https://papers.freebsd.org/2024/asiabsdcon/norris_openzfs-fs...</a>

评论 #42809245 未加载

einpoklum4 个月前

The article wrap up with this salient point:> In conclusion, computers don't work (but I guess you already know this...

评论 #42811077 未加载

1vuio0pswjnm74 个月前

No Javascript or SNI:<a href="https://archive.wikiwix.com/cache/index2.php?rev_t=&url=https%3A%2F%2Fdanluu.com%2Fdeconstruct-files%2F" rel="nofollow">https://archive.wikiwix.com/cache/index2.php?rev_t=&url=http...</a>

AutistiCoder4 个月前

it's a good thing I'm a Web developer.closest I come to working with files is localStorage, but that's thread safe.

jheriko4 个月前

this whole thing is a story about using outdated stuff in a shitty ecosystem.its not a real problem for most modern developers.pwrite? wtf?not one mention of fopen.granted some of the fine detail discussion is interesting, but it doesn't make practical sense since about 1990.

评论 #42813400 未加载

userbinator4 个月前

I don't get it. The only times I've had problems with filesystem corruption in the past few decades was with a hardware problem, and said hardware was quickly replaced. FAT family has been perfectly fine while I've encountered corruption on every other FS including NTFS, exFAT, and the ext* family.Meanwhile you can read plenty of stories of others having the exact opposite experience.If you keep losing data to power losses or crashes, perhaps fix the cause of that? It doesn't make sense to try to work around it.

评论 #42810018 未加载

评论 #42809882 未加载

评论 #42808666 未加载

12 条评论

continuational4 个月前

评论 #42806268 未加载

评论 #42806447 未加载

评论 #42807094 未加载

评论 #42810422 未加载

评论 #42806700 未加载

评论 #42810669 未加载

评论 #42814136 未加载

praptak4 个月前

Retr0id4 个月前

评论 #42806901 未加载

评论 #42806194 未加载

评论 #42812321 未加载

edgarvaldes4 个月前

As per HN headlines, files are hard, git is hard, regex is hard, time zones are hard, money as data type is hard, hiring is hard, people is hard.I wonder what is easy.

评论 #42811060 未加载

评论 #42816042 未加载

评论 #42810378 未加载

gavinhoward4 个月前

wruza4 个月前

No mention on ntfs and windows keywords in the article, for those interested.

评论 #42806797 未加载

评论 #42806409 未加载

ryao4 个月前

评论 #42809245 未加载

einpoklum4 个月前

The article wrap up with this salient point:> In conclusion, computers don't work (but I guess you already know this...

评论 #42811077 未加载

1vuio0pswjnm74 个月前

AutistiCoder4 个月前

it's a good thing I'm a Web developer.closest I come to working with files is localStorage, but that's thread safe.

Working with Files Is Hard (2019)

12 条评论

Working with Files Is Hard (2019)

12 条评论