The UNIX file system abstraction is very simple, and doesn't define post-crash states. I once proposed different guarantees for different types of files:<p>- Unit files - the unit of update is the file. Files are created, written, and closed, and are not visible to other processes until closed. Once closed, the file is read-only; it cannot be rewritten, only replaced as a unit. For POSIX-type systems, files created with 'creat' should be created in this mode. O_TRUNC should be interpreted as "replace the old file with the new version on close". If the program aborts before a proper close, the new file should be dropped, leaving the old version intact.<p>The crash guarantee should be that post-crash, you have a completely written file. It can be either the old version or the new version, but never a partial version. This eliminates the gyrations people go through to get this behavior.<p>- Log files - the unit of update is the write, which must be at the end. These are files opened for append. Appending is always at the end of the file, even from multiple processes. "seek" is disallowed if the file is open for writing; you can only append.<p>The crash guarantee should be that post-crash, you have a file which is either complete to the last write, or truncated precisely after some write. The file may not be cut in the middle of a record or trail off into junk.<p>- Temporary files - after a crash, they're gone.<p>- Managed files - these are for databases, and support additional functions related to locking and file synchronization. That's what the article is about. For the other types of files, you don't need all those features.<p>In practice, most files are unit files, log files, or temporary files. The number of programs which use managed files is small; mostly they're database program or libraries.<p>Programs which use managed files and need data soundness after a crash must be very aware of concurrency and safety semantics. A somewhat different API may be required. There should be two notifications from a write - "data accepted" and "data safely committed". Callers should be able to make blocking writes based on either of those, or make non-blocking writes and get two callbacks. This puts the concurrency management in the database application, which knows what data depends on other data. The file system can't know that, and shouldn't try to.