科技回声

5 条评论

sciurus超过 12 年前

Best comment on that thread (from the bug reporter)"Those following along at home is probably half the human race, now we have posts on Phoronix, Slashdot and Heise. Who the hell submits things like this to random-terrified-user media outlets before we've even characterized the bloody problem? Every one of those posts is inaccurate, of course, through no fault of their own but merely because we didn't yet know what the problem was ourselves, merely that I and one other person were seeing corruption: we obviously started by assuming that it was something obvious and thus fairly serious, but that didn't mean we expected that to be true: I certainly expected the final problem to be more subtle, if still capable of causing serious disk corruption (my warning here was just in case it was not).But now there's a wave of self-sustaining accidental lies spreading across the net, damaging the reputation of ext4 unwarrantedly, and I started it without wanting to.It's times like this when I start to understand why some companies have closed bug trackers."

评论 #4700363 未加载

cokernel_hacker超过 12 年前

This is symptomatic with one of my two big problems with journalling oriented file-systems.My problems with journalling are two fold:1) They are very slow:1a) You have a nice big sequential write into the journal, which is OK.1b) A flush track cache to make sure it is actually in the journal. This can sync whatever has accumulated in the track cache which might not just be journal data.1c) The actual overwrites that spew data randomly over the drive.1d) Writes to update the journal header/terminate the transaction.1e) A final flush track cache which will sync who-knows-what onto the platters/flash.2) Replay behavior of the journal log is _very_ fragile code. You need to handle lots of terrible cases, the most awful of which is to ensure that you don't play older transaction on top of newer transactions. You might say "hey, that shouldn't happen!" but it happens. It happens because the code is not trivial to write and detecting these bad cases aren't trivial. Even if you do get the code write, you are still screwed. Why? Because if your drive does not support the flush track cache mechanism you are in for a world of pain. You can have a journal and journal header that is ancient if it just stuck in some cache...The ext* family of filesystems do not appear to have natural resiliency to this sort of problem. Instead, it appears to be a coordinated, concerted effort between various parts of the journaling code.

评论 #4701159 未加载

评论 #4700156 未加载

评论 #4699967 未加载

评论 #4699951 未加载

bcl超过 12 年前

Update here: <a href="https://lwn.net/Articles/521090/" rel="nofollow">https://lwn.net/Articles/521090/</a>

Florin_Andrei超过 12 年前

Oh great.Ubuntu 12.04 Server crashes randomly due to some obscure bug in 3.2I upgraded the kernel to 3.6.3 specifically to stop the machine from crashing. Now let's hope that INDEED it doesn't crash, else I might get hit by this newfangled Ext4 bug.Sounds like the accelerated kernel development is hitting various limits, they should go back to more stodgy stable series like before.

评论 #4700623 未加载

batgaijin超过 12 年前

What's the status with BTRFS? Also, is ZFS Linux performance still terrible, or did they find a way to fix that?

评论 #4700144 未加载

评论 #4700009 未加载

5 条评论

sciurus超过 12 年前

评论 #4700363 未加载

cokernel_hacker超过 12 年前

评论 #4701159 未加载

评论 #4700156 未加载

评论 #4699967 未加载

评论 #4699951 未加载

bcl超过 12 年前

Update here: <a href="https://lwn.net/Articles/521090/" rel="nofollow">https://lwn.net/Articles/521090/</a>

Florin_Andrei超过 12 年前

评论 #4700623 未加载

batgaijin超过 12 年前

What's the status with BTRFS? Also, is ZFS Linux performance still terrible, or did they find a way to fix that?

评论 #4700144 未加载

评论 #4700009 未加载

Ext4 data corruption trouble

5 条评论

Ext4 data corruption trouble

5 条评论