TechEcho

3 comments

I think the headline is misleading.The function in question only used by the dataset send/receive functionality to compute the checksum of the resume token, and by the zdb debug tool which is used to inspect the filesystem.Importantly, it's not the function used to compute the checksum of the blocks on disk.The resume token contains a header which includes the checksum in question as well as the length of the decompressed data in the token, followed by compressed data. The code checks that the length of the decompressed data matches the value stored in the header.So, this means that the possibility for corruption is only when receiving a dataset and the token is corrupted in a way where Zlib decompresses the correct amount of data, but the last few bytes are incorrect.

ggm5 months ago

Corruption in the last 4 bytes of a checksummed block. So, it's about the rate of detected corruption in the block overall and the distribution of corrupt bits, and the calculated odds of a corrupt 4 byte block end.In terabytes, non trivially non zero I guess. Since the git submission has "my calculated version" you could imagine doing a post hoc comparison block by block on a tainted FS. Painfully slow I bet.

rurban5 months ago

Very common with overly optimized hash functions, such as fletcher4. The remaining bytes mod 4 are just ignored. BSD's nbperf does the same. Very sloppy

ZFS checksum flaw: Corruption may be left unnoticed

3 comments

ZFS checksum flaw: Corruption may be left unnoticed

3 comments