科技回声

9 条评论

dom0大约 8 年前

So after the last blog post by The Author which mainly showed The Author's lack of understanding, we have another article from The Author highlighting that he does indeed not understand things he writes blog posts about (incorrect rationale and assumptions about 128 KiB block size being optimal, no readahead on virtual device files, and of course not mentioning any of the FD splicing alternatives in a post titled "Efficient ..." or any of the approaches involving memory mappings and explicit prefetching on said mappings).I don't want to be overly extremely dismissive or arrogant here, but this post pretty much boils down to "128 KiB is optimal because that number appears somewhere else, too, and that other spot even has somehow something to do with I/O".

评论 #13936301 未加载

评论 #13936594 未加载

ars大约 8 年前

Explanation concludes readahead is the reason that 128KB buffer is fastest on the benchmark, while the benchmark uses /dev/zero and /dev/null which don't have readahead.You need to redo this article using actual reads and writes. Try it with both a quiet machine and a semi-busy one.

评论 #13936631 未加载

JoshTriplett大约 8 年前

I'd be interested to see how this compares to 1) mmapping both files and using memcpy, 2) mmaping the source and making a single call to write passing the whole buffer, and 3) copy_file_range.

评论 #13936307 未加载

amelius大约 8 年前

For even faster copying on the same device, use a copy-on-write (COW) filesystem.(I wonder though what API the "cp" command would use to accomplish that).

评论 #13936187 未加载

评论 #13936142 未加载

jquast大约 8 年前

The statvfs system call indicates the preferred block size of the filesystem. It is a very large value on zfs, for example.<a href="https://docs.python.org/2/library/statvfs.html#statvfs.F_BSIZE" rel="nofollow">https://docs.python.org/2/library/statvfs.html#statvfs.F_BSI...</a>

valarauca1大约 8 年前

I doubt these benchmarks are relevant anymore as Linux has a system just dedicated to copying files.So you never even have to leave the page cache, let alone copy into user space.OFC it was implemented post 4.0 so I doubt GLibc supports it therefore the whole world pretends it doesn't exist.

LeoPanthera大约 8 年前

The most efficient way to copy a large number of small files is often to use a tarpipe. What block size does "tar" use? And for that matter, "nc", as a tarpipe through nc is a super fast way to move data between machines.

评论 #13942370 未加载

jaimex2大约 8 年前

Good explanation, thank you. So when using dd or other copy tools setting a block size of 128 kb should also be the best choice?

heinrich5991大约 8 年前

What about `copy_file_range(2)`?

评论 #13936426 未加载

评论 #13937170 未加载

9 条评论

dom0大约 8 年前

评论 #13936301 未加载

评论 #13936594 未加载

ars大约 8 年前

评论 #13936631 未加载

JoshTriplett大约 8 年前

I'd be interested to see how this compares to 1) mmapping both files and using memcpy, 2) mmaping the source and making a single call to write passing the whole buffer, and 3) copy_file_range.

评论 #13936307 未加载

amelius大约 8 年前

For even faster copying on the same device, use a copy-on-write (COW) filesystem.(I wonder though what API the "cp" command would use to accomplish that).

评论 #13936187 未加载

评论 #13936142 未加载

jquast大约 8 年前

valarauca1大约 8 年前

LeoPanthera大约 8 年前

评论 #13942370 未加载

jaimex2大约 8 年前

Good explanation, thank you. So when using dd or other copy tools setting a block size of 128 kb should also be the best choice?

heinrich5991大约 8 年前

What about `copy_file_range(2)`?

评论 #13936426 未加载

评论 #13937170 未加载

Efficient File Copying on Linux

9 条评论

Efficient File Copying on Linux

9 条评论