TechEcho

9 comments

dom0about 8 years ago

So after the last blog post by The Author which mainly showed The Author's lack of understanding, we have another article from The Author highlighting that he does indeed not understand things he writes blog posts about (incorrect rationale and assumptions about 128 KiB block size being optimal, no readahead on virtual device files, and of course not mentioning any of the FD splicing alternatives in a post titled "Efficient ..." or any of the approaches involving memory mappings and explicit prefetching on said mappings).I don't want to be overly extremely dismissive or arrogant here, but this post pretty much boils down to "128 KiB is optimal because that number appears somewhere else, too, and that other spot even has somehow something to do with I/O".

评论 #13936301 未加载

评论 #13936594 未加载

arsabout 8 years ago

Explanation concludes readahead is the reason that 128KB buffer is fastest on the benchmark, while the benchmark uses /dev/zero and /dev/null which don't have readahead.You need to redo this article using actual reads and writes. Try it with both a quiet machine and a semi-busy one.

评论 #13936631 未加载

JoshTriplettabout 8 years ago

I'd be interested to see how this compares to 1) mmapping both files and using memcpy, 2) mmaping the source and making a single call to write passing the whole buffer, and 3) copy_file_range.

评论 #13936307 未加载

ameliusabout 8 years ago

For even faster copying on the same device, use a copy-on-write (COW) filesystem.(I wonder though what API the "cp" command would use to accomplish that).

评论 #13936187 未加载

评论 #13936142 未加载

jquastabout 8 years ago

The statvfs system call indicates the preferred block size of the filesystem. It is a very large value on zfs, for example.<a href="https://docs.python.org/2/library/statvfs.html#statvfs.F_BSIZE" rel="nofollow">https://docs.python.org/2/library/statvfs.html#statvfs.F_BSI...</a>

valarauca1about 8 years ago

I doubt these benchmarks are relevant anymore as Linux has a system just dedicated to copying files.So you never even have to leave the page cache, let alone copy into user space.OFC it was implemented post 4.0 so I doubt GLibc supports it therefore the whole world pretends it doesn't exist.

LeoPantheraabout 8 years ago

The most efficient way to copy a large number of small files is often to use a tarpipe. What block size does "tar" use? And for that matter, "nc", as a tarpipe through nc is a super fast way to move data between machines.

评论 #13942370 未加载

jaimex2about 8 years ago

Good explanation, thank you. So when using dd or other copy tools setting a block size of 128 kb should also be the best choice?

heinrich5991about 8 years ago

What about `copy_file_range(2)`?

评论 #13936426 未加载

评论 #13937170 未加载

9 comments

dom0about 8 years ago

评论 #13936301 未加载

评论 #13936594 未加载

arsabout 8 years ago

评论 #13936631 未加载

JoshTriplettabout 8 years ago

I'd be interested to see how this compares to 1) mmapping both files and using memcpy, 2) mmaping the source and making a single call to write passing the whole buffer, and 3) copy_file_range.

评论 #13936307 未加载

ameliusabout 8 years ago

For even faster copying on the same device, use a copy-on-write (COW) filesystem.(I wonder though what API the "cp" command would use to accomplish that).

评论 #13936187 未加载

评论 #13936142 未加载

jquastabout 8 years ago

valarauca1about 8 years ago

LeoPantheraabout 8 years ago

评论 #13942370 未加载

jaimex2about 8 years ago

Good explanation, thank you. So when using dd or other copy tools setting a block size of 128 kb should also be the best choice?

heinrich5991about 8 years ago

What about `copy_file_range(2)`?

评论 #13936426 未加载

评论 #13937170 未加载

Efficient File Copying on Linux

9 comments

Efficient File Copying on Linux

9 comments