TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Efficient File Copying on Linux

95 点作者 eklitzke大约 8 年前

9 条评论

dom0大约 8 年前
So after the last blog post by The Author which mainly showed The Author&#x27;s lack of understanding, we have another article from The Author highlighting that he does indeed not understand things he writes blog posts about (incorrect rationale and assumptions about 128 KiB block size being optimal, no readahead on virtual device files, and of course not mentioning any of the FD splicing alternatives in a post titled &quot;Efficient ...&quot; or any of the approaches involving memory mappings and explicit prefetching on said mappings).<p>I don&#x27;t want to be overly extremely dismissive or arrogant here, but this post pretty much boils down to &quot;128 KiB is optimal because that number appears somewhere else, too, and that other spot even has somehow something to do with I&#x2F;O&quot;.
评论 #13936301 未加载
评论 #13936594 未加载
ars大约 8 年前
Explanation concludes readahead is the reason that 128KB buffer is fastest on the benchmark, while the benchmark uses &#x2F;dev&#x2F;zero and &#x2F;dev&#x2F;null which don&#x27;t have readahead.<p>You need to redo this article using actual reads and writes. Try it with both a quiet machine and a semi-busy one.
评论 #13936631 未加载
JoshTriplett大约 8 年前
I&#x27;d be interested to see how this compares to 1) mmapping both files and using memcpy, 2) mmaping the source and making a single call to write passing the whole buffer, and 3) copy_file_range.
评论 #13936307 未加载
amelius大约 8 年前
For even faster copying on the same device, use a copy-on-write (COW) filesystem.<p>(I wonder though what API the &quot;cp&quot; command would use to accomplish that).
评论 #13936187 未加载
评论 #13936142 未加载
jquast大约 8 年前
The statvfs system call indicates the preferred block size of the filesystem. It is a very large value on zfs, for example.<p><a href="https:&#x2F;&#x2F;docs.python.org&#x2F;2&#x2F;library&#x2F;statvfs.html#statvfs.F_BSIZE" rel="nofollow">https:&#x2F;&#x2F;docs.python.org&#x2F;2&#x2F;library&#x2F;statvfs.html#statvfs.F_BSI...</a>
valarauca1大约 8 年前
I doubt these benchmarks are relevant anymore as Linux has a system just dedicated to copying files.<p>So you never even have to leave the page cache, let alone copy into user space.<p>OFC it was implemented post 4.0 so I doubt GLibc supports it therefore the whole world pretends it doesn&#x27;t exist.
LeoPanthera大约 8 年前
The most efficient way to copy a large number of small files is often to use a tarpipe. What block size does &quot;tar&quot; use? And for that matter, &quot;nc&quot;, as a tarpipe through nc is a super fast way to move data between machines.
评论 #13942370 未加载
jaimex2大约 8 年前
Good explanation, thank you. So when using dd or other copy tools setting a block size of 128 kb should also be the best choice?
heinrich5991大约 8 年前
What about `copy_file_range(2)`?
评论 #13936426 未加载
评论 #13937170 未加载