TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How does rsync work?

319 点作者 secure将近 3 年前

8 条评论

throw0101a将近 3 年前
This is also available as a video, &quot;Why I wrote my own rsync&quot;:<p>* <a href="https:&#x2F;&#x2F;media.ccc.de&#x2F;v&#x2F;gpn20-41-why-i-wrote-my-own-rsync" rel="nofollow">https:&#x2F;&#x2F;media.ccc.de&#x2F;v&#x2F;gpn20-41-why-i-wrote-my-own-rsync</a><p>* <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=wpwObdgemoE" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=wpwObdgemoE</a>
boomskats将近 3 年前
This was a great write up. I&#x27;ve already sent it to a few people.<p>On the question of what happens if a file&#x27;s contents change after the initial checksum, the man page for rsync[0] has an interesting explanation of the *--checksum* option:<p>&gt; This changes the way rsync checks if the files have been changed and are in need of a transfer. Without this option, rsync uses a &quot;quick check&quot; that (by default) checks if each file&#x27;s size and time of last modification match between the sender and receiver. This option changes this to compare a 128-bit checksum for each file that has a matching size. Generating the checksums means that both sides will expend a lot of disk I&#x2F;O reading all the data in the files in the transfer (and this is prior to any reading that will be done to transfer changed files), so this can slow things down significantly.<p>&gt; The sending side generates its checksums while it is doing the file-system scan that builds the list of the available files. The receiver generates its checksums when it is scanning for changed files, and will checksum any file that has the same size as the corresponding sender&#x27;s file: files with either a changed size or a changed checksum are selected for transfer.<p>&gt; Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option&#x27;s before-the-transfer &quot;Does this file need to be updated?&quot; check. For protocol 30 and beyond (first supported in 3.0.0), the checksum used is MD5. For older protocols, the checksum used is MD4.<p>[0]: <a href="https:&#x2F;&#x2F;linux.die.net&#x2F;man&#x2F;1&#x2F;rsync" rel="nofollow">https:&#x2F;&#x2F;linux.die.net&#x2F;man&#x2F;1&#x2F;rsync</a>
评论 #31960277 未加载
评论 #31959303 未加载
评论 #31959074 未加载
评论 #31960332 未加载
评论 #31968131 未加载
lazypenguin将近 3 年前
Nice write up. rsync is great as an application but I found it more cumbersome to use when wanting to integrate it into my own application. There&#x27;s librsync but the documentation is threadbare and it requires an rsync server to run. I found bita&#x2F;bitar (<a href="https:&#x2F;&#x2F;github.com&#x2F;oll3&#x2F;bita" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;oll3&#x2F;bita</a>) which is inspired by rsync &amp; family. It works more like zsync which leverages HTTP Range requests so it doesn&#x27;t require anything running on the server to get chunks. Works like a treat using s3&#x2F;b2 storage to serve files and get incremental differential updates on the client side!
评论 #31962713 未加载
lloeki将近 3 年前
When trying to understand rsync and the rolling checksum I stumbled upon a small python implementation in some self-hosted corner of the web[0], which I have archived on GH[1] (not the author, but things can vanish quickly, as proved by the bzr repo which went <i>poof</i>[2]).<p>[0]: <a href="https:&#x2F;&#x2F;blog.liw.fi&#x2F;posts&#x2F;rsync-in-python&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.liw.fi&#x2F;posts&#x2F;rsync-in-python&#x2F;</a><p>[1]: <a href="https:&#x2F;&#x2F;github.com&#x2F;lloeki&#x2F;rsync&#x2F;blob&#x2F;master&#x2F;rsync.py" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;lloeki&#x2F;rsync&#x2F;blob&#x2F;master&#x2F;rsync.py</a><p>[2]: <a href="https:&#x2F;&#x2F;code.liw.fi&#x2F;obsync&#x2F;bzr&#x2F;trunk&#x2F;" rel="nofollow">https:&#x2F;&#x2F;code.liw.fi&#x2F;obsync&#x2F;bzr&#x2F;trunk&#x2F;</a>
评论 #31959982 未加载
throw0101a将近 3 年前
See also the 1996 original paper by Tridgell (also of Samba fame) and Mackerras:<p>* <a href="https:&#x2F;&#x2F;rsync.samba.org&#x2F;tech_report&#x2F;" rel="nofollow">https:&#x2F;&#x2F;rsync.samba.org&#x2F;tech_report&#x2F;</a><p>* <a href="https:&#x2F;&#x2F;www.andrew.cmu.edu&#x2F;course&#x2F;15-749&#x2F;READINGS&#x2F;required&#x2F;cas&#x2F;tridgell96.pdf" rel="nofollow">https:&#x2F;&#x2F;www.andrew.cmu.edu&#x2F;course&#x2F;15-749&#x2F;READINGS&#x2F;required&#x2F;c...</a>
评论 #31959002 未加载
srvmshr将近 3 年前
I encountered a strange situation 2 days ago. I rsync my pdf files periodically between my harddrives. rsync showed no differences between two folder trees, but if I did `diff -r` between the two, 3 pdfs came out different.<p>I checked the three individually but they showed no corruption or changes either side. How can this happen?<p>Edit: the hard drive copy is previously rsynced from this copy &amp; both copies are mirrored with google cloud bucket.<p>The 3 files which showed different have the same MD5 checksum
评论 #31959885 未加载
评论 #31968272 未加载
评论 #31961659 未加载
CamperBob2将近 3 年前
I don&#x27;t see why any of this is needed. Just install Dropbox, and...
评论 #31961923 未加载
评论 #31961262 未加载
bigChris将近 3 年前
Rsync worst issue is someone port scanning and brute force their way into your system. Turn off your port.
评论 #31961776 未加载
评论 #31960394 未加载