TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

HTTrack Website Copier

136 pointsby iscream268 months ago

15 comments

Felk8 months ago
Funny seeing this here now, as I _just_ finished archiving an old MyBB PHP forum. Though I used `wget` and it took 2 weeks and 260GB of uncompressed disk space (12GB compressed with zstd), and the process was not interruptible and I had to start over each time my hard drive got full. Maybe I should have given HTTrack a shot to see how it compares.<p>If anyone wanna know the specifics on how I used wget, I wrote it down here: <a href="https:&#x2F;&#x2F;github.com&#x2F;SpeedcubeDE&#x2F;speedcube.de-forum-archive">https:&#x2F;&#x2F;github.com&#x2F;SpeedcubeDE&#x2F;speedcube.de-forum-archive</a><p>Also, if anyone has experience archiving similar websites with HTTrack and maybe know how it compares to wget for my use case, I&#x27;d love to hear about it!
评论 #41736293 未加载
评论 #41742563 未加载
评论 #41741356 未加载
评论 #41740328 未加载
corinroyal8 months ago
One time I was trying to create an offline backup of a botanical medicine site for my studies. Somehow I turned off depth of link checking and made it follow offsite links. I forgot about it. A few days later the machine crashed due to a full disk from trying to cram as much of the WWW as it could on there.
评论 #41737191 未加载
suriya-ganesh8 months ago
This saved me a ton when back in college in rural India without Internet in 2015. I would download whole websites from a nearby library and read at home.<p>I&#x27;ve read py4e, ostep, Pgs essays using this.<p>I am who I am because of httrack. Thank you
jregmail8 months ago
I recommend to try also <a href="https:&#x2F;&#x2F;crawler.siteone.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;crawler.siteone.io&#x2F;</a> for web copying&#x2F;cloning.<p>Real copy of the netlify.com website for demonstration: <a href="https:&#x2F;&#x2F;crawler.siteone.io&#x2F;examples-exports&#x2F;netlify.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;crawler.siteone.io&#x2F;examples-exports&#x2F;netlify.com&#x2F;</a><p>Sample analysis of the netlify.com website, which this tool can also provide: <a href="https:&#x2F;&#x2F;crawler.siteone.io&#x2F;html&#x2F;2024-08-23&#x2F;forever&#x2F;x2-vuvb0oi6qxkr-ku79.html" rel="nofollow">https:&#x2F;&#x2F;crawler.siteone.io&#x2F;html&#x2F;2024-08-23&#x2F;forever&#x2F;x2-vuvb0o...</a>
xnx8 months ago
Great tool. Does it still work for the &quot;modern&quot; web (i.e. now that even simple&#x2F;content websites have become &quot;apps&quot;)?
评论 #41735539 未加载
dark-star8 months ago
oh wow that brings back memories. I have used httrack in the late 90s and early 2000&#x27;s to mirror interesting websites from the early internet, over a modem connection (and early DSL)<p>Good to know they&#x27;re still around, however, now that the web is much more dynamic I guess it&#x27;s not as useful anymore as it was back then
评论 #41736377 未加载
oriettaxx8 months ago
I don&#x27;t get it: last release 2017 while in github I see more releases...<p>so, did developer of the github repo took over and updating&#x2F;upgrading? very good!
superjan8 months ago
I have tried the windows version 2 years ago. The site I copied was our on-prem issue tracker (fogbugz) that we replaced. HTTrack did not work because of too much javascript rendering, and I could not figure out how to make it login. What I ended up doing was embedding a browser (WebView2) in a C# Desktop app. You can intercept all the images&#x2F;css, and after the Javascript rendering was complete, write out the DOM content to a html file. Also nice is that you can login by hand if needed, and you can generate all urls from code.
chirau8 months ago
I use it to download sites with layouts that I like and want to use for landing pages and static pages for random projects. I strip all the copy and stuff and leave the skeleton to put my own content. Most recently link.com, column.com and increase.com. I don&#x27;t have the time nor the youth to start with all the JavaScript &amp; React stuff.
zazaulola8 months ago
The archive saved in HTTrack Website Copier can be opened in <a href="https:&#x2F;&#x2F;replayweb.page" rel="nofollow">https:&#x2F;&#x2F;replayweb.page</a> locally or they have different save formats?
Alifatisk8 months ago
Good ol&#x27; days
subzero068 months ago
i use this to double check which of my web app folder&#x2F;files are publicly accessible.
j0hnyl8 months ago
Scammers love this tool. I see it used in the wild quite a bit.
alberth8 months ago
I always wonder if this gives false positives for people just using the same WordPress template.
woutervddn8 months ago
Also known as: static site generator for any original website platform...