TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Percollate – a command-line tool to grab web pages as PDFs

123 点作者 danburzo超过 6 年前

8 条评论

anonytrary超过 6 年前
You might want to include some actual pictures of the input and output in the readme. The current examples are just one-line command snippets which aren't as useful to someone who hasn't decided to use the tool yet.
评论 #18200942 未加载
danburzo超过 6 年前
I’ve been sporadically working on this over the last couple of weeks, and I think it’s now stable enough to get other people’s feedback on it. I got the idea while perusing Simon Wardley’s mapping book-in-progress (<a href="https:&#x2F;&#x2F;medium.com&#x2F;wardleymaps" rel="nofollow">https:&#x2F;&#x2F;medium.com&#x2F;wardleymaps</a>), and I wondered whether I can bundle all the chapters into a decent-looking PDF. (It works pretty well for that purpose). I also wanted it to be a sample app for gluing things together for the purpose of producing books in the browser.<p>I’d love it if you gave it a spin; please let me know if you find anything nasty!
dananjaya86超过 6 年前
How is it different from, let&#x27;s say:<p>chrome --headless --disable-gpu --print-to-pdf <a href="https:&#x2F;&#x2F;www.google.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.google.com&#x2F;</a>
评论 #18200356 未加载
评论 #18201004 未加载
评论 #18200348 未加载
burtonator超过 6 年前
Polar has a similar feature if you&#x27;re just wanting an archive of web pages:<p><a href="https:&#x2F;&#x2F;getpolarized.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;getpolarized.io&#x2F;</a><p>We support &#x27;captured&#x27; HTML pages. Basically what we do is we fetch the full HTML of the content and store it in a PHZ file (polar HTML archive) and then we save that to disk (it&#x27;s just a zip file with JSON metadata).<p>The Polar app is an Electron app so it has full access to render HTML.<p>We then inject our self into the network layer using protocol interceptors and if you&#x27;re loading the URL you just captured we load the content from the PHZ instead of the network.<p>You can then annotate the content, take notes on it, tag it, and keep it forever without risk of it vanishing.<p>I use it for important documents that I can&#x27;t afford to ever lose. For example, the Etherium whitepapers are in HTML , not PDF. they&#x27;re also living documents so I can just capture anytime I want.<p>HTML files don&#x27;t often print properly so this way I can keep them the way they were meant to be seen.
评论 #18202514 未加载
评论 #18202281 未加载
评论 #18202208 未加载
dustingetz超过 6 年前
Examples please, and can you show me the differences made by the enhancements?
评论 #18200958 未加载
heinrichhartman超过 6 年前
Just tried it on their GitHub page:<p>percollate pdf --output p.pdf <a href="https:&#x2F;&#x2F;github.com&#x2F;danburzo&#x2F;percollate" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;danburzo&#x2F;percollate</a><p>The font is gigantic and the page tiny. Barely get to the second headline on the first page.<p>And there is no way to tune this on the command line (yet).
评论 #18200934 未加载
dvfjsdhgfv超过 6 年前
This made me smile:<p>&gt; percollate html Not implemented yet
评论 #18200971 未加载
v01d4lph4超过 6 年前
Nice!
评论 #18200959 未加载