Local PDF Tools – Powered by WebAssembly

227 pointsby twapiover 4 years ago

21 comments

svatover 4 years ago

Last year I wrote a couple of similar "local" PDF tools that run in the browser with no network requests. Each is just a single HTML file that will work offline:- <a href="https://shreevatsa.net/pdf-pages/" rel="nofollow">https://shreevatsa.net/pdf-pages/</a> is for extracting pages, inserting blank pages, duplicating or reversing pages, etc.- <a href="https://shreevatsa.net/pdf-unspread/" rel="nofollow">https://shreevatsa.net/pdf-unspread/</a> is for splitting a PDF's "wide" pages (consisting of two-page spreads) in the middle.- <a href="https://shreevatsa.net/mobius-print/" rel="nofollow">https://shreevatsa.net/mobius-print/</a> is the earliest of these, and written for a niche use-case: "Möbius printing" of pages, which is printing out an article/paper two-sided in a really interesting order. (I've tried it and love it.)These don't use WebAssembly, but just use the excellent "pdf-lib" JS library. To keep the file self-contained, I put the whole minified source into a <script> tag at the bottom of the (otherwise hand-written) HTML file.

评论 #26325976 未加载

评论 #26329197 未加载

评论 #26334799 未加载

kickbeakover 4 years ago

Hey, Thanks for Posting it here, i built this tool, hope you like it, feel free to look at my source and contribute. <a href="https://github.com/jufabeck2202/localpdfmerger" rel="nofollow">https://github.com/jufabeck2202/localpdfmerger</a>

Abishek_Muthianover 4 years ago

Is there a .pdf tool which allows compression to a defined file size? Tools like ghostscript can compress a .pdf to different levels of quality by using different setting but not a defined file size; I understand that this has to do with the compression algorithm itself and that data could be compressed only to a certain limit, but what if the file size limit is within that limit?I'm asking this because an user of my problem validation platform wanted a solution for this[1], because websites requiring document upload have a file size limit and often the compressed file is either above or below the prescribed file size limit thereby loosing out on quality unnecessarily.[1]'Reduce document file size to specific size' (I have added the link to it on my profile, since it's my own platform).

评论 #26326749 未加载

brailsafeover 4 years ago

I'd certainly be curious why wasm ends up being 15x slower than native binary in this case, but it's not insurmountable. All of the major commercial PDF editing suites use wasm + their own C++ based pdf engine to great effect.The article that this is based on is here, and a good read. It seems like it's at least non-trivial to get it working, and I'd wonder how the process looks for other compiled binaries, having not tried to do that implementation from scratch. <a href="https://dev.to/wcchoi/browser-side-pdf-processing-with-go-and-webassembly-13hn" rel="nofollow">https://dev.to/wcchoi/browser-side-pdf-processing-with-go-an...</a>

sigvefover 4 years ago

Looks like this thread is all about sharing our own related local browser-based PDF tools. Here’s mine: <a href="https://pdftotext.github.io" rel="nofollow">https://pdftotext.github.io</a>

kc0bfvover 4 years ago

There are a few versions of tools like this, or similar, available. Here's mine:<a href="https://kc0bfv.github.io/WASM-PDF-Combiner/" rel="nofollow">https://kc0bfv.github.io/WASM-PDF-Combiner/</a>I used existing wasm compiles of PDF tools. This use of wasm is pretty awesome to me - I often end up working on very restricted desktop clients with little customization possible, but they always let me run a browser.

评论 #26326541 未加载

评论 #26325297 未加载

codetrotterover 4 years ago

Convenient if you are on a machine where you can’t install software. (Corporate computer, school computer, library computer etc.)For Linux and macOS computers that you are allowed to install software on I recommend the pdftk command line tool.Ubuntu family:<pre><code> sudo apt install pdftk </code></pre> macOS with Homebrew:<pre><code> brew install pdftk-java</code></pre>

评论 #26325250 未加载

yomansatover 4 years ago

Can one easily install such apps as a Chrome app/PWA, and deactivate access to the internet since it doesn't need it and one can merge personal PDFs?

mgm__over 4 years ago

I created a PDF table extractor tool last year with the same idea that it should be local only. Try it here: <a href="https://pdftableutil.possiblenull.com/app/" rel="nofollow">https://pdftableutil.possiblenull.com/app/</a> Also as a Google Docs addon (still local only) <a href="https://workspace.google.com/marketplace/app/pdf_table_importer/646940040599" rel="nofollow">https://workspace.google.com/marketplace/app/pdf_table_impor...</a>I had a bad case of scope creep, so the tool can also extract tables from scanned/image PDFs using OpenCV.js and tesseract OCR wasm build!

评论 #26329470 未加载

评论 #26326401 未加载

danvkover 4 years ago

I’d love a tool (that’s not Acrobat) to manage comments on PDFs.

评论 #26326314 未加载

naedishover 4 years ago

This is helpful. Generally if I need to do any pdf manipulation when I'm away from my own machine I use an android app - PDF Utils [1].[1] <a href="https://play.google.com/store/apps/details?id=pdf.shash.com.pdfutility" rel="nofollow">https://play.google.com/store/apps/details?id=pdf.shash.com....</a>

not_knuthover 4 years ago

Does anyone know some good tutorials/explanations for understanding the PDF format at the byte level?

评论 #26326901 未加载

评论 #26326763 未加载

评论 #26326668 未加载

ithkuilover 4 years ago

Anybody knows a simple tool I can use to turn an academic two-column paper into a single column pdf (so I can read it easily on e-paper like a remarkable)?(Ideally I'd like to be able to run such a tool from browser/phone)

评论 #26329585 未加载

georgeutsinover 4 years ago

Looking forward to using this tool! Are there plans to make this open source?

评论 #26324586 未加载

评论 #26324588 未加载

Cianticover 4 years ago

It's great to see more PDF tools.Many times I just want to clip white margins from PDFs so that it is easier to view on tablets or phones. Most viewers don't have a way to force the clipping of pages, so when you change page the zoom is lost and suddenly all the content is squished to center.Last time I found cli programs to do it aprox. five years ago it was really difficult to find good tools to edit PDFs like that.It's actually not trivial task, as sometimes pages have different margins, e.g. odd and even pages has different margins on folding side of page.

travis729over 4 years ago

For something like this, how do we know that the files are not sent to a server? Am I just trusting the web app? Is there any way to be sure other than having and reading the source?

评论 #26325197 未加载

评论 #26325065 未加载

评论 #26332513 未加载

desmapover 4 years ago

Something I still miss is a free and easy PDF tool which lets you delete, reorder and add pages from multiple PDFs. On Windows there is just Xodo but its UX is unfortunately subpar and on macOS you have Preview where the UI is better but once you have multiple PDFs from where you get the pages it can get confusing.

评论 #26325731 未加载

评论 #26325756 未加载

评论 #26325811 未加载

simonmalesover 4 years ago

I have this same idea on my to-do list. Great that people are experimenting with webapps that don't send any data!

评论 #26324802 未加载

pabs3over 4 years ago

Hmm, I think I would just compile the pdfcpu Go source to native code, that might be faster than WebAssembly?

评论 #26324977 未加载

horst_vieover 4 years ago

This is a nice usecase for pdfcpu. If you are pdftk user give the pdfcpu CLI a spin. It is multi platform and has some nice features baked in. <a href="https://pdfcpu.io/" rel="nofollow">https://pdfcpu.io/</a>

djrogersover 4 years ago

There was an error merging PDFs. Not very helpful, can you tell me what the error was or how to avoid it?