TechEcho

1 comment

bazzarghalmost 10 years ago

The PDFs this produces are simply collections of PNGs, and won't be accessible. It's always a compromise though. If you try to edit the PDF adding black boxes, and remove hidden objects, you may still leak data via the tagged pdf text; it doesn't have to match up to what's on the page exactly. So, converting to PNG isn't a terrible idea, but it would be nice to combine this with something that OCRd the PNG conversion? eg<p><a href="https://github.com/fritz-hh/OCRmyPDF" rel="nofollow">https://github.com/fritz-hh/OCRmyPDF</a><p>(which uses tessaract under the hood). The other thing this is missing, comparing it to commercial redacters I've used, is the ability to assist in the redaction: eg removing SSNs, phone numbers, all occurrences of key phrases.

First Look Media Releases PDF Redact Tools

1 comment

First Look Media Releases PDF Redact Tools

1 comment