TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: APIs to reliably convert any document to HTML?

7 pointsby greaseover 14 years ago
Hi HN,<p>I've broken my head on this, and haven't found a a reliable way to programmatically convert documents (doc, docx, pdf etc) to HTML. The only option seems open-office as a server - but this keeps crashing (at least once a day). I would like something that can process thousands of docs per day and not crash. Any one here has faced this problem / knows a solution?<p>[ PS: In case you're wondering why, we run a web app for recruiting ( recruiterbox.com ) which requires converting resumes to html ]

2 comments

OneWhoFrogsover 14 years ago
I've never used it, but the Google Docs API fit your requirements:<p><a href="http://code.google.com/apis/documents/" rel="nofollow">http://code.google.com/apis/documents/</a><p>It accepts doc, docx, and pdf and does export to HTML. I'm unsure about what the API rate limit is, though. The FAQs suggest that it can be raised by using a premier account.
评论 #2026858 未加载
dinedalover 14 years ago
Document conversion is a tricky space for a startup. All the rules are defined by companies who would very much like to see you fail, and code wise it's almost the most boring task I can think of.