TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Document OCR for Node.js with GPT Vision

1 pointsby thesandlordover 1 year ago
GPT Vision is probably the best generic OCR and document parsing tool out there. It can handle invoices, contracts, receipts, and even hand written notes. However, there are some limitations to the API that make it hard to work with:<p>1) No support for JSON output 2) No support for PDFs<p>llm-document-ocr is a simple Node library that does these pre and post processing steps for you. It converts PDFs into PNGs, crops whitespace around the images, and parses the JSON output.<p>Hope this saves you some time if you are building your own OCR stack on top of GPT and other LLMs!

1 comment

billconanover 1 year ago
is it possible to extract the citation list of a pdf paper using this tool?