TechEcho

I’m one of the co-founders of Doctly AI. I wanted to share our story.We didn’t originally set out to build a PDF-to-Markdown parser. It all started when we were building a RAG solution for a company that deals with regulatory agencies. All of their data was in PDFs, and as it is apparently with lawyers, they like to print and scan documents to make it hard on their counterparts. These documents contained complex tables that barely make sense, are rotated, and handwriting is mixed in between. Many pages are number ruled and potentially rotated.We spent a lot of time trying to get clean data from the PDFs. Most OCR tools spit out garbage among the output and also can lose formatting and document hierarchy. We also wanted to reference the original documents within the RAG output, creating additional challenges. The existing solutions we looked at were unable to give us the quality we were looking for. Garbage in/garbage out.“Let’s just write our own. How hard could this be?” - we thought. Well it is definitely a difficult problem. AI vision models can do pretty well, but different ones excel at different things. From language support to table and chart conversion. Also cost becomes an issue.We settled on an “agentic” approach where we detect features/layout within the document and route them to different models and in some cases traditional OCR, achieving very high quality conversions, while keeping the costs at bay. After comparing our solution with existing solutions and APIs, we easily beat them on the quality of their ‘high-res’ offerings while matching or beating their ‘low-res’ costs.We still have work to do of course, we believe we can increase the quality and reduce the costs by a lot, but you have to start somewhere.We’re offering free credits so you can try it out for yourself! Check us out at Doctly.ai, No credit card required. Let us know how it helps with your document processing! Let me know if you need more credits for testing it out.Would love to hear your feedback, and leads are always appreciated :)You can reach me at ali at doctly.ai

1 comment

freakynit7 months ago

Just tried it on this paper: <a href="https://arxiv.org/pdf/2010.11929" rel="nofollow">https://arxiv.org/pdf/2010.11929</a>Pretty good results. Just wanted to know why the image url's are like this: """ ![Vision Transformer (ViT) Diagram](image-url) """I mean it should give actual url, isn't it?Other than that, seems to work really well.Congrats..

评论 #41955920 未加载

1 comment

freakynit7 months ago

评论 #41955920 未加载

Show HN: Doctly AI – Accurate AI-Powered PDF to Markdown Parser

1 comment

Show HN: Doctly AI – Accurate AI-Powered PDF to Markdown Parser

1 comment