TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Surya – OCR and line detection in 93 languages

11 pointsby vikpover 1 year ago

1 comment

vikpover 1 year ago
Hi HN - I released an open source OCR model yesterday that supports 93 world languages. It builds on a text line detector I created earlier.<p>In my benchmarks, it&#x27;s more accurate than tesseract in every language except one. (see repo for benchmarking method)<p>Since it can run on GPU, speed is about equal to tesseract (when cost-matched with a 1x lambda A6000 vs 28 DigitalOcean CPU cores).<p>It&#x27;s built using a modified donut architecture - I added an MoE layer, GQA for faster decoding, and UTF-16 decoding (can represent any character, and faster than UTF-8 since you can combine adjacent bytes.)<p>I theorized that character-level decoding would be an optimal compute allocation, and that a large embedding matrix (relative to UTF-8 decoding) would store language-specific information.<p>I trained it using 4x A6000s for about 2 weeks.<p>You can run surya via Python API, from the CLI, or via an interactive app in the repo.