TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Surya – OCR and line detection in 93 languages

11 点作者 vikp超过 1 年前

1 comment

vikp超过 1 年前
Hi HN - I released an open source OCR model yesterday that supports 93 world languages. It builds on a text line detector I created earlier.<p>In my benchmarks, it&#x27;s more accurate than tesseract in every language except one. (see repo for benchmarking method)<p>Since it can run on GPU, speed is about equal to tesseract (when cost-matched with a 1x lambda A6000 vs 28 DigitalOcean CPU cores).<p>It&#x27;s built using a modified donut architecture - I added an MoE layer, GQA for faster decoding, and UTF-16 decoding (can represent any character, and faster than UTF-8 since you can combine adjacent bytes.)<p>I theorized that character-level decoding would be an optimal compute allocation, and that a large embedding matrix (relative to UTF-8 decoding) would store language-specific information.<p>I trained it using 4x A6000s for about 2 weeks.<p>You can run surya via Python API, from the CLI, or via an interactive app in the repo.