TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: How to extract structured information from captured audio?

1 点作者 sandreas13 天前
Hey HN,<p>I would like to extract structured information from captured audio on a device that is not too expensive (a small LLM would be an option, I got an old NVidia 1660 Super with 6GB VRAM).<p>OpenAI Whisper could be used to get the audio contents as text, but I don&#x27;t really know how I could reliably extract the information in a structured way. There is always a &quot;purpose&quot;, which is selected out of let&#x27;s say 10 possible purposes and &quot;required data&quot;, which is depending on the purpose and composed by key value pairs, that also have predefined values.<p>An example (spoken text):<p><pre><code> Please apply for leave from 1st November to 8th november. </code></pre> Result (structured data):<p><pre><code> { purpose: &quot;apply for leave&quot;, data: { start: &quot;2025-11-01&quot;, end: &quot;2025-11-08&quot; } } </code></pre> What are my options to do this in a reliable way that can match different purposes with different data by &quot;best match&quot; approach?

1 comment

sargstuff13 天前
Related OpenAI forum topic(s) that covers related issues[0].<p>Old school, mark &#x27;paragraph&#x27;&#x2F;sentence, regular expression out miscellaneous info (using language linguistics &#x2F; linguistic &#x27;typing&#x27; aka noun, verb, etc) , then dump relevent remaining info in json&#x2F;delimited format &amp; normalize data (aka 1st november to 11&#x2F;01). multi-pass awk script(s) &#x2F; pearl &#x2F; icon are languages with appropriate in-language support. use regular expressions&#x2F;statistics to detect &#x27;outliers&#x27;&#x2F;mark data for human review.<p>multi-pass awk would require a codex&#x2F;phrases related to a delimited&#x2F;json tag. so first pass, identify phrases (perhaps also spell correct), categorize phrases related to given delimited field (via human intervention), then rescan, check for &#x27;outliers&#x27;&#x2F;conflicting normalizations &amp; have script do corrects per human annotations.<p>Note: Normalized phonetic annotations bit easer to handle than common dictionary spelling.<p>[0] : <a href="https:&#x2F;&#x2F;community.openai.com&#x2F;t&#x2F;summarizing-and-extracting-structured-data-from-long-text&#x2F;453078&#x2F;10" rel="nofollow">https:&#x2F;&#x2F;community.openai.com&#x2F;t&#x2F;summarizing-and-extracting-st...</a>
评论 #43817910 未加载