TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Simple Speech-to-Text on the '10 Cents' CH32V003 Microcontroller

141 点作者 victor8212 个月前

11 条评论

sen12 个月前
It&#x27;s really cool how people are taking these tiny cheap MCUs and making them do fun things for hobbyists. There&#x27;s nothing better than a project with zero real-world use-case but that&#x27;s done just because it was a challenge.<p>Eg:<p>Making the CH32V003 programmable via USB: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=j-QazXghkLY" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=j-QazXghkLY</a><p>CH32V003 &quot;Super-Cluster&quot;: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=lh93FayWHqw" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=lh93FayWHqw</a><p>Powering a Nixie Tube from USB with a CH32V003: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=-4d3PgEXhdY" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=-4d3PgEXhdY</a><p>(A good rule in life in general is to just always watch CNLohr and Bitluni if you&#x27;re into &quot;useless but amazing hardware projects&quot;)
评论 #40510260 未加载
评论 #40508487 未加载
jononor12 个月前
Minor nitpick&#x2F;clarification. As it stands this is doing detection of a fixed, small vocabulary of words - not open ended text to speech covering entire language. Also called speech command recognition &#x2F; keyword spotting. Which is already impressive and useful. General STT on this grade hardware would be an amazing feat!
kragen12 个月前
this is exciting! it&#x27;s still at prototype stage: &#x27;getting about 90% accuracy [distinguishing between the spoken digits &#x27;zero&#x27; to &#x27;nine&#x27;,] with the code as it stands.&#x27;<p>i wonder if modern continuous optimization algorithms could yield a neural network that would do better than this mfcc approach at, perhaps, even lower computational cost<p>they seem to have gotten more expensive lately, though (11.83¢ in quantity 500), and lcsc is out of stock on the ch32v003. they only have in stock ch32v203 and up, which costs 37.5¢. <a href="https:&#x2F;&#x2F;www.lcsc.com&#x2F;products&#x2F;Microcontroller-Units-MCUs-MPUs-SOCs_11329.html?keyword=ch32v" rel="nofollow">https:&#x2F;&#x2F;www.lcsc.com&#x2F;products&#x2F;Microcontroller-Units-MCUs-MPU...</a><p>digi-key, as usual, doesn&#x27;t list the part at all
评论 #40511775 未加载
评论 #40506813 未加载
评论 #40515878 未加载
jononor12 个月前
Really nice project! Great care is taken in optimized audio feature extraction, very cool to see. I am working on a very similar project[1], using the Puya PY32. I opted for that chip over CH32 since it has DMA (simplifies efficient ADC input at audio rates), and 1 kB more RAM. For a couple of cents more. I have written about some of the hardware constraints on low cost audio already, and am getting to the audio DSP&#x2F;ML in the next months.<p>1. <a href="https:&#x2F;&#x2F;hackaday.io&#x2F;project&#x2F;194511-1-dollar-tinyml" rel="nofollow">https:&#x2F;&#x2F;hackaday.io&#x2F;project&#x2F;194511-1-dollar-tinyml</a>
buescher12 个月前
I wonder how this performs compared to the &quot;voice recognition&quot; VCP200 chip sold by Radio Shack in the eighties (maybe early nineties?). <a href="https:&#x2F;&#x2F;21stdigitalhome.blogspot.com&#x2F;2013&#x2F;06&#x2F;vcp200-voice-recognition-ic.html" rel="nofollow">https:&#x2F;&#x2F;21stdigitalhome.blogspot.com&#x2F;2013&#x2F;06&#x2F;vcp200-voice-re...</a><p>Also be interesting to know if that Voice Control Products ever had a real design win.<p>I gather the VCP200 was a mask-programmed M6804 microcontroller. The M6804 was a strange and obscure beast, apparently a cost-reduced, internally serial (&quot;1-bit&quot;), partial reimplementation of the M6805, which was one of the first Motorola 8-bit microcontrollers based on the 6800. Max bus speed of 2.75MHz, with an instruction cycle time of 44 microseconds. 32 bytes of RAM and 1K mask-programmed ROM. No ADC. <a href="http:&#x2F;&#x2F;www.bitsavers.org&#x2F;components&#x2F;motorola&#x2F;6804&#x2F;M6804_MCU_Manual_Sep85.pdf" rel="nofollow">http:&#x2F;&#x2F;www.bitsavers.org&#x2F;components&#x2F;motorola&#x2F;6804&#x2F;M6804_MCU_...</a><p>One should be able to do better with about any modern microcontroller. Then again, for all I know the VCP200 was not fit to even the modest tasks (looks like toy&#x2F;novelty&#x2F;hobbyist) it was marketed for back then.
hales12 个月前
Is there a recorded demo? Reading about speech-to-text is different from hearing it.
评论 #40506919 未加载
watersb12 个月前
About 10 years ago, I used a basic flip phone, vendor locked to a $15&#x2F;month Verizon plan.<p>The Wal Mart page for a similar device is still up at<p><a href="https:&#x2F;&#x2F;www.walmart.com&#x2F;ip&#x2F;Verizon-Wireless-Samsung-Gusto-3-128MB-Prepaid-Smartphone-Black&#x2F;36771424" rel="nofollow">https:&#x2F;&#x2F;www.walmart.com&#x2F;ip&#x2F;Verizon-Wireless-Samsung-Gusto-3-...</a><p>Among other things, it had limited speech recognition -- you could say &quot;Call&quot; followed by a name, and it would match that against the address book on device.<p>We live in strange times.
评论 #40511827 未加载
评论 #40508913 未加载
评论 #40508236 未加载
londons_explore12 个月前
Projects like this really open the doors to coin sized devices which can record months of audio from a tiny battery.<p>You can imagine employers who might want a record of everything said on their premises for example.
评论 #40515852 未加载
londons_explore12 个月前
If you uploaded some training data somewhere, perhaps to some links to simulators, you might get a crowd of people code-golfing this to maximize accuracy.
countvonbalzac12 个月前
What&#x27;s the minimum spec chip you will need to run the smallest whisper model (looks like that&#x27;s 39M parameters)?
评论 #40515832 未加载
评论 #40517963 未加载
pcdoodle12 个月前
90% accuracy on 10 digits is pretty disappointing but cool project.
评论 #40510828 未加载