TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Year of the Voice – Chapter 2: Let's talk

199 pointsby balloobabout 2 years ago

7 comments

balloobabout 2 years ago
Founder Home Assistant here. Let me know if anyone has any questions.<p>Edit: if you want to keep in the loop of the work we&#x27;re doing, subscribe to our free monthly newsletter @ <a href="https:&#x2F;&#x2F;building.open-home.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;building.open-home.io&#x2F;</a>
评论 #35736742 未加载
评论 #35736366 未加载
评论 #35736517 未加载
评论 #35736363 未加载
评论 #35736305 未加载
评论 #35738910 未加载
评论 #35737883 未加载
评论 #35736714 未加载
评论 #35736458 未加载
评论 #35736395 未加载
评论 #35738168 未加载
pwpwabout 2 years ago
A local voice assistant is the last link missing in my entirely local smart home setup, so this is exciting news. I would love if I could convert a google home mini that I have on hand to use with this, but my understanding is that the hardware is too locked down for tinkering with.<p>I love the VOIP integration shown off that can hook up to an old phone. One of my guilty pleasures is using peak forms of technology from the 20th century when things were more analog. It could be a lot of fun to bring an old phone into the mix to complement my turntable and PVM.
followerabout 2 years ago
The most exciting thing about Home Assistant&#x27;s &quot;Year of the Voice&quot;, for me, is that it is apparently enabling&#x2F;supporting @synesthesiam&#x27;s continued phenomenal contributions to the FLOSS off-line voice synthesis space.<p>The quality, variety &amp; diversity of voices that synesthesiam&#x27;s &quot;Larynx&quot; TTS project (<a href="https:&#x2F;&#x2F;github.com&#x2F;rhasspy&#x2F;larynx&#x2F;">https:&#x2F;&#x2F;github.com&#x2F;rhasspy&#x2F;larynx&#x2F;</a>) made available, completely transformed the Free&#x2F;Open Source Text To Speech landscape.<p>In addition &quot;OpenTTS&quot; (<a href="https:&#x2F;&#x2F;github.com&#x2F;synesthesiam&#x2F;opentts">https:&#x2F;&#x2F;github.com&#x2F;synesthesiam&#x2F;opentts</a>) provided a common API for interacting with multiple FLOSS TTS projects which showed great promise for actually enabling &quot;standing on the shoulders of&quot; rather than re-inventing the same basic functionality every time.<p>The new &quot;Piper&quot; TTS project mentioned in the article is the apparent successor to Larynx and, along with the accompanying LibriTTS&#x2F;LibriVox-based voice models, brings to FLOSS TTS something it&#x27;s never had before:<p>* Too many voices! :)<p>Seriously, the current LibriTTS voice model version has 900+ voices (of varying quality levels), how do you even navigate that many?![0]<p>And that&#x27;s not even considering the even higher quality single speaker models based on other audio recording sources.<p>Offline TTS while immensely valuable for individuals, doesn&#x27;t seem to be attractive domain for most commercial entities due to lack of lock-in&#x2F;telemetry opportunities so I was concerned that we might end up missing out on further valuable contributions from synesthesiam&#x27;s specialised skills &amp; experience due to financial realities &amp; the human need for food. :)<p>I&#x27;m glad we instead get to see what happens next.<p>[0] See my follow-up comment about this.
评论 #35744100 未加载
评论 #35741617 未加载
评论 #35736879 未加载
coder543about 2 years ago
&gt; On a Raspberry Pi 4, Piper can generate 2 seconds of audio with only 1 second of processing time.<p>&gt; On a Raspberry Pi 4, voice commands can take around 7 seconds to process with about 200 MB of RAM used.<p>Have you looked into supporting something like the Coral Accelerator[0], which can drastically speed up machine learning inference on a Raspberry Pi?[1]<p>It used to be available for $60, but it is hard to find in stock at the moment except for way over MSRP.<p>[0]: <a href="https:&#x2F;&#x2F;coral.ai&#x2F;products&#x2F;accelerator" rel="nofollow">https:&#x2F;&#x2F;coral.ai&#x2F;products&#x2F;accelerator</a><p>[1]: <a href="https:&#x2F;&#x2F;www.hackster.io&#x2F;news&#x2F;benchmarking-machine-learning-on-the-new-raspberry-pi-4-model-b-88db9304ce4" rel="nofollow">https:&#x2F;&#x2F;www.hackster.io&#x2F;news&#x2F;benchmarking-machine-learning-o...</a>
评论 #35737598 未加载
glenngillenabout 2 years ago
It’s been a few years since I’ve been down this path, but last time I went exploring one of the main challenges was getting decent hardware. The microphone array on something like an Echo at the time was far better than anything I seemed to be able to achieve without buying in to the Amazon or Google ecosystems.<p>Is there better consumer stuff available now?
评论 #35737871 未加载
评论 #35737007 未加载
pabs3about 2 years ago
Is there a more open alternative to Whisper?
评论 #35737467 未加载
boringuser2about 2 years ago
Eh, call me a naysayer, but nobody cared about AI agents because they largely weren&#x27;t useful until ~GPT 3.5 and Alpaca Lora 7b is about as useful as a vestigial nipple and requires state of the art hardware to run locally.<p>I&#x27;m also going to take the opportunity to poo-poo this pet peeve:<p>&gt;More powerful CPUs, such as the Intel Core i5, can generate 17 seconds of audio in the same amount of time.<p>Oh, really?<p>...Intel Core i5 brand microprocessors... Introduced in 2009...<p>About as accurate as saying &quot;users of four-wheeled vehicles&quot; when carriages were still around.<p>At the very least, you need to provide the node.
评论 #35738813 未加载
评论 #35736822 未加载
评论 #35736604 未加载