Show HN: Aqua Voice 2 – Fast Voice Input for Mac and Windows

140 pointsby the_kingabout 1 month ago

Hey HN - It’s Finn and Jack from Aqua Voice (<a href="https://withaqua.com">https://withaqua.com</a>). Aqua is fast AI dictation for your desktop and our attempt to make voice a first-class input method.Video: <a href="https://withaqua.com/watch">https://withaqua.com/watch</a>Try it here: <a href="https://withaqua.com/sandbox">https://withaqua.com/sandbox</a>Finn is uber dyslexic and has been using dictation software since sixth grade. For over a decade, he’s been chasing a dream that never quite worked — using your voice instead of a keyboard.Our last post (<a href="https://news.ycombinator.com/item?id=39828686">https://news.ycombinator.com/item?id=39828686</a>) about this seemed to resonate with the community - though it turned out that version of Aqua was a better demo than product. But it gave us (and others) a lot of good ideas about what should come next.Since then, we’ve remade Aqua from scratch for speed and usability. It now lives on your desktop, and it lets you talk into any text field -- Cursor, Gmail, Slack, even your terminal.It starts up in under 50ms, inserts text in about a second (sometimes as fast as 450ms), and has state-of-the-art accuracy. It does a lot more, but that’s the core. We’d love your feedback — and if you’ve got ideas for what voice should do next, let’s hear them!

27 comments

idk1about 1 month ago

I’ve been using this for some time and I have to say it is fantastic. I’m intentionally not writing this with Aqua but by hand and it is taking so much longer. This to me feels like what Apple Intelligence could be, it is so much better than stuff all of the big tech is doing. For example, if you tell Siri voice dictation to go back and delete something what Siri will do is just write out “go back and delete something“ also if you tell Siri to go back and spell a name differently all Siri will do is write out the letters that you said to go back and type out. Honestly, for voice dictation software it feels like travelling to another planet in terms of improvement.

nielabout 1 month ago

Real-time text output à la Apple Dictation with the accuracy of Whisper is something I've been looking for recently - I'll definitely give Aqua a spin.MacWhisper [0] (the app I settled on) is conspicuously missing from your benchmarks [1]. How does it compare?[0]: <a href="https://goodsnooze.gumroad.com/l/macwhisper" rel="nofollow">https://goodsnooze.gumroad.com/l/macwhisper</a>[1]: <a href="https://withaqua.com/blog/benchmark-nov-2024">https://withaqua.com/blog/benchmark-nov-2024</a>

评论 #43637776 未加载

aylmaoabout 1 month ago

This is super impressive, great job!Side-comment of something this made me think of (again): tech builds too much for tech. I've lived in the Bay before, so I know why this happens. When you're there, everyone around you is in tech, your girlfriend is in tech, you go to parties and everyone invariably ends up talking about work, which is tech. Your frustrations are with tech tools and so are your peers', so you're constantly thinking about tech solutions applicable to tech's problems.This seems very much marketed to SF people doing SF things ("Cursor, Gmail, Slack, even your terminal"). I wonder how much effort has gone into making this work with code editors or the terminal, even though I doubt this would a big use-case for this software if it ever became generally popular. I'd imagine the market here is much larger in education, journalism, film, accessibility, even government. Those are much more exciting demos.

评论 #43640176 未加载

fxtentacleabout 1 month ago

This looks like it'll slurp up all your data and upload it into a cloud. Thanks, no. I want privacy, offline mode and source code for something as crucial to system security as an input method."we also collect and process your voice inputs [..] We leverage this data for improvements and development [..] Sharing of your information [..] service providers [..] OpenAI" <a href="https://withaqua.com/privacy">https://withaqua.com/privacy</a>

评论 #43637923 未加载

评论 #43639318 未加载

评论 #43638662 未加载

评论 #43640415 未加载

评论 #43639535 未加载

评论 #43638673 未加载

评论 #43638808 未加载

jrvarela56about 1 month ago

Feedback: I use MacWhisper and Tiny wisperkit model (english only) is way faster than any cloud service on my M1 macbook pro.I’d say local is necessary for delightful product experience and the added bonus is that it ticks the privacy box

评论 #43643566 未加载

评论 #43643432 未加载

alxluabout 1 month ago

I’ve been using this for a while now and I really enjoy it. I ran into a semi-obscure bug and emailed them and they basically fixed it the same day.I do wish there was a mobile app though (or maybe an iOS keyboard). It would also be nice to be able to have a separate hotkey you can set up to send the output to a specific app (instead of just the active one).

评论 #43639920 未加载

评论 #43650371 未加载

rkagererabout 1 month ago

You mentioned it "lives on your desktop". How does licensing work, and can you install and use it on a machine without internet access?

rickydrollabout 1 month ago

I've been using Aqua since it was announced on HNN. I've survived the teething pains by using a mixture of Aqua and Dragon, depending on what I was doing. With this new Windows app, I've given up using Dragon for anything.Things I've learned are:1. It works better if you're connected by Ethernet than by Wi-Fi.2. It needs to have a longer recognition history because sometimes you hit the wrong key to end a recognition session, and it loses everything.3. Besides the longer history, a debugging mode that records all the characters sent to the dictation box would be useful. Sometimes, I see one set of words, blink, and then it's replaced with a new recognition result. Capturing would be useful in describing what went wrong.4. There should be a way to tell us when a new version is running. Occasionally, I've run into problems where I'm getting errors, and I can't tell if it's my speaking, my audio chain, my computer, the network, or the app.5. Grammarly is a great add-on because it helps me correct mis-speakings and odd little errors, like too many spaces caused by starting and stopping recognition.When Dragon Systems went through bankruptcy court, a public benefits corporation bid for the core technology because it recognized that Dragon was a critical tool for people with disabilities to function in a digital world.In my opinion, Aqua has reached a similar status as an essential tool. Well, it doesn't fully replace Dragon for those who need command and control (yet). The recognition accuracy and smoothness are so amazing that I can't envision returning to Dragon Systems without much pain. The only thing worse would be going back to a keyboard.Aqua Guys, don't fuck it up.

repleteabout 1 month ago

Product/UI looks good. Nice job. I would pay for a completely offline version of this, cloud voice data is non-starter for me though unfortunately

评论 #43648624 未加载

willwadeabout 1 month ago

You’re real market you need to go hard on is the assistive tech market. You know the biggest companies in this space are those solving problems for dyslexia where govt grants in eg UK fund pretty much all their work? I had an access to work assessment and they recommend like sweets stuff from texthelp. It’s then paid for by the government following these assessments. But it’s crap. It literally is a crap tool for adhd or dyslexia because these users literally CANT remember or deal with barriers like learning how to dictate correctly. Aqua voice solves this. I’m your biggest fan. I recommend it in my AT assessments all the time :)

评论 #43644792 未加载

adamesqueabout 1 month ago

I was very delighted by Aqua v1, which felt like magic at first.But I’ve noticed/learned that I can’t dictate written content. My brain just does not work that way at all — as I write I am constantly pausing to think, to revise, etc and it feels like a completely different part of my brain is engaged. Everything I dictated with Aqua I had to throw away and rewrite.Has anyone had similar problems, and if so, had any success retraining themselves toward dictation? There are fleeting moments where it truly feels like it would be much faster.

评论 #43641249 未加载

评论 #43640654 未加载

评论 #43640621 未加载

评论 #43643285 未加载

评论 #43640635 未加载

评论 #43642013 未加载

SCdFabout 1 month ago

I currently use Talon, which I note is not in your benchmarks.I can't find any documentation on how Aqua works, or how it compares, so I'm not sure it's meant to be a replacement / competitor to Talon? What are you configuring? How are you telling it that you like "genz" style in Slack? Can I create custom configurations / macros?One thing I like about Talon is it's not magic. Which maybe is not what you're going for. But I am giving it explicit commands that I know it will understand (if it understands my accent obvs), as opposed to guessing and constructing a human language vague sentence and hope that an llm will work it out. Which means it feels like something I can actually become fast with, and build up muscle memory of.Also that it's completely offline, so I can actually run it on a work computer without my security folks freaking out.

评论 #43638122 未加载

评论 #43638007 未加载

TylerEabout 1 month ago

I will have to look into this. I am currently in the process of going on disability as I cannot work due to (amongst other things) carpal and cubical tunnel in both arms.

oulipoabout 1 month ago

Interesting!A nice open-source alternative is VoiceInk, check it out: <a href="https://github.com/Beingpax/VoiceInk">https://github.com/Beingpax/VoiceInk</a>do you also plan to open-source part of your platform?

评论 #43642627 未加载

评论 #43638203 未加载

评论 #43637517 未加载

roland_kovacsabout 1 month ago

Hey guys, great idea! Most of the apps already have voice recognition. Do you think about serving a niche where this feature is not existing? Also, the data protection part is unclear to me, I don't want everything uploaded to a cloud where I don't know what is happening with it.

somberiabout 1 month ago

I use MacWhisper and it works well enough for me to stop looking for options.(MBA M2 24GB Ram - Large V3 English)I wouldn't feel comfortable if someone were looking over my shoulder while I'm typing at a coffee shop.I am not your customer.

bemmuabout 1 month ago

I would like to try this, but I use Synergy to use two computers with the same keyboard. I have Aqua Voice now on the server computer, would be great if I could input text to the client computer using it as well.

vladstudioabout 1 month ago

For the sake of alternatives, I've had good experience with <a href="https://tryvoiceink.com/" rel="nofollow">https://tryvoiceink.com/</a>

qntmfredabout 1 month ago

I use the built-in voice typing in Windows and am pretty happy with it. How would you say this compares (presuming most of your comparisons are mac-centric)

评论 #43640084 未加载

aminsadeghiabout 1 month ago

Is there going to be Linux support at some point?

评论 #43640199 未加载

tomblomfieldabout 1 month ago

I recently started using Aqua and it's great. The team really improved the latency in the last few weeks.

hu3about 1 month ago

How does it compare to <a href="https://wisprflow.ai" rel="nofollow">https://wisprflow.ai</a> ?btw, grats!

评论 #43637878 未加载

bklyn11201about 1 month ago

Music playing on Youtube in Chrome, Airpods in, the desktop and the sandbox/demo just don't work.

评论 #43637348 未加载

iAMkenoughabout 1 month ago

Tried viewing the Pricing link in the footer, but it requires a Google account to view.

waveringanaabout 1 month ago

will we ever see local, open source models? they are very important for accessibility reasons which this product can fit into, but wont because of it being cloud based (and proprietary).

hasperdiabout 1 month ago

Anyone can recommend a good dictation app on Linux?

gnfedhjmm2about 1 month ago

Broken on mobile

评论 #43637619 未加载