Hey HN - It’s Finn and Jack from Aqua Voice (<a href="https://withaqua.com">https://withaqua.com</a>). Aqua is fast AI dictation for your desktop and our attempt to make voice a first-class input method.<p>Video: <a href="https://withaqua.com/watch">https://withaqua.com/watch</a><p>Try it here: <a href="https://withaqua.com/sandbox">https://withaqua.com/sandbox</a><p>Finn is uber dyslexic and has been using dictation software since sixth grade. For over a decade, he’s been chasing a dream that never quite worked — using your voice instead of a keyboard.<p>Our last post (<a href="https://news.ycombinator.com/item?id=39828686">https://news.ycombinator.com/item?id=39828686</a>) about this seemed to resonate with the community - though it turned out that version of Aqua was a better demo than product. But it gave us (and others) a lot of good ideas about what should come next.<p>Since then, we’ve remade Aqua from scratch for speed and usability. It now lives on your desktop, and it lets you talk into any text field -- Cursor, Gmail, Slack, even your terminal.<p>It starts up in under 50ms, inserts text in about a second (sometimes as fast as 450ms), and has state-of-the-art accuracy. It does a lot more, but that’s the core. We’d love your feedback — and if you’ve got ideas for what voice should do next, let’s hear them!
I’ve been using this for some time and I have to say it is fantastic. I’m intentionally not writing this with Aqua but by hand and it is taking so much longer. This to me feels like what Apple Intelligence could be, it is so much better than stuff all of the big tech is doing. For example, if you tell Siri voice dictation to go back and delete something what Siri will do is just write out “go back and delete something“ also if you tell Siri to go back and spell a name differently all Siri will do is write out the letters that you said to go back and type out. Honestly, for voice dictation software it feels like travelling to another planet in terms of improvement.
Real-time text output à la Apple Dictation with the accuracy of Whisper is something I've been looking for recently - I'll definitely give Aqua a spin.<p>MacWhisper [0] (the app I settled on) is conspicuously missing from your benchmarks [1]. How does it compare?<p>[0]: <a href="https://goodsnooze.gumroad.com/l/macwhisper" rel="nofollow">https://goodsnooze.gumroad.com/l/macwhisper</a><p>[1]: <a href="https://withaqua.com/blog/benchmark-nov-2024">https://withaqua.com/blog/benchmark-nov-2024</a>
This is super impressive, great job!<p>Side-comment of something this made me think of (again): tech builds too much for tech. I've lived in the Bay before, so I know why this happens. When you're there, everyone around you is in tech, your girlfriend is in tech, you go to parties and everyone invariably ends up talking about work, which is tech. Your frustrations are with tech tools and so are your peers', so you're constantly thinking about tech solutions applicable to tech's problems.<p>This seems very much marketed to SF people doing SF things ("Cursor, Gmail, Slack, even your terminal"). I wonder how much effort has gone into making this work with code editors or the terminal, even though I doubt this would a big use-case for this software if it ever became generally popular. I'd imagine the market here is much larger in education, journalism, film, accessibility, even government. Those are much more exciting demos.
This looks like it'll slurp up all your data and upload it into a cloud. Thanks, no. I want privacy, offline mode and source code for something as crucial to system security as an input method.<p>"we also collect and process your voice inputs [..] We leverage this data for improvements and development [..] Sharing of your information [..] service providers [..] OpenAI"
<a href="https://withaqua.com/privacy">https://withaqua.com/privacy</a>
Feedback: I use MacWhisper and Tiny wisperkit model (english only) is way faster than any cloud service on my M1 macbook pro.<p>I’d say local is necessary for delightful product experience and the added bonus is that it ticks the privacy box
I’ve been using this for a while now and I really enjoy it. I ran into a semi-obscure bug and emailed them and they basically fixed it the same day.<p>I do wish there was a mobile app though (or maybe an iOS keyboard). It would also be nice to be able to have a separate hotkey you can set up to send the output to a specific app (instead of just the active one).
I've been using Aqua since it was announced on HNN. I've survived the teething pains by using a mixture of Aqua and Dragon, depending on what I was doing. With this new Windows app, I've given up using Dragon for anything.<p>Things I've learned are:<p>1. It works better if you're connected by Ethernet than by Wi-Fi.<p>2. It needs to have a longer recognition history because sometimes you hit the wrong key to end a recognition session, and it loses everything.<p>3. Besides the longer history, a debugging mode that records all the characters sent to the dictation box would be useful. Sometimes, I see one set of words, blink, and then it's replaced with a new recognition result. Capturing would be useful in describing what went wrong.<p>4. There should be a way to tell us when a new version is running. Occasionally, I've run into problems where I'm getting errors, and I can't tell if it's my speaking, my audio chain, my computer, the network, or the app.<p>5. Grammarly is a great add-on because it helps me correct mis-speakings and odd little errors, like too many spaces caused by starting and stopping recognition.<p>When Dragon Systems went through bankruptcy court, a public benefits corporation bid for the core technology because it recognized that Dragon was a critical tool for people with disabilities to function in a digital world.<p>In my opinion, Aqua has reached a similar status as an essential tool. Well, it doesn't fully replace Dragon for those who need command and control (yet). The recognition accuracy and smoothness are so amazing that I can't envision returning to Dragon Systems without much pain. The only thing worse would be going back to a keyboard.<p>Aqua Guys, don't fuck it up.
Product/UI looks good. Nice job. I would pay for a completely offline version of this, cloud voice data is non-starter for me though unfortunately
You’re real market you need to go hard on is the assistive tech market. You know the biggest companies in this space are those solving problems for dyslexia where govt grants in eg UK fund pretty much all their work? I had an access to work assessment and they recommend like sweets stuff from texthelp. It’s then paid for by the government following these assessments. But it’s crap. It literally is a crap tool for adhd or dyslexia because these users literally CANT remember or deal with barriers like learning how to dictate correctly. Aqua voice solves this. I’m your biggest fan. I recommend it in my AT assessments all the time :)
I was very delighted by Aqua v1, which felt like magic at first.<p>But I’ve noticed/learned that I can’t dictate written content. My brain just does not work that way at all — as I write I am constantly pausing to think, to revise, etc and it feels like a completely different part of my brain is engaged. Everything I dictated with Aqua I had to throw away and rewrite.<p>Has anyone had similar problems, and if so, had any success retraining themselves toward dictation? There are fleeting moments where it truly feels like it would be much faster.
I currently use Talon, which I note is not in your benchmarks.<p>I can't find any documentation on how Aqua works, or how it compares, so I'm not sure it's meant to be a replacement / competitor to Talon? What are you configuring? How are you telling it that you like "genz" style in Slack? Can I create custom configurations / macros?<p>One thing I like about Talon is it's not magic. Which maybe is not what you're going for. But I am giving it explicit commands that I know it will understand (if it understands my accent obvs), as opposed to guessing and constructing a human language vague sentence and hope that an llm will work it out. Which means it feels like something I can actually become fast with, and build up muscle memory of.<p>Also that it's completely offline, so I can actually run it on a work computer without my security folks freaking out.
I will have to look into this. I am currently in the process of going on disability as I cannot work due to (amongst other things) carpal and cubical tunnel in both arms.
Interesting!<p>A nice open-source alternative is VoiceInk, check it out: <a href="https://github.com/Beingpax/VoiceInk">https://github.com/Beingpax/VoiceInk</a><p>do you also plan to open-source part of your platform?
Hey guys, great idea! Most of the apps already have voice recognition. Do you think about serving a niche where this feature is not existing?
Also, the data protection part is unclear to me, I don't want everything uploaded to a cloud where I don't know what is happening with it.
I use MacWhisper and it works well enough for me to stop looking for options.(MBA M2 24GB Ram - Large V3 English)<p>I wouldn't feel comfortable if someone were looking over my shoulder while I'm typing at a coffee shop.<p>I am not your customer.
I would like to try this, but I use Synergy to use two computers with the same keyboard. I have Aqua Voice now on the server computer, would be great if I could input text to the client computer using it as well.
For the sake of alternatives, I've had good experience with <a href="https://tryvoiceink.com/" rel="nofollow">https://tryvoiceink.com/</a>
I use the built-in voice typing in Windows and am pretty happy with it. How would you say this compares (presuming most of your comparisons are mac-centric)
will we ever see local, open source models? they are very important for accessibility reasons which this product can fit into, but wont because of it being cloud based (and proprietary).