Show HN: An open-source, Raspberry-Pi-based Siri alternative

191 pointsby shbhrsahaabout 11 years ago

27 comments

apendletonabout 11 years ago

It's a shame that the quality of open-source text to speech engines is so much worse than the current commercial state of the art, as that's the most notable difference in the demo videos between this and something like Siri. Would fixing that just be a matter of recording more high-quality free sample libraries, etc., or are there fundamental technical challenges to solve?

评论 #7548051 未加载

评论 #7548131 未加载

评论 #7548997 未加载

评论 #7554786 未加载

评论 #7548367 未加载

评论 #7548915 未加载

评论 #7547585 未加载

scribuabout 11 years ago

It says it's 100% open-source, but I can't seem to find the sources for the Jasper platform (not the Jasper client).Even the "compile Jasper from scratch" installation method involves downloading some binaries: <a href="http://jasperproject.github.io/documentation/software/#install-binaries" rel="nofollow">http://jasperproject.github.io/documentation/software/#insta...</a>edit: more specific link

评论 #7547233 未加载

kyle_martin1about 11 years ago

I actually made this exact thing Saturday afternoon. Great job!First prototype was on an Arduino it and eventually ran out of firmware space. So then I upgraded to the RPi and from there it was a breeze.Suggestions:(1) Use Wit.ai for NLP. There is some added latency but the capabilities far out reach Sphinx in the long run. It's free. Less code to maintain. Easier to deploy and distribute.(2) Try to find a small mic so that you can put everything in a sleek package.(3) Add support for bluetooth speakers (you're on a RPi, it's basically done for you)(4) 3D print a custom case, throw some 3M tape on it and it's ready to be wall-mounted!

评论 #7549060 未加载

评论 #7551278 未加载

评论 #7549422 未加载

评论 #7548826 未加载

tdicolaabout 11 years ago

Nice project--I like the code, it's very clean and well structured. You should check out the subversion trunk of pocketsphinx, it has support for keyword spotting built in so you can do things like instantly recognize the persona keyword to enable the system instead of running the transcription through pocketsphinx and hoping for the best.Unfortunately the keyword spotting stuff isn't documented yet, but check out the code for my Demolition Man swear detector project which is using it: <a href="http://hackaday.io/project/531-Demolition-Man-Verbal-Morality-Statute-Monitor" rel="nofollow">http://hackaday.io/project/531-Demolition-Man-Verbal-Moralit...</a> The important bit is the ps_set_kws function call, this takes either a text keyword or filename with list of keywords. Then after processing audio call ps_get_hyp and it will return any spotted keywords. Check out the code here in PocketSphinxKWS.h/cpp: <a href="https://github.com/tdicola/DemoManMonitor" rel="nofollow">https://github.com/tdicola/DemoManMonitor</a>

sarvagyavaishabout 11 years ago

I am looking to use this / something similar at the startup I work at to toggle our robot's operating modes. I have a question regarding the voice recognition. Is the voice recognition stuff done on the Pi itself, or is there a service that Jasper taps into to perform voice recognition?

评论 #7548840 未加载

atmosxabout 11 years ago

That's amazing, I'm impressed. I have no idea how voice recognition software works, but is it possible using this open source project to add other non-Latin languages (e.g. Greek, Arabic, Japanese)?

评论 #7547841 未加载

评论 #7548932 未加载

m4r71nabout 11 years ago

I'm not entirely clear on why it requires a WiFi adapter? Can you not use the wired connection? Module writing looks pretty nifty though, can't wait to give it a try over the weekend.

评论 #7547130 未加载

lukasmabout 11 years ago

Guys, could you tell me where did you get the music for your demo? How much did you pay? What was the process? Did you use something like MoveMaker?I'm building a tool (prototype phase) for creating trail videos. One of the use case would be to create the demo movie of a product.

评论 #7547593 未加载

nlabout 11 years ago

This looks pretty good.I've been working on a freetext question answering service, but more the question answering part (as opposed to the voice recognition side).Looking at the documentation[1], it appears there is no way for it to handle free text questions ("What is population of X?" - where X is any country) since all words need to be defined in advance. Is that correct or am I missing something?[1] <a href="http://jasperproject.github.io/documentation/api/standard/" rel="nofollow">http://jasperproject.github.io/documentation/api/standard/</a>

ohblahitsmeabout 11 years ago

This looks great! I'm only missing a USB microphone. I'll be sure to make this once I get my hands on one.It also seems pretty trivial to set up Wolfram Alpha on here. From what it looks like, you'd just have to: 1) get a developer account at Wolfram Alpha 2) download this promising looking module: <a href="https://pypi.python.org/pypi/wolframalpha/1.0.2" rel="nofollow">https://pypi.python.org/pypi/wolframalpha/1.0.2</a> 3) integrate it into Jasper (create a module)I'll be sure to try it once I get it set up.

izquiabout 11 years ago

This is pretty cool.I would use Android's TTS (picotts). Audio quality is better.

cscurmudgeonabout 11 years ago

Serious question: Why does everyone seem to confuse speech recognition with other parts of NLP (e.g. parsing)?I can understand CNN or TechCrunch getting confused, but there seems to be a universal confusion here on HN too.Not ranting. It is a bit exasperating to read comments and articles addressing only speech recognition. Siri is more than that.

murali44about 11 years ago

Sweet! This is just what I was looking for to command my Sonos speaker to play some music.

评论 #7547571 未加载

samstaveabout 11 years ago

This would be an awesome voice control addon for XMBC/media play services from a Pi!

评论 #7551887 未加载

评论 #7548843 未加载

tylercrumptonabout 11 years ago

Great work! What sort of recognition distance were you able to get with that microphone?

评论 #7547310 未加载

jkldotioabout 11 years ago

Excellent work, I've always wanted to see a real world use of pocketsphinx with python. When I looked a year ago the documentation was lacking. The module system looks nicely extensible as well.

rob_mccannabout 11 years ago

I had a very similar (albeit less-complete) hack a while back: <a href="https://github.com/rob-mccann/Pi-Voice" rel="nofollow">https://github.com/rob-mccann/Pi-Voice</a>

sp332about 11 years ago

Would it be easy to get this running on a normal PC, without a RPi?

评论 #7548121 未加载

endeavourabout 11 years ago

Is there a way to make this subtract any noise being outputted by the pi's speaker ouput, so that it can still understand me if I'm playing music/watching a movie?

评论 #7548893 未加载

gyoskoabout 11 years ago

It would be cool to trigger commands with my smartphone. I say "Open the door" he sends it to my Raspberry who then opens my door. Is this possible?

drincognitoabout 11 years ago

How accurate is text recognition (could you use it for dictation?) and how fast between the end of a command and recognition/parsing of said command?

评论 #7548960 未加载

kylemaxwellabout 11 years ago

Excellent, I've pondered doing something like this with a BeagleBone Black. Can't wait to try it out and see what I can do.

评论 #7549736 未加载

achalkleyabout 11 years ago

Open Source project pitched like a product. This is how you do it peeps!I like this a lot and I'm encouraged to see it.

raghavsethiabout 11 years ago

Great work guys! Looking forward to playing with this!

nmadhavanabout 11 years ago

Seems clean, useful, and well-documented!

smallfluffycatabout 11 years ago

I need a french acoustic model, if anyone has one...

评论 #7551993 未加载

blikerabout 11 years ago

I think this is a great project! But Raspberry Pi might be a bit overpriced/overpowered for this task. Maybe something like Arduino Yún would be more appropriate choice. I am really hoping this movement of small GNU/Linux based home appliances will take off and lower the price.

评论 #7547822 未加载