To me, there are 3 areas where voice user interfaces fail quite well at:<p>- being able to spell out a word that is not understood (mainly for proper names)<p>- teach new commands vocally (hey assistant, when I say “hit it”, actually play that song and turn the lights of the living room on)<p>- being able to understand words in a different language as part of the sentence.<p>That one makes Alexa quite a pain to use in some countries. For example, if Alexa is set to French and you wanna listen to an English song, you then need to make sure the title of the song is pronounced with a french accent, otherwise it won’t get it. The same is true if Alexa is set to English and you ask for a French or Chinese song title.<p>It makes it so frustrating that it’s unusable.
I have one trouble with voice user interfaces: how difficult or feasible is it to localize them? I can easily translate the UI of a text-based application in an afternoon to my minority language. Especially if it uses gettext, it is a matter of editing a single text file. Will I be able to do that easily for a voice user interface?
My biggest problem with voice user interfaces is that everyone can bloody hear you using them! When you type, you can clickety clack, but nobody knows what your clickety clacking means. When you write, and do work by hand, nobody can see what you write unless they're looking over your shoulder. But when you tell your voice-computer: "Create new document, titled 'Reason's why Tom is being fired,'" the whole office will hear you! If they could make a voice computer where you wouldn't have to talk, that'd be a winner.<p>(On a more serious note, maybe some kind of throat mic?)