I've always found it fascinating that both speaker identification and speech recognition use MFCCs (which I discovered when talking with someone who had worked on speaker identification for their PhD):<p>* In the case of speaker identification, you don't care about what is being said; you care about who is saying it.<p>* In the case of speech recognition, you don't care about who is speaking; you want to know what is being said.<p>That both tasks use the same underlying features is very surprising to me. I imagine that it points to something very powerful about the mapping of the mel scale to psychoacoustics, but I'd be interested to hear other theories about why it shakes out that way, especially given that the research on the mel scale has been frequently criticized.