TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

MFCCs: Engineering Features from Sound

9 点作者 yawz将近 5 年前

1 comment

eindiran将近 5 年前
I&#x27;ve always found it fascinating that both speaker identification and speech recognition use MFCCs (which I discovered when talking with someone who had worked on speaker identification for their PhD):<p>* In the case of speaker identification, you don&#x27;t care about what is being said; you care about who is saying it.<p>* In the case of speech recognition, you don&#x27;t care about who is speaking; you want to know what is being said.<p>That both tasks use the same underlying features is very surprising to me. I imagine that it points to something very powerful about the mapping of the mel scale to psychoacoustics, but I&#x27;d be interested to hear other theories about why it shakes out that way, especially given that the research on the mel scale has been frequently criticized.