TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Machine Learning for Drummers

101 点作者 psobot将近 7 年前

4 条评论

TeMPOraL将近 7 年前
Great! Refreshing to see a ML post using some well-understood methods instead of throwing a random neural net from Kaggle at the problem...<p>Tangential:<p>&gt; <i>Is a given audio file a sample of a kick drum, snare drum, hi-hat, other percussion, or something else? (...) Humans have no trouble classifying these two sounds, as we’ve likely heard them tens of thousands of times before.</i><p>Are people taught that in schools or something? Because I personally can&#x27;t classify those sounds, don&#x27;t know these names, and I&#x27;m not sure how I was supposed to learn them, other by playing in a band.
评论 #17651069 未加载
评论 #17670864 未加载
zneveu将近 7 年前
Had an idea to do this a couple months ago, but haven&#x27;t got around to implementing it yet. I&#x27;m curious: did you consider using standard image processing techniques with spectrograms as an alternative to decision trees? I know thats how Izotope does their Neutron instrument detection, but I&#x27;m not sure how it would compare performance wise. Also, have you tried classifying percussive sounds that aren&#x27;t actual drums? I&#x27;d love to see how it categorizes various stuff.
评论 #17648996 未加载
bagrow将近 7 年前
Surprised there&#x27;s no discussion of FFT, power spectra, etc. Would like to see someone with an electrical engineering&#x2F;signal processing background work on this problem.
评论 #17648922 未加载
flashman将近 7 年前
Could I use something like this to identify which of two or three people is speaking in an audio clip? Assume I can label several samples of each person&#x27;s speech, then present an unlabeled sample for classification.
评论 #17653243 未加载
评论 #17651686 未加载