Could someone fill me in why would machine learning be necessary for pitch detection? Isn't it something that could just be solved with FFT or it's a much more complicated task?
A transformer-based network model, pitch tracking for musical instruments.<p>The timbre of musical notes is the result of various combinations and transformations of harmonic relationships, harmonic strengths and weaknesses, instrument resonant peaks, and structural resonant peaks over time.<p>It utilizes the transformer-based tuneNN network model for abstract timbre modeling, supporting tuning for 12+ instrument types.
How does the accuracy of this compare to CREPE?<p><a href="https://github.com/marl/crepe">https://github.com/marl/crepe</a><p><a href="https://github.com/maxrmorrison/torchcrepe">https://github.com/maxrmorrison/torchcrepe</a><p>Does anyone know what the current state of the art is, within the Music Information Retrieval community?
What's the license?<p>What are your thoughts on PESTO which learns pitch-prediction very well with a small network, and uses a self-supervised objective?<p><a href="https://arxiv.org/abs/2309.02265" rel="nofollow noreferrer">https://arxiv.org/abs/2309.02265</a><p><a href="https://github.com/SonyCSLParis/pesto">https://github.com/SonyCSLParis/pesto</a>
This is cool! The very best software-based tuning tech out there is probably in piano tuning apps; they cost hundreds of dollars+ and are specifically made to report on harmonics and other piano nuances.<p>Do you have any comparisons against other pitch detection tech? Accuracy? Delay/Responsiveness? I assume it's much more compute work than a handcoded FFT type pitch detector.<p>I think it's possible this would find utilization in the piano world if the output offers something new / something that can analyze what a piano tuning maestro can hear and make it accessible to a mid-tier tuner.
Does anyone know where I should look if I want to detect specific sounds? Like a smoke alarm, food bowl dispenser (its very distinct), cat meowing, 3d printer collision, that sort of thing?