A library for audio and music analysis and feature extraction, which supports dozens of time-frequency analysis and transformation methods, as well as hundreds of corresponding time-domain and frequency-domain feature combinations, can be provided to the deep learning network for training, and can be used to study the classification, separation, music information retrieval (MIR), ASR and other tasks in the audio field.
Thanks for sharing! I’ve been using some TTS to make audiobooks for my kids. Sometimes there will be weird artifacts (sounds like someone brushing against a microphone), or the voice will change from masculine to feminine during quotes, or it will struggle to speak a phrase and make guttural sounds.<p>My next task is to figure out where this might be happening so I can rerun those segments. The books can be hours long so it’s hard to catch.<p>Could audioFlux be used to support identifying some or all of these?