I often have voice recordings with a lot of background noise (e.g. a public lecture in a room with poor acoustics, recorded from a phone in the audience — there's usually sounds of paper rustling, noises from the street, etc). Is this "source-separation" the sort of thing that could help, or does anyone have other tips? The best thing I have so far is based on this <a href="https://wiki.audacityteam.org/wiki/Sanitizing_speech_recordings_made_with_portable_audio_recorders#A_simple_two-step_process_taking_a_minute" rel="nofollow">https://wiki.audacityteam.org/wiki/Sanitizing_speech_recordi...</a> —<p>(1) Open the file in Audacity and switch to Spectrogram view,
(2) set a high-pass filter with ~150 Hz, i.e. filter out frequencies lower than that (which tend to be loud anyway),
(3) <i>don’t</i> remove the higher frequencies (which aren’t loud), because they are what make the consonants understandable (apparently),
(4) look for specific noises, select the rectangle, and use “Spectral Edit Multi Tool”.<p>But if machine learning can help that would be really interesting! This Spleeter page does mention “active listening, educational purposes, […] transcription” so I'm excited.