DeepFilterNet: Noise supression using deep filtering

217 点作者 nitinreddy88将近 2 年前

11 条评论

Frankly, what I hear is very similar to the results of classic spectral denoising, even with the characteristic FFT artifacts (for Linux, there's Noise Repellent [1] available for advanced spectral denoising; there's also a ton of commercial spectral processors available).The demonstration could use more random background noises to separate it from FFT noise suppressors (as it's the primary benefit of ML-based filters), and more varied speech to separate it from RNNoise [2] which tends to suppress breath and cut the sibilants in an unnatural manner. The latency is also important - is it as low as in RNNoise? What about the CPU load?[1] <a href="https://github.com/lucianodato/noise-repellent">https://github.com/lucianodato/noise-repellent</a>[2] <a href="https://github.com/werman/noise-suppression-for-voice">https://github.com/werman/noise-suppression-for-voice</a>

评论 #36227119 未加载

nitinreddy88将近 2 年前

Integrate with Pipewire: <a href="https://github.com/Rikorose/DeepFilterNet/blob/main/ladspa/README.md">https://github.com/Rikorose/DeepFilterNet/blob/main/ladspa/R...</a>Youtube demo: <a href="https://youtu.be/EO7n96YwnyE" rel="nofollow">https://youtu.be/EO7n96YwnyE</a>Paper Explanation: <a href="https://youtu.be/it90gBqkY6k" rel="nofollow">https://youtu.be/it90gBqkY6k</a>

rektide将近 2 年前

It's so excellent how many moats are just getting obliterated.I absolutely have been a real snarky hater against AI, as a horrible fuedal unobservable black box that has way too much power in the world. But open source has been doing amazing at reading the papers & reproducing & it's glorious to see.Amazing examples of a peership culture in action. Rising each other up is so divine. Share the knowledge & means.

评论 #36221892 未加载

评论 #36223709 未加载

评论 #36222057 未加载

WiSaGaN将近 2 年前

It looks like the library in Rust is using `tract-onnx` to do the inference: <a href="https://github.com/Rikorose/DeepFilterNet/blob/2a84d2a1750a5fcb608323d1b4f964d9f1c037a6/libDF/Cargo.toml#L112">https://github.com/Rikorose/DeepFilterNet/blob/2a84d2a1750a5...</a> I am wondering whether using Python for research, training in big data center, and Rust at edge for efficient inference would be a trend in the future. We do have a larger community of C++ right now for inference (e.g. ggml). But Rust crate as component to build applications of AI is joy to use.

评论 #36224322 未加载

narrationbox将近 2 年前

Since it does the signal processing in the Fourier domain, does this suffer from audio artefacts e.g. hissing in the output? Torch's inverse STFT uses Griffin-Lim which is probabilistic and if you don't train it sufficiently, you may sometimes get noise in the output.<a href="https://pytorch.org/docs/stable/generated/torch.istft.html#torch-istft" rel="nofollow">https://pytorch.org/docs/stable/generated/torch.istft.html#t...</a>An alternative would be to use a vocoder network (or just target a neural speech codec like SoundStream).

评论 #36223266 未加载

ZoomZoomZoom将近 2 年前

The demo with the vac is certainly not a success.I sometimes wonder if all those filters optimise for a wrong thing. Removing noise is meaningless, unless the overall discernability improves. If you remove noise with the price of the voice becoming choppy, "robotic" and unnatural, you didn't improve the situation, and in some cases you can say only made it worse.What even further deteriorates legibility for most noise suppression filters is the discrepancy between the completely dry pauses and the remaining ambiance "under" the voice. It would be much more interesting to see some style transfer for voice ambience as an alternative to current de-verbs.When dealing with voice processing I advocate for restraining from noise suppression filters for as long as possible, and I haven't seen a publicly available noise suppression filter which could change my position yet.

评论 #36224907 未加载

sniglom将近 2 年前

Sorry if hijacking,I have a drone (DJI FPV) with a microphone. It can pick up some sounds, but the loudness from the rotors makes it really hard to hear in playback.The rotor noise varies in frequency and has several harmonics as well, so it can't be band passed.I understand that you can't get a clean or great signal from it, but something would be nice.What tool would be good to use to filter out that noise?

评论 #36278211 未加载

评论 #36226401 未加载

stonelazy将近 2 年前

One of the challenges we face in this research problem is the lack of reliable metric to evaluate the quality of the NN model. In recent times, i came to know of [3Quest metric](<a href="https://cdn.head-acoustics.com/fileadmin/data/global/Datasheets/Analysis_Software/ACQUA_Options/ACOPT-21-35-3QUEST-3QUEST-SWB-FB-ACQUA-Option-6844-6866-Data-Sheet.pdf" rel="nofollow">https://cdn.head-acoustics.com/fileadmin/data/global/Datashe...</a>) being helpful in this regard. Anybody have any experience with this metric ? May be in comparison with Microsoft's DNSMOS ?

stan_kirdey将近 2 年前

<a href="https://github.com/haoheliu/voicefixer">https://github.com/haoheliu/voicefixer</a> is also a nice CLI tool to do general speech restorationDemo page: <a href="https://haoheliu.github.io/demopage-voicefixer/" rel="nofollow">https://haoheliu.github.io/demopage-voicefixer/</a>

boneitis将近 2 年前

If you (especially on behalf of any hip, popular platforms like Discord) undertake any projects to aggressively denoise or compress audio, please (PLEASE) do us people with auditory processing difficulties a favor, and include such people in your testing.I beg of you, with utmost sincerity.