I want to create a deep learning based filter to remove human voice from a song and generate a karaoke (regular filters are not effective). Is it a good idea to use deep learning to achieve this?
Kadenze has a program that eventually incorporates Deep Learning (<a href="https://www.kadenze.com/courses/creative-applications-of-deep-learning-with-tensorflow-iii-iii/info" rel="nofollow">https://www.kadenze.com/courses/creative-applications-of-dee...</a>). I believe it can be achieved but I haven't seen anything that clearly points out how.
Yes. This is called the cocktail party problem. Though you can solve it with more traditional methods like SVD, deep learning solves this really well too. See <a href="https://arxiv.org/abs/1504.04658" rel="nofollow">https://arxiv.org/abs/1504.04658</a>