TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Spleeter – Music Source-Separation Engine

258 点作者 jph98将近 5 年前

26 条评论

roddylindsay将近 5 年前
Once this technology gets incorporated into DJ mixers &#x2F; CDJs, this is going to make DJing much more creatively interesting.<p>Historically, blending between mixed stereo tracks has limited to mixing EQ bands, but now DJs will be able to layer and mix the underlying stems themselves -- like putting the vocal from one track onto an instrumental section on another (even if there were never a capella &#x2F; instrumental versions released.)<p>It also opens up a previously unreachable world for amateur remixing in general; for instance, creating surround sound mixes from stereo or even mono recordings for playback in 3D audio environments like Envelop (<a href="https:&#x2F;&#x2F;envelop.us" rel="nofollow">https:&#x2F;&#x2F;envelop.us</a>) [disclaimer: I am one of the co-founders of Envelop]
评论 #23233213 未加载
评论 #23233122 未加载
SyneRyder将近 5 年前
For anyone who wants to try Spleeter in a version that &quot;just works&quot; without having to install TensorFlow and mess with offline processing, Spleeter has been been built into a wave editor called Acoustica from Acon Digital. It&#x27;s been working really well for me, and the whole package is solid competition to editors like iZotope RX:<p><a href="https:&#x2F;&#x2F;acondigital.com&#x2F;products&#x2F;acoustica-audio-editor&#x2F;" rel="nofollow">https:&#x2F;&#x2F;acondigital.com&#x2F;products&#x2F;acoustica-audio-editor&#x2F;</a>
评论 #23238072 未加载
评论 #23233644 未加载
mwcampbell将近 5 年前
Previous discussion, where I posted a demo using a full song (legally under Creative Commons):<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=21431071" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=21431071</a><p>Note: I&#x27;m not affiliated with this project; I just think it&#x27;s cool.
评论 #23231531 未加载
svat将近 5 年前
I often have voice recordings with a lot of background noise (e.g. a public lecture in a room with poor acoustics, recorded from a phone in the audience — there&#x27;s usually sounds of paper rustling, noises from the street, etc). Is this &quot;source-separation&quot; the sort of thing that could help, or does anyone have other tips? The best thing I have so far is based on this <a href="https:&#x2F;&#x2F;wiki.audacityteam.org&#x2F;wiki&#x2F;Sanitizing_speech_recordings_made_with_portable_audio_recorders#A_simple_two-step_process_taking_a_minute" rel="nofollow">https:&#x2F;&#x2F;wiki.audacityteam.org&#x2F;wiki&#x2F;Sanitizing_speech_recordi...</a> —<p>(1) Open the file in Audacity and switch to Spectrogram view, (2) set a high-pass filter with ~150 Hz, i.e. filter out frequencies lower than that (which tend to be loud anyway), (3) <i>don’t</i> remove the higher frequencies (which aren’t loud), because they are what make the consonants understandable (apparently), (4) look for specific noises, select the rectangle, and use “Spectral Edit Multi Tool”.<p>But if machine learning can help that would be really interesting! This Spleeter page does mention “active listening, educational purposes, […] transcription” so I&#x27;m excited.
评论 #23233483 未加载
评论 #23232723 未加载
评论 #23235241 未加载
iseanstevens将近 5 年前
There is a Max&#x2F;Ableton live plugin version here, which makes it much easier to experiment with Spleeter artistically.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;diracdeltas&#x2F;spleeter4max&#x2F;releases&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;diracdeltas&#x2F;spleeter4max&#x2F;releases&#x2F;</a>
评论 #23231990 未加载
评论 #23232852 未加载
tomduncalf将近 5 年前
Another recent open source contender for source separation is Open Unmix: <a href="https:&#x2F;&#x2F;github.com&#x2F;sigsep&#x2F;open-unmix-pytorch&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;sigsep&#x2F;open-unmix-pytorch&#x2F;</a><p>I’ve not had time to try it yet but have read good things.
评论 #23235468 未加载
评论 #23245206 未加载
voiper1将近 5 年前
Very cool!<p>I was even able to run it on their notebook <a href="https:&#x2F;&#x2F;colab.research.google.com&#x2F;github&#x2F;deezer&#x2F;spleeter&#x2F;blob&#x2F;master&#x2F;spleeter.ipynb" rel="nofollow">https:&#x2F;&#x2F;colab.research.google.com&#x2F;github&#x2F;deezer&#x2F;spleeter&#x2F;blo...</a> without setting anything up locally.<p>The results of vocal separation were quite impressive.
评论 #23233061 未加载
leoncvlt将近 5 年前
Here&#x27;s the sample output, for those who are curious:<p>- Sample track: <a href="https:&#x2F;&#x2F;files.catbox.moe&#x2F;56op27.mp3" rel="nofollow">https:&#x2F;&#x2F;files.catbox.moe&#x2F;56op27.mp3</a><p>- Spleeted vocals: <a href="https:&#x2F;&#x2F;files.catbox.moe&#x2F;4d9aru.wav" rel="nofollow">https:&#x2F;&#x2F;files.catbox.moe&#x2F;4d9aru.wav</a><p>- Spleeted accompaniment: <a href="https:&#x2F;&#x2F;files.catbox.moe&#x2F;y67g23.wav" rel="nofollow">https:&#x2F;&#x2F;files.catbox.moe&#x2F;y67g23.wav</a>
Myce将近 5 年前
A local radiostation has a broadcast of four hours. They are required to play an x amount of music tracks by the station (about 6 per hour), but there has been demand to make the broadcast available as podcast without the music.<p>Could this make it possible to automatically remove the music from the MP3 file they have available? With 6 tracks per hour times 4 hours, manually removing the music is time consuming.<p>I doubt it, as it seems all vocals are are output to a single file...<p>Is there any other tool someone can recommend?
评论 #23234374 未加载
评论 #23231389 未加载
评论 #23232057 未加载
jph98将近 5 年前
Leveraging a state-of-the-art source separation algorithm for music information retrieval<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?time_continue=42&amp;v=JIR6HJISrtY&amp;feature=emb_logo" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?time_continue=42&amp;v=JIR6HJISrtY...</a>
TedDoesntTalk将近 5 年前
Now we can create all-star bands that never existed. For example:<p>Neil schon from journey. Lead guitar<p>Heart sisters doing lead vocals and lead&#x2F;rthyum guitar<p>Flea -- bass guitar from Chili Peppers<p>Neal Peart -- drummer from rush<p>Tony kay --- keys from genesis<p>The only difficulty is they must all be playing the same song. Then we can extract, transpose if needed, and remix together.
评论 #23231680 未加载
评论 #23232059 未加载
grawprog将近 5 年前
I couldn&#x27;t find any examples so was wondering for anyone that&#x27;s tried this are the results better than using a bandpass filter and an equalizer to isolate frequencies or one of those auto karaoke things?<p>Because the ability to separate any song into separate tracks would be amazing. The ability to remix any song or just play with any instrument or vocal track would be awesome. But does it have the same poor quality and limitations of most frequency based source separation?
评论 #23231984 未加载
marksomnian将近 5 年前
Had a play with the Colab and it&#x27;s quite good indeed. The authors claim &quot;100x real time speed&quot;, which is mighty impressive, but I&#x27;d be more interested in seeing a &quot;Try Really Hard&quot; mode, trading off quality and speed. Is that a thing that can be done in the current code, I wonder?
mehrdadn将近 5 年前
If you&#x27;re trying to run it on Windows with Python 3.8, add numpy and cython to the dependencies, and change Tensorflow&#x27;s requirement to be &gt;= rather than ==.<p>Though then you&#x27;ll run into compatibility errors like &quot;No module named &#x27;tensorflow.contrib&#x27;&quot; which you&#x27;ll have to fix.
mbushey将近 5 年前
While this is awesome, it&#x27;s trained on MUSDB18-HQ which as far as I can tell is proprietary. zenodo.org claims it is available, however I have filled out their &quot;request access&quot; page a half-dozen times. Does anyone know of a training data-set that&#x27;s possible to obtain?<p>Here is the zenodo response:<p>Your access request has been rejected by the record owner.<p>Message from owner: no justification given<p>Record: MUSDB18-HQ - an uncompressed version of MUSDB18 <a href="https:&#x2F;&#x2F;zenodo.org&#x2F;record&#x2F;3338373" rel="nofollow">https:&#x2F;&#x2F;zenodo.org&#x2F;record&#x2F;3338373</a><p>The decision to reject the request is solely under the responsibility of the record owner. Hence, please note that Zenodo staff are not involved in this decision.
pabs3将近 5 年前
This reminds me of this open source project (and its predecessor manyears and open hardware projects 8&#x2F;16soundsusb).<p><a href="https:&#x2F;&#x2F;github.com&#x2F;introlab&#x2F;odas" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;introlab&#x2F;odas</a> <a href="https:&#x2F;&#x2F;github.com&#x2F;introlab&#x2F;manyears" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;introlab&#x2F;manyears</a> <a href="https:&#x2F;&#x2F;github.com&#x2F;introlab&#x2F;16SoundsUSB" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;introlab&#x2F;16SoundsUSB</a><p>Website of the team behind these:<p><a href="https:&#x2F;&#x2F;introlab.3it.usherbrooke.ca&#x2F;" rel="nofollow">https:&#x2F;&#x2F;introlab.3it.usherbrooke.ca&#x2F;</a>
TheOtherHobbes将近 5 年前
Out of interest, and to put this in context - your brain can only do this for conversation, not music.<p>You routinely suppress background noise and room acoustics when listening to someone speaking. But you don&#x27;t do the same thing when listening to music. At best you can focus on individual elements in a track, and you can parse them musically (and maybe lyrically).<p>But you don&#x27;t suppress the rest to the point where you don&#x27;t hear it.
评论 #23238486 未加载
评论 #23233737 未加载
fold_left将近 5 年前
Once you have obtained just the Guitar from a track, are there any tools out there which can work out the Tablature (eg. <a href="https:&#x2F;&#x2F;www.ultimate-guitar.com&#x2F;&#x2F;top&#x2F;tabs" rel="nofollow">https:&#x2F;&#x2F;www.ultimate-guitar.com&#x2F;&#x2F;top&#x2F;tabs</a>) so you can play along?
评论 #23231544 未加载
InstaHeads将近 5 年前
Well, it seems neural networks started to appear for vocal and instrumental track isolation^^ recently I&#x27;ve discovered <a href="https:&#x2F;&#x2F;www.lalal.ai" rel="nofollow">https:&#x2F;&#x2F;www.lalal.ai</a> and it works quite well
philipov将近 5 年前
I tried using the 2 stem model to remove the music from an audio recording of two people talking. It kept sucking in some of the music whenever someone started talking, however. Is there a better model to use for that?
FraKtus将近 5 年前
It says it can be 100 times faster than in real-time.<p>So can it be run in real-time?<p>I am thinking about extracting features for music visualization but it could make a DJ happy also.
评论 #23232125 未加载
评论 #23232691 未加载
评论 #23236181 未加载
manceraio将近 5 年前
You could try spleeter on the cloud here <a href="https:&#x2F;&#x2F;voxremover.com" rel="nofollow">https:&#x2F;&#x2F;voxremover.com</a>
philipov将近 5 年前
The output appears to cut off after 10 minutes. How do you make it operate on longer files, like in the 100 minute range?
jbverschoor将近 5 年前
Deezer is pretty useless if all supported hardware require your phone to stream.<p>They should spend dev time on something that matters
peterhookgen将近 5 年前
This is very cool, I have started using it for experimenting creating hardstyle dance remixes of popular songs
fit2rule将近 5 年前
This is ultra-cool .. I have a few terabytes of jam-session recordings that I&#x27;m going to throw at this. If it ends up being usable to the point that I can re-do vocals over some of the greatest moments in the archive, I&#x27;ll be praising whatever Spleeter deity makes itself visible to me at the time, most highly ..