Facebook's work on separating multiple sources in an audio stream is fundamentally different from prior ICA-based methods of Blind Source Separation [0] in ways that are both interesting and seem to be part of a broader trend at FB Research.<p>ICA-based BSS requires at least n microphones to separate n sources of sound. This work does the separation with one microphone.<p>What makes this more broadly interesting is FB Research has separately developed the capability to reconstruct full 3D models from single image photos[1].<p>Both of these reconstruct-from-single-sensor problems are MUCH harder than their associated reconstruct-from-multiple-sensors variants (ICA in the case of audio, stereo separation or photogrammetry in the case of video) so they aren't efforts one undertakes casually.<p>The obvious motivation for this single-sensor approach is augmenting existing video and audio clips, most of which are single camera, single microphone (or very closely spaced stereo microphones with minimal separation), and all of which people have already uploaded massive numbers to Facebook.<p>The more interesting motivation could be that FB (Oculus) is widely believed to be developing next generation AR or VR glasses. Most of the discussion around AR/VR headsets focuses on the displays, but if you wanted to keep both your physical size and hardware parts cost to an absolute minimum, one of the things you'd want to minimize is your sensor count.<p>FB Research seems to have a strong interest in things that reduce the number of sensors required to provide high grade AR/VR experiences and that make it possible to explore pre-existing conventional media in spatialized 3D contexts.<p>[0] <a href="https://en.m.wikipedia.org/wiki/Independent_component_analysis" rel="nofollow">https://en.m.wikipedia.org/wiki/Independent_component_analys...</a><p>[1] <a href="https://ai.facebook.com/blog/facebook-research-at-cvpr-2020/" rel="nofollow">https://ai.facebook.com/blog/facebook-research-at-cvpr-2020/</a>