The samples were released a while back: <a href="https://google-research.github.io/seanet/stream_vc/" rel="nofollow">https://google-research.github.io/seanet/stream_vc/</a>
Are there any use cases that is driving this ? Is there a huge burning need for technology ?<p>Are kidnappers and con-men a huge under-served market that Google is hoping to serve ? Deep Fake videos not convincing enough to serve the need of fraudsters ?<p>I am totally against regulating AI but shit like this gives fodder to the other side.
From the poster:<p>In this work, we propose a light-weight (~20M param.)
causal voice conversion solution that can run in real-time
with low latency on a commercially available mobile
device. The key design elements are: (1) using a causal
encoder to learn soft speech units; (2) injecting whitened
f0 to improve pitch stability without leaking source
speaker info.<p>In our later V2 version, we found that f0 rescaling
followed by a NSF-style harmonic-plus-noise
conditioning (as is done in RVC) results in better quality.
What are the anticipated use cases?<p>I know of one: transgender people often would like to alter the timbre of their voice and spend a lot of time training their voice. At least for online scenarios, this can just do it.<p>But other than that AI voice altering research seems like it benefits mostly scammers? I’m just wondering what they tell themselves they’re doing. I didn’t see this in the paper.