TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Sounding the Secrets of AudioLM

85 pointsby tullieabout 2 years ago

5 comments

knaik94about 2 years ago
I think audio model will be much more sensitive to input issues relative to text or art. Humans are very good at picking up the nuances in audio and also process it very quickly. I wonder how far we are from being able to manipulate the emotions of how something sounds. In my opinion, that&#x27;s the turing test for any audio generative AI. Native speakers will immediately know when something is AI generated or adjusted for the same reason they immediately detect accents.<p>I am curious what kind of audio repair AI models are being worked to help make outputs sound more natural. This research feels like progress towards that goal as well.
chapsabout 2 years ago
Possibly weird question, but have there been any attempts at modeling this sort audio model specifically where tokens aren&#x27;t defined by its audio, but instead by the movement of the tongue&#x2F;mouth&#x2F;lips&#x2F;vocal chords, etc?
评论 #34943902 未加载
评论 #34944037 未加载
stanleydrewabout 2 years ago
It&#x27;s off-topic (or maybe not?) but I get a very strong &quot;ChatGPT wrote the first draft of this&quot; vibe from a lot of the introductory prose in this post.
评论 #34943817 未加载
评论 #34943721 未加载
rnosovabout 2 years ago
More examples on the AudioLM page. Some are pretty impressive (assuming they are cherry picked).<p><a href="https:&#x2F;&#x2F;google-research.github.io&#x2F;seanet&#x2F;audiolm&#x2F;examples&#x2F;" rel="nofollow">https:&#x2F;&#x2F;google-research.github.io&#x2F;seanet&#x2F;audiolm&#x2F;examples&#x2F;</a>
评论 #34944587 未加载
visargaabout 2 years ago
AudioLM advantage is that we have orders of magnitude more audio than text.
评论 #34943849 未加载
评论 #34943907 未加载