TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

AudioGen: Textually Guided Audio Generation

146 点作者 pierre超过 2 年前

9 条评论

solardev超过 2 年前
The last thing you&#x27;ll hear before the AI eats you: <a href="https:&#x2F;&#x2F;felixkreuk.github.io&#x2F;text2audio_arxiv_samples&#x2F;large_32factor_1streams_2048codesPerBook&#x2F;continuous_laughter_and_chuckling.mp3" rel="nofollow">https:&#x2F;&#x2F;felixkreuk.github.io&#x2F;text2audio_arxiv_samples&#x2F;large_...</a>
iamthemonster超过 2 年前
It would be very interesting indeed to have an ebook reader paired with bluetooth earphones, and it simultaneously feeds the words into this to make an ambient soundtrack, perhaps also choosing music appropriate to the word-choice on the page.
nudpiedo超过 2 年前
That could be another missing piece to videogame generational art, sfx sounds and soon soundtracks.
评论 #33042841 未加载
kevmo314超过 2 年前
The speech samples are really funny. Very Sims-esque.
评论 #33046156 未加载
karmasimida超过 2 年前
It will be more useful if it can narrate text along with those background effects.
评论 #33041456 未加载
youssefabdelm超过 2 年前
-__- I wish researchers would train a stereo 44.1kHz version...why always 16kHz? I know I know 16kHz saves more compute but come ooooon you&#x27;re Meta
fragmede超过 2 年前
Text2audio is impressive, but I wanna see dance2audio. Just need a million dollars in funding to pay for cameras and dancers.
fuzzythinker超过 2 年前
[code] redirects to the same page
评论 #33040882 未加载
uwagar超过 2 年前
s&#x2F;textually&#x2F;sexually<p>i giggled :)