TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Vocal timing conditioned audio diffusion in real-time

8 点作者 haykmartiros超过 1 年前
We&#x27;ve been cooking up a new experiment where you can record yourself singing or talking and the app will generate vocals to match your words and timings. It&#x27;s backed by an end-to-end latent diffusion model that generates audio conditioned on both the style and the lyric timings - and it&#x27;s quite fast. Your actual voice and melody are not used, just the transcription, and we don&#x27;t store the recording.<p>We&#x27;ve found it&#x27;s a really natural way to control the output you want and dream up a song concept. Curious to hear what you think!

3 条评论

badFEengineer超过 1 年前
I&#x27;ve been pretty bearish on gen AI for music, but this is the most fun I&#x27;ve had playing with an AI tool in a long time- the filters remind me of the OG Instagram filter effect, where even shitty photos from phones could &quot;magically&quot; be enhanced.<p>This + the Music ControlNet post from yesterday gives me some hope that audio AI will go the direction of creative tools, rather than dystopian full song generation.
ricepaddies3超过 1 年前
I&#x27;m impressed with the quality of the sound! Some of my generations were for certain bops I&#x27;m finding myself regenerating on &quot;Surprise&quot; just to see what the model can toss up.<p>Would it be possible for the model to generate based on the recorded melody in the future? It might also be cool to have increased controls, e.g. choose between male and female vocals, and things like that.<p>Super nice work!
sarawiltberger超过 1 年前
Very cool! Is this the state of the art music gen model out there?