TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Infinity – Realistic AI characters that can speak

481 pointsby lcolucci9 months ago
Hey HN, this is Lina, Andrew, and Sidney from Infinity AI (<a href="https:&#x2F;&#x2F;infinity.ai&#x2F;">https:&#x2F;&#x2F;infinity.ai&#x2F;</a>). We&#x27;ve trained our own foundation video model focused on people. As far as we know, this is the first time someone has trained a video diffusion transformer that’s driven by audio input. This is cool because it allows for expressive, realistic-looking characters that actually speak. Here’s a blog with a bunch of examples: <a href="https:&#x2F;&#x2F;toinfinityai.github.io&#x2F;v2-launch-page&#x2F;" rel="nofollow">https:&#x2F;&#x2F;toinfinityai.github.io&#x2F;v2-launch-page&#x2F;</a><p>If you want to try it out, you can either (1) go to <a href="https:&#x2F;&#x2F;studio.infinity.ai&#x2F;try-inf2">https:&#x2F;&#x2F;studio.infinity.ai&#x2F;try-inf2</a>, or (2) post a comment in this thread describing a character and we’ll generate a video for you and reply with a link. For example: “Mona Lisa saying ‘what the heck are you smiling at?’”: <a href="https:&#x2F;&#x2F;bit.ly&#x2F;3z8l1TM" rel="nofollow">https:&#x2F;&#x2F;bit.ly&#x2F;3z8l1TM</a> “A 3D pixar-style gnome with a pointy red hat reciting the Declaration of Independence”: <a href="https:&#x2F;&#x2F;bit.ly&#x2F;3XzpTdS" rel="nofollow">https:&#x2F;&#x2F;bit.ly&#x2F;3XzpTdS</a> “Elon Musk singing Fly Me To The Moon by Sinatra”: <a href="https:&#x2F;&#x2F;bit.ly&#x2F;47jyC7C" rel="nofollow">https:&#x2F;&#x2F;bit.ly&#x2F;47jyC7C</a><p>Our tool at Infinity allows creators to type out a script with what they want their characters to say (and eventually, what they want their characters to do) and get a video out. We’ve trained for about 11 GPU years (~$500k) so far and our model recently started getting good results, so we wanted to share it here. We are still actively training.<p>We had trouble creating videos of good characters with existing AI tools. Generative AI video models (like Runway and Luma) don’t allow characters to speak. And talking avatar companies (like HeyGen and Synthesia) just do lip syncing on top of the previously recorded videos. This means you often get facial expressions and gestures that don’t make sense with the audio, resulting in the “uncanny” look you can’t quite put your finger on. See blog.<p>When we started Infinity, our V1 model took the lip syncing approach. In addition to mismatched gestures, this method had many limitations, including a finite library of actors (we had to fine-tune a model for each one with existing video footage) and an inability to animate imaginary characters.<p>To address these limitations in V2, we decided to train an end-to-end video diffusion transformer model that takes in a single image, audio, and other conditioning signals and outputs video. We believe this end-to-end approach is the best way to capture the full complexity and nuances of human motion and emotion. One drawback of our approach is that the model is slow despite using rectified flow (2-4x speed up) and a 3D VAE embedding layer (2-5x speed up).<p>Here are a few things the model does surprisingly well on: (1) it can handle multiple languages, (2) it has learned some physics (e.g. it generates earrings that dangle properly and infers a matching pair on the other ear), (3) it can animate diverse types of images (paintings, sculptures, etc) despite not being trained on those, and (4) it can handle singing. See blog.<p>Here are some failure modes of the model: (1) it cannot handle animals (only humanoid images), (2) it often inserts hands into the frame (very annoying and distracting), (3) it’s not robust on cartoons, and (4) it can distort people’s identities (noticeable on well-known figures). See blog.<p>Try the model here: <a href="https:&#x2F;&#x2F;studio.infinity.ai&#x2F;try-inf2">https:&#x2F;&#x2F;studio.infinity.ai&#x2F;try-inf2</a><p>We’d love to hear what you think!

79 comments

yellowapple9 months ago
As soon as I saw the &quot;Gnome&quot; face option I gnew exactly what I gneeded to do: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;b31e494f-95ef-4c7d-8500-8bb5c17dab36-hqGJLnkiRkxuj7B9n7KWBOgrdpJzb2.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p>EDIT: looks like the model doesn&#x27;t like Duke Nukem: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;0e79d09d-3dfe-4bc3-86a1-844c823c4d95-UpgNkUjajW56PSjY7JkkMqXRIrPMAq.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p>Cropping out his pistol only made it worse lol: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;b8c4889c-f6c8-4dd5-b580-75ff651badf4-Za9lv58BUCQQMlYw456TJJC1jGGDOi.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p>A different image works a little bit better, though: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;ee0ca607-6a22-4be5-a8d2-1af5d66cacf9-f1cKpF81QZt5iIkACTXUfWeJA54Q85.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41469830 未加载
评论 #41469814 未加载
评论 #41470467 未加载
评论 #41469625 未加载
squarefoot9 months ago
Someone had to do that, so here it is: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;56f2ff47-8535-4bbc-b234-a2fddcc8daf6-Zz1AqAiHRbvoMfkhbjih2kMiPSm669.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
vessenes9 months ago
Hi Lina, Andrew and Sidney, this is awesome.<p>My go-to for checking the edges of video and face identification LLMs are Personas right now -- they&#x27;re rendered faces done in a painterly style, and can be really hard to parse.<p>Here&#x27;s some output: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;67ece032-f495-43a1-8d50-6fee07fc92cd-Ra0Cg3WWofQlxbOPujQLhuiq26WHY9.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p>Source image from: <a href="https:&#x2F;&#x2F;personacollective.ai&#x2F;persona&#x2F;1610" rel="nofollow">https:&#x2F;&#x2F;personacollective.ai&#x2F;persona&#x2F;1610</a><p>Overall, crazy impressive compared to competing offerings. I don&#x27;t know if the mouth size problems are related to the race of the portrait, the style, the model, or the positioning of the head, but I&#x27;m looking forward to further iterations of the model. This is already good enough for a bunch of creative work, which is rad.
评论 #41469638 未加载
hansoolo9 months ago
This is fun!<p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;026845f6-6ec5-40fd-8350-10a3a779e545-XYtTJdrc9VRxMam8PeuRqYgr9jE8oO.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41477270 未加载
PerilousD9 months ago
Damn - I took an (AI) image that I &quot;created&quot; a year ago that I liked and then you animated it AND let it sing Amazing Grace. Seeing IS believing this technology pretty much means video evidence ain&#x27;t necessarily so.
评论 #41468560 未加载
shitloadofbooks9 months ago
<a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;ebc93c27-de42-4b9a-af68-7010f13703c2-uCT9hWe33kHcfmIt4r9iXRyWXPrfA3.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p>It’s astounding that 2 sentences generated this. (I used text-to-image and the prompt for a space marine in power armour produced something amazing with no extra tweaks required).
advael9 months ago
There is prior art here, e.g. Emo from alibaba research (<a href="https:&#x2F;&#x2F;humanaigc.github.io&#x2F;emote-portrait-alive&#x2F;" rel="nofollow">https:&#x2F;&#x2F;humanaigc.github.io&#x2F;emote-portrait-alive&#x2F;</a>), but this is impressive and also actually has a demo people can try, so that&#x27;s awesome and great work!
评论 #41471048 未加载
评论 #41474527 未加载
Andrew_nenakhov9 months ago
I tried making this short clip [0] of Baron Vladimir Harkonnen announcing the beginning of the clone war, and it&#x27;s almost fine, but the last frame somehow completely breaks.<p>[0]: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;b59104aa-ea6c-44c7-baa1-38708e7ae770-tTiAflTPj5aEBIHW26EgUvA8XLkdPF.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41468695 未加载
dang9 months ago
This is my favorite: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;5e318b66-fde8-474f-bede-bef45266c7b1-yFXXKBL4EdyjgfDyhnr9zYKcpmxJ7O.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41468114 未加载
评论 #41470864 未加载
b0ner_t0ner9 months ago
Steve Jobs on Microsoft Edge: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;00eb29a5-f42a-4cbc-afc7-43b8f4d3eac1-LY3dpLW1qKxNjByNYHkVKm9BpalBB9.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41473480 未加载
评论 #41474428 未加载
评论 #41480729 未加载
zach_miller9 months ago
Tried to make this meme [1] a reality and the source image was tough for it.<p>Heads up, little bit of language in the audio.<p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;b31545d6-810f-48ab-847f-55e27d2aadc1-5FAFCmmkhYkB2ae1A8hjjb2fc6lMfb.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p>[1] <a href="https:&#x2F;&#x2F;i.redd.it&#x2F;uisn2wx2ol0d1.jpeg" rel="nofollow">https:&#x2F;&#x2F;i.redd.it&#x2F;uisn2wx2ol0d1.jpeg</a>
评论 #41470284 未加载
johnchristopher9 months ago
Well, I don&#x27;t know what to think about this, I don&#x27;t know where we are going. I should read some scifi from back then about conversational agents maybe ?<p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;413520c7-35c0-43d6-8cf3-a3e384d75e68-EUXxMTcK8S0fVMn4cHAf5QfPLgDa0m.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;c21092dd-07de-460a-867e-75d2d8581f1d-zhGBxbhSPUBFK32nZaTL7a6cmGBCcT.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;c414c311-f1f9-470a-81f9-ea1031073e71-XjGTCpQPtQHJfa9KqVz2kqtSISK7wU.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41477260 未加载
marginalia_nu9 months ago
Tried my hardest to push this into the uncanny valley. I did, but it was pretty hard. Seems robust.<p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;d10e8a10-0e03-463e-a137-7de74830ef4c-pP54DqbM7Yf4P635cI5pEcYZv9x87o.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41469114 未加载
评论 #41469523 未加载
评论 #41468854 未加载
评论 #41469061 未加载
ardrak9 months ago
&gt; It often inserts hands into the frame.<p>Looks like too much Italian training data
评论 #41470890 未加载
RobinL9 months ago
Have to say, whilst this tech has some creepy aspects, just playing about with this my family have had a whole sequence of laughs out loud moments - thank you!
评论 #41469242 未加载
评论 #41468662 未加载
naveensky9 months ago
Is it similar to <a href="https:&#x2F;&#x2F;loopyavatar.github.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;loopyavatar.github.io&#x2F;</a>. I was reading about this today and even the videos are exactly the same.<p>I am curious if you are anyway related to this team?
评论 #41468237 未加载
评论 #41468191 未加载
评论 #41468395 未加载
评论 #41468207 未加载
zoogeny9 months ago
I am actively working in this area from a wrapper application perspective. In general, tools that generate video are not sufficient on their own. They are likely to be used as part of some larger video-production workflow.<p>One drawback of tools like runway (and midjourney) is the lack of an API allowing integration into products. I would love to re-sell your service to my clients as part of a larger offering. Is this something you plan to offer?<p>The examples are very promising by the way.
评论 #41470575 未加载
nextcaller9 months ago
It&#x27;s great <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;d377220b-f52e-4b53-a825-d406584f9c77-RJhuiH6ZkQdvNuHZMu1OwWzeP6L9xr.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
naveensky9 months ago
For such models, is it possible to fine-tune models with multiple images of the main actor?<p>Sorry, if this question sounds dumb, but I am comparing it with regular image models, where the more images you have, the better output images you generate for the model.
评论 #41468414 未加载
w10-19 months ago
Breathtaking!<p>First, your (Lina&#x27;s) intro is perfect in honestly and briefly explaining your work in progress.<p>Second, the example I tried had a perfect interpretation of the text meaning&#x2F;sentiment and translated that to vocal and facial emphasis.<p>It&#x27;s possible I hit on a pre-trained sentence. With the default manly-man I used the phrase, &quot;Now is the time for all good men to come to the aid of their country.&quot;<p>Third, this is a fantastic niche opportunity - a billion+ memes a year - where each variant could require coming back to you.<p>Do you have plans to be able to start with an existing one and make variants of it? Is the model such that your service could store the model state for users to work from if they e.g., needed to localize the same phrase or render the same expressivity on different facial phenotypes?<p>I can also imagine your building different models for niches: faces speaking, faces aging (forward and back); outside of humans: cartoon transformers, cartoon pratfalls.<p>Finally, I can see both B2C and B2B, and growth&#x2F;exit strategies for both.
评论 #41468363 未加载
johnyzee9 months ago
It&#x27;s incredibly good - bravo. Only thing missing for this to be immediately useful for content creation, is more variety in voices, or ideally somehow specifying a template sound clip to imitate.
评论 #41468608 未加载
artur_makly9 months ago
oh this made my day: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;f35d0dda-06aa-4e04-838d-77f02a370a04-Bc10YGY5SgVAcWdgPTZlFIZYvauaJl.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p>!NWSF --lyrics by Biggy$malls
评论 #41468973 未加载
评论 #41469690 未加载
评论 #41468734 未加载
max4c9 months ago
This is amazing and another moment where I question what the future of humans will look like. So much potential for good and evil! It&#x27;s insane.
评论 #41471086 未加载
评论 #41469796 未加载
svieira9 months ago
Quite impressive - I tried to confuse it with things it would not generally see and it avoided all the obvious confabulations <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;6a532857-ba9f-42dc-b64a-bdb2cee2cb76-gIqdMttxqWDvVUsRrrL06PujVVKyTn.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41469196 未加载
评论 #41469186 未加载
scotty799 months ago
It&#x27;s awesome for very short texts. Like a single long sentence. For even a bit longer sequences it seems to be losing adherence to the initial photo and also venture into uncanny valley with exaggerated facial expressions.<p>A product that might be build on top of this could split the input into reasonable chunks, generate video for each of them separately and stitch them with another model that can transition from one facial expression into another in a fraction of a second.<p>Additional improvement might be feeding the system not with one image but with a few expressing different emotional expressions. Then the system could analyze the split input to find out in which emotional state each part of the video should be started on.<p>On unrelated note ... generated expressions seem to be relevant to the content of the input text. So either text to speech might understand language a bit or the video model itself.
siffin9 months ago
Very cool, thanks for the play.<p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;bb39b162-ef10-45f3-ae28-1bfbed5ca660-lb1VNy1uUxiZs3XY0F77Ut5CPK7Cht.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p>Managed to get it working with my doggo.
snickmy9 months ago
Out of curiosity, where are you training all this ? aka where do you find the money to support such training
评论 #41477015 未加载
IXCoach8 months ago
WOW this is very good!!<p>I have an immediate use case for this. Can you stream via AI to support real time chat this way?<p>Very very good!<p>Jonathan<p>founder@ixcoach.com<p>We deliver the most exceptional simulated life coaching, counseling and personal development experiences in the world through devotion to the belief that having all the support you need should be a right, not a privilege.<p>Test our capacity at ixcoach.com for free to see for yourself.
sharemywin9 months ago
you need a slider for how animated the facial expression are.
评论 #41468590 未加载
Andrew_nenakhov9 months ago
i wonder how long would it take for this technology to advance to a point where nice people from &#x2F;r&#x2F;freefolk would be able to remake seasons 7 and 8 of Game of Thrones to have a nice proper ending? 5 years, 10?
评论 #41468900 未加载
评论 #41468954 未加载
评论 #41469614 未加载
archon14109 months ago
The website is pretty lightweight and easy-to-use. The service also holds up pretty well, specially if the source image is high-enough resolution. The tendency to &quot;break&quot; at the last frame happens with low resolution images it seems.<p>My generation: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;45f0f17b-a277-49b5-b7f4-cdb857794b9b-CtaiVQTisuwnFT3whrlZsdoKxdpKLm.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41469239 未加载
parkaboy9 months ago
Max headroom hack x hacker&#x27;s manifesto! I&#x27;m impressed with the head movement dynamism on this one.<p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;53b50fc2-a598-49b9-86e5-688b91570763-QxaRpEOqGdLSFjUVQ4y3yy9iLp89Hk.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
nickfromseattle9 months ago
I need to create a bunch of 5-7 minute talking head videos. What&#x27;s your timeline for capabilities that would help with this?
评论 #41468731 未加载
评论 #41473953 未加载
WaffleIronMaker9 months ago
Does anybody know about the legality of using Eminem&#x27;s &quot;Gozilla&quot; as promotional material[1] for this service?<p>I thought you had to pay artists for a license before using their work in promotional material.<p>[1] <a href="https:&#x2F;&#x2F;infinity.ai&#x2F;videos&#x2F;setA_video3.mp4">https:&#x2F;&#x2F;infinity.ai&#x2F;videos&#x2F;setA_video3.mp4</a>
评论 #41470588 未加载
评论 #41471183 未加载
sroussey9 months ago
I look forward to movies that are dubbed moving the face+lips to the dubbed text. Also using the original actors voice.
评论 #41468259 未加载
评论 #41468653 未加载
评论 #41468202 未加载
评论 #41467879 未加载
ladidahh9 months ago
I have uploaded an image and then used text to image, and both videos were not animated but the audio was included
评论 #41468348 未加载
评论 #41468152 未加载
guessmyname9 months ago
Is this the original? <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;2f40980e-5fab-4040-bc6f-e0c5ce8c9e54-09mf5OnMq8ZslE9pOzgd50tKpIAK1H.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41470215 未加载
eth0up9 months ago
Lemming overlords<p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;f200244c-5d8a-4858-9c15-88315dbe9212-7ViUaDgTrLwFASzv6xCfPsbECsIfLw.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41470258 未加载
LarsDu889 months ago
Putting Drake as a default avatar is just begging to be sued. Please remove pictures of actual people!
评论 #41468503 未加载
评论 #41468150 未加载
评论 #41468126 未加载
zaptrem9 months ago
The e2e diffusion transformer approach is super cool because it can do crazy emotions which make for great memes (like Joe Biden at Live Aid! <a href="https:&#x2F;&#x2F;youtu.be&#x2F;Duw1COv9NGQ" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;Duw1COv9NGQ</a>)<p>Edit: Duke Nukem flubs his line: <a href="https:&#x2F;&#x2F;youtu.be&#x2F;mcLrA6bGOjY" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;mcLrA6bGOjY</a>
评论 #41468443 未加载
评论 #41469357 未加载
SlackingOff1239 months ago
Oh, this is amazing! I&#x27;ve been having so much fun with it.<p>One small issue I&#x27;ve encountered is that sometimes images remain completely static. Seems to happen when the audio is short - 3 to 5 seconds long.
评论 #41471015 未加载
doctorpangloss9 months ago
If you had a $500k training budget, why not buy 2 DGX machines?
评论 #41470446 未加载
AnnaMere9 months ago
This is surprisingly very intelligent and awesome, any plan for research paper or full grown project with pricing or open source?
dhbradshaw9 months ago
So good it feels like I think maybe I can read their lips
评论 #41470696 未加载
ilaksh9 months ago
It would be amazing to be able to drive this with an API.
评论 #41469194 未加载
sidneyprimas9 months ago
After much user feedback, we removed the Infinity watermark from the generated videos. Thanks for the feedback. Enjoy!
whitehexagon9 months ago
Thank you for no signup, it&#x27;s very impressive, especially the physics of the head movement relating to vocal intonation.<p>I feel like I accidentally made an advert for whitening toothpaste:<p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;83ab9fdc-9ca7-4d1b-ac0d-5e15c14a80db-HcvE02BBvymrbkG3UEpbpDuh0G5Luo.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p>I am sure the service will get abused, but wish you lots of success.
评论 #41473525 未加载
modeless9 months ago
Won&#x27;t be long before it&#x27;s real time. The first company to launch video calling with good AI avatars is going to take off.
评论 #41469506 未加载
评论 #41478061 未加载
kemmishtree9 months ago
I&#x27;d love to enable Keltar, the green guy in the ceramic cup, to do this www.molecularReality&#x2F;QuestionDesk
billconan9 months ago
can this achieve real-time performance or how far are we from a real-time model?
评论 #41468816 未加载
评论 #41478225 未加载
android5219 months ago
This is great. is it open source? is there an api and what is the pricing?
bufferoverflow9 months ago
It completely falls apart on longer videos for me, unusable over 10 seconds.
评论 #41470285 未加载
评论 #41470306 未加载
评论 #41470760 未加载
dvfjsdhgfv9 months ago
Hi, there is a mistake in the headline, you wrote &quot;realistic&quot;.
lofaszvanitt9 months ago
Rudimentary, but promising.
vadiml9 months ago
Let&#x27;s see what Putin says about it: <a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;5d163052-090e-4e1b-bc20-b4ff05be31d9-ETINjzAisdCIwxAlG9TZD6JVGJ3R3I.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a>
评论 #41475388 未加载
评论 #41472796 未加载
protocolture9 months ago
Sadly wouldnt animate an image of shodan from system shock 2
strogonoff9 months ago
Is it fairly trained?
评论 #41471130 未加载
jadbox9 months ago
Awesome, any plans for an API and, if so, how soon?
评论 #41470559 未加载
naveensky9 months ago
Is there any limitation on the video length?
评论 #41468277 未加载
bschmidt19 months ago
Amazing work! This technology is only going to improve. Soon there will be an infinite library of rich and dynamic games, films, podcasts, etc. - a totally unique and fascinating experience tailored to you that&#x27;s only a prompt away.<p>I&#x27;ve been working on something adjacent to this concept with Ragdoll (<a href="https:&#x2F;&#x2F;github.com&#x2F;bennyschmidt&#x2F;ragdoll-studio">https:&#x2F;&#x2F;github.com&#x2F;bennyschmidt&#x2F;ragdoll-studio</a>), but focused not just on creating characters but producing creative deliverables using them.
评论 #41469422 未加载
fsndz9 months ago
super nice. why does it degrade quality of image so much, makes it looks obviously AI-generated rapidly.
DevX1019 months ago
Any details yet on pricing or too early?
评论 #41467813 未加载
aagha9 months ago
This is so impressive. Amazing job.
barrenko9 months ago
Talking pictures. Talking heads!
siscia9 months ago
Can I get a pricing quote?
atum479 months ago
This is super funny.
sharemywin9 months ago
accidentally clicked the generate button twice.
deisteve9 months ago
what is the TTS model you are using
评论 #41469260 未加载
la647109 months ago
Nice
toisanji9 months ago
can we choose our own voices?
评论 #41470420 未加载
slt20219 months ago
great job Andrew and Sidney!
bosky1019 months ago
Dayum
Log_out_9 months ago
and mow a word from our..
dorianmariefr9 months ago
quite slow btw
评论 #41468449 未加载
ianbicking9 months ago
The actor list you have is so... cringe. I don&#x27;t know what it is about AI startups that they seem to be pulled towards this kind of low brow overly online set of personalities.<p>I get the benefit of using celebrities because it&#x27;s possible to tell if you actually hit the mark, whereas if you pick some random person you can&#x27;t know if it&#x27;s correct or even stable. But jeez... Andrew Tate in the first row? And it doesn&#x27;t get better as I scroll down...<p>I noticed lots of small clips so I tried a longer script, and it seems to reset the scene periodically (every 7ish seconds). It seems hard to do anything serious with only small clips...?
评论 #41468364 未加载
xpe9 months ago
Given that I don&#x27;t agree with many of Yann LeCun&#x27;s stances on AI, I enjoyed making this:<p><a href="https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2-clips&#x2F;1445ab24-e321-428b-98ce-a322d904c9d5-CE61KDAwyVyc5gOaiZDOmNqBNJYfgd.mp4" rel="nofollow">https:&#x2F;&#x2F;6ammc3n5zzf5ljnz.public.blob.vercel-storage.com&#x2F;inf2...</a><p>Hello I&#x27;m an AI-generated version of Yann LeCoon. As an unbiased expert, I&#x27;m not worried about AI. ... If somehow an AI gets out of control ... it will be my good AI against your bad AI. ... After all, what does history show us about technology-fueled conflicts among petty, self-interested humans?
评论 #41474363 未加载
评论 #41471577 未加载
评论 #41473131 未加载
aramndrt9 months ago
Quick tangent: Does anybody know why many new companies have this exact web design style? Is it some new UI framework or other recent tool? The design looks sleek, but they all appear so similar.
评论 #41468067 未加载
评论 #41468205 未加载
评论 #41468141 未加载
评论 #41468932 未加载
cchance9 months ago
I tried with the drake and drake saying some stuff and while its cool, its still lacking, like his teeth are disappearing partially :S
评论 #41468562 未加载
评论 #41468564 未加载
jl69 months ago
Say I’m a politician who gets caught on camera doing or saying something shady. Will your service do anything to prevent me from claiming the incriminating video was just faked using your technology? Maybe logging perceptual hashes of every output could prove that a video didn’t come from you?
评论 #41469198 未加载
评论 #41471396 未加载