Create AI videos by simply typing in text

256 pointsby vladohalmost 4 years ago

59 comments

Synthesia is also the name of a much more established, extremely popular midi/piano visualisation software[1]. If you've ever looked up "<song> piano tutorial" on youtube, you've probably seen that program.It's a shame they chose that name, since it was such a great play on words for the midi software (synesthesia is sound into colorful visuals, and midi uses synths) whereas this product has basically no relation.[1] <a href="https://synthesiagame.com/" rel="nofollow">https://synthesiagame.com/</a>

评论 #27331390 未加载

评论 #27332381 未加载

stevenicralmost 4 years ago

Avoid getting your video rejected. Please make sure you adhere to our content guidelines. Please keep your script professional and business related. Political, sexual, personal, criminal and discriminatory content will not be tolerated or approved.Ahh.. the anchor fm problem.. guess I'll need an open source version.I started toying with libreBot I think it's called - which allows you to do anything you want with these things if you self-host license for a grand I think it was.This synthesia didn't even get the first sentence I tried. It also requires a 'business email' and agree to terms that includes "I agree to receive occasional product information as per Synthesia Privacy Policy *"trying hard to keep the genie in the bottle aren't they.

评论 #27329414 未加载

评论 #27331598 未加载

评论 #27332420 未加载

shannifinalmost 4 years ago

While the tech is impressive in itself, still doesn't look to be something I'd pay for. The lip sync is annoyingly off, and the bland expressions that comes from not understanding context make the communication even worse. If having a visual talking head is that important for a project, still seems better to just hire someone.(On a side note, I'm not sure I understand the appeal of emotionally bland fake-smile talking heads in general, even when they're real.)

评论 #27331173 未加载

评论 #27331516 未加载

评论 #27332394 未加载

question000almost 4 years ago

Can you think of one good use for this product?No I'm not asking if you think you can you use this to make money, I'm asking do you personally want to sit through a video of a robot telling you do things? Are we supposed to believe this is preferable to simply reading this or hearing recorded audio? This is flat out consumer hostility, basically telling your customers to talk to a sock puppet instead of a real person, I hope this fails, I would pay money to make this illegal.

评论 #27330014 未加载

评论 #27331069 未加载

评论 #27333029 未加载

评论 #27330906 未加载

评论 #27331354 未加载

xiphias2almost 4 years ago

Here's the cookie text if you are lazy to read it...it sounds a bit creepy: <a href="https://share.synthesia.io/a4159eee-f70b-4318-a8bc-ec0fdf6af751" rel="nofollow">https://share.synthesia.io/a4159eee-f70b-4318-a8bc-ec0fdf6af...</a>

评论 #27328997 未加载

评论 #27330196 未加载

评论 #27329345 未加载

erichurkmanalmost 4 years ago

Are sales spam emails going to start including personalized videos? I guess I'll look forward to the "Hello dollar sign firstname. I'm dollar sign agentname. My colleague recommended I connect with you, as you both work at dollar sign employer" template misfires.

评论 #27338661 未加载

istoricalalmost 4 years ago

So where's the version that allows NSFW content? Can't be the only one who wanted to test this with erotica.

评论 #27328844 未加载

评论 #27329466 未加载

firefoxdalmost 4 years ago

Impressive. Funny enough I've started to see those faces appear on YouTube. The intention may be to create these corporate style videos, but I'm counting down the minutes until my aunt starts forwarding questionable things on WhatsApp.

anonytraryalmost 4 years ago

<a href="https://share.synthesia.io/d8860a05-2870-4315-9316-b03cbc76a6ad" rel="nofollow">https://share.synthesia.io/d8860a05-2870-4315-9316-b03cbc76a...</a>Animations are pretty good. Pronunciation could use some work. There also does not seem to be a way to influence the inflection, which is an absolutely crucial component for sales pitches. It's not so much what you say, but how you say it. Also, the right people have to sell the right things. Words coming from Elon's mouth in regards to cryptocurrency have a far greater effect on market behavior than the exact same words coming from this AI person's mouth.

K0baltalmost 4 years ago

Uncanny valley meets mixed messages and bad delivery.The incoherent facial expressions actually manage to confuse the message more than the dissociated pronunciation.... "witch is know small feet".This tech is a neat trick at this stage but is less useful than just leaving the text as text, in fact adding negative value to an already fully functional process.Fiver is a better option, and I would not recommend that.For an interesting and highly unethical experiment, someone should raise a thousand infants with this drivel and see what happens...I’m going to posit that the result is not good. Children’s narrations is exactly where this is headed though, I can see this as a multimillion view no effort YouTube babysitter.Children find a pleasant, smiling female face soothing...so this is going to be another way that the dollar and human laziness will use AI to make the world a slightly worse place.

going_to_800almost 4 years ago

What awful comments here, you're all criticizing something really exciting. Of course AI can't beat real humans, what do you expect? But it's closer we've ever been, especially since is available to consumers. People in sales and marketing know how valuable is this on improving conversion rates... if you're not in those fields, that's not for you, saying something it useless just because you have no knowledge in other domains, it's highly ignorant.

评论 #27331997 未加载

评论 #27332409 未加载

评论 #27332444 未加载

nemothekidalmost 4 years ago

Wow this feels like a blast from the past. There used to be a service that did exactly this (little help chats with "AI" generated voices), in the mid 2000s but instead of having human avatars they were animated. Seeing the woman speak immediately unlocked a memory in my kid brain.

评论 #27331342 未加载

评论 #27334112 未加载

Swizecalmost 4 years ago

Fantastic technology and I love that the videos look and sound super lifelike. The face looks like most instagram influencers with vanilla broad-appeal pretty faces, which I guess is the style these days.But what’s the point?If you’re gonna send someone a soulless corporate drone video, is that really better than a soulless corporate email? I thought the goal of doing video was that it’s more personable and human ... an AI video doesn’t quite hit those goals does it?

评论 #27331529 未加载

评论 #27328838 未加载

geuisalmost 4 years ago

Here’s a sample video with a custom script produced earlier <a href="https://share.synthesia.io/4b75b584-9b3b-4a96-86c2-6b34b8711d10" rel="nofollow">https://share.synthesia.io/4b75b584-9b3b-4a96-86c2-6b34b8711...</a>

cs702almost 4 years ago

Pretty good.... but not quite there yet, in my humble opinion.The lips, eyes, and facial features move in natural ways, but the head remains frozen in a somewhat unnatural manner. It's just inside the uncanny valley, with barely perceptible creepiness.I would hope to see improvements to make face/neck movements look more natural, to overcome these issues over time!

2bitencryptionalmost 4 years ago

There's something quite cyberpunk about smiling AI-generated corporate headshot faces extolling the wonders of <insert product here>. And I don't mean that in a good or bad way. I imagine we'll start seeing these all over the place quite soon.I mean, combine it with GPT-3 and you've got something that's nearly science fiction. Really interested to see where this goes.

评论 #27328895 未加载

Cyril_HNalmost 4 years ago

The eyes aren't quite right and sometimes.thr voice is a little off, but I probably wouldn't notice in a real world setting without prior knowledge.

artur_maklyalmost 4 years ago

I want to see her on my wall, every day, bald, with green eyes. Spouting Shakespearean slurs at Alexa, then following up with some Rumi poetry, and a dash of Allan Watts..all powered by a Markov chain.

anderscoalmost 4 years ago

Very close but not quite human. A text book example of the uncanny valley <a href="https://en.m.wikipedia.org/wiki/Uncanny_valley" rel="nofollow">https://en.m.wikipedia.org/wiki/Uncanny_valley</a>

hyperpallium2almost 4 years ago

rel. given a script, "generating all aspects of a cinematic scene, including staging, acting, editing, framing and lighting in Assassin's Creed Odyssey."<a href="https://youtube.com/watch?v=DFM5zbekZ7c" rel="nofollow">https://youtube.com/watch?v=DFM5zbekZ7c</a> hour-long dev talk (GDC)

codeulikealmost 4 years ago

Their David Beckham video is pretty good <a href="https://www.synthesia.io/post/david-beckham" rel="nofollow">https://www.synthesia.io/post/david-beckham</a>

评论 #27329403 未加载

cupcake-unicornalmost 4 years ago

What's the point of using AI if it needs to be manually reviewed? I suppose the outputs are also manually reviewed as well to keep from the AI going rouge?

评论 #27331091 未加载

p-sharmaalmost 4 years ago

People don't want to talk to computers, that's why chatbots (in their current form) fail one after the other. People also don't want to listen to emotionless robots. As long as this technology is not 100% accurately mimicking a human, the Uncanny valley effect will kick in and just leave an uncomfortable feeling.

bredrenalmost 4 years ago

Here is an instructional reading of advice I gave my friend over text on how to use enzymatic cleaner should his new kittens have an accident:<a href="https://share.synthesia.io/2761933d-4ec7-48c7-b67e-85fc9d6864b9" rel="nofollow">https://share.synthesia.io/2761933d-4ec7-48c7-b67e-85fc9d686...</a>

hervalalmost 4 years ago

I know I'll will probably sound a bit Luddite by saying this, but just the examples already make me cringe: a welcoming video for a corporation saying "we're looking forward to have you here", narrated by a _bot_, is as dehumanizing as it gets. :(

ilakshalmost 4 years ago

Interesting. I hope the models were paid adequately, considering that they can now use them effectively for free infinitely.Reminds me of the movie The Congress.Obviously this technology has a long way to go, but it seems that that actors should feel less secure about their jobs being resistant to automation.

评论 #27329270 未加载

评论 #27332069 未加载

FraserGreenleealmost 4 years ago

These videos are incredibly life like. I can see many virtual companions being made with this.

评论 #27332094 未加载

MarkMcalmost 4 years ago

Impressive, but not quite good enough to avoid the 'uncanny valley' - the lips are not perfectly synced to the audio. Also it should allow a way stress certain words in the input script.

aishwaryaashokalmost 4 years ago

So, a bit curious on how this factors in emotions and depth that could vary depending on the nature of the video [onboarding vs launch videos, say]? And, how to not run out of options for voice/person selection. It shouldn't end up being like the stock images (same faced used in multiple brands). How well of a brand identity gets maintained for say paying customers?

andrewmcwattersalmost 4 years ago

Ah dang, I pasted some literal Lorem Ipsum in to see how it would sound from the AI, and it just puts you through an invite funnel. Oh well.

YeGoblynQueennealmost 4 years ago

>> Synthesia lets you create great business videos in minutes. Say goodbye to actors, film crews and expensive equipment.Yay! At last! And when we've automated away everyone's work, also say goodbye to synthesia and every other automation service, because there's no business left to use it. Woo-hoo, future world, here I come!

system2almost 4 years ago

1 - We will review your video 2 - You will receive your video in your email 3 - You will receive an account creation inviteWhat a great sample.

评论 #27330710 未加载

evan_almost 4 years ago

A really creepy use case for this would be to combine it with one of those IP-to-company name lists. If you visit a vendor it could play a video greeting you by mentioning your business name. “Click here to learn what we can do for Acme Industries!”Again, super creepy and not really clear if it would drive engagement.

dalmo3almost 4 years ago

Wow, the Portuguese pronunciation, intonation and lipsync are incredibly accurate, 10x more so than the English voice. I wonder if that's true for other latin-ish languages and if that means those languages are easier to learn.

pedalpetealmost 4 years ago

I think in general the quality is quite good, but the characters lack personality. I think that is the opportunity. Create something with more lively movement. Think the Sham-wow guy.Anybody can stand blankly in front of a camera without emotion. But this is an impressive start.

Meph504almost 4 years ago

Will not demo anything that requires me to put in that much of my data to try their product.

jordhyalmost 4 years ago

I love it 1000%. Need to create videos for a new crypto. This helps translate the videos to 10 different languages and kick off a global service. It's not perfect but it's fast and looks very professional.

mensetmanusmanalmost 4 years ago

Groups like nxivm are going to do strange things with this tech in the future.

smusamashahalmost 4 years ago

The require agreeing to sending promotional emails before creating the video.

bobochealmost 4 years ago

Would have been interresting to try out but unfortunately, the email prompt ended my evaluation. A lot of people will probably stop there and move on as well.

ravenstinealmost 4 years ago

Aw man, it kind of made it seem like it would be generated fast, but then you find out after putting in your information that it requires manual review.

anotheryoualmost 4 years ago

I'm more stunned by the good speech synthesis than by the already good visuals.Does anyone know what's under the hood for the text to speech?

junonalmost 4 years ago

No thanks. I don't like having to give you all of this personal information you really don't need in order to try your product.

0xxalmost 4 years ago

Founder here. AMA :)To answer a few recurring questions in the thread---> Use case.Video is a way more effective way to communicate than text. Not for the HN crowd, but if you're a blue collar worker a 2 minute video in your native language is much preferred to a 5 page pdf for training.Anyone who has tried to record a simple corporate video know the pain of cameras, film crews, 25 takes to get one that works and post production. Cumbersome, slow and multidisciplinary. By the time the video is done the content is out of date.Synthetic video is not yet at the quality of real video. Eventually it will be. But the mistake many are making here is comparing it to real video; it should be compared with text.In X years we'll be able to make Hollywood films on a laptop without needing anything but time and imagination. Just like we can digitally compose music in Ableton, create images in Photoshop and type novels on keyboards rather than with pen and paper.My (obviously biased;)) belief is that synthetic media will eventually become foundational technology that will move media production from cameras/microphones to API's. We'll be able to do all kind of things we couldn't do before.Eg. personalized and interactive rich media, video-driven chatbots and eventually Hollywood blockbusters made by your favourite YouTuber from his or her bedroom.---> Uncanny valleySimulating real video is incredibly hard. We're constantly improving and launching more expressive synthesis soon.From our tests with some of our largest clients 8/10 people don't realise it's a synthetic video (unless they are asked to look for it).---> TechHas been developed over the last 3 yrs. Origins/team from Stanford/UCL/TUM.Learning: Going from research to working, scaleable product is hard and takes time. But very rewarding when it works.[1] <a href="https://www.youtube.com/watch?v=ohmajJTcpNk" rel="nofollow">https://www.youtube.com/watch?v=ohmajJTcpNk</a> [2] <a href="https://www.youtube.com/watch?v=qc5P2bvfl44" rel="nofollow">https://www.youtube.com/watch?v=qc5P2bvfl44</a>---> Bad usesBad actors will do bad things with synthetic media. Like with any other technology from smartphones to cars. We're moderating all content and building safeguards and verification + working with FAANG and others on detection and provenance technology.Recommended read - deepfakes perfectly follow the story arc of any new, powerful technology: <a href="https://journals.sagepub.com/doi/full/10.1177/1745691620919372" rel="nofollow">https://journals.sagepub.com/doi/full/10.1177/17456916209193...</a>---> ActorsReal actors getting rev share + upfront free from every video generated with their likeness. Like being a stock photo actor.

评论 #27333628 未加载

jellingalmost 4 years ago

I’m deeply interested in synthetic media but it’s hard to believe there is a shortage of people who want to be video presenters.

评论 #27332101 未加载

devops000almost 4 years ago

I created a step-by-step tutorial, but the voice still sounds too robotic. Unfortunately it doesn't inspire trust to users.

darepublicalmost 4 years ago

Gonna have dynamic open world video games too, where custom cut scenes can play based on your characters actions.

lxealmost 4 years ago

Is this based on a paper/demo previously posted on HN? A vaguely remember seeing the faces elsewhere.

评论 #27332087 未加载

Gualdrapoalmost 4 years ago

It forces you to select that option to receive promotional emails from them before submiting a script.

doeneralmost 4 years ago

This site does not let me try the demo without giving them the permission to send spam eMails.

joshribakoffalmost 4 years ago

After filling out the recaptcha I cannot scroll to the submit button on mobile safari.

alexfromapexalmost 4 years ago

Warning: you have to agree to receive marketing emails from them

Exumaalmost 4 years ago

Just give me the ability to be offensive. Who are you to stop me?

flemhansalmost 4 years ago

Scripts require manual review. It's not automated

cushalmost 4 years ago

The sample videos made me incredibly uncomfortable

rkagereralmost 4 years ago

You want my email to try it out? Hard pass.

评论 #27331771 未加载

ratsimihahalmost 4 years ago

The lack of empathy in her voice is chilling

aalfsonalmost 4 years ago

This is really cool.

gibba999almost 4 years ago

$3/minute of video seems a bit steep. $180/hour of video.

评论 #27329842 未加载

评论 #27329947 未加载