It’s Game over on Vocal Deepfakes

94 pointsby burlesonaabout 2 years ago

25 comments

geuisabout 2 years ago

Once the generation times get to real-time, I dunno what's going to happen. I follow the voice acting community a lot and this is a big existential threat level of worry in that community.But I also see some positives for the narrative voice field, but at the expense of actual actors.The latest sequel to a favorite audio book series has the professional narrator pronouncing different character names and town names entirely different than the previous 7 books.In another book the narrator completely changed the character voices in the sequel compared to the first.The positives for listeners is eventually we can guarantee that voices are completely consistent between narrations. An editor will soon be able to describe the emotions of a character and nuance how the AI performs certain scenes.It's really sad but I think the end of human audio storytelling is coming to and end quite rapidly.

评论 #35255361 未加载

评论 #35255208 未加载

评论 #35255285 未加载

评论 #35255391 未加载

评论 #35255290 未加载

blueridgeabout 2 years ago

I think he's right about this—things are going to get weird and ugly and people aren't prepared for what's coming. Here's my suggestion for safeguarding your sanity in the years ahead: find the people and the blogs you love, embrace RSS, block or avoid everything else, read old books, stop listening to podcasts, revert to email as the primary channel for occasional "social" correspondence, abandon all side projects that require additional screen time, go outside.This is the way.

评论 #35255172 未加载

评论 #35255287 未加载

评论 #35255206 未加载

DethNinjaabout 2 years ago

Those fake AI voices are perfect tools for phishing campaigns.I wonder if ChatGPT combined with a voice generator can mount successful phishing campaigns.It looks like Cyberpunk future we never asked is already here.

评论 #35255376 未加载

评论 #35255375 未加载

评论 #35255217 未加载

评论 #35255183 未加载

评论 #35255274 未加载

评论 #35255127 未加载

评论 #35255373 未加载

评论 #35255252 未加载

评论 #35255101 未加载

101008about 2 years ago

I tried ElevenLabs and it is truly amazing. You don't need much audio to train it and the final result can be incredible. I shared some of the snippets with friends and relatives and after the intitial scare (same as John) we agreed that the outcome wouldn't be different than a few years ago...We had impersonators for decades, so what have stopped political parties to hire an impersonator to create fake audios? The IA in this case is going to make it more accesible, but in political parties you don't want thousands of fake audios (they would lose credibility), you need only one.Also, photoshopping photos would have a similar effect, and that technique has been available for years as well.

评论 #35255220 未加载

评论 #35256341 未加载

评论 #35255539 未加载

Rastonburyabout 2 years ago

I've used a tool to make myself sound like a girl, it is freakily realistic and uncanny hearing what it produces. Chinese love scam rings are going to be all over this in a year, coupled with image/video generation

评论 #35256616 未加载

smegsicleabout 2 years ago

> I don’t think the general population is prepared for thisi dont know how many videos it will take of trump and biden shit talking eachother on xbox live but it looks like several on youtube are near 1m views

lazyeyeabout 2 years ago

Would a Roger Stone or Steve Bannon type fake a Russian collusion dossier and then run it through the media for literally years?What are the "types" that would do something like this?<a href="https://www.bbc.com/news/world-us-canada-59168626" rel="nofollow">https://www.bbc.com/news/world-us-canada-59168626</a><a href="https://www.theguardian.com/us-news/2022/oct/11/russian-analyst-igor-danchenko-steele-dossier-sources" rel="nofollow">https://www.theguardian.com/us-news/2022/oct/11/russian-anal...</a><a href="https://www.nytimes.com/2022/10/18/us/politics/igor-danchenko-russia-acquittal-trump.html" rel="nofollow">https://www.nytimes.com/2022/10/18/us/politics/igor-danchenk...</a><a href="https://www.npr.org/2021/11/12/1055030223/the-fbi-arrests-a-key-contributor-to-efforts-trying-to-link-trump-with-russia" rel="nofollow">https://www.npr.org/2021/11/12/1055030223/the-fbi-arrests-a-...</a>

MonkeyMalarkyabout 2 years ago

So who's going to build the sex phone line where you can talk to celebrities like Marilyn Monroe?

评论 #35255109 未加载

评论 #35255296 未加载

评论 #35255331 未加载

评论 #35256858 未加载

lapitopiabout 2 years ago

It is scary for sure. We were heading towards a brave new world even before chatGPT, where what's real was no longer to be taken for granted and you couldn't trust your eyes and ears. ChatGPT has just hastened things. I'm sure a new market will emerge for authentication services that verify the speaker or the person in the video similar to how we had twitter verified checkmark.On the flip side I can't wait for someone to build a product where I record a few conversations with my parents while they are alive, and then when they're gone, through chatGPT + vocalfakes, I can have a parent forever. Sure, I'll know it's not the real thing, but when you really miss them, it certainly can make the pain a little less.

评论 #35255499 未加载

评论 #35255586 未加载

smitty1eabout 2 years ago

Meh. We achieve a state of complete systemic disbelief, and alternate trust mechanisms develop.[Clip of Abe Lincoln in RayBans saying "Don't believe everything you see and hear on the Internet" goes here]Even the stuff that wasn't deepfaked was mostly bollocks anyway.

jaredsabout 2 years ago

Was anyone else not particularly impressed by this? I haven't listened to much Steve Jobs stuff but it sounded just stilted enough for me to think something was up. I'm not sure if it was because I came into it with a skeptical mindset because of the article and context around it. It also may be do to the fact that I have been using screen reading software for about 30 years. The only people who I've heard more then synthetic speech may be my close family and I'm not sure about that. Is there anywhere that offers a test where you have to determine what is generated and what is not with random clips lacking background info?

评论 #35255366 未加载

2bitencryptionabout 2 years ago

On the flip side, if these deepfakes are so simple to generate, they will surely bombard us constantly throughout the day - radio, TV commercials, youtube videos, podcasts.And people quickly become desensitized to that kind of thing. It could be the case that, after some initial "ramping up" period, these deepfakes are so cheap and abundant that no one falls for them.When was the last time you got a call from a number you didn't recognize, and you actually picked it up? If you're like me, probably not in a long time, because you're aware it's almost certainly some scam/robo call.

MBCookabout 2 years ago

If you know Casey Liss from ATP or his other podcasts, James Thompson made a deepfake of him with permission and it is extremely good.<a href="https://mastodon.social/@jamesthomson/110062947060928918" rel="nofollow">https://mastodon.social/@jamesthomson/110062947060928918</a>I had no idea things had gone this far.

stephc_int13about 2 years ago

It is difficult to predict what will happen, but nefarious use of something that powerful is a given.Generated images and audio clips are still pretty easy to detect, we're not there yet, but close.What I worry about is not the obvious usage and risks, but the second and third order effects, those we can't predict.I will be interesting.

slakeabout 2 years ago

I'm worried somebody will call my parents and tell them to transfer money to a weird bank account using my deepfaked voice.

gnicholasabout 2 years ago

Does anyone have suggestions on verbal ways to authenticate family members, for example, on the phone?

评论 #35255439 未加载

评论 #35255610 未加载

评论 #35255342 未加载

评论 #35255377 未加载

georgeoliverabout 2 years ago

What's the feasibility of signed ads/speech verified on the device (tv, phone, etc.)?

评论 #35255498 未加载

somsak2about 2 years ago

This feels so over-the-top. "A recording of Joe Biden forgetting his own name or what year it is, or Kamala Harris claiming to be running an abortion clinic?" Give me a break, you could already have done this with a high-quality voice impersonator -- cost is no real concern at that level anyway.I think this is only really a risk at the low end -- people scamming others with fake references from quasi-celebrities. Not great, but overall feels pretty minor of a concern. We already allow the carriers to scam everyone in the US by allowing anyone to call your cell phone and try to tell you you're behind on your insurance, or that your computer has a virus. There's plenty of scams out there, if we really thought this was a problem, we'd care more about the existing ones.

puglrabout 2 years ago

When this topic has come up with family and friends, folks often say that they aren't worried (yet) because while a human can be fooled, they can't yet fool readily available forensic tools, and perhaps can't ever.I can't speak to the veracity of that claim, but as the post points out, the past several years has shown us that it doesn't matter, not in the least.The author goes on to say how it feels inevitable that we'll see a Bannon or Stone type use this technology to create fake scandals.I'm more worried about the grass roots efforts. Crowdsourced conspiracies like QAnon. Now they'll have more capable tools to radicalize people.

TylerEabout 2 years ago

I was with him right up until the last paragraph. The pussy grabber tape was VIDEO. You can lip read it.

评论 #35255469 未加载

评论 #35255467 未加载

评论 #35255480 未加载

soneilabout 2 years ago

fair play to him for calling out his own "claim chowder"

Rodeoclashabout 2 years ago

Everyone's worried about people using this to generate things that people haven't said (The "grab em" comment by Trump) but inversely to this is that you can claim any REAL content about someone was AI generated by someone else.

antibasiliskabout 2 years ago

I've already been seeing issues with AI art being used to fake Trump's arrest, people are genuinely confused and it even took me a second to realise I was being had

rvzabout 2 years ago

> It’s all fun and games in these demos, but this is inevitably going to be put to use by ratfuckers to create fake scandals in political campaigns...And it feels inevitable that a Roger Stone or Steve Bannon type will use this technology to commission, say, a recording of Joe Biden forgetting his own name...The road to hell is paved with good intentions, and all politicians are going to use this for their political campaigns, not just one specific side. Both.Let's not veer off the wider point and believe that only one side will use it for bad things. All politicians are liars, and no matter what side they are on, if it benefits their agenda to influence the electorate to gain power, they will use it; even if it is used for spreading lies or false and misleading claims.

评论 #35255134 未加载

surrealizeabout 2 years ago

This is one of those things where Balaji is ahead of the curve - the way to guarantee metadata (e.g. who the speaker is) is to do it cryptographically on-chain.<a href="https://twitter.com/balajis/status/1583495595737481217" rel="nofollow">https://twitter.com/balajis/status/1583495595737481217</a>

评论 #35255442 未加载