Personally I love this new AI voice content booms. My favorite is the AI presidents discussing/playing/rating stuff [1], Attenborough narrating warhammer [2].<p>While slightly silly, I think in future people might start to take internet way less serious when you can literally make things up. Which I think is a good trend. People getting depressed or so consumed by the internet culture that they loose all interest in the real world have been a trend I've disliked. We might be just going back to the time of early, anonymous internet forums time where most people used to it just didn't take it too seriously, and a lot of communities were closed /invite only.<p>[1] <a href="https://youtu.be/IkaAZE_UGMo" rel="nofollow">https://youtu.be/IkaAZE_UGMo</a><p><a href="https://youtu.be/q6ra0KDgVbg" rel="nofollow">https://youtu.be/q6ra0KDgVbg</a><p><a href="https://youtu.be/iAq-yg72GWw" rel="nofollow">https://youtu.be/iAq-yg72GWw</a><p>[2] <a href="https://youtu.be/X6RCLJ4pDaw" rel="nofollow">https://youtu.be/X6RCLJ4pDaw</a>
After listening to this, I feel annoyed at the folks in the "AI clones teen girl’s voice in $1M kidnapping scam: ‘I’ve got your daughter’" thread that was on the front page yesterday saying "ppfftt bullshit you'd need to be an idiot to fall for an AI voice" - Guess I'm an idiot. Like it or not, generative media is getting better and better by the day.
There is a need for a solution to this problem -- something like public / private key encryption or "more advanced social security" number to verify authenticity of a source via digital signature.<p>There is a business to be made there.
It sounded weird at times, almost like someone reading a script with a higher pitch than his normal voice. But the technology is amazing and I would most likely be fooled by a short clip if the content wasn't completely out of place from him.
The ironic thing about this is that it reminds me of how the podcast used to be a few years before Spotify, before it got bogged down in politics and Joe asked more interesting questions and it was just generally funner. I just listened to this for 20 odd minutes which is probably longer than I've listened to the actual podcast in the last month or two.
ChatGPT always answers questions like:
Prompt: Can you tell me about <description of thing>
ChatGPT: Sure, let me tell you <slightly rephrased description of thing><p>It's pretty funny to hear that in Sam Altmans voice, along with umms<p>There's a growing number of companies working on voice - we used one recently on a game, it's not quite ready for main characters (yet!) but for background characters and rapid prototyping on main characters (that we plan on rerecording for the final assets) it's already there. It's so close, but none of them quite capture inflection, it's the stable diffusion fingers of audio AI
I think what gives it away is when answering questions, chat GPT first repeats the question.
For example the question is:
"If you were to be found in a small blue room with your favorite food, what would you do?".<p>The answer would start with:
"if I were found in a small blue room with my favorite food I would..."<p>Normal people don't usually talk like that.
Rogan is always the first one to be used as input/demo for this stuff.<p>I have a feeling it's going to remain the trend for every new AI tool.
This isn't a new issue. Image and likeness laws exist for a reason.<p>A more clear example on how this could be harmful, and how we are already equipped to deal with this - I saw an ad for a supplement that had a convincing AI Joe Rogan "talking about it on his podcast" and how it's the greatest thing that everyone needs to buy. This is illegal currently, and it's not any different from hiring a Joe Rogan impersonator to talk in a similar looking podcast set to trick people. It's why we have systems to enforce ownership over trademarks, copyrights, and your own image.
I think we're going to need some sort of 'real human' proof system where when you record any audio or video you publish the media hash, n-second segment hashes, and signed participants to a ledger/blockchain. You could also build a tamper-proof device that you place in frame that uses a combination of a hard to get-at private hardware key and the local ambiance to produce a signature you encode a as a subtle signal that can be later be used to authenticate the video.
The old boys club which has been outsourcing programmer for 20 years to India figured out that they can just fake it since nobody has complained yet.<p>And before software engineers unionize they came up with some snake oil to put us in our place, maybe try to force us back into offices, or just eliminate us.<p>Nobody can deny that Sam Altman and Bill Gates have been trying to "reduce costs" for a long time. The startups with devs in portugal, costa rica, Mexico, Spain, the Ukraine, India, China, anyplace where they can pay 5 dollars for a day's work.<p>When Bill gates said he would pay programmers 7 dollars an hour, we were all offended, we didn't realize that he was already doing it and that would be a significant raise in pay.
Fantastic execution. The generating dialogue I understand; how did they get the voices so incredibly lifelike?! Where do I start when I want text to speech like this?!
I think we now have the technology to build the talking portraits we saw in Harry Potter.
Voice, facial movements, dialogs that matches the character can all be generated by AI now.
I'm just not sure if this can be done realtime yet to interact with another person.<p>In the future, you may be able to have a conversation with a portrait of your parents even after they pass away. (So collect as much training data now?)
I listened and I think it’s really good. I wonder how much effort went in to polishing the sound.<p>Or did they just feed a script into some voice generator?
The one thing that I want from all this AI voice stuff is better selection of voice for the siris and alexas out there. i.e. Ability to clone some's voice onto an assistant<p>Some voices are just more pleasant than others & it differs by person doing the listening
You're telling me this was fake all along? <a href="https://twitter.com/TallBart/status/1643108942627864577" rel="nofollow">https://twitter.com/TallBart/status/1643108942627864577</a>
I think we'll see media personalities move to a public key encryption model, where authentic streams are encrypted in some manner with a private key to verify their origin.
Personally I prefer AI Joe Rogan discussing Bionicle with AI Jordan Peterson.<p><a href="https://www.youtube.com/watch?v=kVX1PB19TYE">https://www.youtube.com/watch?v=kVX1PB19TYE</a>
Biggest appeal of his podcast is that he's a brilliant standup comedian, and he uses that in his podcasts. That's completely missing from this copy.
Based on my experience with Substack posts with ChatGPT's rewriting layoff notices in various voices, including Trump's:<p>The novelty of this idea has worn off. The first post has almost 20x the views of the second. I don't think Rogan has anything to worry about.