Launch HN: Wondercraft (YC S22) – Use text-to-speech to create podcasts easily

153 pointsby diminikolaoualmost 2 years ago

Hi HN! We’re Dimitris and Youssef, founders of Wondercraft (<a href="https://www.wondercraft.ai/">https://www.wondercraft.ai/</a>), a platform that leverages AI voices to make podcast creation simple. This video shows how it works: <a href="https://www.loom.com/share/fa8ac8eba8b9440dbe0321ccb8ba9426?sid=fef9c08a-1003-4377-9d79-8d8a7bf14eb0" rel="nofollow noreferrer">https://www.loom.com/share/fa8ac8eba8b9440dbe0321ccb8ba9426?...</a>.“Hacker News Recap” (<a href="https://www.wondercraft.ai/podcasts/hacker-news-recap">https://www.wondercraft.ai/podcasts/hacker-news-recap</a>) a podcast produced using our platform, has been running for 4 months and currently gets close to 23k listens per month. We’ve made its analytics publicly available: <a href="https://op3.dev/show/f77aea62-97e5-5cce-92c6-9464e51c30c6" rel="nofollow noreferrer">https://op3.dev/show/f77aea62-97e5-5cce-92c6-9464e51c30c6</a>.Having previously attempted to start a podcast, we were well aware of the difficulties. Figuring out what equipment and software you need to buy is a daunting start. Editing is a lengthy and tedious process, technical difficulties often occur during recording, and planning logistics around recording is a hassle. As a result, content release is infrequent, which leads to lackluster growth.At the same time, podcast consumption is experiencing exponential growth. There are 500M podcast listeners around the world, double in size compared to 5 years ago. Apart from the growth in listeners, podcasts are the medium that is most likely to influence behavior, which is the reason why the number of businesses having podcasts has grown 5x over the past 5 years. Finally, the last piece that led to the creation of Wondercraft is that text-to-speech models saw a big improvement about 6 months ago, with ElevenLabs releasing models with an output that is almost indistinguishable to humans (see HN thread here: <a href="https://news.ycombinator.com/item?id=34361651">https://news.ycombinator.com/item?id=34361651</a>).Wondercraft integrates realistic text-to-speech with an infrastructure that simplifies podcast creation. For example, you can integrate music, publish your podcast / create an RSS feed, generate a video for your episode, get assistance in the script generation, auto generate show notes and transcript and translate your podcast all together. All text based tasks (e.g. script assistance, show note generation, etc) are completed using a chain of custom prompts to LLM models. All text-to-speech is done through custom voices that are either synthetically generated or professionally cloned from Voice Actors, using the ElevenLabs platform. Tasks such as episode translation involve the use of both LLMs and ElevenLabs. Video generation runs using Remotion and the RSS feed is an XML creation and updating routine.Since launching, we’ve had more than 13k users sign up to create their podcast. Use cases that we’re seeing include: businesses repurposing their blogs and generating video content for their socials; writers/bloggers/newsletters reaching audience through another medium; news outlets and publications adding a news rundown podcast in their lineup; businesses creating internal educational/cultural material; and podcast studios using Wondercraft to serve client needs faster.Wondercraft is not a tool for fully AI generated content. Rather, we save people time by transferring content they’ve created (e.g. an article they’ve written) to another medium. This technology is best suited for news rundowns and narrational format podcasts (often used by businesses talking about a niche topic). And while interview and conversational formats will sound better person-to-person, the logistical and (often) sound quality issues remain, so we’re testing out an “Async Podcasts” feature, where an interviewee can respond to questions async in writing, share a photo and (optionally) a clip of their voice, and a podcast will be created out of it.We’d love to hear any thoughts, comments or experiences you may have had in relation to leveraging text to speech for podcast creation. Thank you for taking the time to read!

33 comments

Kwpolskaalmost 2 years ago

People like podcasts, because they are interesting stories told by humans. Good podcasts have a lot of creativity behind them. Your HN Recap podcast uses a bland voice that sometimes struggles with tech terms, and the auto-generated summaries often feature deep details and miss the intention of the story. Auto-generated content on YouTube is usually misleading spam, how will you prevent your auto-generated podcasts from flooding podcast aggregators with such content?

评论 #37096005 未加载

评论 #37092813 未加载

评论 #37098264 未加载

zurferalmost 2 years ago

I love it. I am also a regular listener to "PG Essays"[1]. I would never have read so many of his essays as I'm listening to.[1] <a href="https://podcasts.google.com/feed/aHR0cHM6Ly9hcGkyLndvbmRlcmNyYWZ0LmFpL2ZlZWRzL3BvZGNhc3RzLzE1ODliYTI1LTRmMzktNGVkMC1hZTZjLWM0ZDI0NGJkMDE2OS5yc3M" rel="nofollow noreferrer">https://podcasts.google.com/feed/aHR0cHM6Ly9hcGkyLndvbmRlcmN...</a>

评论 #37088738 未加载

评论 #37094378 未加载

vishnuharidasalmost 2 years ago

I am a regular listener of the podcast "Hacker News Recap" (linked in the post description) and I always doubt if that's a real human reading the script. It is not a simple text-to-speech thing. Instead, it feels like a real human talking, with real emotion in it. I am already in love with Anna, the voice behind the HN Recap podcast!Also I played around with their podcast generation tool, where it neatly built a podcast from my blog posts. This is a good example of what Generative AI can do in the media domain. Congrats on the production launch! Keep up!

评论 #37088848 未加载

jedbergalmost 2 years ago

This is a great product: I've listened to the HN podcast and it's great.> podcast consumption is experiencing exponential growthI find this so interesting! I know my personal podcast consumption has fallen off a cliff since the pandemic started. I pretty much only listen to podcasts when I commute, and I stopped commuting then. I assumed that everyone did that but I guess I was wrong.

评论 #37090193 未加载

aloknnikhilalmost 2 years ago

I personally wouldn’t use this. I don’t know if your point about information being “locked” in written form is even being addressed here. There are so many audio books out there but I personally only really enjoy audio books delivered by the author themselves or someone who can actually capture the nuances in the text. So I think you’ll just end up moving this information from being locked in prose to being locked in sound, unless you can accurately capture the tone, nuances and the context around the whole text.

评论 #37107837 未加载

dutchbritalmost 2 years ago

I was listening to an audiobook the other day on a commute that was also done with AI. The main issue I had was focus, the voice was very monotone to begin with, and at one point it pronounced “it” as “IT”. I didn’t finish listening… That said, the voice isn’t that bad in the HN example.

评论 #37111162 未加载

mcpackiehalmost 2 years ago

How will you prevent your service from being used to flood the world with worthless algorithmically generated slop?

评论 #37092475 未加载

RankingMemberalmost 2 years ago

This is both brilliant and scary- I anticipate that the amount of web-scraped stuff about to land on Spotify's podcasts tab is going to be insane.

评论 #37090057 未加载

porkbeeralmost 2 years ago

Great, more robot voices. No thanks. The point of a podcast is the human part. I can just have gpt blather to me thru tts if i wanted fake podcasts. I regret saying this, but your tech will actively make the world worse.

评论 #37093941 未加载

评论 #37092909 未加载

benziblealmost 2 years ago

Different but related idea, this creates a personal podcast feed: <a href="https://reca.st" rel="nofollow noreferrer">https://reca.st</a>

评论 #37091655 未加载

评论 #37089916 未加载

monologicalalmost 2 years ago

Podshorty does something kind of similar, but it takes any YouTube link, summarizes it and generates a podcast using the voices of the original speakers. Also creates transcripts so you can follow along. <a href="https://www.podshorty.com" rel="nofollow noreferrer">https://www.podshorty.com</a>

评论 #37092271 未加载

评论 #37091514 未加载

another-davealmost 2 years ago

Seems cool! One thing I noticed listening to one of the PG essays was that it changed voices for one of the pull quotes, which was a nice touch!Might be cool to have a feature that read out the source too, like someone would if a human was reading a quote from a book. Hard to control for everyone's different annotation style though I'd imagine.

评论 #37092182 未加载

Imply8215almost 2 years ago

Translating a podcast into so many languages with two clicks increases our reach so much. Great stuff Wondercraft, keep it up

评论 #37092379 未加载

评论 #37089655 未加载

评论 #37088740 未加载

cca778almost 2 years ago

Recently I have produced some short video lectures to distributed to research partners. I can write reasonably well in english, but my speaking is terrible. I manage to prepare fine-tuned english subtitles.A text-to-speech can help creating english audio tracks for those producing original content in other languages

评论 #37092834 未加载

ilovettsalmost 2 years ago

Hello, The TTS voice is fantastic. Any plan to make it available to developers on iOS, Android, Windows etc? The bundled TTS voices aren’t great on these platforms.

kyriakoselalmost 2 years ago

There are too many books that I would want to listen to and don't have audiobooks. i'd definitely give it a try with that in mind

评论 #37111124 未加载

rw2almost 2 years ago

Great product, first of all. I can really see a use for it. Are you afraid that this is too easy to clone?Someone with speechify: <a href="https://speechify.com/" rel="nofollow noreferrer">https://speechify.com/</a>And who wants to write a spotify API write code can do this.

评论 #37090746 未加载

colesantiagoalmost 2 years ago

This looks great and exciting, congrats on the launch.I am so happy that this exists, I was considering creating a podcast but it was too much effort involved and had to do and redo takes and other priorities.Will be considering using Wondercraft and others if they exist entirely for this now.

评论 #37089787 未加载

GordonSalmost 2 years ago

Not sure if I'm just not seeing it, but I can't find any information on pricing, or whether there's a free tier?(there's a "start for free" button, but that could mean anything, and it wants me to create an account)

评论 #37089030 未加载

sakopovalmost 2 years ago

Congrats on the launch! Looks like ElevenLabs is your direct competitor. How do you plan to differentiate? So far their pricing is a little better and they also provide the ability to create a custom voice model.

评论 #37098486 未加载

causialmost 2 years ago

I wish there was more in this space geared toward audiobooks. There are so many brilliant novels in my collection that never got an audiobook release and it'd be amazing to be able to generate my own.

评论 #37111150 未加载

swyxalmost 2 years ago

congrats on launching! i've been a vocal fan for a little bit: <a href="https://twitter.com/swyx/status/1661848597728575489" rel="nofollow noreferrer">https://twitter.com/swyx/status/1661848597728575489</a>however when i tried signing up for your pod to make my own, i was disappointed that it would only take manually entered content. i want to hook it up to my twitter or rss or discord feed, and have you Do The Thing. please!

评论 #37111219 未加载

评论 #37102429 未加载

hexage1814almost 2 years ago

It's amazing how good your service is. Not only the text to speech, but like even the emphasis it knows how to give on each word. Fantastic times, fantastic times indeed.

评论 #37097154 未加载

snissnalmost 2 years ago

Your latest hn podcast post has a low level noise that makes it unlistenable to me. You should be able to use non ai tools to remove the background hiss/hum

ksajadialmost 2 years ago

I look at it from a consumer of podcasts point of view not a producer. If the content is good and the voice quality is natural then this can only help unlock more good content, by lowering the barrier to entry. Can't see why that is a bad thing. Sure, there will always be the equivalent of content farms, but remember that content farms exist because of Google. Without Google traffic the incentive to create useless content diminishes. Podcasts are not like that. You might be tricked into listening to one episode of an AI content generated (not AI voiced) one but in all likelihood you won't subscribe, removing the incentive.

jermaustin1almost 2 years ago

I might have missed it on the site, but is there any plan for multiple voices on a single podcast? And any type of annotations to add emotion the voice (scared, excited, angry)?

评论 #37089751 未加载

magdyksalmost 2 years ago

I’ve been listening and really loving the hacker news recap. Keep up the good work and please let me listen in different languages!

评论 #37088293 未加载

oo0shinyalmost 2 years ago

This looks like a really cool idea. A question though: who holds the rights to the created audio? The user, or Wondercraft?

评论 #37090226 未加载

languagehackeralmost 2 years ago

I was half expecting this to be Wondery spinning off whatever they do to make all their narrators sound like robots.

hdivideralmost 2 years ago

This is a welcome tool indeed. Question: how would you describe how you're different to Descript?

评论 #37089928 未加载

rgrieselhuberalmost 2 years ago

This is pretty awesome. Are any languages other than English supported?

评论 #37089242 未加载

funthreealmost 2 years ago

1hr and 5hr is not long enough

评论 #37088920 未加载

评论 #37089083 未加载

0921kiyoalmost 2 years ago

Congrats on the launch! The output quality is very good!

评论 #37088746 未加载