科技回声

15 条评论

ttul7 个月前

The more I listen to NotebookLM “episodes”, the more I am convinced that Google has trained a two-speaker “podcast discussion” model that directly generates the podcast off the back of an existing multimodal backbone. The two speakers interrupt and speak over each other in an uncannily humanlike manner. I wonder whether they basically fine tuned against a huge library of actual podcasts along with the podcast transcripts and perhaps generated synthetic “input material” from the transcripts to feed in as training samples.In other words, take an episode of The Daily and have one language model write a hypothetical article that would summarize what the podcast was about. And then pass that article into the two—speaker model, transcribe the output, and see how well that transcript aligns with the article fed in as input.I am sure I’m missing essential details, but the natural sound of these podcasts cannot possibly be coming from a text transcript.

评论 #41966877 未加载

评论 #41966776 未加载

评论 #41966752 未加载

notpushkin7 个月前

This is in fact pretty explicitly not open source: <a href="https://github.com/meta-llama/llama-recipes/blob/d83d0ae7f5c9953737d9bbdca490c03282f24c37/pyproject.toml#L18">https://github.com/meta-llama/llama-recipes/blob/d83d0ae7f5c...</a>(And given there is no LICENSE file, I’m afraid you can only use this code as reference at best right now)

评论 #41968860 未加载

评论 #41968589 未加载

评论 #41968615 未加载

评论 #41970003 未加载

jrm47 个月前

Great to see this: Fellow tech-geeks, ignore the NotebookLM thing at your peril.NotebookLM, far and away, has been the "AI Killer App" for the VAST MAJORITY of bright-but-not-particularly-techy people I know. My 70ish parents and my 8 year old kid are both just blown away by this thing and can't stop playing with it.Edit: As someone pointed out below, I absolutely mean just the "podcast" thing.

评论 #41967799 未加载

评论 #41968441 未加载

评论 #41968685 未加载

评论 #41967152 未加载

评论 #42000749 未加载

评论 #41969181 未加载

评论 #41969375 未加载

terhechte7 个月前

I tried to build something kind of like NotebookLM (personalized news podcasts) over the past months (<a href="https://www.tailoredpod.ai" rel="nofollow">https://www.tailoredpod.ai</a>), but biggest issue is that the existing good TTS Apis are so expensive that a product such as NotebookLM is not really possible for a normal company that doesn't have internal access to Google's models. OpenAI has the cheapest / quality good enough TTS Api, but even then generating hours of audio for free is way too expensive.Open Source TTS models are slowly catching up, but they still need beefy hardware (e.g. <a href="https://github.com/SWivid/F5-TTS">https://github.com/SWivid/F5-TTS</a>)

评论 #41973734 未加载

评论 #41969287 未加载

lelag7 个月前

Pretty weird choice of TTS engines. None of them are anywhere near state of the art as far as open TTS system goes. XTTSv2 or the new F5-TTS would have been much better choices.

评论 #41966697 未加载

评论 #41968248 未加载

rmorey7 个月前

The sample output is very poor. Cool demo, but really just emphasizes how much of a hit product the NotebookLM team has managed to come up with, ostensibly with more or less the same foundation models already available.

danpalmer7 个月前

I'm not so sure this is an open source NotebookLM as it is a few experiments in an iPython notebook. What NotebookLM does at an LLM level is not particularly novel, it's the packaging as a product in a different way than what others are doing that I think is interesting. Also the "podcast" bit is really just an intro/overview of a large corpus, far more useful is being able to discuss that corpus with the bot and get cited references.What this does however demonstrate is that prototyping with LLMs is very fast. I'd encourage anyone who hasn't had a play around with APIs to give it a go.

评论 #41966822 未加载

antononcube7 个月前

Here is another (Jupyter based) notebook solution supporting LLaMA models: <a href="https://raku.land/zef:antononcube/Jupyter::Chatbook" rel="nofollow">https://raku.land/zef:antononcube/Jupyter::Chatbook</a> .Here is a demo movie: <a href="https://youtu.be/zVX-SqRfFPA" rel="nofollow">https://youtu.be/zVX-SqRfFPA</a>

zmmmmm7 个月前

It only creates the podcasts right?I am more interested in the other features of NotebookLM. The podcasts are fun but gimmicky.

评论 #41974906 未加载

alanzhuly7 个月前

If we can have this running locally on mobile phone that would be pretty cool. Imagine receiving a work document (for example, product requirement documents), and then this turning it into a podcast to play for me while I am driving. I think my productivity will be through the roof and I don't need to worry about compliance issues.

评论 #41966503 未加载

sajid-aipm7 个月前

I wonder, how soon they release this in other languages and with different accents epecially Se-Asian accents.

jklein117 个月前

Man.. the sample is pretty rough

mmaunder7 个月前

I’d love to hear the output if anyone has used this.

评论 #41968293 未加载

评论 #41966458 未加载

luxus7 个月前

now i need something that pseudonyms my pdfs/input in the first step

gnabgib7 个月前

Page title: NotebookLlama: An Open Source version of NotebookLM

评论 #41966477 未加载

15 条评论

ttul7 个月前

评论 #41966877 未加载

评论 #41966776 未加载

评论 #41966752 未加载

notpushkin7 个月前

评论 #41968860 未加载

评论 #41968589 未加载

评论 #41968615 未加载

评论 #41970003 未加载

jrm47 个月前

评论 #41967799 未加载

评论 #41968441 未加载

评论 #41968685 未加载

评论 #41967152 未加载

评论 #42000749 未加载

评论 #41969181 未加载

评论 #41969375 未加载

terhechte7 个月前

评论 #41973734 未加载

评论 #41969287 未加载

lelag7 个月前

Pretty weird choice of TTS engines. None of them are anywhere near state of the art as far as open TTS system goes. XTTSv2 or the new F5-TTS would have been much better choices.

评论 #41966697 未加载

评论 #41968248 未加载

rmorey7 个月前

danpalmer7 个月前

评论 #41966822 未加载

antononcube7 个月前

zmmmmm7 个月前

It only creates the podcasts right?I am more interested in the other features of NotebookLM. The podcasts are fun but gimmicky.

评论 #41974906 未加载

alanzhuly7 个月前

评论 #41966503 未加载

sajid-aipm7 个月前

I wonder, how soon they release this in other languages and with different accents epecially Se-Asian accents.

jklein117 个月前

Man.. the sample is pretty rough

mmaunder7 个月前

I’d love to hear the output if anyone has used this.

评论 #41968293 未加载

评论 #41966458 未加载

luxus7 个月前

now i need something that pseudonyms my pdfs/input in the first step

gnabgib7 个月前

Page title: NotebookLlama: An Open Source version of NotebookLM

评论 #41966477 未加载

NotebookLlama: An open source version of NotebookLM

15 条评论

NotebookLlama: An open source version of NotebookLM

15 条评论