TechEcho

20 comments

ggerganovover 1 year ago

Heh, funny to see this popup here :)The performance on Apple Silicon should be much better today compared to what is shown in the video as whisper.cpp now runs fully on the GPU and there have been significant improvements in llama.cpp generation speed over the last few months.

评论 #38113936 未加载

评论 #38113405 未加载

评论 #38115967 未加载

评论 #38117816 未加载

评论 #38114039 未加载

评论 #38125049 未加载

评论 #38116315 未加载

wxnxover 1 year ago

This is cool. I hooked up Llama to an open-source TTS model for a recent project and there was lots of fun engineering that went into it.On a different note:I think the most useful coding copilot tools for me reduce "manual overhead" without attempting to do any hard thinking/problem solving for me (such as generating arguments and types from docstrings or vice-versa, etc.). For more complicated tasks you really have to give copilot a pretty good "starting point".I often talk to myself while coding. It would be extremely, extremely futuristic (and potentially useful) if a tool like this embedded my speech into a context vector and used it to as an additional copilot input so the model has a better "starting point".I'm a late adopter of copilot and don't use it all the time but if anyone is aware of anything like this I'd be curious to hear about it.

nwoliover 1 year ago

A few months until this is effectively outlawed if the open weights proposal in 270 days comes into existence

评论 #38113985 未加载

评论 #38113896 未加载

评论 #38117142 未加载

评论 #38114595 未加载

评论 #38112245 未加载

lbltavaresover 1 year ago

I'm getting a "floating point exception" when running ./talk-llama on arch and debian. Already checked sdl2lib and ffmpeg (because of this issue: <a href="https://github.com/ggerganov/whisper.cpp/issues/1325">https://github.com/ggerganov/whisper.cpp/issues/1325</a>) but nothing seems to fix it. Anyone else?

评论 #38139300 未加载

avereveardover 1 year ago

Aren't there text to talk solution that can receive a stream of text so one doesn't have to wait for llama to finish production before getting the answer talked out?I guess it'd only work if the model can keep the buffer filled fast enough so the tts engine doesn't stall.

评论 #38111497 未加载

评论 #38111691 未加载

评论 #38113299 未加载

cjbprimeover 1 year ago

Would it be possible to reduce lag by streaming groups of ~6 tokens at a time to the TTS as they're generated, instead of waiting for the full LLM response before beginning to speak it?

评论 #38117233 未加载

boiler_up800over 1 year ago

What’s the best chat interface for llama? I have a 3090 and would love to get one of the models running in my terminal for quick coding tasks.

评论 #38113458 未加载

评论 #38113116 未加载

评论 #38113650 未加载

评论 #38112266 未加载

评论 #38115965 未加载

d3nj4lover 1 year ago

This makes me wonder, what's the equivalent to ollama for whisper/SOTA OS tts models? I'm really happy with ollama for locally running OS LLMs, but I don't know of any project that makes it that simple to set up whisper locally.

评论 #38113252 未加载

评论 #38114126 未加载

评论 #38113257 未加载

onemoresoopover 1 year ago

Could anyone explain the capability of this in plain English? Can this learn and retain context of a chat and build on some kind of long term memory? Thanks

评论 #38114048 未加载

kristopolousover 1 year ago

this has really strong eliza vibes.

SillyUsernameover 1 year ago

Does anybody have a quick start for building it all in Windows for this? I could probably check it out as a VS project and build but I'm going to bet since it's not documented it's going to have issues specifically because the Linux build instructions are the only ones that are a first class citizen...

human183729153over 1 year ago

What are currently the best/go-to approaches to detect the end of an utterance? This can be tricky even in conversations between humans, requiring semantic information about what the other person is saying. I wonder if there’s any automated strategy that works well enough.

评论 #38144519 未加载

wahnfriedenover 1 year ago

How does this choose when to speak back? (Like is it after a pause, or other heuristics.) I tried looking through the source to find this logic.

评论 #38116544 未加载

rgbrgbover 1 year ago

very sick demo! if anyone wants to work on packaging this up for broader (swiftUI/macos) consumption, I just added an issue <a href="https://github.com/psugihara/FreeChat/issues/30">https://github.com/psugihara/FreeChat/issues/30</a>

bribriover 1 year ago

Elevenlabs voice is amazing but it's so expensive. You can easily spend $20 on a single conversation.

评论 #38122576 未加载

danielEMover 1 year ago

All I need is ... Vulkan support :) pls, pls, pls :)

columnover 1 year ago

why use this instead of "memgpt run" ?

评论 #38111773 未加载

评论 #38113682 未加载

评论 #38111480 未加载

horsellamaover 1 year ago

can this be used to talk with local documents?say I have a research paper in pdf, can I ask llama questions about it?

评论 #38114114 未加载

BigRedDog1669over 1 year ago

Tacoma

birdyroosterover 1 year ago

I don’t want to talk to anything in my terminal. It’s a shitty interface for that.

评论 #38114666 未加载

评论 #38120195 未加载

20 comments

ggerganovover 1 year ago

评论 #38113936 未加载

评论 #38113405 未加载

评论 #38115967 未加载

评论 #38117816 未加载

评论 #38114039 未加载

评论 #38125049 未加载

评论 #38116315 未加载

wxnxover 1 year ago

nwoliover 1 year ago

A few months until this is effectively outlawed if the open weights proposal in 270 days comes into existence

评论 #38113985 未加载

评论 #38113896 未加载

评论 #38117142 未加载

评论 #38114595 未加载

评论 #38112245 未加载

lbltavaresover 1 year ago

评论 #38139300 未加载

avereveardover 1 year ago

评论 #38111497 未加载

评论 #38111691 未加载

评论 #38113299 未加载

cjbprimeover 1 year ago

Would it be possible to reduce lag by streaming groups of ~6 tokens at a time to the TTS as they're generated, instead of waiting for the full LLM response before beginning to speak it?

评论 #38117233 未加载

boiler_up800over 1 year ago

What’s the best chat interface for llama? I have a 3090 and would love to get one of the models running in my terminal for quick coding tasks.

评论 #38113458 未加载

评论 #38113116 未加载

评论 #38113650 未加载

评论 #38112266 未加载

评论 #38115965 未加载

d3nj4lover 1 year ago

评论 #38113252 未加载

评论 #38114126 未加载

评论 #38113257 未加载

onemoresoopover 1 year ago

Could anyone explain the capability of this in plain English? Can this learn and retain context of a chat and build on some kind of long term memory? Thanks

评论 #38114048 未加载

kristopolousover 1 year ago

this has really strong eliza vibes.

SillyUsernameover 1 year ago

human183729153over 1 year ago

评论 #38144519 未加载

wahnfriedenover 1 year ago

How does this choose when to speak back? (Like is it after a pause, or other heuristics.) I tried looking through the source to find this logic.

评论 #38116544 未加载

rgbrgbover 1 year ago

bribriover 1 year ago

Elevenlabs voice is amazing but it's so expensive. You can easily spend $20 on a single conversation.

评论 #38122576 未加载

danielEMover 1 year ago

All I need is ... Vulkan support :) pls, pls, pls :)

columnover 1 year ago

why use this instead of "memgpt run" ?

评论 #38111773 未加载

评论 #38113682 未加载

评论 #38111480 未加载

horsellamaover 1 year ago

can this be used to talk with local documents?say I have a research paper in pdf, can I ask llama questions about it?

评论 #38114114 未加载

BigRedDog1669over 1 year ago

Tacoma

birdyroosterover 1 year ago

I don’t want to talk to anything in my terminal. It’s a shitty interface for that.

评论 #38114666 未加载

评论 #38120195 未加载

Talk-Llama

20 comments

Talk-Llama

20 comments