Show HN: YakGPT – A locally running, hands-free ChatGPT UI

287 点作者 kami8845大约 2 年前

Greetings!YakGPT is a simple, frontend-only, ChatGPT UI you can use to either chat normally, or, more excitingly, use your mic + OpenAI's Whisper API to chat hands-free.Some features:* A few fun characters pre-installed* No tracking or analytics, OpenAI is the only thing it calls out to* Optimized for mobile use via hands-free mode and cross-platform compressed audio recording* Your API key and chat history are stored in browser local storage only* Open-source, you can either use the deployed version at Vercel, or run it locallyPlanned features:* Integrate Eleven Labs & other TTS services to enable full hands-free conversation* Implement LangChain and/or plugins* Integrate more ASR services that allow for streamingSource code: <a href="https://github.com/yakGPT/yakGPT">https://github.com/yakGPT/yakGPT</a>I’d love for you to try it out and hear your feedback!

39 条评论

jwarden大约 2 年前

Nice. It took about a minute to clone it, run it, enter my API key, and get started. The speech-to-text worked flawlessly.Most people can talk faster than they can type, but they can read faster than other people can talk. So an interface where I speak but read the response is an ideal way of interfacing with ChatGPT.What would be nice is if I didn't have to press the mic button to speak -- if it could just tell when I was speaking (perhaps by saying "hey YakGPT"). But I see how that might be hard to implement.Would love to hook this up to some smart glasses with a heads-up display where I could speak and read the response.

评论 #35383055 未加载

评论 #35379126 未加载

JimmyRuska大约 2 年前

I tried it, it looks good! I had to modify the code to accept 8000 tokens for chatGPT. It would be good if it saved the json payload of the responses as well.It uses 2 external calls to a javascript CDN for the microphone package and something else. It would probably be best if it was localhost calls only since it handles an API key

评论 #35441234 未加载

FriedPickles大约 2 年前

I love the concept of this and other alternate ChatGPT UIs, but I hesitate to use them and pay for my calls when I could use chat.openai.com for free.Any chance you could integrate the backend-api, and let me paste in my Bearer token from there?

评论 #35375905 未加载

评论 #35377421 未加载

评论 #35378509 未加载

评论 #35378476 未加载

teawrecks大约 2 年前

> Run locally on browser – no need to install any applicationsThat's not what "run locally" means. This isn't any more "local" than talking to chatgpt directly, which is never running locally.

评论 #35375957 未加载

评论 #35375636 未加载

blairanderson大约 2 年前

Honestly your "idea generator" blew my mind. Would love to see a section that includes a larger catalog of prefilled prompts.I'm thinking: What would a GPT project manager do? What would a GPT money manager do? What would a GPT logistics manager do? GPT Data Analyst, Etc.

meghan_rain大约 2 年前

> Run locally on browser – no need to install any applications> Please enter your OpenAI key...Do people just not get it?I would in fact rather give all my company secrets to this random dude than OpenAI.

评论 #35377594 未加载

评论 #35377929 未加载

asow92大约 2 年前

Love the idea of prompt dictation. Taking that idea a step further, would it possible to have a feature where ChatGPT responses are spoken back to the user?

评论 #35376592 未加载

smusamashah大约 2 年前

This is fast. And talking to it is a nice touch. Consider adding text to speech too :)One feature I am missing from all these front ends is the ability to edit your text and generate new response from that point. Official chat gpt UI is the only one that seems to do that.

评论 #35379753 未加载

评论 #35379975 未加载

Tiberium大约 2 年前

Looks cool! Are you planning on adding more customization to be able to influence the AI? See <a href="https://bettergpt.chat/" rel="nofollow">https://bettergpt.chat/</a> (it's also open source and uses API in the browser). Basically with that frontend you can control the role of all messages (e.g. add system messages) and also edit them all to better influence the AI in some cases.

评论 #35376099 未加载

computershit大约 2 年前

BRO. Your transcription is SO fast. I've hacked at a similar project passing to the Whisper API and honestly I was already blown away with its speed and accuracy (as was anyone I showed it to), but your implementation is so much faster both in TTS as well as the response from their API. I will absolutely use this.

ilovepuppies大约 2 年前

Very cool. I use a custom local UI as well, based on a fork of a similar project called ChatPad (<a href="https://github.com/deiucanta/chatpad">https://github.com/deiucanta/chatpad</a>). That also uses Mantine UI, and lets you create and save prompts just like chats. Data is stored locally using indexdb. I embedded it in an electron app, which lets me run it from my dock rather than a terminal. But what's missing is speech-to-text, so it's great to see this project has that.There are a few drawbacks to local, I've discovered. For example I doubt the new plugins can be extended to beyond ChatGPT's web UI. Also, it doesn't stream response tokens as they're generated, which is a pain. I haven't looked into whether OpenAPI let you do that though.Nice work!

ezzato大约 2 年前

Looks great. Super interesting to browse other peoples code. I'm working on a desktop app for ChatGPT.<a href="https://github.com/EzzatOmar/delegate">https://github.com/EzzatOmar/delegate</a>

throwaway675309大约 2 年前

Given that Vocode (realtime audio, llm, etc) came out a few days ago, could you speak to how yours compares to it?<a href="https://github.com/vocodedev/vocode-python">https://github.com/vocodedev/vocode-python</a>

评论 #35386824 未加载

user-大约 2 年前

Cool! I tried out the speech to text and it was instant and accurate, i had no idea whisper was that good.Do you know their privacy for our voices? Do they train on it, hear it, etc ?

评论 #35468479 未加载

Karunamon大约 2 年前

I absolutely love this! The UI is nice and responsive and this is the first chatGPT UI that has voice recognition that works outside of chrome!I kind of want to throw this up on a server for my housemates to use, I am currently the only person with a openai account, so I would like the ability to embed my API key. Minor feature request :-)

einpoklum大约 2 年前

Hi ChatGPT! Let me register using my personal information, then tell you what my tasks are at works, what I'm interesting in, what I'm struggling with in life and a bunch of other sensitive personal information. I trust you completely, and am sure a nice AI such as yourself would never use my personal data for anything.

评论 #35379925 未加载

illuminated大约 2 年前

The only thing I'd suggest to consider to add is some sort of authentication. If I deploy this on a server so I could reach it with my mobile, on the go, and it has my API credentials, I wouldn't want anyone who stumbles upon the page to be able to interface ChatGPT on my expense.Otherwise, it really looks good.

评论 #35385061 未加载

fudged71大约 2 年前

I've been playing around with your Idea Generator persona for the last 15 minutes and have been absolutely blown away. Excellent prompt engineering.As mentioned by others, it would be great to customize or write new personas/prompts.Also could you add a voice chatbot as well using vocode? It could be an alternative UI for each of the personas.

diversionfactor大约 2 年前

So if you add audio output to it so I can talk to my computer like in Star Trek, I'll venmo you $100. Then, I want to have a command line module so I can ask it to write files to the local disk and run them, so I can deploy code it's just written to AWS, that's worth at least another $100.

评论 #35382929 未加载

dingclancy大约 2 年前

It would be great if I can just enter "space" in the app and it just lets me talk to it. Keyboard shortcuts!BTW I have a lot of these ChatGPT UI apps installed, mostly free and open-source. Perhaps this is really the era of going back to just talking to a chat interface like the old times.

chenxi9649大约 2 年前

This is very well made and designed. I will likely use this instead of the actual Chatgpt UI since their API is a lot cheaper than the 20$/month pricing for my usage amount.Interesting note: I tried speaking mandrain chinese to the mic and it auto translated what I said into English.

donpark大约 2 年前

Just tried this in both English and Korean. Fumbled a bit with voice control but worked well once I got it going. Very nice. Korean prompts got translated to English so had to tell ChatGPT to respond in Korean to get full non-English UX.Well done.

评论 #35412116 未加载

oriettaxx大约 2 年前

It's pretty bad to ask people to enter e private secret key in a web site (any, I mean)

评论 #35375506 未加载

评论 #35375468 未加载

评论 #35376771 未加载

评论 #35386148 未加载

terran57大约 2 年前

I installed it locally about an hour ago and have been running it through some paces. Nice work! (In addition to the predefined prompts, I like the API usage meter at the top).(now, I just need Openai to take me off the waitlist for GPT-4)

psychoslave大约 2 年前

I’m a bit confused, I tried to utter some queries in Esperanto and French and it transcribed English (fine) translations. Can I disable this behavior to have the text transcribed in the language uttered?

andymac4182大约 2 年前

I might be missing it but do we have an idea about the prompt that ChatGPT uses so we can replicate the experience?I haven't played with the OpenAI API yet. Is there examples of good prompts to use to get good responses?

noobcoder大约 2 年前

Love this, Few things we could add: - Search Feature - Way to import/export chats - Star/Favourite replies by ChatGPT - For GPT4 provide 8k/32k model variations - Prompt Dictionary

victorantos大约 2 年前

I get a 404 error in the browser console for <a href="http://localhost:3000/encoderWorker.umd.js" rel="nofollow">http://localhost:3000/encoderWorker.umd.js</a>

afro88大约 2 年前

This is exactly what I need, thank you for building this! We're using Azure cognitive services for API access to OpenAI models though. With any luck, expect a PR today for basic Azure support :)

LanternLight83大约 2 年前

Could I hook this up to one of text-generation-webui's API formats?

aryamaan大约 2 年前

would be so fun if you could fork a project on vercel i.e this project has a button to fork: - which forks its github - makes a new project on your vercel cause it's connected to your github - it opens a new tab with your project running.

kulikalov大约 2 年前

Isn’t GPT a trademark owned by OpenAI? Is it legal to use it?

评论 #35377567 未加载

ushakov大约 2 年前

What's the use-case for this instead of the default UI?

itsthecourier大约 2 年前

Cross-platform compressed audio record!? How!?

yosito大约 2 年前

> Run locally on browser – no need to install any applicationsThis seems to be a contradiction. Am I running it locally, or is it running on someone else's server?

评论 #35383081 未加载

Obertr大约 2 年前

speech to text didnt transcribe text after a minute. recording was 5s long(((

评论 #35434663 未加载

thelittleone大约 2 年前

All your prompts are belong to us

kristopolous大约 2 年前

Make it easier to try

评论 #35375935 未加载

评论 #35376815 未加载

jerrygoyal大约 2 年前

could you please add some screenshots of how it looks

评论 #35380019 未加载