Code is available at: <a href="https://github.com/dnhkng/GlaDOS">https://github.com/dnhkng/GlaDOS</a><p>You can also run the Llama-3 8B GGUF, with the LLM, VAD, ASR and TTS models fitting on about 5 Gb of VRAM total, but it's not as good at following the conversation and being interesting.<p>The goals for the project are:<p>All local! No OpenAI or ElevenLabs, this should be fully open source.<p>Minimal latency - You should get a voice response within 500 ms (but no canned responses!)<p>Interruptible - You should be able to interrupt whenever you want, but GLaDOS also has the right to be annoyed if you do...<p>Interactive - GLaDOS should have multi-modality, and be able to proactively initiate conversations (not yet done, but in planning)<p>Lastly, the code base should be small and simple (no PyTorch etc), with minimal layers of abstraction.