I built Vibevoice, a tool that lets you dictate text and run voice commands with screen context anywhere on your system. It works like this:<p>For regular dictation, hold the right Ctrl key while speaking, then release to have your words typed automatically wherever your cursor is - perfect for coding, emails, or chat apps.<p>The more interesting feature is the AI command mode: hold the Scroll Lock key, speak a prompt, and a local LLM responds based on both your words AND a screenshot of what you're looking at. The AI's response gets typed directly into your application as if you typed it yourself.<p>Everything runs locally using Whisper for transcription and Ollama for the LLM (I recommend gemma3:27b for best results). No cloud services or API costs.<p>The tool was inspired by Karpathy's "vibe coding" and builds upon Vlad's whisper-keyboard project. I extended it to work with local models and added the screenshot context feature, which makes the AI much more useful for everyday tasks like writing e-mails.<p>Let me know what you think!