A hackable AI assistant using a single SQLite table and a handful of cron jobs

800 点作者 stevekrouse大约 1 个月前

42 条评论

xp84大约 1 个月前

I don't know if I love this more for the sheer usefulness, or for the delightful over-the-top "Proper English Butler" diction.But what really has my attention is: Why is this something I'm reading about on this smart engineer's blog rather than an Apple or Google product release? The fact that even this small set of features is beyond the abilities of either of those two companies to ship -- even with caveats like "Must also use our walled garden ecosystem for email, calendars, phones, etc" -- is an embarrassment, only obscured by the two companies' shared lack of ambition to apply "AI" technology to the 'solved problem' areas that amount to various kinds of summarization and question-answering.If ever there was a chance to threaten either half of this lumbering, anticompetitive duopoly, certainly it's related to AI.

评论 #43687451 未加载

评论 #43691820 未加载

评论 #43689358 未加载

评论 #43689474 未加载

评论 #43690811 未加载

评论 #43699499 未加载

评论 #43693424 未加载

dogline大约 1 个月前

This made me think: what if my little utility assistant program that I have, similar to your Stevens, had access to a mailbox?I've got a little utility program that I can tell to get the weather or run common commands unique to my system. It's handy, and I can even cron it to run things regularly, if I'd like.If it had its own email box, I can send it information, it could use AI to parse that info, and possibly send email back, or a new message. Now, I've got something really useful. It would parse the email, add it to whatever internal store it has, and delete the message, without screwing up my own email box.Thanks for the insight.

评论 #43681994 未加载

评论 #43683507 未加载

评论 #43684592 未加载

评论 #43682840 未加载

评论 #43682804 未加载

评论 #43685491 未加载

评论 #43686272 未加载

评论 #43685428 未加载

评论 #43688436 未加载

评论 #43682571 未加载

评论 #43687244 未加载

评论 #43682208 未加载

groseje大约 1 个月前

This is the kind of pragmatic AI hack I want to see. It feels like sometimes we are forgetting why certain tooling even exists. To simplify things! No fancy vector DBs or complex architectures, just practical integration with existing data sources. Love it.

squireboy大约 1 个月前

" Initially, Stevens spoke with a dry tone, like you might expect from a generic Apple or Google product. But it turned out it was just more fun to have the assistant speak like a formal butler. "Honestly, saying way too little with way too much words (I already hate myself for it) is one of the biggest annoyances I have with LLM's in the personal assistant world. Until I'm rich and thus can spend the time having cute conversations and become friends with my voice assistant, I don't want J.A.R.V.I.S., I need LCARS. Am I alone in this?

评论 #43687525 未加载

评论 #43688146 未加载

评论 #43685425 未加载

评论 #43688660 未加载

评论 #43685724 未加载

jredwards大约 1 个月前

I've been kicking around idea for a similar open source project, with the caveats that:1. I'd like the backend to be configured for any LLM the user might happen to have access to (be that the API for a paid service or something locally hosted on-prem).2. I'm also wondering how feasible it is to hook it up to a touchscreen running on some hopped-up raspberry pi platform so that it can be interacted with like an Alexa device or any of the similar offerings from other companies. Ideally, that means voice controls as well, which are potentially another technical problem (OpenAI's API will accept an audio file, but for most other services you'd have to do voice to text before sending the prompt off to the API).3. I'd like to make the integrations extensible. Calendar, weather, but maybe also homebridge, spotify, etc. I'm wondering if MCP servers are the right avenue for that.I don't have the bandwidth to commit a lot of time to a project like this right now, but if anyone else is charting in this direction I'd love to participate.

评论 #43689842 未加载

评论 #43686837 未加载

评论 #43687113 未加载

评论 #43687297 未加载

Workaccount2大约 1 个月前

Lately I have been experimenting with ways to work around the "context token sweet spot" of <20k tokens (or <50k with 2.5). Essentially doing manual "context compression", where the LLM works with a database to store things permanently according to a strict schema, summarizes it's current context when it starts to get out of the sweet spot (I'm mixed on whether it is best to do this continuously like a journal, or in retrospect like a closing summary), and then passes this to a new instance with fresh context.This works really effectively with thinking models, because the thinking eats up tons of context, but also produces very good "summary documents". So you can kind of reap the rewards of thinking without having to sacrifice that juicy sub 50k context. The database also provides a form of fallback, or RAG I suppose, for situations where the summary leaves out important details, but the model must also recognize this and go pull context from the DB.Right now I have been trying it to make essentially an inventory management/BOM optimization agent for a database of ~10k distinct parts/materials.

评论 #43682842 未加载

评论 #43682359 未加载

mikethemerry大约 1 个月前

Along the same lines, I've just done a build called Jeeves. A bit less flair, but very fast to put together. The stack is:1. Claude Desktop 2. Projects 3. MCPs for [Notion, Todoist] and exploring emails + WhatsApp for a next upgradeThis is for me to support productivity workflows for consulting + a startup. There are a few Notion databases - clients, projects, meetings, plus a Jeeves database. The Jeeves database is up to Jeeves how it uses it, but with some guidance. Jeeves uses his own database for things like tracking a migration of all of my previous meeting notes etc under the new structure.So my databases, I've set up my best practices for use. Here's how my minutes look, here's how client one pagers looks like, here's the information to connect it all together, and here's how I manage To Dos. I then drop in transcriptions into a new chat, with some text-expanding prompts in Alfred for a few common meetings or similar, and away he goes. He'll turn the transcript into meeting notes, create the todos, check everything with me, do a pass, and then go and file everything away into Notion and Todoist via MCP.It's also self documenting on this. The todoist MCP had some bugs, so I instructed Jeeves to go, run all the various use cases it could, figure out the limitations and strengths, document it, and it's filed away in the Jeeves database that it can pull into context.It lacks the cron features which I would like, but honestly, a once-a-day prepared prompt dropping into Claude is hardly difficult.

angusturner大约 1 个月前

The thing this really hits home for me is how Apple is totally asleep at the wheel.Today I asked Siri “call the last person that texted me”, to try and respond to someone while driving.Am I surprised it couldn’t do it? Not really at this point, but it is disappointing that there’s such a wide gulf between Siri and even the least capable LLMs.

评论 #43694486 未加载

tossandthrow大约 1 个月前

Here I thought they used the sqlite DB for next token prediction.For others: they use Claude.

evacchi大约 1 个月前

hah! this is great. I built something similar using mcp.run and a task- <a href="https://docs.mcp.run/tasks/tutorials/telegram-bot" rel="nofollow">https://docs.mcp.run/tasks/tutorials/telegram-bot</a>for memories (still not shown in this tutorial) I have created a pantry [0] and a servlet for it [1] and I modified the prompt so that it would first check if a conversation existed with the given chat id, and store the result there.The cool thing is that you can add any servlets on the registry and make your bot as capable as you want.[0] <a href="https://getpantry.cloud/" rel="nofollow">https://getpantry.cloud/</a> [1] <a href="https://www.mcp.run/evacchi/pantry" rel="nofollow">https://www.mcp.run/evacchi/pantry</a>Disclaimer: I work at Dylibso :o)

didip大约 1 个月前

So… I have a number of questions:1. How did he tell Claude to “update” based on the notebook entries?2. Won’t he eventually ran out of context window?3. Won’t this be expensive when using hosted solutions? For just personal hacking, why not simply use ollama + your favorite model?4. If one were to build this locally, can Vector DB similarity search or a hybrid combined with fulltext search be used to achieve this?I can totally imagine using pgai for the notebook logs feature and local ollama + deepseek for the inference.The email idea mentioned by other commenters is brilliant. But I don’t think you need a new mailbox, just pull from Gmail and grep if sender and receiver is yourself (aka the self tag).Thank you for sharing, OP’s project is something I have been thinking for a few months now.

评论 #43682912 未加载

theptip大约 1 个月前

This is fun! I think this sort of tooling is going to be very fertile ground for hackers over the next few years.Large swathes of the stack is commoditized OSS plumbing, and hosted inference is already cheap and easy.There are obvious security issues with plugging an agent into your email and calendar, but I think many will find it preferable to control the whole stack rather than ceding control to Apple or Google.

评论 #43683103 未加载

eitland大约 1 个月前

> It’s rudimentary, but already more useful to me than Siri!For me, that is an extremely low barrier to cross.I find Siri useful for exactly two things at the moment: setting timers and calling people while I am driving.For these two things it is really useful, but even in these niches, when it comes to calling people, despite it having been around me for years now it insist on stupid things like telling me there is no Theresa in my contacts when I ask it to call Therese.That said what I really want is a reliable system I can trust with calendar acccess and that is possible to discuss with, ideally voice based.

评论 #43687635 未加载

评论 #43682792 未加载

评论 #43682730 未加载

0xbadcafebee大约 1 个月前

Hmm, there's supposed to be a Tasks [reminders] feature in ChatGPT, but it's in beta (I don't have access to it). Whenever it gets released, you could make some kind of "router" that connects to different communication methods and connect that up to ChatGPT statefully, and you could just "speak"/type to ChatGPT from anywhere, and it would send you reminders. No need for all the extra logic, cron jobs, or SQLite table (ChatGPT has memory across chats).

hwpythonner大约 1 个月前

Very cool. I’m wondering if you’ve thought about memory pruning or summarization as usage grows?What do you think of this: instead of just deleting old entries, you could either do LRU (I guess Claude can help with it), or you could summarize the responses and store the summary back into the same table — kind of like memory consolidation. That way raw data fades, but a compressed version sticks around. Might be a nice way to keep memory lightweight while preserving context.

simianwords大约 1 个月前

I have built something similar that runs without a server. It required just a few lines in Apple shortcuts.TL;DR I made shortcuts that work on my Apple watch directly to record my voice, transcribe it and store my daily logs on a Notion DB.All you need are 1) a chatgpt API key and 2) a Notion account (free).- I made one shortcut in my iPhone to record my voice, use whisper model to transcribe it (done locally using a POST request) and send this transcription to my Notion database (again a POST request on shortcuts)- I made another shortcut that records my voice, transcribes and reads data from my Notion database to answer questions based on what exists in it. It puts all data from db into the context to answer -- costs a lot but simple and works well.The best part is -- this workflow works without my iPhone and directly on my Apple Watch. It uses POST requests internally so no need of hosting a server. And Notion API happens to be free for this kind of a use case.I like logging my day to day activities with just using Siri on my watch and possibly getting insights based on them. Honestly the whisper model is what makes it work because the accuracy is miles ahead of the local transcription model.

评论 #43683251 未加载

fedeb95大约 1 个月前

The title is a bit misleading since it relies on Claude API to function.

paulnovacovici大约 1 个月前

Curious, how come you decided to use a cloud solution instead of hosting this on a home server? I’ve recently bought a mini PC for small projects like this and have been loving being able to host with no cost associated to it. Albeit it’s probably still incredibly cheap to use a IaaS or PaaS but still a barrier to entry for random projects I want to work on a weekend

评论 #43682493 未加载

评论 #43690569 未加载

评论 #43683382 未加载

drog大约 1 个月前

I've been using my own telegram -> ai bot and its very interesting to see what others do with the similar interface.I have not thought about adding memory log of all current things and feeding it into the context I'll try it out.Mine is a simple stateless thing that captures messages, voice memos and creates task entries in my org mode file with actionable items. I only feed current date to the context.Its pretty amusing to see how it sometimes adds a little bit of its own personality to simple tasks, for example if one of my tasks are phrased as a question it will often try to answer the question in the task description.

kylecazar大约 1 个月前

I like the idea of parsing USPS Informed Delivery emails (a lot of people I encounter still don't know that this service exists). Maybe I'll make something to alert me when my checks are finally arriving!

评论 #43687957 未加载

avinassh大约 1 个月前

For "memory", I wonder how it would be if you use vector search in SQLite and pass that info to reduce context size. The ValTown SQLite should have support for vectors API - <a href="https://docs.turso.tech/features/ai-and-embeddings#vectors" rel="nofollow">https://docs.turso.tech/features/ai-and-embeddings#vectors</a>

hiatus大约 1 个月前

Is there some way to git clone this? It appears to use git under the hood but doesn't offer a publicly accessible interface.

评论 #43683338 未加载

larsonnn大约 1 个月前

I argue that this kind of tools are fun to play but in the end is it really helpful? I start my day like every day and on work I just check the calendar. My private calendar has all Information i need. Where is the gap where an Assistent makes sense and where we are just complicating our lives?

评论 #43683201 未加载

评论 #43703513 未加载

评论 #43683184 未加载

emporas大约 1 个月前

It reminds me of "Generative AI is just a phase. What’s next is interactive AI."The more i think about it however, command line applications are about as interactive as a program can be.Let's say one is interested to find out which one of the following 7 next days is gonna rain and send it to a telegram bot. `weather_forecast --days 7 | grep rain` | send_telegram.The whole thing of off-loading everything to nondeterministic computation instead of the good ol determinism does seem strange to me. I am a huge fan though, of using non-deterministic computation for creating deterministic computation, i.e. programming./As a side note, i have played chess against Ishiguro many times on lichess.[1] <a href="https://www.technologyreview.com/2023/09/15/1079624/deepmind-inflection-generative-ai-whats-next-mustafa-suleyman/" rel="nofollow">https://www.technologyreview.com/2023/09/15/1079624/deepmind...</a>

ajcp大约 1 个月前

I'm a little confused as to the 16-bit game interface shown in the article. Is that just for illustration purposes in the article itself, or is there an actual UI you've built to represent Steven/Steven's world?

评论 #43684547 未加载

评论 #43684253 未加载

lazyeye大约 1 个月前

A nice little project. I think you could probably do the same with N8N running on a raspberry pi.<a href="https://reddit.com/r/n8n" rel="nofollow">https://reddit.com/r/n8n</a>

stunnAR大约 1 个月前

This is probably naive and looking forward to a correction; isn't sending your info to Claude's API (or really any "AI API") is a violation of your safeguarded privacy data?

评论 #43682866 未加载

评论 #43682703 未加载

评论 #43683580 未加载

ww520大约 1 个月前

This is awesome. Keep things simple and direct.The background tasks can call mcp servers, to connect to more data sources and services. At least you don’t have to write all the connectivities to them.

Sphax大约 1 个月前

This is really cool. How much would that cost in Claude API calls ?

评论 #43682528 未加载

评论 #43682290 未加载

jurgenaut23大约 1 个月前

Love it, such a nice idea coupled with a flawless execution. I think the future of AI looks a lot more like this than half-cooked agent implementations that plagues LinkedIn…

评论 #43692433 未加载

smusamashah大约 1 个月前

Sorry for being pedantic, the title sounded like no LLM was being used and therefore was lot more intriguing. It uses Claude.> cron job which makes a call to the Claude API

maCDzP大约 1 个月前

This is awesome. I think I will play around with this idea using Apple shortcuts. I have a hunch you’ll get really far just using shortcuts.

OSDeveloper大约 1 个月前

I think that projects like this are pretty smart, and I like little simple hacked-together things like this, most likely made in a weekend.

sunshine-o大约 1 个月前

This is brilliant !I am wondering, how powerful the AI model need to be to power this app?Would a selfhosted Llama-3.2-1B, Qwen2.5-0.5B or Qwen2.5-1.5B on a phone be enough?

评论 #43692558 未加载

triyambakam大约 1 个月前

First:> I’ll use fake data throughout this post, beacuse our actual updates contain private informationbut then later:> which makes a call to the Claude APII guess we have different ideas of privacy

评论 #43684538 未加载

评论 #43686224 未加载

cess11大约 1 个月前

"It’s very useful for personal AI tools to have access to broader context from other information sources."How? This post shows nothing of the sort."I’ve written before about how the endgame for AI-driven personal software isn’t more app silos, it’s small tools operating on a shared pool of context about our lives."Yes, probably, so now is the time to resist and refuse to open ourselves up to unprecedented degrees of vulnerability towards the state and corporations. Doing it voluntarily while it is still rather cheap is a bad idea.

pmdr大约 1 个月前

Well it's probably ahead of Apple Intelligence in usefulness and functionality. We should see more things like this.

评论 #43703540 未加载

sneak大约 1 个月前

Telegram isn’t end to end encrypted. Why would you use an insecure app to transmit private family information like this?

评论 #43701019 未加载

评论 #43690437 未加载

jonahss大约 1 个月前

I think the best part was the little video-game video of Stevens checking different datasets by walking around. Love it.

fullstackchris大约 1 个月前

Love it - and ironically this is something one would struggle to build with "vibe coding" alone

lnenad大约 1 个月前

@stevekrouse FYI getGoogleCalendarEvents is not available.

评论 #43685568 未加载

geonic大约 1 个月前

Super fun project – love it!