I wrote this tool to get familiar with CLIP model, I know many people have written similar tools with CLIP before, but I'm new to machine learning and writing a classic tool helps my study.<p>The unusual thing with my version is, it is in pure Node.js, with the power of node-mlx, a Node.js machine learning framework.<p>The repo in the link is mostly about implementing indexing and CLI, the code of the model implementation lives as a Node.js module: <a href="https://github.com/frost-beta/clip">https://github.com/frost-beta/clip</a> .<p>Hope this helps other learners!
I was planning to do this myself lol. I was going to use SQLite as the index, and use `sqlite-vec` or something similar to query for similar files directly. I think the only other thing I was planning were more filters, `"positive term" -"negative term"` to be able to negate results, `>90"search"` to find images that match by >90% and some generic filters like `--size >1mb` to help narrow it down when you are looking for a specific image. Quantizing embeddings to make them smaller/faster also seemed interesting but I haven't tried doing it yet.
Uses only 1 core 100% under linux, can this be changed?<p>10 images, each ~20 kb size, took more than 10 minutes to index, is that normal without GPU-acceleration?
Very cool! Here is a similar python version.<p><a href="https://github.com/spullara/photoindex">https://github.com/spullara/photoindex</a><p>Oh and if you want to run something locally on your iphone you can use my app I am still testing:<p><a href="https://x.com/getrememberwhen" rel="nofollow">https://x.com/getrememberwhen</a>
This is cool. Is there also a way to show contents of the image as indexed? i.e. image 1 has cat and dog<p>There are a lot of tool/apps that let you “search images” but not much that lets you just as easily “read images”
I have wanted to clean up my photo collection for ages and remove any nsfw picture that might hide somewhere.<p>Would this be able to do that and how likely is it It will see a pc release.
I've been enjoying <a href="https://github.com/mazzzystar/Queryable">https://github.com/mazzzystar/Queryable</a> on iPhone
How does CLIP compare to YOLO[1]? I haven't looked into image classification/object recognition for a while, but I remember that YOLO was quite good was working on realtime video too.<p>[1]: <a href="https://pjreddie.com/darknet/yolo/" rel="nofollow">https://pjreddie.com/darknet/yolo/</a>
using same app rclip: <a href="https://github.com/yurijmikhalevich/rclip">https://github.com/yurijmikhalevich/rclip</a>
I have made similar android app for semantic image search, works offline too, still gathering feedback and polishing UI, but it works, if you are brave enough here is it <a href="https://drive.google.com/file/d/1tE0cY6umj5h5zCY_Jvaou1M8sCfzWMOR/view?usp=drive_link" rel="nofollow">https://drive.google.com/file/d/1tE0cY6umj5h5zCY_Jvaou1M8sCf...</a>
In russian, "sisi" is a variation of "tits".<p>Is there a job/services that confirm that branding is appropriate across different languages? Seems like a non trivial problem to solve.