科技回声

11 条评论

mxmlnkn超过 2 年前

Interesting read. Especially the lookup method based on partitioning.I tried to implement a similar reverse image search based on dHash as explained here <a href="https://github.com/Rayraegah/dhash" rel="nofollow">https://github.com/Rayraegah/dhash</a> . However, I also had lookup performance problems. Exact matches are not a problem but the Hamming distance threshold matching is. Because my project was in Python, I tried to eke out more performance by writing a BK-tree backend module in C++ <a href="https://github.com/mxmlnkn/cppbktree" rel="nofollow">https://github.com/mxmlnkn/cppbktree</a> It was 2 to 10x faster than an existing similar module but still was too slow when trying to look up something in a database of millions of images. However, as lookup tended to depend on the exact Hamming-distance threshold value, my next step would have been to try and optimize the hash. E.g, make it shorter so that only a short Hamming distance is necessary to be looked up but the mentioned multi-indexing method looks much more promising and tested.

评论 #33262533 未加载

ajtulloch超过 2 年前

“Fast Search in Hamming Space with Multi-Index Hashing” (<a href="https://www.cs.toronto.edu/~norouzi/research/papers/multi_index_hashing.pdf" rel="nofollow">https://www.cs.toronto.edu/~norouzi/research/papers/multi_in...</a>) is a great paper. Note that you can do significantly better (in terms of query latency/throughput) with specialized implementations as in eg FAISS (<a href="https://github.com/facebookresearch/faiss/wiki/Binary-hashing-index-benchmark" rel="nofollow">https://github.com/facebookresearch/faiss/wiki/Binary-hashin...</a>).

johndough超过 2 年前

There is also <a href="https://rom1504.github.io/clip-retrieval/?back=https%3A%2F%2Fknn5.laion.ai&index=laion5B&useMclip=false" rel="nofollow">https://rom1504.github.io/clip-retrieval/?back=https%3A%2F%2...</a> which indexes the 5B images LAION dataset (basically a subset of the images from Common Crawl) and allows both text and image queries. The code as well as the image dataset is open source (although you need a new hard drive just to store the URLs).

fxtentacle超过 2 年前

4 requests per image lookup x 2000 requests per second = 8000 DynamoDB Reads per Second = 28,800,000 reads per hourAWS price list says reserved instances come in at "$0.00013 per RCU" x 28,800,000 = $3744 per hour. Does this really cost $2.6 mio per month to operate?If yes, contact me and I'll be happy to save you $2.595 mio monthly by re-implementing things in C++ on bare metal servers.

评论 #33266367 未加载

评论 #33266339 未加载

jonatron超过 2 年前

I used ORB descriptors and Hamming Distance for a fuzzy/partial/inexact reverse image search at my last job. While it didn't scale like the OP does, it found some really interesting fuzzy matches.

tiffanyh超过 2 年前

<a href="https://tineye.com" rel="nofollow">https://tineye.com</a> is pretty great for doing internet wide reverse image searches.

评论 #33262217 未加载

Corendos超过 2 年前

This was my internship subject (in another company) just before I graduated, I wonder what they used for the Perceptual Hash, ours was SIFT features. Happy to see that what I implemented would have been able to scale that much !

评论 #33261472 未加载

评论 #33262344 未加载

jcbages超过 2 年前

This looks to cool I didn't know anything about perceptual hashing but the idea makes a lot of sense. I'm curious if the system lose its effectiveness if a user shifts an image a few pixels, reflects it, or apply a certain filter that makes all pixels have a slightly different tonality. I'm also thinking of some system maybe a small ML program that applies a minimal amount of noise to the image such that to the human eye it looks the same but pixel-wise is totally different.

culi超过 2 年前

95% of the time when I'm using reverse image search I'm just trying to find the original source of something. This is the opposite of what Google Lens focuses on but is a much simpler problem to solve :(

fxtentacle超过 2 年前

I would love to hear about the cost of this deployment.10 billion images x4 rows each in Amazon DynamoDB with 2000 rps sounds like it'll burn through your wallet faster than most explosives...

swyx超过 2 年前

something i felt was missing from the piece - what is the typical hamming distance cutoff for something like this? he explains 2 is small enough to be identical but presumably it can go really high. what happens with a distance of like 100? what's the state of research on false positives and adversarial attacks?

评论 #33262559 未加载

11 条评论

mxmlnkn超过 2 年前

评论 #33262533 未加载

ajtulloch超过 2 年前

johndough超过 2 年前

fxtentacle超过 2 年前

评论 #33266367 未加载

评论 #33266339 未加载

jonatron超过 2 年前

I used ORB descriptors and Hamming Distance for a fuzzy/partial/inexact reverse image search at my last job. While it didn't scale like the OP does, it found some really interesting fuzzy matches.

tiffanyh超过 2 年前

<a href="https://tineye.com" rel="nofollow">https://tineye.com</a> is pretty great for doing internet wide reverse image searches.

评论 #33262217 未加载

Corendos超过 2 年前

评论 #33261472 未加载

评论 #33262344 未加载

jcbages超过 2 年前

culi超过 2 年前

fxtentacle超过 2 年前

I would love to hear about the cost of this deployment.10 billion images x4 rows each in Amazon DynamoDB with 2000 rps sounds like it'll burn through your wallet faster than most explosives...

swyx超过 2 年前

评论 #33262559 未加载

Simple, Fast, and Scalable Reverse Image Search

11 条评论

Simple, Fast, and Scalable Reverse Image Search

11 条评论