TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Musical archive needs your help - Safety ML filter for 200K+ photos

4 pointsby panosfilianosabout 2 years ago
Hello everyone,<p>A friend is working on one of the largest musical historical archives as an archivist. His team has been tasked with going through the personal archive of a donor.<p>The archive contains hundreds of thousands of pictures which are downloaded with random filenames from standard browsing from that donor.<p>The issue is that the archive includes useful pictures mixed together with adult pictures with no way to distinguish between them besides manual review. They have been working on this for weeks and they have a ton more pictures to go through.<p>Is there any software I can help them put together to make the distinguish automatically with a good accuracy? Even sorting (possibly adult&#x2F; possibly safe) is good enough. I am SWE myself, so I can create something fast in many langugaes (especially Python and JS).<p>The archive pertains to one of the greatest opera singers of the 20th century and is the largest to date, so your help here will be meaningful.<p>Thanks a lot!

2 comments

jpoesenabout 2 years ago
No experience with any of this, but a quick search turns up stuff like <a href="https:&#x2F;&#x2F;deepai.org&#x2F;machine-learning-model&#x2F;nsfw-detector" rel="nofollow">https:&#x2F;&#x2F;deepai.org&#x2F;machine-learning-model&#x2F;nsfw-detector</a>, which looks affordable and straightforward to implement.<p>And here&#x27;s bunch more: <a href="https:&#x2F;&#x2F;rapidapi.com&#x2F;collection&#x2F;nudity-detection-image-moderation-api" rel="nofollow">https:&#x2F;&#x2F;rapidapi.com&#x2F;collection&#x2F;nudity-detection-image-moder...</a>
评论 #35054575 未加载
sinuhe69about 2 years ago
You can use CLIP to build a database to search (offline) for keywords like “nude” or “naked”. Specifically, I use clip-anytorch with the ViT-B&#x2F;16 pertrained model and find the result very good. Just go to pypi and the corespondent GitHub. They have examples and demo for a quick start. It can run on CPU, too, albeit a bit slow.