Show HN: Ichido, search engine that tags sites using Google and Cloudflare

122 pointsby anthonyhnabout 2 years ago

Hello HN,In my spare time I work on an experimental search engine named Ichido. Search is fascinating, there are so many features you can add to a search engine, but I find that the existing search engines are a bit limited in the features they have to offer. So I decided to work on my own search engine to test out different features, searching algorithms, and front ends in order to improve my (and hopefully others) searching experience.Ichido includes a tagging system that provides more info on search results. For example, if a site links to Google services or uses Cloudflare, a tag is shown with the search result that let's the user know about that site's use of those services. Ichido also includes links to RSS feeds in search results, making it much easier to find RSS feeds.This search engine is free to use, but if you like the service and want to support continued development please consider making a donation (Ichido currently supports donations through Libera Pay).

16 comments

superasnabout 2 years ago

I think the tags can be grouped like Extereme trackers, Moderate trackers, etc and clicking on them expands the full list.Also one really useful tag would be "Affiliate links" if there is a way to identify a page contains affiliate links like amazon affiliate, etc. Those pages are always almost crap.Also a tag for "Modal popups", those are too often just marketing related websites and definitely want to skip it if I know prior to visiting.

mgabout 2 years ago

I run this search engine comparison tool:<a href="https://www.gnod.com/search/" rel="nofollow">https://www.gnod.com/search/</a>Just added Ichido.Click on "more engines" to activate it.

评论 #34949592 未加载

评论 #34947411 未加载

评论 #34950226 未加载

评论 #34948785 未加载

评论 #34947515 未加载

coroboabout 2 years ago

Search engines will do literally anything except the option "never show results from this domain again"Is there something obvious I'm missing that makes it infeasible, or maybe is it just something only I want?As for this site there's too many tags for them to be useful imo. Give it 2 weeks of using the search engine and I bet you could hide silly fake tags in there and I'd never notice. Lots of tags = no tags.I was picturing maybe a little pillbox type thing you might find appended to Google search results.For instance when a result is a PDF: <a href="https://img.imgy.org/-7lq.jpg" rel="nofollow">https://img.imgy.org/-7lq.jpg</a>

评论 #34954983 未加载

评论 #34951750 未加载

评论 #34952768 未加载

评论 #34953641 未加载

coolspotabout 2 years ago

I would prefer more logical tags like “top 1k”, “aggregator”, “user-generated content” than technical like “utm” and “obfuscated scripts”. Also, I would prefer tags grouped together into expandable lists and not shown all by default. Every site uses javascript, I don’t want to see it over and over again unless specifically queried for that.

jesprenjabout 2 years ago

An interesting search proxy is also SearX. Written in Python, it supports many backend engines and can be self hosted.And here's a lightweight frontend/proxy I wrote in C for using Google search on low-end phones that can't render bloated HTML (SearX was too complicated to install):<a href="http://searc.4a.si:7327/search?q=news" rel="nofollow">http://searc.4a.si:7327/search?q=news</a>It's also nice that the structured never constantly changing HTML it produces makes it ideal to programatically query Google. Although you still run into captchas which it cannot solve if queries get too suspicious.

评论 #34947855 未加载

ocdtrekkieabout 2 years ago

This looks great, I am really glad to see things making it more obvious how pervasive malicious Google scripts are.I find the webp flag interesting, as I don't think webp itself is inherently harmful, except for being an image spec that solely exists because Google NIHs everything and wants to write their own everything. (Long live JPEG-XL!)I'm curious why you chose to tag it explicitly though.

评论 #34947246 未加载

TekMolabout 2 years ago

In your about page, I see you are using Bing's API. I didn't even know Bing has a search API that everyone can use!How much do you have to pay them for this?

评论 #34947210 未加载

评论 #34947044 未加载

danukerabout 2 years ago

Thank you! I think any competition is welcome for search engines, with Google going down the monetization path.A piece of feedback: When I select "Remove top ...." and click Submit, then click Next, the popularity filter is gone.Edit: looks like the file type filter is dropped as well. Do add the arguments to the pagination links.

评论 #34947186 未加载

1vuio0pswjnm7about 2 years ago

The pagination keep increasing past the point where Bing will provide no more results. Testing a popular search term, for which there are no doubt millions of results, it was only possible to get new results up to page 45. Yet the website will keep incrementing the page number and result numbers as if new results are being returned.Then tried same search with popularity set to 500000 and could not even get a single full page of 10 results. It's laughable to assume from this "search" that only, say, 500004 out of the millions of websites in existence include this term. Not that I want to browse a full list, but at least I want to know how many hits I got. Then I can add more terms and try to reduce that number.

simultsopabout 2 years ago

What would be the issue of being hosted on CF? I believe it is a better option than the rest of the shared hosting industry.. If nothing critical whats the intention of tagging?

评论 #34947496 未加载

评论 #34947325 未加载

flas9sdabout 2 years ago

I see you offer an opensearch.xml already - if you embed it as link node with the appropriate type it will be straightforward to add it to the browser as (default) search engine: <a href="https://developer.mozilla.org/en-US/docs/Web/OpenSearch#autodiscovery_of_search_plugins" rel="nofollow">https://developer.mozilla.org/en-US/docs/Web/OpenSearch#auto...</a>also: happy to give this a try, more knobs for power users

评论 #34951607 未加载

daoudcabout 2 years ago

This is really cool! Please consider joining forces with us at mwmbl.org, would love to incorporate some of these ideas.

partyguyabout 2 years ago

Nice project! However, when trying to search for my site (<a href="https://spacehey.com" rel="nofollow">https://spacehey.com</a>), it shows multiple tags, with most of them being false (Cloudflare, UTM Tracking, WEBP Images). I used Cloudflare at one point in the past, but don't anymore. Additionally, there has never been UTM tracking or anything like that nor WEBP images... Where do you get such data from?Apart from that, awesome project!

评论 #34948051 未加载

bastawhizabout 2 years ago

What's the use case for this? If I don't want Google scripts, I block them. I'll use a user agent that doesn't download or run them. If I don't want cookies, I'll instruct my browser not to save cookies. What situation would I be in where knowing whether a site uses these things is a search result I want to visit?

评论 #34947199 未加载

jacooperabout 2 years ago

Brave goggles also do something similar, allowing to filter search the way to you want.

KomoDabout 2 years ago

Too many tags, and if a site has something, like scripts, why do you say "may"?If a site has scripts then it's not "This site may be using Javascript", it's for sure that the site uses it...?And popularity filter doesn't work, the results are empty and if you try going to any of the other pages it removes the filter