Hi HN!<p>I've been fed up with search results so much that I decided to make a giant blocklist to remove garbage links by using uBlacklist.<p>I browsed other blocklists and wasn't very satisfied from what exists now; the goal of this one is to be super organized and transparent, explaining why each site was blocked via issues. Contributions welcome!<p>Even though around 100 domains are blocked so far, I already noticed a big improvement in casual searches. You'd be surprised how some AI generated websites can dominate the #1 page on DuckDuckGo.
I'm fed up too. Spammy, AI-looking sites are showing up more and more. For some reason, many of them use the same Wordpress theme with a light gray table of contents - they look like this: <a href="https://imgur.com/a/totally-not-ai-generated-efsumgZ" rel="nofollow">https://imgur.com/a/totally-not-ai-generated-efsumgZ</a><p>The problem seems worse on "alternative" search engines, e.g. DuckDuckGo and Kagi, which both use Bing. It's been driving me back to Google.<p>A blocklist seems like a losing proposition, unless, like adblock filter lists, it balloons to tens of thousands of entries and gets updated constantly.<p>Unfortunately, this kind of blocklist is highly subjective. This list blocks MSN.com! That's hardly what I would have chosen.
Installed! This should not be a function of the search engine nor a plugin. This should be integrated in the browser.<p>Another great function (not for this plugin) should be the option to "bundle" all search results from the same domain. Stuff them under one collapsible entry. I hate going through lists and pages of apple/google/synology/sonos/crab urls when I already know that I have to search somewhere else.
It's not going to be long before we need to move to a whitelist model, rather than a blacklist model.<p>It ironically makes me think of the Yahoo Web Directory in the 90s.<p>Time is a flat circle.
So, if you already run uBlock Origin (and of course you are), you can use this list without installing any additional extensions by going to 'Filter lists' in the uBlock settings, then Import, then enter <a href="https://raw.githubusercontent.com/popcar2/BadWebsiteBlocklist/refs/heads/main/uBlacklist.txt" rel="nofollow">https://raw.githubusercontent.com/popcar2/BadWebsiteBlocklis...</a> as the URL.<p>Not saying you <i>should</i>, just that you <i>could</i>...
Hi @popcar2 — how are you sourcing the domains for the blocklist? We'd like to evaluate those domains and consider whether they should be removed from DuckDuckGo as spam. You can also report a site directly in the search results by clicking the three-dot menu next to the link and selecting "Share Feedback about this Site".
With the Kagi search engine is a way in the settings to bulk-upload lists of domains to block (or upvote) them. Has anyone uploaded a list like this to it?<p>I may do that.
The problem with a list like this is that a “bad website” is in the eye of the beholder. I’m not saying that there’s anything wrong with you personally not liking the Shopify or the Semrush blog. But I think that everyone else has their own calculus.<p>It’s the same reason why social media blocklists can be problematic—everyone’s calculus is different.<p>My suggestion is that you promote it as a starter and suggest that users fork it for their own needs.
I recently started a crypto scam/phishing blocklist if you wanna roll these into your list as well.<p>also works well with Pi-hole and other platforms.<p><a href="https://github.com/spmedia/Crypto-Scam-and-Crypto-Phishing-Threat-Intel-Feed">https://github.com/spmedia/Crypto-Scam-and-Crypto-Phishing-T...</a>
This is one of those features a proper search engine (i.e., not a thinly-veiled advertising network) should have. If users can customize their search results and share their sorting/filtering methods, then that presents a large number of constantly-moving targets that greatly drives up the cost of SEO. There's no "making the Google algorithm happy." Instead, it becomes more "making the users happy."
I don't understand why so much corporate blogs are blocked.
Most of them are about their product, or about the industry in general.<p>- For example, kaspersky blog doesn't look bad.<p>- CCleaner blog is just a list of update.
Related: Freya Holmér - "Generative AI is a Parasitic Cancer" <a href="https://www.youtube.com/watch?v=-opBifFfsMY" rel="nofollow">https://www.youtube.com/watch?v=-opBifFfsMY</a> (1h19m54s) [2025-01-02].<p>She talks at length about how pages of AI-generated nonsense text are cluttering search results on Google and all other search engines.
I've been using GoogleHitHider, which also works on other search engines like DDG. Worked well for many years. It's a list I curated myself though for personal use, I definitely wouldn't mind seeing what other people had.
This is cool. It would be pretty easy to add the domains from this list to Kagi's blocked domain list and have it integrated in the search without a plugin. The downside obviously is having to update that list from the repo, but still, as OP says, even with just a hundred domains blocked it's already a big improvement.
I think there's big potential in using DNS blacklists for this: they have the advantage of being massively scalable and simple to maintain, and clients configuration to use them is also easy.<p>The scalability comes from the caching inherent in DNS; instead of having to have millions of people downloading text files from a website over HTTP on a regular basis, the data is in effect lazy-uploaded into the cloud of caching DNS resolvers, with no administration cost on behalf of the DNSBL operator.<p>Reputation whitelists (or other scoring services) would also be just as easy to implement.
This is cool! Not entirely sure whether I think it's a good idea, but I wonder if it'd be useful to come up with a way to tranche websites.<p>Some sites are complete garbage and should be blocked, for course. Others (e.g., in my experience, Quora) are sometimes quite good and sometimes quite bad. Wouldn't be my first choice, but I've found them useful at times.<p>For a given search, maybe you try with the most aggressive blocking / filtering. If you fail to find what you're looking for, maybe soften the restriction a bit.<p>Maybe this is overwrought...
One enraging thing, if some guy on GitHub can do this, why the F** can't billion-dollar search giants put in a little human effort to do it too, right in their search engines?<p>SEO spam and AI slop are easily spotted on the human level. Google has hundreds of thousands of employees. Just put ONE of them on this f**ing job!<p>It's criminal what these companies have let happen to the web.
Tangent, I may laughably use Malware Bytes but when I'm image searching on Google and it stops me from opening a picture with a adware alert. I'm like "oh damn"... I use an adblocker/generally don't do anything sus on my main OS but yeah. I'm still unsure am I safe? (paranoia ensues)<p>I use a VM in other scenarios but even that, properly separated?
What on earth are people still searching for using search engines? I’ve found chatGPT to be significantly better at answering question I have than google or DDG or any other search engine. It’s still AI slop, but at least it’s a bit more succinct, and I can ask follow up questions
Brave has goggles that do exactly this. you can even share the list with others.<p><a href="https://search.brave.com/goggles/discover" rel="nofollow">https://search.brave.com/goggles/discover</a>
everytime i search content about supabase, some trash ai generated content website like restack shows and waste my time. I am not saying restack is bad, but a customizable blocker to block the site for specific topic might be good for me.
hosts with tens of thousands of entries, kagi for search and recipes from the spammer godsend Llm in librewolf is still an option but no idea for how long.