TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Search engines and SEO spam

592 点作者 iamjbn超过 3 年前

107 条评论

dang超过 3 年前
This was in response to mwseibel&#x27;s thread, which had a big discussion yesterday:<p><i>Google no longer producing high quality search results in significant categories</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=29772136" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=29772136</a> - Jan 2022 (1167 comments, spread over multiple pages - note the &quot;X more comments&quot; links at the bottom)
jrockway超过 3 年前
To some extent, I worry that the problem with search engines is that there isn&#x27;t any data worth returning. Yesterday&#x27;s thread talked a lot about reviews. Writing a review is hard work that requires deep domain expertise, experience with similar products, and months of testing. If you want a review for something that came out today, there is no way that work could have been done, so there simply isn&#x27;t anything to find. Instead you&#x27;ll get a list of &quot;Best TVs 2021&quot; or whatever, with some blurb and an affiliate link, not an actual review. That&#x27;s what people can make for free with a day&#x27;s notice, so if you write a search engine that discards those sites, that&#x27;s fine, you&#x27;ll just return &quot;no results&quot; for every interesting query.<p>I guess what I&#x27;m saying is that if you want better reviews, you probably want to start writing reviews and figuring out how to sell them for money. Many have tried, few have succeeded. But there probably isn&#x27;t some Javascript that will fix this problem.
评论 #29783250 未加载
评论 #29784834 未加载
评论 #29784377 未加载
评论 #29783101 未加载
评论 #29784787 未加载
评论 #29784922 未加载
评论 #29786456 未加载
评论 #29788751 未加载
评论 #29785861 未加载
评论 #29786598 未加载
评论 #29784233 未加载
评论 #29786788 未加载
评论 #29783372 未加载
评论 #29788979 未加载
评论 #29783237 未加载
评论 #29788298 未加载
评论 #29788038 未加载
评论 #29787316 未加载
评论 #29785405 未加载
评论 #29784557 未加载
Veen超过 3 年前
&gt; Why not try writing a search engine specifically for some category dominated by SEO spam?<p>Back in the olden days, there were lots of organizations that collated high quality content from the best writers. They nurtured expert writers and paid them well. They fact-checked the content and employed diligent editors and proofreaders so it was accurate and well-written. Over the years, they&#x27;d build a reputation for reliability and trustworthiness that kept people coming back for more. If you wanted to learn about fitness, or cars, or cooking, or science, you&#x27;d find a reputable author and publisher and buy their magazines or books.<p>But then, in the early 2000s, the geniuses from SV &quot;disrupted&quot; the publishing industry and its financial model. They brought us a much better way to find content, the search engine. Because they were so much better than the old-fashioned publishers, search engines gobbled up the advertising money and became the dominant gateway to content. Publishers had to abandon expensive high-quality writing because rankings and eyeballs now mattered more than quality and trustworthiness. Instead of investing in writers, they invested in marketers and SEO specialists.<p>The result: worthless content, writers banging out garbage for peanuts, and useless search engines.<p>Two decades later, looking at the barren wasteland they had created, the SV geniuses thought: I know what we need, more search engines, but smaller ones that collate high-quality content from the best writers. There must be money in that, right?
评论 #29785449 未加载
评论 #29785714 未加载
评论 #29786410 未加载
评论 #29789121 未加载
评论 #29791849 未加载
评论 #29790230 未加载
评论 #29785280 未加载
评论 #29787866 未加载
评论 #29786477 未加载
评论 #29786941 未加载
canyonero超过 3 年前
I&#x27;ve been troubled by the just plain awful results being delivered by Google search over the last few years. I think these are just plain hard problems to solve and that Google is not incentivized to solve. Google wants you to click on ads at the end of the day, full-stop.<p>Often times I find myself searching for &quot;best ($product|$thing_to_do)&quot; which I think many other people do as well because we all want the best. Other times I&#x27;m looking for a music or a book recommendation with some depth. This of course nearly always leads to SEOd trash. There is no relevance nor is there trust. So, I like others to use keywords like &quot;reddit&quot; or &quot;forum&quot; to get to real humans who I trust and intentions are not to sell via affiliate links.<p>These issues often lead to the need in finding trust in real human-centered recommendations that stem from real human interests and needs. I&#x27;ve never found an algorithmic solution to this problem. This is why I think college radio stations or those south-of-the-dial end up being so, so much better. And why beer recommendations from your local brew-shop owner are better than anything you can find on the net.<p>I think building search vertical that are hand-curated would be very interesting to see. But I also think we need to build more communities which allow recommendations to be shared without an incentive to get hits via search and aren&#x27;t paid for by large corporations and where community impact&#x2F;quality _is_ incentivized. I do worry that those days may be gone and there are just not may be enough folks (not in tech) willing to spend so much time online and contributing to niche communities. A lot of folks spend much of their time in walled-gardens like Facebook, Instagram or Twitter, so it&#x27;ll be challenging to be sure.
评论 #29784864 未加载
评论 #29787150 未加载
评论 #29787148 未加载
评论 #29784306 未加载
hammock超过 3 年前
<i>&gt;This may not just be a problem with Google but possibly also the recipe for beating Google. A startup usually has to start with a niche market. Why not try writing a search engine specifically for some category dominated by SEO spam?<p>&gt;You might need to do a lot of manual spam fighting initially. That could be both the thing-that-doesn&#x27;t-scale, and the thing that differentiates you by being alien to Google&#x27;s DNA. (They must hate manual interventions; so inelegant).</i><p>Is he describing...Yahoo circa 1994? A manually curated directory service.
评论 #29782758 未加载
评论 #29782756 未加载
评论 #29782603 未加载
评论 #29782597 未加载
评论 #29782763 未加载
评论 #29782866 未加载
评论 #29782588 未加载
评论 #29787182 未加载
birken超过 3 年前
The funny thing is that <i>if</i> the people who worked on spam at Google were free to talk about it, I&#x27;m sure it would become evident that they know more about spam and anti-spam efforts than anybody else in existence. It&#x27;s a ridiculously hard problem, especially when people are targeting you directly. But they aren&#x27;t free to talk about it, because if they did it would just give more assistance to the spammers, and make the problem worse.<p>I&#x27;m not saying that curated search results for particular verticals is a terrible idea (though I&#x27;m sure like anything the devil is in the details), but on the whole Google search is very, very good considering the constant assault they are under from spammers (which most other search engines are not, at least directly).
评论 #29783112 未加载
评论 #29782564 未加载
评论 #29782492 未加载
评论 #29782625 未加载
评论 #29782905 未加载
评论 #29783463 未加载
评论 #29782463 未加载
评论 #29783010 未加载
评论 #29783849 未加载
评论 #29782776 未加载
评论 #29783609 未加载
评论 #29783711 未加载
评论 #29782610 未加载
评论 #29783839 未加载
评论 #29783214 未加载
评论 #29782814 未加载
评论 #29784074 未加载
评论 #29783138 未加载
评论 #29782885 未加载
评论 #29784128 未加载
评论 #29783141 未加载
评论 #29783042 未加载
评论 #29783204 未加载
评论 #29783175 未加载
评论 #29783293 未加载
评论 #29782714 未加载
评论 #29782960 未加载
评论 #29782857 未加载
评论 #29782896 未加载
mg超过 3 年前
<p><pre><code> Why not try writing a search engine specifically for some category dominated by SEO spam? </code></pre> I like to compare search engine results and wrote this tool to make it easy:<p><a href="https:&#x2F;&#x2F;www.gnod.com&#x2F;search" rel="nofollow">https:&#x2F;&#x2F;www.gnod.com&#x2F;search</a><p>There in fact are many vertical search engines. You can click on &quot;more engines&quot; to see the whole list.
评论 #29782732 未加载
评论 #29782626 未加载
abakker超过 3 年前
I have a version of fixing this that I would personally enjoy a lot. Leave google alone, let it crawl the web, prioritize what it wants to via algorithms. But, give me a version of that which ONLY surfaces results from discussion forums (including SO, Reddit, HN, etc). For most of the stuff where I am actively <i>searching</i> and not just looking stuff up, discussion forums of motivated, self-selected contributors have the stuff I need with the context I need. It used to be that blogs had answers, but that media has been categorically ruined by SEO.<p>Now, one of the deficiencies here has been examples. Try this: &quot;best miter saw&quot;. you will not find any websites that actually discuss the answer to this question, despite it being a product category with a lot of price variability and performance tradeoffs (weight, capacity, power, cord vs cordless, accuracy).<p>Nearly any product reviews for large purchases follow the same pattern unless consumer reports has decided to dig deep (e.g. washing machines).<p>How about guitar strings? Sandpaper? Printers? google&#x27;s algorithm has allowed profit motivated websites to displace the commons to too great an extent.
评论 #29782861 未加载
评论 #29783020 未加载
评论 #29782731 未加载
评论 #29782958 未加载
Fede_V超过 3 年前
I think the complaints about SEO spam are valid - but - I think msweibel and pg misdiagnose the challenge. The challenge is that you are dealing with an adversarial system, and, the better your search engine is, the more widely used it becomes, the more valuable it is for your adversaries to find ways to game your rankings.<p>Any new niche search engine will go through a small window of time where they have the luxury that none of the sites they are indexing are spending all their effort trying to reverse engineer your signals, and optimize against them. I&#x27;m incredibly skeptical that they can remain useful once people all the SEO efforts of various marketers start to be turned against them.
评论 #29790331 未加载
bombcar超过 3 年前
Just give users the ability to blacklist domains when searching; pretty soon you&#x27;ll have a decent list of what users consider worthless.<p>And pintrest would die.
评论 #29784287 未加载
评论 #29784075 未加载
评论 #29790337 未加载
djoldman超过 3 年前
Google knows how to surface relevant results and they choose not to because they aren&#x27;t optimizing for relevant results, they&#x27;re optimizing for revenue or profit within some constraints (don&#x27;t lose too many users, privacy, avoid actually terrible or completely irrelevant results).<p>All the various suggestions in this thread plus far more complex and insightful solutions are known to Google. Most of it boils down to using automated user feedback to improve or measure search result relevancy.<p>Google doesn&#x27;t need to solicit user upvotes &#x2F; downvotes to improve rankings. They can monitor user clicks on results in addition to analytics on the sites the users visit to determine which sites are relevant to which searches.<p>Google doesn&#x27;t optimize for search relevancy.
评论 #29784340 未加载
评论 #29785858 未加载
bretpiatt超过 3 年前
This is already happening for a bunch of verticals:<p>Travel - Expedia, Hotels.com, Kayak, etc.<p>Consumer Goods - Amazon, WalMart, EBay, Etsy, etc.<p>Automobile Purchase - Cars.com, Autotrader, etc.<p>Career&#x2F;Job - Indeed, LinkedIn, etc.<p>As Google continues to lose search volume on these big revenue categories it is going to make spam much more difficult as they are working to sort out long tail spam. Way harder.
评论 #29784638 未加载
评论 #29783048 未加载
评论 #29791403 未加载
netcan超过 3 年前
I&#x27;m almost certain that pg gave &quot;<i>compete with Google by competing in some niche</i>&quot; advice 10+ years ago.<p>In any case, I&#x27;m not sure that competing in search is a very attractive notion. AdWords is the only meaningfully profitable search and business. Even if you steal 10% of Google&#x27;s market, that absolutely doesn&#x27;t translate into 10% of the revenue.<p>That said, recipes. Someone make a search engine where the top results don&#x27;t start with 500 words on the history &amp; etymology of butter, because that&#x27;s what Google want.
评论 #29784156 未加载
deltarholamda超过 3 年前
The quote-Tweeted thread mentioned recipes as one of the things that has been SEObliterated. It&#x27;s a great example of the problem, and also a great example of the problems any solution will encounter.<p>Recipes have become a bellwether Internet problem. In the past, your great-grandmother had a card file with a bunch of 3x5 index cards with the ingredients and instructions on how to make everything, and they pretty much all fit on one side. There was a great deal of domain knowledge required (e.g. &quot;whip to stiff peaks&quot;), but these things reveled in their terseness.<p>Internet recipes all begin with 9 paragraphs of the author&#x27;s first time encoutering the dish in a Moroccan bazaar in 1997, and the life story of the chef. There are two embedded 10-minute videos of the lifecycle of the vanilla bean. And then you get to the ingredient list. Then two more 10-minute videos, then instructions.<p>The drive to make recipes full-contact Internet content has changed what it means to be a recipe. This is similar to how cooking shows evolved from Julia Childs working on a sound stage to a carnival barker presentation with vivid personalities dominating the scene.<p>I&#x27;m not sure there is any technological solution to a problem that has fundamentally changed what it means to be a recipe, short of establishing a new informational silo in the form of a new Web site devoted to recipes only. You could encourage an RSS-like format for recipes, but that requires buy-in from places that profit from the new evolution. This new status quo may be good or bad--you can make the argument either way--but it is what it is. A cultural change is required more than tweaking algorithms.<p>(Unless tweaking algorithms can be foundational to cultural change, in which case we really, really, really need to take a hard look at the corporate behemoths and their algorithms, and sooner better than later.)
评论 #29783513 未加载
评论 #29783857 未加载
rickdeveloper超过 3 年前
I think a lot of this is due to Google both owning search and the ads on the websites (AdSense). There’s an incentive for them to prioritize click farms (and other sites filled with their ads). I think in general there may be a correlation between the number of ads on a site and its usefulness to me, which is inverse to its usefulness to google.<p>I’m curious what would happen if those products were split up into 2 separate companies.
评论 #29782534 未加载
评论 #29782623 未加载
curiousllama超过 3 年前
Good idea. You could start with fitness. Lots of high-quality information out there that’s entirely, 100% inaccessible via google.<p>Over COVID, I did the whole fitness thing from a few different angles (overhauled diet, trained for a marathon, now lifting weights a lot). I found I could only find good info by going directly to a trusted source - literally, typing <a href="http:&#x2F;&#x2F;www" rel="nofollow">http:&#x2F;&#x2F;www</a>. like I’m in the 90s or something. This is the exact issue a search engine should solve, but Google doesn’t.
评论 #29782620 未加载
评论 #29782947 未加载
评论 #29782802 未加载
floatingatoll超过 3 年前
He’s just describing Webrings, except in a reactive tense (“filter out spam sites”) rather than a proactive tense (“associate your site with other worthwhile sites”). Google’s ranking algorithm only works when someone is proactively curating, and only SEO spammers do so these days. Reactive curation is not a viable way to manage information.<p>The simplest way to compete with Google is to create a DIY Webrings site that disallows harvesting of data by Google. Charge curators to create a webring, and let curators select three hashtags and a description that represent their list of fifty or fewer sites. Use the revenue to pay a human to curate the list of hashtags, and let users tip a webring curator in gratitude with an Apple Pay button.<p>This is how to make a million dollars, Pinboard-style, out of the ashes of the original curated Yahoo idea and the information structures of hashtagging. It doesn’t work if you allow free-for-all infinite-sized lists, it doesn’t work if you allow free-for-all hashtags, but with clear limits and moderation of tags (instead of webrings), it would thrive. By moderating tags, users can keep the webring they paid for, and SEO rings will be stick out for having no shared network with any other rings, which allows for easier detection and culling of malicious non-participatory actors. Plus, with the curation networks in place, it becomes possible to bubble up rings that have unusual content for <i>positive</i> human moderation activity.<p>I tried to find some good podcast lists yesterday and each site I visited had a really interesting cross-section, but there were so many duplicates. I wish the ring site existed, so that it could remember what it had shown me already, and I could say “show me rings that intersect with this podcast and have something new I haven’t seen before”.<p>That’s where the theory of pagerank and the practice of curation and the capabilities of search align, and given that moderation of hashtags scales very cheaply, is a billion dollar opportunity that Google and Amazon cannot compete with if handled properly. It’s not about trying to get a cut of every visit’s revenue potential. It’s about giving human beings a directory that respects their time and remembers what they’ve seen.
评论 #29784497 未加载
rhtgrg超过 3 年前
I think pg is missing something important here. The reason Google was able to beat Yahoo, Altavista, Ask, etc. was not just because they had a better formula — it was also because they started in the era where &#x27;search&#x27; was still seen as secondary to &#x27;portals&#x27; by the big guys. Had these companies known how important search is to the internet back then, they would&#x27;ve copied Google&#x27;s secret sauce and crushed it long before it could suck up their traffic.<p>This isn&#x27;t going to happen again. Google isn&#x27;t going to sit around twiddling its thumbs while a competitor develops a better algorithm.<p>You have to attack the problem from a different angle entirely (make something that looks nothing like a search engine), I don&#x27;t think a niche market is going to be enough.<p>Perhaps you just want to make something that scares Google into acquiring you, rather than actually bettering the situation. If that&#x27;s the case, I implore you to think of doing better ways to spend your life.
评论 #29784151 未加载
评论 #29783629 未加载
PaulHoule超过 3 年前
For medical search the answer is pubmed. Not only is the collection of documents clean (of low-grade scammers, pharma companies have to pay big $ to play) but the NIH has done a large amount of search quality and ontology work -- the system knows &quot;Tylenol&quot; is synonymous with &quot;Paracetamol&quot;, &quot;Acetaminophen&quot;, etc.
评论 #29783551 未加载
pkamb超过 3 年前
A search engine that only indexed Reddit, Stack Exchange, Wikipedia, and a small number of other &quot;good&quot; sites would get 80% of the way there.<p>No, DDG bang operators don&#x27;t let you do this. I want an SERP, not a shortcut to a single site&#x27;s on-site search.
评论 #29784978 未加载
leoc超过 3 年前
<a href="https:&#x2F;&#x2F;twitter.com&#x2F;mwseibel&#x2F;status&#x2F;1477707884632834049" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;mwseibel&#x2F;status&#x2F;1477707884632834049</a><p>&gt; I’m pretty sure the engineers responsible for Google Search aren’t happy about the quality of results either. I’m wondering if this isn’t really a tech problem but the influence of some suit responsible for quarterly ad revenue increases.<p>Please no more of this. Two men, Page and Brin, together have basically unfettered control over Google.* If Google does something bad then, unless it&#x27;s genuinely something small enough that those two could not be expected to hear about it, it&#x27;s happening with—at the very least—their acquiescence. And low overall search quality is not something that some &quot;suit&quot; is successfully hiding from Good Czar Larry. They could fire the &quot;suit&quot;, or command him or her to make other decisions. This is—again, at the very least—something that they have chosen not to do. The responsiblity lies with them.<p>* There <i>is</i> the risk of lawsuits from the minority shareholders, I assume. But IIUC this is not realistically that big a restraint on what shareholders with a majority of votes can do. However IANAL.
donio超过 3 年前
What I am looking for is control over the results. Personalized blacklists and lists of sites to be (de)prioritized and also the ability to subscribe to community curated versions of the same.<p>And to be clear I want to be able to control these myself, not algorithm trying to guess my preferences. No guessing, just do what I tell you to.<p>Multiple search profiles with different priorities would be nice too.<p>I would like the search algorithm to be transparent, I should be able to tell why I got a certain result and how I can avoid such results in the future.
评论 #29788428 未加载
wenbin超过 3 年前
FYI - Google hires 10,000+ search result raters [1], who are contractors, to evaluate search result quality.<p>In an ideal world, you build a thing, and it&#x27;s done. It runs automatically and prints out money.<p>In reality, you still need human labors to do manual tasks, even in tech industry.<p>[1] <a href="https:&#x2F;&#x2F;www.searchenginejournal.com&#x2F;google-eat&#x2F;quality-raters-guidelines&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.searchenginejournal.com&#x2F;google-eat&#x2F;quality-rater...</a>
评论 #29784993 未加载
krono超过 3 年前
Just a minute ago, I made a small typo in a non-obscure programming-related search term.<p><pre><code> Showing results for searchterm No results found for searchterm </code></pre> Followed by an unending list of random celebrities I don&#x27;t know nor care about, businesses I&#x27;ve never been that sell items I have absolutely no use of, and random foreign news articles.<p>Failure to recognise the typo is unexpected but forgivable. But then, rather than helping me with my search, they attempt to distract and lead me away from it - using triggers that you&#x27;d think they should have known wouldn&#x27;t work.<p>I really don&#x27;t understand how this is even possible, and it&#x27;s not a rare occurrence.
yuliyp超过 3 年前
Was this linked for the irony of everyone spamming replies advertising their startups which don&#x27;t solve the problem but kinda-sorta do, resulting in something hard to read and understand?
igammarays超过 3 年前
Interestingly, Google Maps doesn&#x27;t suffer as much from the issues with Google Search. Maybe because it has those community-driven curation features that PG is talking about? Google Maps is fantastic at finding places to go to (and getting you there).<p>Also, why hasn&#x27;t Apple built a search engine yet? It baffles me that they chose to go head-to-head with Google on Maps, yet outsourced their search engine. I would&#x27;ve liked it the other way around: Google Maps and Apple Search.
评论 #29782967 未加载
评论 #29784761 未加载
评论 #29782684 未加载
jefftk超过 3 年前
I&#x27;m seeing a lot of comments along the lines of, &quot;Google shows ads on the SEO-gamed sites that show up in results, so their incentive is to give spammy results&quot;. But wouldn&#x27;t this predict that results would be much better on Bing and other search engines that don&#x27;t have much presence in the &quot;put ads on random sites&quot; market?<p>(Disclosure: I work on ads at Google, speaking only for myself)
评论 #29785226 未加载
评论 #29787277 未加载
nobbis超过 3 年前
10 years ago, the original engineer of Google&#x27;s search engine told me what he now wanted was asynchronous, human-powered search with curated results, e.g. a Google-like interface, but queries cost $5 and take 15 minutes.<p>Money&#x27;s no object for him, so he wanted to outsource the filtering, ranking, and interpreting of results. Would be even more useful today (albeit a tiny TAM.)
评论 #29796646 未加载
评论 #29784459 未加载
评论 #29784010 未加载
rel2thr超过 3 年前
&gt; You might need to do a lot of manual spam fighting initially<p>This is why I am very hyped on Brave search&#x27;s goggles feature , it will let you share exclude &#x2F; include site lists to use w&#x2F; the search engine. Hopefully it will empower these niche communities to curate a list of non-spam sites ( like the ad-blockers do with ads today )
gomox超过 3 年前
I believe that the only moat protecting $100B of AdWords revenue is the quality of the Google Search results. There is no meaningful switching cost to using a new search engine, and the spend inertia in ad spend is not very significant (i.e. any online marketing manager will happily spend 5% of their budget in a different search engine adwords-like program if they get better ROI, there is no incentive the be a &quot;Google Ads only shop&quot;).<p>On the other hand, Google needs to maintain the ballistic trajectory of its revenue growth. So how can they fix search quality when they&#x27;ve minmax&#x27;d themselves into this situation in the first place? If they were to make the ads background yellow again, that would have negative short term effects that I doubt any career exec can stomach.
评论 #29782673 未加载
评论 #29782716 未加载
baby-yoda超过 3 年前
search ads responsible for the rise of google search[1], content ads (seo spam) responsible for google search&#x27;s fall?<p>my guess is the rate of spam content production far outpaces the rate of original content creation. so the power law concentrates even further in the tiny percentage of OC and a moat forms around them (highest ad $, highest authority&#x2F;authenticity).<p>where do we end up 5 years from now? further consolidation and the continued return to aol style portals (telco&#x2F;media giants and fast-lane to own content?) pay-to-access silos dominating the internet?<p>[1] oversimplifying a bit of course, there was a novel ranking method that was more than accurate enough, and it scaled, which allowed for the search ad business to go gangbusters.
Jenk超过 3 年前
Single-page thread: <a href="https:&#x2F;&#x2F;threadreaderapp.com&#x2F;thread&#x2F;1477760548787920901.html" rel="nofollow">https:&#x2F;&#x2F;threadreaderapp.com&#x2F;thread&#x2F;1477760548787920901.html</a>
anovikov超过 3 年前
Sadly the only way of fixing it is making search results unattractive for cracking. Ranking of a page in search results is a metric, and every metric is a hackable metric. Only way they won&#x27;t be hacked is if there&#x27;s no incentive to.<p>Sure a search engine that specialises on narrow area of knowledge without much money in it, can be very relevant and bullshit-free.<p>But there&#x27;s no way to make it work for the general web search. People hack things. If they didn&#x27;t we&#x27;d have Communism built by now (yes the &quot;good&quot; - classless, stateless one).
jeffbee超过 3 年前
All of the amateur search quality experts forget to mention the regulatory environment. Obviously, Google could nuke Pinterest from orbit, dramatically improving image search results. Clearly, Google could effectively take down Statista, technically. But various Eurocracies have shown an extreme willingness to take the side of Yelp, Pinterest, and whatever other spam&#x2F;scam mills are able to form a shadow alliance with Microsoft&#x27;s astroturf campaigns like &quot;fairsearch&quot; and whatever.
评论 #29783323 未加载
评论 #29783352 未加载
EamonnMR超过 3 年前
Being able to flag Fandom, Quora, and Pinterest results would bring me great joy.
lucasyvas超过 3 年前
I don&#x27;t think this problem can be solved by another search engine as they currently exist. The problem must be solved by a new kind of search engine that <i>exclusively</i> searches Internet communities (HN, Reddit, etc.). The content must be community produced, since all other forms of writing have monetary incentive. You will find the best results in communities of enthusiasts with respected moderation teams.<p>So, a curated strategy where the users can UP&#x2F;DOWN sources they trust for particular topics. Relevancy and user ranking of answers determines score. Of course, the user can search any sources they like, but a voting system would control the defaults.<p>I think this avoids slanted results as well, because topics are objective. The subjectivity will be in the comments, where they belong. That&#x27;s in contrast to today where SEO scams can determine how high up results are.<p>So, to game my proposed search engine, you need to infiltrate the users. I believe this is harder to do across numerous sources compared to the current system, which is game the algorithm.
zavkz超过 3 年前
Oh so it&#x27;s not just me... Most recently I was trying to find a way to reset a printer and also fix a certain error code. I search on google and it&#x27;s filled with irrelevant content, unrelated to the model number I just put in, scam websites, ink sellers, even though I used correct filters such as + sign and the syntax, which is a joke if you think about it. A billion dollar company and this is the best they do. The advanced search is burried, the syntax they have is explained by third parties or burried somewhere in their options.<p>I jumped onto youtube, I put in the model number, same thing pretty much. I get unrelated videos mixed in with the model number I put in. I&#x27;m pretty sure some videos even though the model number is the same is not being shown. Ironically there&#x27;s a video explaining a certain solution and warning people not to fall victim of another video scamming people, which the dislike has been removed, comments obviously deleted so some people may be calling and getting scammed.
anyfactor超过 3 年前
I have seen several google alternative search engine projects being posted in HN every other week. You have your privacy focused open source google alternative search engine for &quot;insert niche here&quot; with big hopes of disruption.<p>I will give you my two cents. I have used duckduckgo, bing, searx etc. for extended periods of time and hated every one of those things. The problem is that what you search seem to be essentially the gateway to wild west of internet. I understand the proposition of spam control in search engines, but atleast to me I think the early days of google without DMCA and copyright bans made google the best.<p>I fear SEO spam control will only bring the worst of the moderated internet. It will not be the first time big tech tried to douse a gasoline fire with more gasoline because they taught the more fire meant the previous fire will get suffocated by the lack of oxygen(?). Rather than using &quot;AI&quot; as a crutch to solve SEO as a problem, I want to see an option that is true to 2005 era google.
fuckcensorship超过 3 年前
Go read any default subreddit on Reddit to see what this idea would look like long-term, especially the &quot;amateur police&quot; part.
mirekrusin超过 3 年前
The whole thing seems &quot;simple&quot; to me – graph of identities with url vetting&#x2F;liking&#x2F;approve-this-message-like actions, you don&#x27;t need anything else.<p>Reputation, non-fakeness etc. can be derived from it for anybody - you just list identities you trust&#x2F;follow (with weights?) and anything you look at can be scored.<p>Virtual identities can also be created, ie. identity listing all links mentioned on HN (with positive sentiment only?), links from wikipedia etc. so people can follow those to create their reality graphs.<p>The interesting part is that it doesn&#x27;t claim universal truthness - depending on who you follow your results will be skewed towards their opinion of the world. Ie. if you follow MIT, Wikipedia and E. Musk you&#x27;ll see different view of truthness than somebody following FOX News and Flat Earth Society for example.<p>It could be interesting to focus on &quot;dislike&quot; marking (only?) as it may be much more lightweight to approach it from blacklisting side.
baby超过 3 年前
Searching code is also impossible on Google. If there’s a competing search engine for that I’ll use it at least for this use case.
评论 #29782644 未加载
评论 #29792242 未加载
yumraj超过 3 年前
If I remember correctly, it used to be called About.com - with categorized and human curated links.<p>It was big during the dot com days, but withered after Google.<p>Interestingly, I do think that that model may need to be revisited.<p>Edit: I feel that Reddit is filling some of this need, at least for things like Vaccuum and Espresso machines with dedicated spaces.
评论 #29783634 未加载
hooande超过 3 年前
What are some search categories that are so dominated by spam that they are unusable?<p>I&#x27;ll start: &quot;how to rent a car&quot; [0]<p>[0] worth noting that I personally get somewhat reasonable results for this, with a 3rd result from nerdwallet.com and a 4th from wikihow.com, both of which seem to answer the question in an unbiased way
评论 #29782504 未加载
lifeisstillgood超过 3 年前
Google is not important because it has all the information - it&#x27;s important because it has hardly any.<p>A major complaint is that there used to be good free reviews of commercial products that could be easily found.<p>That is not &quot;all the information&quot;. Information about the current round of commerically advertised products is something like 5-10% of all commerce (or less).<p>And we are entering a world where &quot;all the information&quot; is what we do all day, what we say, how we react to different stimuli.<p>That is the real review sites - why do people take this train and not that, why is that park safe and this one full of muggings.<p>We need to solve the Google problem not because we want blogging like it&#x27;s 2009 but because epidemiology is about to open humanity&#x27;s eyes. And it&#x27;s going to hurt if we don&#x27;t make it free and open.
freediver超过 3 年前
Google’s job is to serve its customers, and it does that really, really well.<p>The problems being discussed today (and yesterday in the similar thread) come from the fact that for Google user != customer.<p>When you have incentives that are misaligned like this, you can only go so far! We seem to have reached that point with Google, where there is not much more that can be done on the search experience front without jeopardizing customer experience (ad revenue).<p>Disclosure: I’m working on a paid search engine to solve this problem on a fundamental level, by aligning the incentives and making user also the customer so we can best serve them and their needs. It is called Kagi and is currently in closed beta accepting beta-testers.<p><a href="https:&#x2F;&#x2F;kagi.com" rel="nofollow">https:&#x2F;&#x2F;kagi.com</a>
mtnGoat超过 3 年前
Big money in SEO, I had an acquaintance all the way back in the early 2000s that had tens of thousands of domains that he ran experiments on to reverse engineer how the algorithm worked. He also had tens of thousands he ran SEO&#x2F;link networks on. He made a lot of money for a long time by being front page for a lot of terms.<p>Same thing is happening today, there are just more of these actors doing it. They just game the algorithm for terms related to products. Notice you still get decent serps in Google for terms that don’t relate to something that can be sold using an affiliate link.<p>Fairly easy problem to fix, but Google would have to hire a black hat to help solve it. But the good ones ain’t gonna work there.
quickthrower2超过 3 年前
Google is so spammy I now instinctively use other search methods at times. Which is very interesting, because doing so is high friction. But it&#x27;s so spammy out there that pain(spam) &gt; pain(friction).<p>It is not to be &quot;un-Google&quot; but because I get better results.<p>For example searching in a good subreddit can be more fruitful, giving answers from genuine people in moderated parts of the internet. If you get crap then try another subreddit - some mods are better than others.<p>Is this a business opportunity - I think so, although I have no idea how you would go about it. Maybe a decent search engine for programmers would be a good start! E.g. &quot;Exception Message XYZ&quot; + site with decent answers.
wslh超过 3 年前
In 2013 I elaborated about this topic: <a href="http:&#x2F;&#x2F;blog.databigbang.com&#x2F;letters-from-the-future-challenging-googles-search-engine&#x2F;" rel="nofollow">http:&#x2F;&#x2F;blog.databigbang.com&#x2F;letters-from-the-future-challeng...</a> I would add that in 2021 we can easily do Natural Language Understading (NLU) and Natural Language Generation (NLG) and can build zillions of web pages that don&#x27;t follow the original page ranking concept of Google. Probably important sites share less low rank pages and there are many more link rings and clusters. More decentralized blogs seems a thing of the past (expecting to be rebooted in the future).
Covzire超过 3 年前
Could this issue be related to Gmail&#x27;s spam filtering? For approximately 2 years now it&#x27;s been downright porous, I&#x27;m getting on average 1 obvious spam message in my inbox that is something like:<p>c0nGrats-You_HaVe_Won_ThE_Pr1ze!<p>..Or some silly variation of this that takes literally 0.1 ms for a human to discern that it&#x27;s spam. Yet something happened to Gmail&#x27;s spam algorithm in the last couple years that has been consistently letting these through. To be fair, it does catch most spam but it&#x27;s only batting something like 75% and the spam it does catch is often times much less obvious to human eyes than the stuff it lets through.
konaraddi超过 3 年前
A problem is that good SEO doesn’t meant good quality. And assessing quality is hard, so people lean on other people to assess quality (either by appending something like “Reddit” to their search queries or asking friends irl or on twitter&#x2F;discord).<p>I wrote a bit more about search engines competition and problems&#x2F;opportunities here - How Alternative Search Engines Can Win Users <a href="https:&#x2F;&#x2F;konaraddi.com&#x2F;writing&#x2F;2021&#x2F;2021-08-05-on-search-competition&#x2F;" rel="nofollow">https:&#x2F;&#x2F;konaraddi.com&#x2F;writing&#x2F;2021&#x2F;2021-08-05-on-search-comp...</a>
pictur超过 3 年前
People don&#x27;t want to search anymore. they want to see well-categorized data. For example, instead of searching for cheap vacuum cleaners, I think they want a site that lists vendors that sell cheap vacuums.
anderspitman超过 3 年前
I for one am optimistic what a &quot;post-search&quot; world might look like. Maybe a lot like the early web. I don&#x27;t think affiliate links themselves are necessarily the problem. I&#x27;ll gladly use a link from a high-quality reviewer to give them a little kickback. SEO seems to be the issue. Maybe we end up with trusted brands for reviewing specific things. For example, I trust outdoorgearlab.com for pretty much anything camping related, and no purchase comes to mind that I&#x27;ve regretted yet.
NmAmDa超过 3 年前
One example, the website called gitmemory which crawls github data regularly and have better SEO than github that usually you will find results above original github links.
noduerme超过 3 年前
Hey, smart people: It&#x27;s called <i>CURATION BY HUMANS</i>.
cassianoleal超过 3 年前
I&#x27;m not very well versed in SEO but isn&#x27;t this just good old Goodhart&#x27;s Law?<p>Come up with criteria to determine which websites are &quot;better quality&quot;. Measure them, rank them, put the ones that fit the criteria best at the top.<p>On the other side, there&#x27;s the people promoting their websites. Do what you can to get as close to Google&#x27;s ideal as possible through whatever means. Profit.<p>At this point the criteria becomes useless for any real quality analysis.
heisenbit超过 3 年前
Affiliate links are environmental toxic waste and it would be only logical to tax such affiliate payments to fund cleanup and mitigation efforts.
评论 #29787323 未加载
hnbad超过 3 年前
I guess Paul&#x27;s definition of &quot;beating Google&quot; is &quot;creating a startup without clear revenue path aiming to be acquired by Google or a competitor&quot; as I can&#x27;t think of any meaningful way a niche search engine would provide a good enough value proposition against existing Google competitors or embeddable search engines (as well as SaaS like Algolia).
MarkMc超过 3 年前
I just switched to Duck Duck Go yesterday and was not impressed. When I searched &quot;define hot take&quot; Google gives me a canonical, prominent definition with bold-font title and button to hear the pronunciation. DDG gives me multiple definitions in a normal search results page with none prominent, and no way to hear the pronunciation.<p>I&#x27;ll be switching back to Google
评论 #29789418 未加载
mrkramer超过 3 年前
Good idea Paul I had similar one but no way you would do it manually. Machine learning algorithms need to detect spam not people because that way search engine can&#x27;t scale. If people were marking what&#x27;s good content and what&#x27;s not such search engine would be reduced to content curation not organic search and discovery.
aronpye超过 3 年前
A lot of the spam results just seem to be copy pasted content.<p>I wonder how difficult it is to compare the main body of text in search results, then say if it is over a 95% match with another site (I.e. it has been copy-pasted), demote it in the search results. If a site generates too many of these demotions then it gets blacklisted from the index.
评论 #29783294 未加载
评论 #29783017 未加载
imranhou超过 3 年前
I believe google tracks click throughs from search results pages, which should provide in theory plenty of insight into what links aren&#x27;t really working for specific keywords and what are... thus helping improve or reduce rankings of SEO laden sites.<p>Wonder if someone can throw light on to why this isn&#x27;t effective.
评论 #29783621 未加载
new_here超过 3 年前
&gt; <i>Maybe ultimately you open up spam fighting to your users. If you managed this well, you could harness a lot of energy.</i><p>Doesn&#x27;t Google already consider that if a user returns to the results page (or clicks a second link) then the first link visited was not satisfactory. Seems like a pretty elegant solution.
评论 #29782669 未加载
ben7799超过 3 年前
It&#x27;d be really interesting if Google allowed upvote&#x2F;downvote on search results... but it&#x27;d be super hard to imagine them every taking the votes into account much versus ad revenue.<p>And the upvote&#x2F;downvote would be very tricky to implement in a way that the SEO crowd couldn&#x27;t just game it horribly.
评论 #29783192 未加载
PaulHoule超过 3 年前
I have wondered about this.<p>When I run web sites I frequently look at the log and find a large fraction of the traffic is from search engines. This is a problem because it costs me money to serve that traffic. It might not be initially obvious but it costs more than serving real users because the search engines will scan everything and break the cache.<p>Google sends a significant amount of traffic. Bing sends a detectable amount of traffic. Baidu&#x27;s crawler might be more active than the two of those together but I never get hits from Baidu. Other crawlers deliver me trouble instead of value: even if I&#x27;m not interested in hosting pirate or plagiarized content, a crawler that is looking for trouble is only going to bring me trouble.<p>I hate doing it but I turn off crawlers other than Google and Bing both at the robots.txt and web server level because I just can&#x27;t afford to serve Baidu queries.<p>I&#x27;d like to sign an exclusivity contract with a search engine such that they get exclusive access to crawl it and in turn I get a privileged position in search results. This would give the search engine and myself an incentive to deliver end-to-end quality results.
notananthem超过 3 年前
We still need a search engine that actually blacklists everything serving ads. Google beat altavista, now we need to beat google.<p>I mean no mincing about- recipe sites that are ads are blocked. Results with pixel tracker etc are blocked. Hell, results that are paywalled are blocked because they&#x27;re useless.
Marazan超过 3 年前
Google&#x27;s ranking alhorithm shaoes the web.<p>And the web now looks like a 1500-2000 word listicle with 3 images becasue that is what thr ranking algorithm favours.<p>If you find the info you need and leave quickly that actually down ranks the page. That is is idiotic. Pages that give you what you want quickly are punished!
AtNightWeCode超过 3 年前
I think it will be hard to create a great a search engine while the web works as it does today. Maybe there could be like a sitemap but for text content that has the content structured, indexed, and signed by a trusted party in a way that makes it easy to analyze for plagiarism and so on.
Quenhus超过 3 年前
For developers, you can remove some spam websites from Google and other search engines, with these uBlock filters: <a href="https:&#x2F;&#x2F;github.com&#x2F;quenhus&#x2F;uBlock-Origin-dev-filter" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;quenhus&#x2F;uBlock-Origin-dev-filter</a>
digitcatphd超过 3 年前
Most of their complaints are not related to Google Ads, which means the poor results are not there because of profit motives.<p>Moreover, they are more related to a specific type of search query, that likely a result of broad based ranking algorithms that loosely are the most efficient ranking system.
blunte超过 3 年前
Google search results are garbage, at least from a developer&#x27;s perspective.<p>Most of the results are poorly formatted content &quot;gathered&quot; from stackoverflow, github, quora, etc.<p>And from a &quot;person who wants to see an image&quot; perspective, Google is purely a gateway to Pinterest or Gettyimages.
lubesGordi超过 3 年前
How about a platform for curation. Curators who know a subject well can link to content that looks good to them. Search goes through the curators, people can favorite certain curators. Lots of people like to curate. This is a better idea than trying to go after spam.
评论 #29784446 未加载
cpeterso超过 3 年前
Amazon’s search results and scammy third party sellers are a similar trust problem. When possible I try to purchase directly from the product manufacturer’s website. Similarly, I don’t search Google for product reviews, I go directly to trustworthy review websites.
jliptzin超过 3 年前
Improve your search engine results with this one weird trick!<p>Just block any domain containing the word pinterest
anfilt超过 3 年前
An other thing does not help is how some sites gate content from being scraped. Also forums are not as popular today again reducing the amount of indexable content. Think about some sites have migrated from using forums to something like discord.
cblconfederate超过 3 年前
Good. In fact, if we want people to visit websites other than google.com (and then read the answer in the snippet or the box in the sidebar) then it&#x27;s good that google results are crap. Use google less.
ttiurani超过 3 年前
Is there a search engine for programming? One that not only searches stackoverflow, github, relevant subreddits and the other big sites, but also finds programming articles in personal blogs?<p>That would be valuable to me.
calltrak超过 3 年前
Here is a handy list of alternatives to google search<p><a href="https:&#x2F;&#x2F;fabform.io&#x2F;a&#x2F;alternative-search-engines" rel="nofollow">https:&#x2F;&#x2F;fabform.io&#x2F;a&#x2F;alternative-search-engines</a>
mrlanderson69超过 3 年前
If anyone has the skills to work on something like this please email me (email address in my profile.)<p>I can show you a demo. Just to show I am not screwing around: if you don&#x27;t like the demo I will pay you $500.
stickyricky超过 3 年前
What are the pros and cons of a user generated tagging system? If you have a community of dedicated individuals who maintain a group of tags, searching those tags should yield high quality results.
SavantIdiot超过 3 年前
&gt; And boy would Google find it hard to follow you down that road.<p>This is a good perspective. Where can Google not go? Places that don&#x27;t lead to profit. They will try (cough Wave cough) but will give up.
评论 #29783165 未加载
pwdisswordfish9超过 3 年前
&gt; Lots of people want to be amateur police. And boy would Google find it hard to follow you down that road.<p>Kinda like they tried with YouTube Heroes?<p>But then, who’s to say you won’t get the same kind of backlash?
throwaway14356超过 3 年前
ah, so everyone wanted to move from carefully crafted personal websites where every detail counts and low effort publications are harshly punished to platforms with guranteed readership and now we have a curration problem?<p>Someone (who probably doesnt have a website) said that comment moderation on your own website is to much work. Perhaps the whole internet is to much work?<p>But i like the spam search engine by and for spammers as a way of finding the latest and greatest affiliate marketing and blockchain swindle.
snth超过 3 年前
Several people mention DuckDuckGo in that Twitter thread. I use DuckDuckGo for my main search engine, and it&#x27;s not obviously any better than Google regarding SEO spam.
评论 #29783482 未加载
nanna超过 3 年前
If you were to start a search engine, what stack would you use?
评论 #29783090 未加载
评论 #29792280 未加载
james-redwood超过 3 年前
www.neeva.com www.kagi.com Two privacy oriented search engines with results and features better than and surpassing Google (did I mention that they’re ad free?)
beefield超过 3 年前
Okay, given that we have pretty successful examples of wikipedia as a general crowdsourced information storage and stackoverflow as a specialized domain crowdsourced Q&amp;A site, would it be impossible to build a crowdsourced search engine? Not even scraping the web, but I would just type my search term, if that is already searched and results voted, I would see those. If it wasa completely new search term, I would get no immediate results, but my search would be displayed in &quot;new searches page&quot;, which some voluntary people would be following and trying to add relevant results.
usrusr超过 3 年前
“You might need to do a lot of manual spam fighting initially“<p>How would this be limited to &quot;initially&quot;? Wouldn&#x27;t it be a lot, initially, and then only get worse?
1024core超过 3 年前
... and the moment you gain some traction, the SEO monster will train it&#x27;s eye on you like Sauron; and without a billion dollar budget, you will be toast.
legohead超过 3 年前
It would work until you got big enough, then you&#x27;d end up following the same path as Google, as that&#x27;s where the money is.
coding123超过 3 年前
&gt; Lots of people want to be amateur police. (pg)<p>This is very true. How many times have I clicked on a site met with ads so bad that the browser slows down, and after 10 seconds the page gets covered up by more and more crap and then a paywall shows up sometimes too. Now here&#x27;s the thing - a competitor to Google might detect you clicking back and then pop-up a special set of controls near the search result that lets you say: &quot;too many ads&quot; or &quot;paywall&quot;.<p>However, if such an engine were to start beating Google, I&#x27;m sure Google would implement it in their own way: automatically detect why you clicked back in such a short timespan.
评论 #29782578 未加载
评论 #29783317 未加载
评论 #29782571 未加载
nojito超过 3 年前
The issue is and will always be monetizing. Anyone competing with Google will need to have a robust monetizing strategy to survive.
nabla9超过 3 年前
When the search engine is funded by ads, there is incentive to produce results that people who click ads like.
ape4超过 3 年前
One approach would be to have moderators from the community who are allowed to make decisions about results.
Jenk超过 3 年前
&gt; What would a paid version of Google Search results look like - where Google can just try to give me the best possible results and not be worried about generating revenue?<p>God please no. YouTube premium shows what Google would do, i.e., they would further ruin the free experience by ramping up the amount of ads you see to &quot;incentivize&quot; the premium search.
评论 #29782639 未加载
streamofdigits超过 3 年前
Eventually search will become a decentralized activity (No, not a web3&#x2F;crypto&#x2F;coin type decentralization, I am talking about the useful type).<p>Is there any particular reason why internet search has to have a distorting gatekeeper to the global commons (that pretends playing Maxwell&#x27;s demon). For chrissake, the stuff being indexed is <i>public</i>.
评论 #29785029 未加载
评论 #29783664 未加载
RichardHeart超过 3 年前
His suggestion basically is to become DMOZ.org If you are old enough to remember it.
ChuckMcM超过 3 年前
As I pointed out to paulg yesterday this was <i>exactly</i> the business model &#x2F; concept that Blekko was created to address. The idea being that one could use &quot;slashtags&quot; to curate web sites that were &quot;good&quot; on a topic (not spammy) and pull results from that rather than the general web. Guess what? It works great! Also, it doesn&#x27;t make enough money to support the company using advertising.<p>For a couple of years, Blekko ran a &quot;3 card monte&quot; game where we white listed the results from Google, Bing, and our own index. For every &quot;contested&quot; query, Blekko consistently beat the others by a significant margin. If the query wasn&#x27;t contested, Bing and Google did about the same, and if the query was obscure, typically Google did better than Bing or Blekko.<p>What is a &quot;contested&quot; query? That is one where there is a lot of money on the line. My favorite one was &quot;best credit card&quot; (which is search engine shorthand for &quot;What is the best credit card?&quot; because the stop words &quot;What&quot;, &quot;is&quot;, and &quot;the&quot; are removed).<p>Why is it contested? Because if you put an advertisement into the results of that query, and the person making it clicked on that link and signed up for a credit card, you could be paid $50 or more. For a single click. Other queries that advertisers would pay well for getting the traffic of the user were, car dealerships, hotel chains, jewelry retailers, and university &quot;referral&quot; services (like the one that was busted for getting people into Ivy League schools by faking academic records).<p>Extremely few people click on an ad put onto a page of search results for the query &quot;what is shoe rubber made of?&quot;[1]. However it is required to serve queries like that so that people will come back when they are looking to spend money on something.<p>So using the same exact idea that Paul proposed Blekko built an English language index which allowed you to curate the crap out of your search results and return much better data. The &quot;value&quot; of that was not considered to be high enough to insist on people logging in to use the engine. Knowing an id for the person making the query allowed for user specific blacklists of spammers (so if for example you never wanted to see a Pinterest link in your results you could make that happen).<p>Without sufficient traffic, using the feedback loop &quot;of these documents, which one was clicked as the &#x27;best&#x27; answer?&quot; type algorithms for ranking fail to converge rapidly enough for decent ranking.<p>Without a credible threat that if your site is not included in the index, your traffic will be greatly reduced, it is difficult to negotiate with web sites to permit crawling, rather than deny your crawls with the robots.txt file.<p>Blekko&#x27;s best customers and most ardent fans? Reference Librarians. Yup, people who needed web search to do their jobs, not to find the movie times for the latest feature. Blekko never did try to create a subscription service, but I think such a service that is somewhere between free and the $$$ of LexisNexis has a shot, at least as a lifestyle business. You still need to get rights to the data and that gets harder and harder.<p>[1] Okay, bots do, but humans don&#x27;t
mrlanderson69超过 3 年前
We are working on exactly this problem.<p>IF anyone wants to see a demo please email me.
hammyhavoc超过 3 年前
Why not Searx or YaCy?
mitchtbaum超过 3 年前
How will these search engines interoperate?
deadalus超过 3 年前
I also consider Paywalls to be spam. Clicking on a link and finding out that it is paywalled, is a massive waste of time.
评论 #29783554 未加载
tester756超过 3 年前
How about Bing?<p>Is it viable competition?
liveoneggs超过 3 年前
do google search engineers use ad blockers?
1vuio0pswjnm7超过 3 年前
Ranking search results on popularity is flawed. It may improve search engine performance and the effectiveness of online advertising but it penalises users who can think critically and independently. There is an underserved market that has been left behind by Google and its &quot;competitors&quot;.<p>PageRank seemed to borrow from the concept of citation count. The idea that &quot;importance&quot; could be measured by the number of times a webpage, like a paper published in a peer-reviewed academic journal, was referenced by other webpages, like other papers published in peer-reviewed academic journals. The initial name for the project before &quot;Google&quot; was &quot;Backrub&quot;, referring to the reliance on &quot;backlinks&quot; to quantify importance.<p>An index of a commercially-oriented www full of sites supported by online advertising is nothing like Web of Science or some other database collection that allows ranking by citation count. The www has no peer-review and no limits on commercial activity.<p>Google succeeded in creating something highly profitable and sometimes useful, but the founders never delivered on their original promise. That was a search engine in the academic realm, where the technical details were public, and one that would be free from the influence of advertising.^1 Instead the project was turned into an online advertising business. A 180-degree pivot.<p>The moral&#x2F;ethical debate went from the question of being advertising-supported to the question of invading the personal privacy of users, for the benefit of advertising. Whatever ideals the founders held in 1998 were overtaken by the lure of pure financial success. Once oppposed to idea of using cookies for advertising purposes, the founders were persuaded to purchase DoubleClick, ground zero for the explosion of online ads, for $3.1 bilion. Not sure what if any moral&#x2F;ethical debate remains today. While the company is being sued simultaneously by hundreds of plaintiffs, including the US government, one of the founders is &quot;hiding out&quot; on a small island in the South Pacific. Whatever motivations he had to make an open, academic search engine free from the influence of advertising, they seem to be gone.<p>In sum, the world still needs a decent web search engine free from the influence of online advertising.<p>1. <a href="https:&#x2F;&#x2F;infolab.stanford.edu&#x2F;~backrub&#x2F;google.html" rel="nofollow">https:&#x2F;&#x2F;infolab.stanford.edu&#x2F;~backrub&#x2F;google.html</a><p>Excerpts:<p>&quot;Up until now most search engine development has gone on at companies with little publication of technical details. This causes search engine technology to remain largely a black art and to be advertising oriented (see Appendix A). With Google, we have a strong goal to push more development and understanding into the academic realm.<p>Appendix A: Advertising and Mixed Motives<p>Currently, the predominant business model for commercial search engines is advertising. The goals of the advertising business model do not always correspond to providing quality search to users.<p>For this type of reason and historical experience with other media [Bagdikian 83], we expect that advertising funded search engines will be inherently biased towards the advertisers and away from the needs of the consumers.<p>Furthermore, advertising income often provides an incentive to provide poor quality search results.<p>[T]here will always be money from advertisers who want a customer to switch products, or have something that is genuinely new. But we believe the issue of advertising causes enough mixed incentives that it is crucial to have a competitive search engine that is transparent and in the academic realm.&quot;
1vuio0pswjnm7超过 3 年前
Ranking search results on popularity is flawed. It may improve search engine performance and the effectiveness of online advertising but it penalises users who can think critically and independently. There is an underserved market that has been left behind by Google and its &quot;competitors&quot;.<p>PageRank seemed to borrow from the concept of citation count. The idea that &quot;importance&quot; could be measured by the number of times a webpage, like a paper published in a peer-reviewed academic journal, was referenced by other webpages, like other papers published in peer-reviewed academic journals. The initial name for the project before &quot;Google&quot; was &quot;Backrub&quot;, referring to the reliance on &quot;backlinks&quot; to quantify importance.<p>An index of a commercially-oriented www full of sites supported by online advertising is nothing like Web of Science or some other database collection that allows ranking by citation count. The www has no peer-review and no limits on commercial activity.<p>Google succeeded in creating something highly profitable and sometimes useful, but the founders never delivered on their original promise. That was a search engine in the academic realm, where the technical details were public, and one that would be free from the influence of advertising.^1 Instead the project was turned into an online advertising business. A 180-degree pivot.<p>The moral&#x2F;ethical debate went from the question of being advertising-supported to the question of invading the personal privacy of users, for the benefit of advertising. Whatever ideas the founders held in 1998 regarding the influence of advertising on web search were overtaken by the lure of pure financial success. Once oppposed to idea of using cookies for advertising purposes, the founders were persuaded to purchase DoubleClick, a company with a terrible privacy record that uses cookies and purchasing data to profile users as ad targets,^2 for almost double what they paid for YouTube. Not sure what if any moral&#x2F;ethical debate remains today. While the company is being sued simultaneously by hundreds of plaintiffs, including the US government, one of the founders is &quot;hiding out&quot; on a small island in the South Pacific. Whatever motivations he had to make an open, academic search engine free from the influence of advertising, they seem to be gone.<p>In sum, the world still needs a decent web search engine free from the influence of online advertising.<p>1. <a href="https:&#x2F;&#x2F;infolab.stanford.edu&#x2F;~backrub&#x2F;google.html" rel="nofollow">https:&#x2F;&#x2F;infolab.stanford.edu&#x2F;~backrub&#x2F;google.html</a><p>Excerpts:<p>&quot;Up until now most search engine development has gone on at companies with little publication of technical details. This causes search engine technology to remain largely a black art and to be advertising oriented (see Appendix A). With Google, we have a strong goal to push more development and understanding into the academic realm.<p>Appendix A: Advertising and Mixed Motives<p>Currently, the predominant business model for commercial search engines is advertising. The goals of the advertising business model do not always correspond to providing quality search to users.<p>For this type of reason and historical experience with other media [Bagdikian 83], we expect that advertising funded search engines will be inherently biased towards the advertisers and away from the needs of the consumers.<p>Furthermore, advertising income often provides an incentive to provide poor quality search results.<p>[T]here will always be money from advertisers who want a customer to switch products, or have something that is genuinely new. But we believe the issue of advertising causes enough mixed incentives that it is crucial to have a competitive search engine that is transparent and in the academic realm.&quot;<p>2. <a href="https:&#x2F;&#x2F;www.nytimes.com&#x2F;2000&#x2F;02&#x2F;17&#x2F;technology&#x2F;us-investigating-doubleclick-over-privacy-concerns.html" rel="nofollow">https:&#x2F;&#x2F;www.nytimes.com&#x2F;2000&#x2F;02&#x2F;17&#x2F;technology&#x2F;us-investigati...</a><p><a href="https:&#x2F;&#x2F;slate.com&#x2F;technology&#x2F;2005&#x2F;11&#x2F;why-web-surfers-love-to-hate-cookies.html" rel="nofollow">https:&#x2F;&#x2F;slate.com&#x2F;technology&#x2F;2005&#x2F;11&#x2F;why-web-surfers-love-to...</a>
waynesonfire超过 3 年前
what a great idea.
thr0wawayf00超过 3 年前
I&#x27;m honestly not trying to take a potshot against PG or YC here, but it&#x27;s kinda funny to see him saying this after I worked for a YC-backed startup years ago that built its core revenue streams around generating SEO spam, we just marketed it as something else. Just to be clear, I don&#x27;t think PG or YC are responsible for all or even most SEO spam, but I know firsthand that they&#x27;ve profited from it through at least one of their incubated companies.<p>I never considered the possibility that an incubator would support a specific product, then later on call for alternatives that would essentially freeze out the original product that they supported. I&#x27;m sure this very rarely happens, but it&#x27;s interesting to see a real-world example in action.
评论 #29782663 未加载
评论 #29783335 未加载