Hmmm. Let's say that Bing sets up a script that sends queries to Google and then records the results. That's clearly copying. But what Bing does is when you use its toolbar, it watches what you do and uses that information to rank results. Is that really copying? It showed Google's Honeypot page because Google's engineers were clicking on the Honeypot page with the toolbar installed. That isn't copying Google's results, that's copying the actions of Bing toolbar users.<p>This can easily be demonstrated. Google can set up a second honeypot but instruct its engineers not to click on the link, ever. If it shows up in Bing's results, then Bing is watching what Google returns and scraping its results.<p>But if the second Honeypot doesn't show up in Bing's results, then clearly Bing isn't copying Google's results, it's copying its toolbar's preference for links.<p>The entire thing is moot to me. The takeaway in't whether Bing copies Google. The takeaway is that Bing's toolbar is spyware :-)
I had a front row seat for this test. I believe the experiment we ran provides conclusive proof. I'm on a panel with a representative from Bing later today and I'll ask Bing about this directly.
I thought this was the most interesting part:<p>> The day after that, Bing contacted me. They were hosting an event on February 1 to talk about the state of search and wanted to make sure I had the date saved, in case I wanted to come up for it. I said I’d make it. <i>I later learned that the event was being organized by Wadhwa, author of that TechCrunch article.</i> [emphasis mine]<p>So the supposedly independent author of an article on TechCrunch that kicked off a massive wave of Google criticism is, less than a month later, organizing events specifically for a Google competitor? Boy, <i>that</i> sure seems above-board.
Uhh... Yeah? Everyone in search does this. I've worked at and with 3 major search engine initiatives, and we all tested heavily against Google in a variety of ways.<p>But the article definitely gets a few things wrong. For example, having worked at Bing I can tell you this: in general "obvious" misspellings are autocorrected without comment. It's not some sort of magical copying procedure, it's actually a policy. Want proof? Here's an example query you can repeat: <a href="http://fayr.am/4KdG" rel="nofollow">http://fayr.am/4KdG</a> (direct query link: <a href="http://fayr.am/4JZD" rel="nofollow">http://fayr.am/4JZD</a>)<p>But otherwise, shit yes everyone is scrutinizing google trying to figure out what they're doing. That doesn't mean other players aren't doing their own optimizations, or even running relevancy metrics against other search engines. Relevancy is not a concept with fixed metrics, and every player in the search market does everything they can to figure out what their competitor is doing.<p>And even the raw results leakage is fairly par for the course. It's not like Bing searches are a crawl of google searches; Microsoft gets this data from browsers running this toolbar and uses it to help shore up queries where they don't return good results.
No surprise, as both DDG and Blekko disclose that they use Bing for long-tail queries, but it works at both of those engines, too:<p><pre><code> http://duckduckgo.com/?q=hiybbprqag
http://duckduckgo.com/?q=mbzrxpgjys
http://duckduckgo.com/?q=indoswiftjobinproduction
http://blekko.com/ws/hiybbprqag
http://blekko.com/ws/mbzrxpgjys
http://blekko.com/ws/indoswiftjobinproduction</code></pre>
It may not be illegal or 'cheating', but it's incredibly stupid for the same reason cheating is: Without the person you are cheating from, you can't pass the test!<p>In the case, the customers don't get relevant results unless other potential customers use the competition! In short, Bing's results are only good if Google is popular.<p>Why would you invest time relying on your competition? Shouldn't you be striving to match or beat them, rather than trying to piggy-back on them?
"Is it illegal? Is it cheating? Is it unfair?" Who cares? Google already got everything it needed out of this situation: a gigantic PR win, and a morale boost for their own team. Well played.
My take is this: the whole Google ethos is that they are trying to have the best algorithm to give the best results. Outside of this sting they have always been at pains to put forward the view that nothing is manually ranked.<p>I think the same thing applies to Bing here: if they have a generic algorithm that ranks results based on toolbar (or other data) it could be easy to see how their data is skewed by Google given the amount of traffic Google search gets compared to the rest of the internets. This seems fine to me.<p>But if their algorithm does stuff with activity on google.com <i>because it is google.com</i> then this is a pretty clear foul - it is both essentially copying, and the equivalent of manually ranking results (specifically, Google results)<p>The corollary of this is that if their algorithm is generic, then it will still work if Google were to cease to exist. If it's not generic, it would be useless without Google.
When asked by SearchEngineLand, Google's Singhal seems to imply Google Toolbar clicktrail data is never used for ranking, but his wording is actually a bit vague:<p><i>Absolutely not. The PageRank feature sends back URLs, but we’ve never used those URLs or data to put any results on Google’s results page. We do not do that, and we will not do that.</i><p><i>Put any results</i> could be read narrowly as "this data isn't used to add pages to the index", or more generally as "this data isn't used to rank results relative to each other". Also, Singhal's pledge that "we will not do that" is much stronger than any statement I've ever seen in any Google privacy policy, which all pretty much say Google may use any info they have to improve their services.<p>Matt Cutts, can you clarify if Singhal in fact meant the 'narrow' or 'general' interpretation above?<p>And, if the 'general' meaning, then is there any statement about the use of clicktrail data in Google's published privacy policies that is as strong as Singhal's?
> It strongly suggests that Bing was copying Google’s results, by watching what some people do at Google via Internet Explorer.<p>Wow, it almost seems that is exactly what they are doing, which is some pretty dirty stuff. Now MS always had a shady track record, but I thought recently the company got a lot better.
The real question should be why is Google not doing this. Bing seems to be learning from what results users choose and improving their results.<p>Seems like a no brainer, unless i missed something.<p>I also really like this for some reason. It's very ... gangster. Shows that bing is scrappy and willing to bend the rules.<p>That being said, i will still continue using Google.
Sort of petulant on Google's part to release this, no?<p><i>Of course</i> your competitors are going to copy you. It's not innovative, and you might consider it 'cheating' if you forget that each and every one of us are building off of a foundation laid by other people. But it works, and that's why it happens and will continue to happen.
Do I have this right?<p>1. User does a search in a Microsoft toolbar, using Google as his search engine. User is searching for $terms.<p>2. User gets a results page. User clicks on the entry in the results for $site.<p>3. Toolbar sends back to Microsoft that the $site was the first result the user chose for $terms.<p>4. Bing uses this to increase $site's placement in searches for $terms.<p>An interesting question then would be whether or not Microsoft also "copies" from Bing? That is, if you are using Bing as your search engine, do they still use the fact that you went to $site after searching for $terms to adjust the rankings?
So in an effort to be as good as a competitor MS is watching what you do when you interact with that competitors website and sending that information home. Seems like a really big reason to suggest to anybody you know that they uninstall the Bing toolbar.
Google gathers lots of user data on 3rd party websites via services such as (to name a few):
- Google analytics (opted in for data sharing)
- Chrome
- Google toolbar<p>@Matt Cutts - I'd love it if you could confirm exactly which user data you DO and DO NOT use to influence rankings. Or, at the very least say on record that you don't do what Bing are doing and use data from bing.com<p>Overall, I'm not surprised that Bing are doing this for some keywords - all the major search engines use a massive number of different signals. I'll be more surprised if it turns out this is happening at a large scale or for competitive terms.
It's instructive to think of the cases where Google can return a search result, even though the searched word doesn't appear on the page. Most often, this occurs because another site includes an outlink to the page, with the searched word. That is, they're 'copying' a publicly-available source that indicates that word is associated with that page.<p>I see this Microsoft tactic as similar. They're considering search terms that resulted in a visit to the page from other search engines as being important indicators of the page content. If they have that URL-to-URL-trail data legally, and the signal works well, and they are not singling out Google's URLs as the only source of such a signal, I'm not sure what the problem is.<p>Google didn't get where they are by throwing out legally-collected useful data, and Bing won't catch up to a leader who has clicktrail sensors <i>everywhere</i>, via analytics/toolbar/ads/mobile/etc., by throwing away legally-collected useful data.
TL;DR<p>1.Bing is inferring search results from user behavior, collected via Bing Toolbar<p>2. Google team makes an experiment: using Bing Toolbar to feed Bing particular behavior. Namely, they all go from a search result page on Google.com laden with a unique word to a particular target site.<p>3. Bing infers connection between the unique word and the target site.<p>4. Google cries cheating.
Wow, I'm surprised by all the developers on Microsofts side on this one. Google spends a lot of money developing proprietary algorithms for determining search results. Microsoft is then stepping in and taking advantage of the money Google spent by copying some of their results. It's rather like someone taking the results of a Consumer Reports list and publishing it themselves. It borders on illegal, and it's definitely shady.<p>But what I think is more important is all of the flak that Google has been catching for supposedly slipping in its quality of search results. If it's quality is so poor, then why is Bing stealing its results? It's a great method of striking back at the negative PR they've been receiving.
So what? Is it a scandal that Walmart and Target both send employees into each others stores and actively monitor prices on items? It's called being competitive, and to be competitive you have to at least match what your competitor is doing, then beat them.
It's interresting. A little bit like browser wars, isn't it? Browsers are really similar between themselves. If any new noteworthy feature appears in one, it is very likely to be copied to another, which is a very good thing for end users and is a reason for which competitiveness is good. At the end of the day, users want more-less the same functionality, no matter which browser they use. There are some differences in details and quality, but rather minor.<p>Both Bing and Google are targeted towards mass market and I think people expect the same from both. If Google does it right, there is nothing more to invent. And even if there is, it is probably pretty expensive. It is so much easier to copy than to invent from scratch, just to get something almost exactly the same as Google :)<p>I am really interrested in what could Bing do to be REALLY different or better than Google. And if they did, Google would most likely do something very similar :)
They should have made one of the search terms "Agloe, New York"! [1]<p>Footnotes<p>[1] - <a href="http://en.wikipedia.org/wiki/Fictitious_entry" rel="nofollow">http://en.wikipedia.org/wiki/Fictitious_entry</a>
"Is this illegal?"<p>IANAL, but in certain jurisdictions, most certainly <i>yes</i>. Many countries have copyright laws that protect compilations of things that are individually not worthy of copyright, for example telephone books. Copying down an individual telephone book entry is of course not a copyright violation, but copying the whole listing in a systematic fashion is.<p>I'd guess that this law applies to search engine rankings as well - rankings/listings of individual items that are not protected by copyright, but where a lot of effort goes into producing the listing itself.
A few quick thoughts:<p>- generally speaking, the conclusion seems to be that for regular queries, Bing uses mostly other clues to figure out relevance, so this is basically a storm in a cup of water. Regardless, since both Google's and Bing's algos are closed-source, we're going on faith when either company says data gathered from one of their products doesn't affect search quality.<p>- the whole thing about making a ranking overrider and talking about it publicly seems like a stupid move. Why in the world would you say you developed such code and then "deleted it" in an all-code-is-version-controlled-these-days world? This won't go very well against the claims that Google gives preferential treatment to its own services (e.g. email, maps) vs competitors.<p>- The experiment reportedly was triggered because Bing results were getting better for misspelled searches. But, seriously, returning wikipedia as the top result for something with low levenshtein distance to a rare word is not exactly rocket science...<p>- if Google feels that its SERPs are the most relevant possible, shouldn't it make sense that competitors trying to improve relevance will inevitably end up showing the same results as Google on at least a subset of queries?<p>- if you're saying Bing has just as good results as Google, regardless of the means to the goal, then how does publicizing that help the whole "Google's overrun by spam" meme going on?
The interesting thing here is that Google now has the smarts and power to play games with Bing, and were I MS, that fact would scare me more than any lawsuit.
<i>Google likens it to the digital equivalent of Bing leaning over during an exam and copying off of Google’s test.</i><p>Isn't that basic classroom solidarity?
I'm really curious as to how this is different than Bing using google's search results in some form of aggregate pageranking. If we assume that some arbitrary metric of "authenticity" exists for searches and a search for mbzrxpgjys results in results in a low (<0.1%) result for authenticity, but Google suddenly declares that www.page.com is the foremost authority in mbzrxpgjys's, it stands to reason that a good page-ranking scheme would take that into account and bump it to the front of the line.<p>I don't think it's cheating, no where in the article does it claim that they aren't doing their own search, they are just using Google's results as part of their own search algorithm. Is that really such a crime?
While the "cheating" angle on this seems hugely overblown, I do think that companies that harvest data through toolbars etc. should be obligated to explain upfront in clear language how they use the data. Not bury it in the legalese of a vast impenetrable ToS.
Give me a break, MS has always played the fast follower game which means they will ride on the work and investment done by the market leader and it's worked out well for them in other parts of their business.<p>Using signals from user behavior on the toolbar on ANY search engine seems to make a lot of sense when it comes to improving search results. MS employees are the biggest QA group for Bing. Internal tools allow employees to tag queries and results that are superior/inferior to Google. Both are displayed side by side and employees provide active feedback to help improve the algorithm and identify more systemic underlying ranking issues.
Instead of whining, I would have gone on the offensive.
So we have a competitor copying our search results. Great. Now how can we fuck with that?<p>Figure out the requests coming from microsoft and return a different set of search results (e.g. XXX stuff) so that it doesn't show up for organic google resutls. Set the trap and once bing has incorporated those results for a keyterm, spam TC and LOL at Steve Ballmer gettingn worked up.<p>This discredits the relevancy of bing and all that PR dollars spent rebranding would have gone down the drain. Imagine searching for a harmless search term like 'poodle' and getting hardcore triple xxx results.<p>Oh well, dont do evil right?
> For the first time in its history, Google crafted one-time code that would allow it to manually rank a page for a certain term<p>If that's accurate, that's a precedent I'd rather not have seen.<p>(a little help on the grammar here, anyone?)
This is an ultimate opportunity for Google - Can't they somehow spoof the results that are sent back to Bing. I now if someone was cheating off me in an exam, I would try and give them the wrong answer.
I'm not sure about this. It almost sounds like Google is posturing. The reason I say this is that while Google was getting bombed up until last week with scraper sites, Bing wasn't.<p>If Bing was really copying results, they would have reflected the spam sites, because people click on those when they are highly ranked just as often as they click on the originator site. After all, the problem is that the content is identical.
><i>Is It Illegal?<p>Suffice to say, Google’s pretty unhappy with the whole situation, which does raise a number of issues. For one, is what Bing seems to be doing illegal? Singhal was “hesitant” to say that since Google technically hasn’t lost anything. It still has its own results, even if it feels Bing is mimicking them.</i><p>Funny... that's the <i>exact same argument</i> software / music piracy often makes.
Google wasting time embarrassing Bing? I think this is more interesting that learning what Bing are doing.<p>Google are clearly monitoring Bing (and others) as a matter of course. I'm interested to know what they'd have done if they'd found Bing providing better quality results. Would they have spent resources trying to figure out what Bing were doing right, or would that be "copying" too?
The biggest article surprise for me was Google's claim they don't use the toolbar or Chrome directly to improve search queries. I assumed measuring bounce rates and patterns in link graph traversal across the entire web was part of their raison d'etre, as with Google Analytics
> <i>When the experiment was ready, about 20 Google engineers were told to run the test queries from laptops at home</i><p>An interesting side-effect is that Bing has in its logs the home IPs of the Googlers involved in this research (i.e., anyone who searched for "hiybbprqag" in Dec. '10).
Since when has reverse engineering been cheating? If the article is correct, there still is no allegation that Google's algorithms have been used. I don't think Google is in much of any position to cry foul over any company using data mining to tailor search results.
Buried halfway through the article:<p>"These searches returned no matches on Google or Bing — or a tiny number of poor quality matches, in a few cases — before the experiment went live. [...] Only a small number of the test searches produced this result, about 7 to 9 (depending on when exactly Google checked) out of the 100. Google says it doesn’t know why they didn’t all work, [...]"<p>The writer apparently thinks these results justify concluding the article with this takeaway:<p>"When Bing launched in 2009, the joke was that Bing stood for either “Because It’s Not Google” or “But It’s Not Google.” Mining Google’s searches makes me wonder if the joke should change to “Bing Is Now Google.”<p>okayyyy.
If Bing is really copying what about those who fought Bing vs Google war as if it were Vi vs Emacs war?<p>By any chance, is Bing named after Chandler Bing?<p>"DuckDuckGo" has become by default. Its awesome.
Perhaps Bing tracks clicks for every search engine, not only Google. If so, they are not copying Google, but legally tracking user behavior across the web.
Did anyone stop to think that Bing is leveraging IE to improve/influence search results <i>regardless</i> of the URLs this monitored traffic is gleaned from?<p>Google states 9 of 100 planted queries showed up on Bing. You think Amazon, Godaddy, and AOL could make similar claims?<p>Probably...but those examples aren't worried about their market share evaporating.
With Google results filled with Spam & I hardly find a result on the first search, any sane-minded search shouldn't be even replicating (if they are).
The balls on MS people are fucking amazing: "Harry Shum, VP of search development at Bing, responded by admitting that Google had uncovered a new form of search fraud, and said he wished Google had spoken to Microsoft about it before taking it to the press". So bing is either (a) scraping all web behavior out of ie, or (b) scraping G's search engine results, or (c) both -- and dude is pissy because G didn't give them time to get their lies together in private? Amazing.<p>ps -- there's a word for what MS's software appears to be doing: spyware.
Search for "hacker news" on both. The results are quite different. One might argue Bing is better because they don't have a duplicate result at the top.
I personally don't mind if one is copying the search of another. The whole idea is to get the BEST search results possible. And thats what I use a search engine for is getting by far the best results possible. I don't mind how they do it and as far as I can tell they aren't breaking any copyright laws...