I’d like to defend this guy. What he is doing is testing the trust mechanism.<p>If he went to Google and said ‘I think the trust mechanism is broken’ Google would say: ‘We know, that’s why we are pushing to move everyone to https.’<p>‘That isn’t enough. The padlock on the https page gives users a false sense of security.’<p>‘We don’t agree with that. Where’s your data?’<p>Google wouldn’t have accepted this. They have pushed full HTTPS hard, and suggesting that it has a negative consequence is unacceptable to them.<p>His experiment has proven the problem. How else could it have been demonstrated?<p>Ideally this would have been a large scale study done by academics. But this guy doesn’t have those resources. Nobody is going to fund this research.<p>The depressing thing here is that everybody is more interested in calling this guy a jerk than dealing with the issues he has raised.<p>Trust on the internet is broken. This guy did it with ease. Imagine what is being done by those who want to scam
millions?<p>But yeh, call him a jerk and then you can bury your unease beneath a big pile of outrage. It’s fine. Fine. He’s a jerk.
Hi everyone! I did this. It was just a random cool idea I wanted to try. It worked a little too well and I quickly moved it to a disposable site to test if the page will get penalised by Google. I got busy with other things and forgot about it. When I bumped into it again I decided to write about it, for two reasons: 1) To me it's hard to believe that Chrome would allow for this to happen in the first place and 2) that Google wouldn't penalise a site doing this. Well, since the story was published Google tracked down my test page (most likely by using the source code I revealed on my blog) and completely de-indexed the whole domain.
Surprisingly few comments about the actual attack mechanism here. IMO discussion of whether the author's PoC was ethical is interesting but far less important than the question about how to handle the actual vulnerability; this kind of attack could be used for far more damaging things than just recording user behavior. (Such as phishing.)<p>IMO "get rid of the browser history API" (as the article author recommends) isn't the right solution. The history API is important, as it's the only way to make the back button work as expected in single-page applications, or in multi-page applications that don't trigger a full page reload when you click a link. Rather, I'd suggest the following mitigations:<p>1. Require a user gesture for `History#pushState` and `History#replaceState`<p>2. Follow Firefox's example and highlight the most important part of the domain name in the browser UI<p>3. Don't label HTTPS sites as "Secure", as this can be misleading (Chrome's planning to do this starting next month <a href="https://blog.chromium.org/2018/05/evolving-chromes-security-indicators.html" rel="nofollow">https://blog.chromium.org/2018/05/evolving-chromes-security-...</a> )<p>4. Give the back button a different icon when it's taking you to a different domain (maybe "Up" instead of "Back"?)<p>Any other ideas?
This is an interesting yet disturbing case of blackhat SEO and phishing, where the site owner hijacks the back button and sends visitors to fake sites where he can observe their behaviour.<p>FTA:<p><i>Here’s what I did:<p>1. User lands on my page (referrer: google)<p>2. When they hit “back” button in Chrome, JS sends them to my copy of SERP<p>3. Click on any competitor takes them to my mirror of competitor’s site (noindex)<p>4. Now I generate heatmaps, scrollmaps, records screen interactions and typing.</i>
Somewhat related, Google AMP is also destroying the ability for users to trust URLs. In fact it’s kind of the inverse problem; the URL bar says google.com when the user expects to be on another website. I wouldn’t be surprised if observing the AMP pattern subconsciously made users less suspicious of the trick in OP.<p>It’s also a bit rich to see all the outrage here and deranking by google, since hijacking/proxying to sites in search results is <i>exactly what AMP does.</i>
I don't understand why you would have been expected to report this to Google. It's not an issue or bug with Google, it's a simple gray hat social engineering trick.<p>People linking to fake sites as a dark pattern is nothing novel, you just did so too capture analytics instead of, say, installing a virus or taking someone's credentials. That said, you certainly could have done the latter and gotten views into your competitors' user portals. In my head that's not fundamentally different or more unethical from what you ended up doing.<p>I don't necessarily begrudge you for trying it, but I don't think it's for a noble reason nor do I think it was particularly innovative and the end result is Google doing something unsurprising.
For context: Firefox greys out anything that is not the "real" domain, which remains black. So:<p>google.com.fakesite.io/foobar<p>becomes:<p>(grey "google.com.")(black "fakesite.com")(grey "/foobar")<p>This makes it at least a little more obvious you're not on Google.<p>Although that's still a tricky one for non technical users to protect against. Aside from EV, I can't immediately think of anything else a browser could systematically do to guard against this, to be honest. Blacklists etc, but that's very unsatisfying.<p>It's a pretty old problem, to be fair. I remember almost being phished this way myself back on Myspace, were it not for Firefox's blacklists catching the form submission.<p>Domain names being little endian has been one of the most expensive web sec mistakes in history.
> Record actual sessions (mouse movement, clicks, typing)<p>> I gasped when I realised I can actually capture all form submissions and send them to my own email.<p>How many bad actors have been doing the same and for how long? This doesn't sound like something Google should just brush under the carpet and expect no one else is doing it. Although I wish the author had reached out to Google first to see how they would have handled it, I thank him for publishing it.
The big issue here is: Who does our browser work for?<p>People worry that self-driving cars will take us to "promoted" coffee, if we're not specific. More generally, software agents as a rule are loyal to their creators, not to us. That we put up with this is absurd.<p>Browsers should be intelligent agents that are entirely loyal to the person browsing. For example, no site should be able to tell whether we see ads or not. As one site-by-site option, process the ads exactly as if they're reaching our senses, but don't actually render them so they reach our senses.<p>Not even having a back button loyal to us? That's obscene. Copyright infringement is the MacGuffin in this movie; the real story is that we're wusses for having totally lost this balance of power struggle in our personal software.
Disturbing, fascinating, obvious in hindsight.<p>Here’s another angle: a “bounce” back to google too quick is a negative ranking signal. By keeping them from going back to google by making them think they in fact did makes this also black hat seo.
Why do browsers allow changing the back button history before the visitor arrived at the domain? Seems like a subtle cross origin attack if that is truly what's happening.
I'm surprised he's willing to put his real name to this. I can't immediately see that it's actually illegal, but it still screams red flag for unethical behavior.
In many ways this is malicious deception. In any instance where a login form is included in the scraped mirror, that represents an attacked user, and a phishing attempt.<p>If someone did this in the wild, in an uncontrolled situation involving random strangers, it risks serious misinterpretation, and worse.
While we're on this topic, I have a related situation and wonder if my case is common:<p>I built a brochure site for a mom-and-pop business a decade ago. The domain expired some time ago, and it was snapped up by someone who repopulated it with the original content scraped from the Internet Archive. It looks and behaves exactly like it did when I controlled it, except that a phrase in the frontpage content now links to some supplement sale site.<p>Is there a name for this SEO bullshitery? What can someone do who isn't American and who therefore can't file a DMCA.
I've seen Dejan speak and I'd recommend following his work because he does very interesting black hat things like this in SEO. He has so many out of the box ideas like this which are brilliant.
Long time ago I wrote about this technique: <a href="http://mixedbit.org/referer.html" rel="nofollow">http://mixedbit.org/referer.html</a>
Besides back button navigation, I also had ideas to use a fake malware warning or just take a victim directly to fake search engine results.
This seems to have some fairly scary security implications if used maliciously, but I can't think of a good way to protect against this.<p>Does anyone know of a browser extension to limit access to the history API?
A couple of years back I was talking to someone who did SEO for a popular education network. The company was spending millions of dollars every month on SEO and advertising.<p>Their module operandi went like this:<p>1. Offer money to license or buy a smaller competitor's content<p>2. If that doesn't work, crawl and clone the site<p>3. Pump a lot of money into Google Ads, so that the cloned site now appears as an ad above the legitimate site. Google makes such scam easier now by making the ads look like organic results - a non technical user would hardly notice.<p>4. The legitimate site just dies.<p>I was asked to build a tool which crawls sites, which I refused. But I learned how professional SEO works.
Is this kind of blatant censorship, where Google delists information it doesn't like common? It's not like the experiment was ongoing, is it.
fun fact: you can do the same thing again, but use the AMP version and call yourself an amp-provider, just like google does.<p>technically they wont be able to complain because you can say providing amp content assumes they want to be served by you, and you can fiddle as much as you want (e.g. adding tracking code) just like google does when it serves someone else content as amp.
I just realized that it is not necessary to hijack the back button!<p>1) Watch out for users coming from Google (or Bing) using the referrer field.<p>2) Choose randomly In 5% of them are redirect them to your shady domain using a temporal 303 redirect. [If Google notice this, they will hate you.]<p>3) Host a copy of your competitors page in the shady domain, with all the tracking enabled. [This is illegal! You may get a lawyer C&D, nastygram or more.]<p>I guess that when the user finds your site in Google and click the link, they will most of the time not be sure of with link they choose, so they will not notice the change. And if they realize that they went to the wrong site, they will click the back button and click the search result again, and get the normal page like the 95% of the people.<p>This is probably more credible if the search field in the referrer doesn't have your site in it, so the user is looking for any generic site that includes you and your competitors.<p>As I said before, this is shady and some parts are illegal, so don't do it. Google may demote your site, and also you can get some legal problems.
This is really a good example why it is so difficult for security experts to do research and experiments where real users are involved.<p>What Mr. Petrovic did is illegal in most developed countries: copyright violation (copying web pages) and monitoring and storing user behavior without their consent (and, even worse, by phishing). It doesn't matter that he did it for a "very brief period of time (for ethical reasons)". LOL. If I tried this kind of stuff where I work, I would have a long unpleasant talk with our legal/ethics department afterwards. I cannot even do a network scan in the Internet without first notifying God and a couple of lesser gods.<p>I am also wondering whether that's good publicity for the author's company. The author is basically saying: "We are doing things without being fully aware (or without caring) of the legal consequences. Are you sure you want to be our customer?".
What's concerning is that the post author seems not to see the problem with trying to sit on both sides of the fence at once.<p>As others have said, the way this was done is likely to be against numerous laws in most major jurisdictions. If you wish to do this as a PoC then simply put a notice up on the page that initiates it and use dummy "competitor" content, so you've got some semblance of user content/transparency without copyright infringement. That would work just as well for flagging it up as a concern to others.<p>Or if being up-front about it is not the side you are on, do this fully admitting that it's wrong and face any consequences (it doesn't sound like this was the post authors aim, esp given follow up comments).<p>For a "very brief period of time" doesn't cut the mustard here, just as it wouldn't with briefly stealing something from a bank or briefly kidnapping someone (both crimes where one could sometimes argue there may not be permanent damage, although even that likely isn't true in many cases)
1 Years years ago when I was learning web development I bought a TLD and just copy-pasted Amazon’s log in page to just check how it works. Amazon somehow found out about this and Google punished that TLD after that incident and it just couldn’t go up in rankings after that.<p>If I remember correctly they had even put that TLD on sites that report/list “phishing” sites so if you Googled about that TLD you would also get the “they are fraud” results.<p>2 I think that most pro users just New-Tab everything and go from there. Seems to me that going in n out search results all in one tab is kind of slow too.
> I had this implemented for a very brief period of time (for ethical reasons) and then moved to one of my disposable domains where it still runs after five years and ranks really well, though for completely different search terms.<p>Am I reading this correctly? He's been doing this since 2013 and still wants to use the white hat card?
Interesting hack. Sorry about the whole google de-indexing thing. My question would be, did you really gain any useful insights? From competitors, you can normally figure out which page is their most viewed and then figure out how they merchandise it on their homepage. Without "hacking" it.
Just remove the push state history api. Set state is fine.<p>Push state is totally unnecessary since we already had a technology for this: anchors!<p>Instead of site.com/my/page, it's site.com/#/my/page.<p>What is wrong with this? It does literally everything you need and is supported by most routers out of the box!
If Chrome outright disables JavaScript's ability to alter the "Back" path, it may brake some (poorly designed) applications. A compromise is to prompt with a warning.
Presenting yourself as someone else is called fraud in my book.<p>Changing the back button might be clever but all the rest is just simple. But people don't do this because I think in a lot of countries this is illigal.<p>There is a way you can protect your site a little from this: add canonical tags to all your pages.
When an attacker updates the back button to Google they will have a hard time getting the cloned pages up in the results.
Honestly, it doesn't shock me in the slightest that someone who markets themselves as an SEO expert would not only do something as unethical as this, but also brag about it, as though they think they've done something they should be proud of.