How I recorded user behaviour on my competitor’s websites

766 pointsby lukestevensover 6 years ago

40 comments

GedByrneover 6 years ago

I’d like to defend this guy. What he is doing is testing the trust mechanism.If he went to Google and said ‘I think the trust mechanism is broken’ Google would say: ‘We know, that’s why we are pushing to move everyone to https.’‘That isn’t enough. The padlock on the https page gives users a false sense of security.’‘We don’t agree with that. Where’s your data?’Google wouldn’t have accepted this. They have pushed full HTTPS hard, and suggesting that it has a negative consequence is unacceptable to them.His experiment has proven the problem. How else could it have been demonstrated?Ideally this would have been a large scale study done by academics. But this guy doesn’t have those resources. Nobody is going to fund this research.The depressing thing here is that everybody is more interested in calling this guy a jerk than dealing with the issues he has raised.Trust on the internet is broken. This guy did it with ease. Imagine what is being done by those who want to scam millions?But yeh, call him a jerk and then you can bury your unease beneath a big pile of outrage. It’s fine. Fine. He’s a jerk.

评论 #17826561 未加载

评论 #17827141 未加载

评论 #17826455 未加载

评论 #17826250 未加载

评论 #17826617 未加载

评论 #17826588 未加载

评论 #17826610 未加载

评论 #17827493 未加载

评论 #17827316 未加载

评论 #17827272 未加载

dejanseoover 6 years ago

Hi everyone! I did this. It was just a random cool idea I wanted to try. It worked a little too well and I quickly moved it to a disposable site to test if the page will get penalised by Google. I got busy with other things and forgot about it. When I bumped into it again I decided to write about it, for two reasons: 1) To me it's hard to believe that Chrome would allow for this to happen in the first place and 2) that Google wouldn't penalise a site doing this. Well, since the story was published Google tracked down my test page (most likely by using the source code I revealed on my blog) and completely de-indexed the whole domain.

评论 #17826318 未加载

评论 #17826017 未加载

评论 #17828106 未加载

评论 #17827109 未加载

评论 #17825231 未加载

评论 #17825176 未加载

评论 #17826772 未加载

评论 #17826229 未加载

评论 #17825208 未加载

Ajedi32over 6 years ago

Surprisingly few comments about the actual attack mechanism here. IMO discussion of whether the author's PoC was ethical is interesting but far less important than the question about how to handle the actual vulnerability; this kind of attack could be used for far more damaging things than just recording user behavior. (Such as phishing.)IMO "get rid of the browser history API" (as the article author recommends) isn't the right solution. The history API is important, as it's the only way to make the back button work as expected in single-page applications, or in multi-page applications that don't trigger a full page reload when you click a link. Rather, I'd suggest the following mitigations:1. Require a user gesture for `History#pushState` and `History#replaceState`2. Follow Firefox's example and highlight the most important part of the domain name in the browser UI3. Don't label HTTPS sites as "Secure", as this can be misleading (Chrome's planning to do this starting next month <a href="https://blog.chromium.org/2018/05/evolving-chromes-security-indicators.html" rel="nofollow">https://blog.chromium.org/2018/05/evolving-chromes-security-...</a> )4. Give the back button a different icon when it's taking you to a different domain (maybe "Up" instead of "Back"?)Any other ideas?

评论 #17829164 未加载

评论 #17829041 未加载

评论 #17828385 未加载

评论 #17829817 未加载

评论 #17829719 未加载

评论 #17828124 未加载

lukestevensover 6 years ago

This is an interesting yet disturbing case of blackhat SEO and phishing, where the site owner hijacks the back button and sends visitors to fake sites where he can observe their behaviour.FTA:Here’s what I did:1. User lands on my page (referrer: google)2. When they hit “back” button in Chrome, JS sends them to my copy of SERP3. Click on any competitor takes them to my mirror of competitor’s site (noindex)4. Now I generate heatmaps, scrollmaps, records screen interactions and typing.

评论 #17825411 未加载

评论 #17824107 未加载

评论 #17825371 未加载

chatmastaover 6 years ago

Somewhat related, Google AMP is also destroying the ability for users to trust URLs. In fact it’s kind of the inverse problem; the URL bar says google.com when the user expects to be on another website. I wouldn’t be surprised if observing the AMP pattern subconsciously made users less suspicious of the trick in OP.It’s also a bit rich to see all the outrage here and deranking by google, since hijacking/proxying to sites in search results is exactly what AMP does.

评论 #17828510 未加载

评论 #17828998 未加载

评论 #17828264 未加载

nkozyraover 6 years ago

I don't understand why you would have been expected to report this to Google. It's not an issue or bug with Google, it's a simple gray hat social engineering trick.People linking to fake sites as a dark pattern is nothing novel, you just did so too capture analytics instead of, say, installing a virus or taking someone's credentials. That said, you certainly could have done the latter and gotten views into your competitors' user portals. In my head that's not fundamentally different or more unethical from what you ended up doing.I don't necessarily begrudge you for trying it, but I don't think it's for a noble reason nor do I think it was particularly innovative and the end result is Google doing something unsurprising.

评论 #17827710 未加载

nothrabannosirover 6 years ago

For context: Firefox greys out anything that is not the "real" domain, which remains black. So:google.com.fakesite.io/foobarbecomes:(grey "google.com.")(black "fakesite.com")(grey "/foobar")This makes it at least a little more obvious you're not on Google.Although that's still a tricky one for non technical users to protect against. Aside from EV, I can't immediately think of anything else a browser could systematically do to guard against this, to be honest. Blacklists etc, but that's very unsatisfying.It's a pretty old problem, to be fair. I remember almost being phished this way myself back on Myspace, were it not for Firefox's blacklists catching the form submission.Domain names being little endian has been one of the most expensive web sec mistakes in history.

评论 #17826762 未加载

评论 #17828788 未加载

评论 #17826562 未加载

3etoover 6 years ago

> Record actual sessions (mouse movement, clicks, typing)> I gasped when I realised I can actually capture all form submissions and send them to my own email.How many bad actors have been doing the same and for how long? This doesn't sound like something Google should just brush under the carpet and expect no one else is doing it. Although I wish the author had reached out to Google first to see how they would have handled it, I thank him for publishing it.

评论 #17825190 未加载

Syzygiesover 6 years ago

The big issue here is: Who does our browser work for?People worry that self-driving cars will take us to "promoted" coffee, if we're not specific. More generally, software agents as a rule are loyal to their creators, not to us. That we put up with this is absurd.Browsers should be intelligent agents that are entirely loyal to the person browsing. For example, no site should be able to tell whether we see ads or not. As one site-by-site option, process the ads exactly as if they're reaching our senses, but don't actually render them so they reach our senses.Not even having a back button loyal to us? That's obscene. Copyright infringement is the MacGuffin in this movie; the real story is that we're wusses for having totally lost this balance of power struggle in our personal software.

评论 #17829776 未加载

encodererover 6 years ago

Disturbing, fascinating, obvious in hindsight.Here’s another angle: a “bounce” back to google too quick is a negative ranking signal. By keeping them from going back to google by making them think they in fact did makes this also black hat seo.

评论 #17824763 未加载

评论 #17824730 未加载

paulryanrogersover 6 years ago

Why do browsers allow changing the back button history before the visitor arrived at the domain? Seems like a subtle cross origin attack if that is truly what's happening.

评论 #17824561 未加载

评论 #17828232 未加载

评论 #17826271 未加载

评论 #17825431 未加载

yjftsjthsd-hover 6 years ago

I'm surprised he's willing to put his real name to this. I can't immediately see that it's actually illegal, but it still screams red flag for unethical behavior.

评论 #17825768 未加载

saintPirelliover 6 years ago

How does a person get so much flak for hacking - on Hacker News?

评论 #17827458 未加载

评论 #17826002 未加载

评论 #17826036 未加载

hw_penfoldover 6 years ago

In many ways this is malicious deception. In any instance where a login form is included in the scraped mirror, that represents an attacked user, and a phishing attempt.If someone did this in the wild, in an uncontrolled situation involving random strangers, it risks serious misinterpretation, and worse.

评论 #17830195 未加载

markdownover 6 years ago

While we're on this topic, I have a related situation and wonder if my case is common:I built a brochure site for a mom-and-pop business a decade ago. The domain expired some time ago, and it was snapped up by someone who repopulated it with the original content scraped from the Internet Archive. It looks and behaves exactly like it did when I controlled it, except that a phrase in the frontpage content now links to some supplement sale site.Is there a name for this SEO bullshitery? What can someone do who isn't American and who therefore can't file a DMCA.

评论 #17829462 未加载

jackgoldingover 6 years ago

I've seen Dejan speak and I'd recommend following his work because he does very interesting black hat things like this in SEO. He has so many out of the box ideas like this which are brilliant.

mixedbitover 6 years ago

Long time ago I wrote about this technique: <a href="http://mixedbit.org/referer.html" rel="nofollow">http://mixedbit.org/referer.html</a> Besides back button navigation, I also had ideas to use a fake malware warning or just take a victim directly to fake search engine results.

digitalbossover 6 years ago

Update from site "Google’s team has tracked down my test site, most likely using the source code I shared and de-indexed the whole domain."

评论 #17825087 未加载

nsmog767over 6 years ago

This is easy to hate on, and certainly ethically dubious....but man do I love it.

评论 #17824694 未加载

caffeinated_meover 6 years ago

This seems to have some fairly scary security implications if used maliciously, but I can't think of a good way to protect against this.Does anyone know of a browser extension to limit access to the history API?

评论 #17825167 未加载

评论 #17824766 未加载

jeswinover 6 years ago

A couple of years back I was talking to someone who did SEO for a popular education network. The company was spending millions of dollars every month on SEO and advertising.Their module operandi went like this:1. Offer money to license or buy a smaller competitor's content2. If that doesn't work, crawl and clone the site3. Pump a lot of money into Google Ads, so that the cloned site now appears as an ad above the legitimate site. Google makes such scam easier now by making the ads look like organic results - a non technical user would hardly notice.4. The legitimate site just dies.I was asked to build a tool which crawls sites, which I refused. But I learned how professional SEO works.

评论 #17824913 未加载

评论 #17824989 未加载

评论 #17825375 未加载

megousover 6 years ago

Is this kind of blatant censorship, where Google delists information it doesn't like common? It's not like the experiment was ongoing, is it.

gcb0over 6 years ago

fun fact: you can do the same thing again, but use the AMP version and call yourself an amp-provider, just like google does.technically they wont be able to complain because you can say providing amp content assumes they want to be served by you, and you can fiddle as much as you want (e.g. adding tracking code) just like google does when it serves someone else content as amp.

gus_massaover 6 years ago

I just realized that it is not necessary to hijack the back button!1) Watch out for users coming from Google (or Bing) using the referrer field.2) Choose randomly In 5% of them are redirect them to your shady domain using a temporal 303 redirect. [If Google notice this, they will hate you.]3) Host a copy of your competitors page in the shady domain, with all the tracking enabled. [This is illegal! You may get a lawyer C&D, nastygram or more.]I guess that when the user finds your site in Google and click the link, they will most of the time not be sure of with link they choose, so they will not notice the change. And if they realize that they went to the wrong site, they will click the back button and click the search result again, and get the normal page like the 95% of the people.This is probably more credible if the search field in the referrer doesn't have your site in it, so the user is looking for any generic site that includes you and your competitors.As I said before, this is shady and some parts are illegal, so don't do it. Google may demote your site, and also you can get some legal problems.

tralarpaover 6 years ago

This is really a good example why it is so difficult for security experts to do research and experiments where real users are involved.What Mr. Petrovic did is illegal in most developed countries: copyright violation (copying web pages) and monitoring and storing user behavior without their consent (and, even worse, by phishing). It doesn't matter that he did it for a "very brief period of time (for ethical reasons)". LOL. If I tried this kind of stuff where I work, I would have a long unpleasant talk with our legal/ethics department afterwards. I cannot even do a network scan in the Internet without first notifying God and a couple of lesser gods.I am also wondering whether that's good publicity for the author's company. The author is basically saying: "We are doing things without being fully aware (or without caring) of the legal consequences. Are you sure you want to be our customer?".

评论 #17825706 未加载

评论 #17825748 未加载

nmstokerover 6 years ago

What's concerning is that the post author seems not to see the problem with trying to sit on both sides of the fence at once.As others have said, the way this was done is likely to be against numerous laws in most major jurisdictions. If you wish to do this as a PoC then simply put a notice up on the page that initiates it and use dummy "competitor" content, so you've got some semblance of user content/transparency without copyright infringement. That would work just as well for flagging it up as a concern to others.Or if being up-front about it is not the side you are on, do this fully admitting that it's wrong and face any consequences (it doesn't sound like this was the post authors aim, esp given follow up comments).For a "very brief period of time" doesn't cut the mustard here, just as it wouldn't with briefly stealing something from a bank or briefly kidnapping someone (both crimes where one could sometimes argue there may not be permanent damage, although even that likely isn't true in many cases)

评论 #17825940 未加载

评论 #17826329 未加载

everydaypanosover 6 years ago

1 Years years ago when I was learning web development I bought a TLD and just copy-pasted Amazon’s log in page to just check how it works. Amazon somehow found out about this and Google punished that TLD after that incident and it just couldn’t go up in rankings after that.If I remember correctly they had even put that TLD on sites that report/list “phishing” sites so if you Googled about that TLD you would also get the “they are fraud” results.2 I think that most pro users just New-Tab everything and go from there. Seems to me that going in n out search results all in one tab is kind of slow too.

评论 #17825037 未加载

评论 #17825215 未加载

评论 #17825061 未加载

评论 #17825136 未加载

glandiumover 6 years ago

One more reason to kill the referer HTTP header, I guess.

评论 #17824222 未加载

Zalastaxover 6 years ago

> I had this implemented for a very brief period of time (for ethical reasons) and then moved to one of my disposable domains where it still runs after five years and ranks really well, though for completely different search terms.Am I reading this correctly? He's been doing this since 2013 and still wants to use the white hat card?

评论 #17828199 未加载

danvoellover 6 years ago

Interesting hack. Sorry about the whole google de-indexing thing. My question would be, did you really gain any useful insights? From competitors, you can normally figure out which page is their most viewed and then figure out how they merchandise it on their homepage. Without "hacking" it.

sattoshiover 6 years ago

Just remove the push state history api. Set state is fine.Push state is totally unnecessary since we already had a technology for this: anchors!Instead of site.com/my/page, it's site.com/#/my/page.What is wrong with this? It does literally everything you need and is supported by most routers out of the box!

tabtabover 6 years ago

If Chrome outright disables JavaScript's ability to alter the "Back" path, it may brake some (poorly designed) applications. A compromise is to prompt with a warning.

评论 #17831560 未加载

wu-ikkyuover 6 years ago

Will Google be pushing out a fix for this vulnerability?

aquarinover 6 years ago

It seems my habbit to open google links in new tabs with right click have more meaning now. I initialy used this to avoid referal information.

评论 #17825288 未加载

评论 #17827028 未加载

评论 #17824987 未加载

评论 #17824321 未加载

slimover 6 years ago

Bonus : you steal some ranking clout from competitor since they don't get that precious click on Google search results

jhohover 6 years ago

That huge fixed navbar on mobile is just horrbile. Can't read the article because of it.

moltarover 6 years ago

That’s genius

bo1024over 6 years ago

And people still think javascript is a good idea...

pastaover 6 years ago

Presenting yourself as someone else is called fraud in my book.Changing the back button might be clever but all the rest is just simple. But people don't do this because I think in a lot of countries this is illigal.There is a way you can protect your site a little from this: add canonical tags to all your pages. When an attacker updates the back button to Google they will have a hard time getting the cloned pages up in the results.

评论 #17824837 未加载

shruubiover 6 years ago

Honestly, it doesn't shock me in the slightest that someone who markets themselves as an SEO expert would not only do something as unethical as this, but also brag about it, as though they think they've done something they should be proud of.

评论 #17824762 未加载

评论 #17824758 未加载

评论 #17824821 未加载

评论 #17826279 未加载