I launched the site http://altexplorer.net at the start of January as a Block Explorer and information hub for alternative cryptographic currencies. This morning I found a site http://4co.in which is ripping-off my site in real-time; every time a page is loaded on 4co.in it uses php to load the corresponding page from http://altexplorer.net, removes analytics and ad tags, replaces the site name, and replaces the link URLs.<p>I've put a lot of effort into building this site and keeping it running, and now someone in India is stealing it in real-time. Every page load to 4coin causes an identical page load in the nginx logs of http://altexplorer.net. What can I do besides blocking the source IP address to stop this?<p>Screen shots:
Alt Explorer home page: https://d1eem2029tdth0.cloudfront.net/img/altexplorer-home.png<p>4coin home page: https://d1eem2029tdth0.cloudfront.net/img/4coin-home.png<p>Alt Explorer profitability page: https://d1eem2029tdth0.cloudfront.net/img/altexplorer-prof.png<p>4coin profitability page: https://d1eem2029tdth0.cloudfront.net/img/4coin-prof.png
Lot of good suggestions already. I am not sure if you are interested in contacting the perpetrator directly and asking them to stop this but I did a little research for you.<p>looking up the whois info, it says that the registrant's email was bgrf@ymail.com<p>When I put this email in google, I came across another spammy site called baklinks.blogspot.com. This site asks you to swap back links. At the bottom of the blog post, I found the name of the person "Naveen K R"<p>I then looked up google with "Naveen K R + bgrf". I was able to find a site he (probably) runs called www.zokali.com<p>More googling combos, I finally found his linkedin profile and his name "Naveen K Ramanand"<p><a href="https://www.linkedin.com/in/krnaveen" rel="nofollow">https://www.linkedin.com/in/krnaveen</a>.<p>May be you can contact this guy directly. Seems like he is the one doing this or at least he knows who.
If you end up trying to block his IP, don't just DROP or REJECT his packets. TARPIT [1] them! This way not only would you be denying him access, but you would also be draining his resources.<p>Another thing to try is to see just how much data his server will take. See if you can send him a GB-sized response.<p>[1] <a href="http://www.netfilter.org/projects/patch-o-matic/pom-external.html#pom-external-TARPIT" rel="nofollow">http://www.netfilter.org/projects/patch-o-matic/pom-external...</a>
The javascript solution has already been suggested, but take a step back and think about it: the same way the leech worked out your links, domain name, logo and all the stuff that brands your website, he can easily figure out the simple JS code suggested here.<p><i><img src="x" onerror= "if(document.location.href==='<a href="http://4co.in')document.location='//xxxxxx.xxxx';">" rel="nofollow">http://4co.in')document.location='//xxxxxx.xxxx';"></a> </i><p>So I say, go a step further:<p>- do not send his users to a black hole, instead show a banner warning them about the leech and then after a few seconds redirect the user to your website.<p>- The JS code for the above should go in the same JS file that provides core functionality to your website.
After done that, run your JS past <a href="http://closure-compiler.appspot.com/home" rel="nofollow">http://closure-compiler.appspot.com/home</a> or if you better still install the yuicompressor cli (<a href="http://yui.github.io/yuicompressor/" rel="nofollow">http://yui.github.io/yuicompressor/</a>) in your machine.
The resulting code will be minified/compressed and seriously obfuscated. So trying to defeat it will that the leech hours if not days depending on his experience.<p>- encode/obfuscate the warning string (1st topic) to make it harder to find within the JS code.<p>- and finally do a daily spot check on website following @jarrett comment below
You found out the right first step yourself: Block the source IP address. Sure it will turn into a game of whack-a-mole with them changing their IP but eventually, their customers will get fed up with their downtime.<p>Second idea: Javascript redirect all of your pages to your own subdomain. Again, its just a step in an arms race, but this would be a little too hard/expensive to take to court. You can win an arms race if you try.
Don't punish users. The goal here shouldn't be to silently redirect or deceive them with fake data or throw up goatse.<p>Instead, make it annoyingly clear to anyone that visits 4co.in that the content is stolen. 4co.in users aren't visiting 4co.in to spite you. They just don't know and will gladly use your website instead.<p>The game of whack-a-mole is strongly in your favor because you're on the right side of a trapdoor.
Look for either the php user agent and/or the source ip. Why not use mod_redirect or something and redirect him to some bizarre internet meme site? I would suggest tub girl or goatse. It will get the point across very loud and clear. Or, just serve a different copy of your site to him that makes it loud and clear what he is doing is not ok. Either way, you can use mod_rewrite to cause this guy agony and prevent him from perpetrating this.
Could we make him pay a few bucks?<p>Specifically, can we make him traffic multiply? I wonder what exactly is he doing with request headers... maybe this could work:<p>1) set up page /fluffy with wildly compressing contents, say 50MB of $£€$£€$£€$£€$£€.. always force gzip encoding
2) set up a few bots (amazon?) to download that page from his site, but do not accept any compression<p>Start the attack on some time the guy is probably sleeping, it might go on for a few hours before he notices, costing him a couple of hundred bucks in bandwidth.<p>Or maybe just some cpu waste in same vein: the guy has to open the gzip before forwarding to do string replace and re-zip it afterwards, so you can make sure that the content REALLY balloons..
You can use javascript frame busting techniques to redirect back to the main page. You can also use mod_rewrite or some proxy setups to make it so a completely different set of pages shows up for people coming from that site. This is better than just blocking it because it's a bit more subtle and lets you tell that site's users what's happening.
This exact same thing happened to me a couple years ago.<p>This is how I got it resolved within a day:<p><a href="http://pzxc.com/internet-is-still-wild-west" rel="nofollow">http://pzxc.com/internet-is-still-wild-west</a>
I dealt with a somewhat similar situation a while back: <a href="https://news.ycombinator.com/item?id=4291454" rel="nofollow">https://news.ycombinator.com/item?id=4291454</a><p>I issued a DMCA takedown notice to their host and it was taken care of in a couple of days. I suggest doing the same.
If you have time, go to war.<p>Have a page that spits the IP/hostname of referrer in a hidden section. Using that you can identify the IP/hostnames, so if he changes, you can always detect it.<p>Now that you can detect him, when he crawls your site, feed him garbage info for every single page, then constantly check his page for the hidden ip/hash in case he changes his IP/host. Hide that in a minified js. You can also feed his page bogus links that violates google's SEO so he can get blacklisted.
First post here at HN... but I would try a shame tactic (per codegeek's helpful name research). In a nice bright box just above your normal content, send the following text back to his IP address ...<p>"Hello, my name is <insert his name here once you are certain> and I've stolen the content that you are viewing right now -- someone's hard work. I stole it in a very intentional and fairly disrespecful way. Sometimes we get life lessons and this may well be one of mine. Instead of using my skills to do good with the precious time that I have in this beautiful world, I've chosen to write a fairly nefarious script to copy every single page of someone else's website and suck it back into my website, so that I can profit from someone else's work. The message you are reading right now may go away for a day or two, if I change my IP address. But rest assured, it will be back once my IP address is rediscovered. This event will also follow me forever on search engines when people search my name -- future employers, friends, family. I have been doing this for <x> days and have been asked to stop. I haven't yet, but time will tell.... (<insert-pretty-date-here>)<p>In the meantime, if you would like to visit the real website go <here>..."
The JavaScript frame busting methods are not the right approach, you have no control over what his users see. There is no reason he can't filter out any JavaScript or other HTML. In fact he might not even display your live HTML. He might have copied it to make his page templates and it scraping just the data from your site, you just don't know. If he isn't doing this now, he will if he gets in an arms race with you.<p>You need to return bad data to his site by IP address and possibly user-agent. Don't make the data bad to mess with the users, just make it return unusable data, for example all numbers are zeros. Then what you do it make a scheduled task that scraps <i>his</i> website (using his domain name). If you start getting HTTP requests in your logs that correspond to the schedule job you created then you add the new requesting IP to the blacklist of funny data, then make a second request to his website validate the IP you blacklisted. You could setup your scrapping tool to use random tor exit nodes and cycle the user-agent info.<p>He could do the same (random ips) but might not... Really you need some type of accountability which you can never have on a public website but requiring registration/authentication would help some if it becomes that important to you.
Use imagemagick to watermark all image requests on the fly so you can keep changing the position of a url watermark on all images.<p>edit - actually, don't do this as it is trivially easy to get around by doing 2 or 3 requests and keeping anything that hasn't changed.<p>Or if you do do this, add a low level noise filter on top so that the attacker can't just directly equate pixel values.
Currently 4co.in is showing this:<p>---
Site is down!<p>Sorry everyone! i really apologize for what happend!!<p>It all happend because of my silly mistake and misconfiguration and it was affected for at max 10hrs.<p>Instead of making a scene somebody would have contacted me!<p>Now i understand the risks of live development. It was not my intension to steal anything.
---
Before you react, try to estimate on how much money this is costing you, then determine how much money you're willing to spend to combat the problem. Try to keep the costs of your response in line with the damage inflicted.
I think mentioning the short URL provider auto-killed fragmede's comment, and my copy&paste of it. Here goes again:<p>fragmede's comment below is [dead], but has very good advice.<p>---<p>Nice bit of news you added to the top, which 4co.in is putting on their own site.<p>One piece of advice though: Drop the short link and link directly to altexplorer.net, otherwise it looks like 4co.in was 'hacked' and the short link is a phishing/some other sort of scam and not legit.<p>You should be able to pickup the 4coi.in domain as the referrer if you want metrics for how many people were using 4co.in.<p>---
Report them to their web host and the ad networks they use. Don't troll them with different content just go for the kill - some accounts like AdSense carry lifetime bans.
I would contact the other site first and find out WTF. It is unlikely, but they might have a good reason for it. If they are just trying to rip you off, solution might be as simple as just asking them to stop.
You should be able to apply behavioural detection here even if the IP address changes - they'd have to be polling your site regularly. Is there a discernible pattern in the logs?
In addition to the other ideas here, I'd also recommend feeding a completely fake site to the source IP of the thief. Possibly including some political ideas that could get him in trouble in his host country (up to you depending on how mean you wish to be).
there are companies offering services to deal with this, just depends on how much your time is worth. Here is one option. www.distilnetworks.com ... in case you tire of whack-a-mole
Ok, time for a reality check<p>If you can't imagine what to do in this situation you shouldn't be running a website of this nature<p>This type of thing can (and does) happen and it's up to you to know how to defend yourself.<p>The others have given plenty of ideas, but I guess there are more specific things that can be done depending on their page structure/ads etc