I would like to gain some high-level insight into the traffic accessing my website. For example:<p><pre><code> - Unique visitor counts
- Most viewed pages
- Referring sites
- Activity per time of day/week/month
</code></pre>
I do not want to be able to track individual users - I want to keep this strictly to statistics rather than intrusive tracking. That throws out pretty much anything that involves JavaScript or stuff done on the client-side.<p>I've been trying to put together a solution using the AWStats log analyser, however this requires me to collect IP addresses. If I remove or obfuscate IP addresses, then the 'Unique Visitors' count doesn't work. Unfortunately it seems that AWStats uses IPs as the primary method for identifying unique visitors.<p>What other solutions are out there? My site is PHP so doing something myself would also be acceptable.
I have built a platform that exactly does that. It does require JavaScript for a few reasons:<p>1. It allows single page apps to analyze<p>2. Caching of pages does not have any effect on the JS to be executed. Most back end tracking don't know if something is visited when cached.<p>So I would recommend you to use JavaScript if above reasons apply to you. As far as I know you can't really obfuscate the IP address in a why that you can't track a visitor. That's why I decided to drop IP address from our logs and don't use them at all.<p>Regarding your last point: unique visitors are hard to measure if you don't use IP or a cookie. A cookie is tracking, be not sure how intrusive you think this is for you. It could be a cookie with just a value of 'visited=1' or something, so you know it's a non-unique visitor when the cookie is present. That way you don't track I think.<p>You can see demo stats of my platform here <a href="https://simpleanalytics.io/simpleanalytics.io" rel="nofollow">https://simpleanalytics.io/simpleanalytics.io</a>
You could hash the ips before storing them, but as there aren't that many IPv4 addresses wit would be trivial to revert.<p>However if you use bloom filters to calculate "distinct counts" then I think you cannot reliably re-construct visitors ips. You gotta do some planning in advance on implementation details so that you can extract the stats you are looking for.