TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Why web log analyzers are better than JavaScript based analytics

33 pointsby mindaugasalmost 16 years ago

10 comments

timmaahalmost 16 years ago
Note that Data Land Software (host of blog) sells an "interactive web log analyzer"
elialmost 16 years ago
Uh, the author <i>greatly</i> underestimates the headache of filtering out bot traffic. It's bad enough that some of the fancier comment spam bots load javascript, but going through the server logs would be nuts. The "Contact Us" form would show as the most popular page, since it's constantly being assaulted by automated bot-net based attacks.
评论 #689988 未加载
piealmost 16 years ago
I don't think anyone suggests (for serious websites) that log files should be abandoned or ignored. On the contrary, script based analytics - which do indeed offer significant value that's ignored in this article - should be considered supplemental or complementary to more traditional methods.
axodalmost 16 years ago
JS based:<p><pre><code> * Can record java version, flash version, other plugin info * Can record screen size, browser window size, color space * Can detect and record ad block presence</code></pre> etc<p>Both have their uses.
评论 #689936 未加载
评论 #690716 未加载
bjplinkalmost 16 years ago
6. Bots (spiders) are excluded from JavaScript based analytics<p>To me that is actually a benefit of JS based analytics programs. When I check Google Analytics in the morning I don't want to see how many search engine bots and scrapers hit my site the previous day. I want to know how many actual human beings used my site instead.<p>Also, and this is probably obvious since it's been pointed out that these people have a vested interest in log parsers, this article would better be titled as "10 Reasons Why Web Log Analyzers Should Be Used WITH JavaScript Based Analytics." I would argue most people serious about tracking traffic use both anyway but those that don't should see the benefits.
vradmilovicalmost 16 years ago
I'm an author of this article - thank you all for commenting. I don't have an intention to start a flame war - both methods have pros and cons, but in this "GA craziness" people tend to forget that log analysis even exists. Hence the article. :)<p>And yes, Dataland Software "sells" interactive web log analyzer, but I can't really see how's that important?
评论 #690050 未加载
评论 #690044 未加载
eggnetalmost 16 years ago
One of the great things about javascript based analytics is that the cached version of your page is just as good as someone grabbing it directly. You can set long cache times on all of your pages without worrying about people viewing your site without you knowing. This more than counteracts the handful of people who have javascript turned off.<p>This is also particularly important for sites like heroku who have an HTML cache sitting in front of your site. If you serve pages that are cached, javascript logging is your only option.
评论 #690209 未加载
jacquesmalmost 16 years ago
The reasons 1 by 1:<p>1) you don't need to edit HTML code to include scripts<p>The authors assert that you'd have do this by hand if you had a lot of static html. This is incorrect (you could easily insert the code using some script), but it also doesn't make sense, most larger sites (if not all these days) are dynamically constructed and adding a bit of .js is as easy as changing a footer.<p>2) scripts take additional time to load<p>This is true, but it only matters if you place your little bit of javascript in the wrong place on the page (say in the header). When positioned correctly it does not need to take more time to make the connection.<p>3) 'if website exists, log files exist too' (...)<p>This is really not always the case. Plenty of very high volume sites rely almost entirely on 3rd party analysis simply because storing and processing the logs becomes a major operation by itself.<p>4) 'server log files contain hits to all files, not just pages'<p>That's true, but for almost every practical purpose that I can think of that is a very good reason to use a tag based analysis tool rather than to go through your logs. The embedding argument the author makes is fairly easily taken care of by some cookie magic and / or a referrer check.<p>5) you can investigate and control bandwidth usage<p>Bot detection and blocking is a reason to spool your log files to a ramdisk and to analyze them in real time, to do it the next day is totally pointless. Interactive log analysis (such as the product sold by this company does) can help there, but a simple 50 line script will do the same thing just as well and can run in the background instead of requiring 'interaction'.<p>6) see 5<p>7) log files record all traffic, even if javascript is disabled<p>yes, but trust me on this one, almost everybody has javascript enabled these days because more and more of the web stops working if you don't have it. The biggest source of missing traffic is not people that have javascript turned off but bots.<p>8) you can find out about hacker attacks<p>True, but your sysadmin probably has a whole bunch of tools looking at the regular logs already to monitor this. Basically when all the 'regular' traffic is discarded from your logs the remainder is bots and bad guys. A real attack (such as a ddos) is actually going to work much better if you are writing log files because you're going to be writing all that totally useless logging information to the disk. Also, in my book a 'hacker' is going to go after other ports than port 80.<p>9) log files contain error information<p>This is very true, and should not be taken lightly, your server should log errors and you should poll those error logs periodically to make sure they're blank (or nearly so) in case you've got a problem on your site.<p>10) by using (a) log file analyzer, you don't give away your business data<p>well, you're not exactly giving away your business data, but the point is well taken. For most sites however the benefits of having access to fairly detailed site statistics in real time for $0 vs 'giving away of business data' is clearly in favor of giving away that data.<p>Google and plenty of others of course have their own agenda on what they do with 'your' data, but as long as they don't get too evil with it it looks like the number of sites that analyse via tags is going to continue to expand.
评论 #690187 未加载
评论 #691785 未加载
jawngeealmost 16 years ago
Log analysis is a major PITA, specifically if you're operating a farm of web servers like we do. We use an epic shit ton of realtime stats (Woopra, Mint, GA) so we have most needs covered and have a real time view into what's going on.<p>We do rotate our logs up to S3, but haven't done anything with them thus far.
davidwalmost 16 years ago
IIRC, a while ago there used to be an analysis system that you'd place in the appropriate location in your network, and it would sniff packets and piece together its own log files. I don't recall what the advantages were supposed to be... perhaps that you could get some information on speed/latency.
评论 #690512 未加载