Skip to main content

Readership distortion and the net as a hostile environment

Like Xerexes mentioned, Rogues of Clwyd-Rhan was hacked this week. It happens to all of us. As an administrator on Talk About Comics, I know that it happens to some of us a lot. This time, hackers targeted the Gallery application, which I hadn't upgraded in a long time and which had known exploits, so I had it coming. The damage was minimal: the hack tied up system processes until system administrator Xepher stepped in and killed it.

Joey Manley, who owns Talk About Comics, a regular target for hackers and script kiddies, has on occasion wondered whether someone out there was out to get him. I don't believe that - I am sure crackers don't know or care about the nature of the sites they destroy. Most of their work is automated anyway. All they're interested in is the exploit and what they can do with it. The web in 2006 is a deeply hostile environment, and crackers aren't the only problem.

If you read blogs, you may be aware of comment spam. It's a nuisance, all right. You try to follow a string of ever-shriller, ever more hilarious comments on how horrible the Bush administration is, but by the end of the thread you're drowning in phentermine! And half of the pill-pushing comments are in a markup format the blog software doesn't recognise, so what you see is this unreadable soup of UBB tags. But you've had years of experience with spam in email so you shrug, put up with it, and ignore the spam until the owner of the blog cleans it up. To the owner of the blog, though, comment spam is a lot worse than email spam: a flood of spam comments can take down a blog. It happened to me when I upgraded my blog software in late 2004 and made a mistake that prevented me from keeping my blacklisting plugin running. Tom Coates had it happen to him even though he didn't make any mistake. Ever since that upgrade, I've had comments disabled on my weblog, much to its detriment. I'm actually afraid to switch comments back on.

The content management system I use for the comic, called WillowCMS, is pretty safe against that sort of thing, if only because only two or three installations exist in the world. That doesn't mean spammers and hackers don't try. Two thirds of all comments that have ever been posted on the parts of my site that run on Willow have been spam, and it takes ever more stringent filtering, blacklisting and client-to-server handshaking just to keep that proportion stable. And there is one form of damage that, while it doesn't affect people using the website and doesn't take it down, can still trip up an ambitious website owner: the distortion of readership statistics.

A few weeks ago, I told a friend that, despite the fact that I'd been posting ancient material for almost a year, my readership numbers remained pretty stable. About 500 to 600 a day, give or take the occasional spike (the logging functionality filters out known webcrawlers, by the way). But when I looked a little closer, I realised that on any given day, I could have up to 40 "visitors" opening one page not prominently linked on the front page without passing on a referer (sign of a script kiddie at work), about a dozen "visitors" could be referral spammers faking a visit so that they could insert a bogus referrer into my statistics (not knowing, or caring, that my statistics are not accessible to the public and not linked to from anywhere), another dozen could be failed attempts at posting comment spams (because of the way the blacklisting is set up, these show up as hits to a node called "commentblock") and another, currently much smaller number of "visitors" consists of succesful comment spammers who don't read any of the site either. So spammers and hackers, in addition to the waste and extra effort they cause me and the people supporting me, also cause my stats to be off by a significant amount, making a comic that's slowly declining in popularity seem like it's chugging along with stable readership numbers. If you have any ambition at all for your webcomic, that could hurt you real bad.
If you are the sort of creator who pays attention to readership numbers, are you sure that the numbers actually represent readers and not bots who don't care at all about your work?

Maybe I'm just too

scarfman's picture

Maybe I'm just too low-profile, or maybe it's because I only see the top ten referrals, but I don't see anything like that in my referral logs. Maybe it's because I handcode with straight unadorned HTML? My interest in WordPress sparked by moovok's post last week has just waned. And my message board is one of the free ones TalkAboutComics.com offers (thanks Joey!), but I don't see much in the way of spam postings there, probably because you must be registered.

Paul Gadzikowski, paul@arthurkingoftimeandspace.com

Arthur, King of Time and Space New cartoons daily

I use AWSTATS which filters

Greg Carter's picture

I use AWSTATS which filters for bots, crawlers, etc. And I'm always checking the logs and adding to the list. I watch the stats pretty close and see the same things you see, Reinder.

I've been lucky except for spammers in the forum. My Site doesn't get a ton of traffic, so that's a plus. It means I can analyze the stats closely. No service or piece of software will ever be as good as keeping familiar with what's going on so you can notice changes.

Also I filter the site in several ways - I run a total site set, a forum only set, and a gallery only set so it's easier to eyeball the stats and see what's going on. I don't have to spend as much time that way.Â

What's the saying? Vigilence is the price of democracy? It's also the price of running a secure website.

 

 

Greg Carter Abandon UpDown Studio

Greg Carter - Abandon: First Vampire - Online Graphic Novel

Actually, I think it wasn't

djcoffman's picture

Actually, I think it wasn't so much Google Analytics, as it is the hosting provider they use-- I found that if your host sucks, they don't like loading a lot of scripts or plugins or "calls" to other sites, they really drag.

If you have a tight host, you should be ok. Google Analytics is pretty darn thorough as far as actual statistics. Same goes for a barebones thing called http://www.analog.cx/ - which is used with all dreamhost plans built in.

Google Analytics

Iain Hamp's picture

I know it's still in Beta, so I am forgiving, but the gripe I have about Google Analytics (and I seriously only have one true gripe so far) is that there is several hours of delay at times before the stats are displayed. I keep the free version of statcounter.com's system on my site along with Analytics for now, because there are times I just want to be unhealthily obsessive about the traffic on my websites.

Hmmm... I'm skeptical about

Hmmm... I'm skeptical about Google Analytics. The first I heard of it was when one comic I used to read regularly suddenly stalled on something whenever I tried to load it, and trying to track down the problem I found that Google Analytics was causing the stalling.I don't want that on my website unless I'm damned sure that that sort of problem has been ironed out.

As for selling stuff, you're making me cry now.

I would suggest using Google

djcoffman's picture

I would suggest using Google Analytics for stat tracking. They filter out all the bots and spam hosts.

Another way you can possibly know you have an audience is if you're actually selling things you have for sale. Spam bots don't buy books and swag.

 

DJ Coffman yirmumah.net