Readership distortion and the net as a hostile environment
Submitted by Reinder on July 8, 2006 - 13:05
Like Xerexes mentioned, Rogues of Clwyd-Rhan was hacked this week. It happens to all of us. As an administrator on Talk About Comics, I know that it happens to some of us a lot. This time, hackers targeted the Gallery application, which I hadn't upgraded in a long time and which had known exploits, so I had it coming. The damage was minimal: the hack tied up system processes until system administrator Xepher stepped in and killed it.
Joey Manley, who owns Talk About Comics, a regular target for hackers and script kiddies, has on occasion wondered whether someone out there was out to get him. I don't believe that - I am sure crackers don't know or care about the nature of the sites they destroy. Most of their work is automated anyway. All they're interested in is the exploit and what they can do with it. The web in 2006 is a deeply hostile environment, and crackers aren't the only problem.
If you read blogs, you may be aware of comment spam. It's a nuisance, all right. You try to follow a string of ever-shriller, ever more hilarious comments on how horrible the Bush administration is, but by the end of the thread you're drowning in phentermine! And half of the pill-pushing comments are in a markup format the blog software doesn't recognise, so what you see is this unreadable soup of UBB tags. But you've had years of experience with spam in email so you shrug, put up with it, and ignore the spam until the owner of the blog cleans it up. To the owner of the blog, though, comment spam is a lot worse than email spam: a flood of spam comments can take down a blog. It happened to me when I upgraded my blog software in late 2004 and made a mistake that prevented me from keeping my blacklisting plugin running. Tom Coates had it happen to him even though he didn't make any mistake. Ever since that upgrade, I've had comments disabled on my weblog, much to its detriment. I'm actually afraid to switch comments back on.
The content management system I use for the comic, called WillowCMS, is pretty safe against that sort of thing, if only because only two or three installations exist in the world. That doesn't mean spammers and hackers don't try. Two thirds of all comments that have ever been posted on the parts of my site that run on Willow have been spam, and it takes ever more stringent filtering, blacklisting and client-to-server handshaking just to keep that proportion stable. And there is one form of damage that, while it doesn't affect people using the website and doesn't take it down, can still trip up an ambitious website owner: the distortion of readership statistics.
A few weeks ago, I told a friend that, despite the fact that I'd been posting ancient material for almost a year, my readership numbers remained pretty stable. About 500 to 600 a day, give or take the occasional spike (the logging functionality filters out known webcrawlers, by the way). But when I looked a little closer, I realised that on any given day, I could have up to 40 "visitors" opening one page not prominently linked on the front page without passing on a referer (sign of a script kiddie at work), about a dozen "visitors" could be referral spammers faking a visit so that they could insert a bogus referrer into my statistics (not knowing, or caring, that my statistics are not accessible to the public and not linked to from anywhere), another dozen could be failed attempts at posting comment spams (because of the way the blacklisting is set up, these show up as hits to a node called "commentblock") and another, currently much smaller number of "visitors" consists of succesful comment spammers who don't read any of the site either. So spammers and hackers, in addition to the waste and extra effort they cause me and the people supporting me, also cause my stats to be off by a significant amount, making a comic that's slowly declining in popularity seem like it's chugging along with stable readership numbers. If you have any ambition at all for your webcomic, that could hurt you real bad.
If you are the sort of creator who pays attention to readership numbers, are you sure that the numbers actually represent readers and not bots who don't care at all about your work?