Truth About Computer Security Hysteria
The debut of realtime virus dataRob Rosenberger, Vmyths co-founder
Friday, 23 February 2001 KOURNIKOVA MARKS THE first media event with realtime virus data.
Media Axiom #1: all other things being equal, a guy with a number gets more airtime.Take Kenny Liao (Trend Micro), for example. It doesn't even look like he gave the Associated Press a hard number. He merely pulled an estimate out of his butt: "if we look at the total number of users that have been reported to us and consider we are only contacting a small portion of the Australian population — the estimation would be more than 100,000 [infected Australian computer users]." The Xinhua news agency soon reported a complete meltdown of email service in Australia thanks to Liao's brown-tinged guesstimate. This leads us to
Corollary #1 to Media Axiom #1: all other things being equal, a guy with a big number gets more airtime.The media suckled for 15 years on wild conjectures in the antivirus world and I see no real end to their guess-feeding. Reporters will forever mistake estimates for facts — but at least now we'll see some actual empirical data mixed in with all the fearmongering.
THEN AGAIN, ADDING real data to the freakshow may result in more fearmongering, not less. The more raw figures you have, the more easily you can mislead reporters. I can imagine someone making the following conjecture while talking to the press:
...so as you can see, our raw data obviously shows a virus catastrophe on the horizon. It's going to be big, too. Based on a pseudo-logarithmic increase in detections from 3:14am to 5:13pm, and based on a quasi-logarithmic increase from 5:14pm to 7:32pm, we at Fearmongers Inc. believe this horrifying virus will infect 148,932,000 PCs, with a margin of error of +/- 22,500, before antivirus vendors ultimately contain the threat. Of course, this is an estimate; your actual mileage may vary. We at Fearmongers Inc. pray our calculations prove wrong, so we beg every reporter on the planet to help us spread the word. The media will make a difference if they save even one PC user...
WHAT DOES IT truly mean? Answer: it means you can still promote virus hysteria — even if your own (broken) virus tracking maps contradict everything you say. In an earlier column, I pointed out how MessageLabs' initial comparison charts didn't support their claims about the spread of Kournikova. Spokesmodel Alex Shipp wrote to say he agreed with my view:
What we meant to convey by this was that the time between first release and achieving epidemic spread rates was half as much for Kournikova as for LoveBug. However, looking at the page again, I take your point that it is not at all obvious what we mean, and in fact we haven't even bothered to tell anyone what the coloured bars mean on the graph, so how anyone is meant to make any sense of it all beats me.Read Shipp's quote again: "it is not at all obvious what we mean." Reporters still lavished ink on MessageLabs — simply because they provided realtime virus data. Shipp promised to update the pages "to be more useful," and indeed he did. Click on the radioactive tennis ball to see the improvements. (A radioactive tennis ball? The antivirus world spends too much time creating graphics if you ask me.) One of the best improvements came when they added this caveat: "MessageLabs [acquired] more customers since the outbreak of LoveBug [last year], so direct comparisons by number may not be too meaningful." Their updated report offers a much better comparison, shown in numbers as a ratio of malicious emails. You don't see raw data with those statistics (tsk tsk), but the web page includes enough data to deduce MessageLabs' daily email flow during ILoveYou and Kournikova. I'll leave it for you to calculate as an exercise in math.
I WON'T BLAME MessageLabs for misleading the media with their initial comparison charts. They really do want to please reporters & critics alike. No, the real blame must fall on the press. They collectively can't (or won't) study a chart for validity. Reporters saw some numbers on a bar graph and said "duh, it's intuitively obvious." This leads us to
Corollary #2 to Media Axiom #1: all other things being equal, a guy with a slick chart gets more airtime.I critiqued MessageLabs pretty heavily, but I want to give them genuine kudos for supplying raw data and for using time as the key factor in a virus proliferation map. As I said, I waited a long time for this to happen. MessageLabs set a new standard by providing realtime virus data during the Kournikova coverage. Honorable mentions go to Mail.com, Brightmail, McAfee, and Trend Micro. Once you've got raw data, you can begin to create metrics. Come on, say it out loud with me: "you can't develop metrics without raw data." I'd love to see a line chart showing each day's ratio of malicious emails, for example. Forget the immediacy of a plummeting value — I want to see how the line rises or falls over the long term.
Corollary #3 to Media Axiom #1: all other things being equal, a guy with more numbers gets more airtime.Oh, sure, the antivirus industry will stumble while they try to figure out how to present the raw data they collect. Security managers will scratch their heads while they try to analyze the numbers. CIOs will stare blankly at charts they've never seen before. But I can assure you, our world just got better. All because some firms published a bit of raw data during a virus media event.
NOT EVERYONE MAY feel as ecstatic as I do about the debut of realtime virus data. Shipp seemed just a little too eager to email me about his firm's data collection theories. I'll go out on a limb here — didn't MessageLabs find anyone else willing to listen to them? Surely Virus Bulletin would beg Shipp to write an article about the vectoring data he assembled. It makes little sense to hand me so much great material unless other antivirus media outlets turned it down first. If they did turn it down, then we need to ask why. Shipp knows more than he wants to say under direct questioning. Fine: his evasiveness gives me an opportunity to speculate. I believe the antivirus industry holds a dirty little secret. Namely, they don't have access to as much "proprietary client data" as they've led reporters to believe over the years. Indeed, this industry has thrived for its whole life on inaccuracy. I first started ranting about it, what, a dozen years ago? (Time flies when you're having fun.)
LET'S TALK NOW about some data we gather at Vmyths.com. What does it truly mean when we track the weekly 'Top 3' virus hoaxes? Take a quick glance at the data. You'll notice every 'winner' received a few dozen reports at most. We don't yet break a hundred with the weekly top three combined. Say it out loud: "your figures look boring, Rob." And rightly so. You can't deduce worldwide impacts from a few dozen emails per week. Ah! Now let's suppose Vmyths.com never divulged the raw numbers. Suppose we offered percentages instead. We might further cloud things by saying we provide the world's only open-source data on virus hoax proliferation. You'd take our weekly 'Top 3' list a lot more seriously then, wouldn't you? In fact, you might even take it as seriously as Sophos' monthly 'Top 10' virus list. If I recall correctly, Sophos used to include raw numbers in their reports. The low values didn't give the impression of a large firm, so they switched to percentages. The raw data in our 'Top 3' list ironically explains why you shouldn't take it too seriously right now. We think it will grow more precise over time as the word gets out — but it qualifies as a novelty for the moment. Treat it as such. And why shouldn't we admit it? After all, "truth" is the first word in our website's slogan.