Approximately Wrong

I’m continuing my trend of going against what other people are saying. It’s easier to do than coming up with original ideas of my own. That being said, the non-original ideas I’m looking at today include: the 1% rule, the Wisdom of Crowds, and prediction markets.

Let’s start with the wisdom of crowds. If you’re unfamiliar with the term, it basically means that the knowledge or assumptions of a large group of people will fall along a familiar bell curve. There are always a few people who are completely wrong or misinformed about a given issue, but progressively more people will have information that is progressively closer to the truth. Taken as a whole, a group’s knowledge of something will tend to center on the correct pieces of information. The “guess the jellybeans in the jar” game is a good illustration of this. Take all the guesses and average them, and what you get is something very close to the actual number. That is wisdom of the crowds.

The first thing you’ll likely notice is that while wisdom of the crowds gets you close to a given piece of information, there’s no garuntee that it will get you exactly that piece of information. Now, I’ve argued before that one reason Wikipedia is works for me is because it’s good enough. It may never be 100% accurate, but it’s good enough for your everyday double-checking of facts from the water cooler conversation.

An extention of the “wisdom of the crowds” idea is predictive markets. If you get a group of people together to bet on the odds of something happening, the same thing happens as in the guesses on the number of jellybeans. The probabilities from individuals will be all over the board, but an average will emerge which is supposed to be representative of the true chance of some given event happening.

More on these in a minute.

The 1% rule says that for any given consumer-content driven site on the internet, only 1% of people will actually create content. Ten people will interact with the content created, and the other ninety people will just sit around and look at it. These statistics are consistent across all the user-content sites on the web.

So now we put the concepts together and find—a contradiction!

The justification for Wikipedia’s integrity is based on the wisdom of the crowds. Now, the Wiki-P gets a huge number of unique visitors a year, so the number of unique visitors who interact with the site, despite being “just” 1%, is still huge. It’s probably more than a million people (although I’m entirely to lazy to go verify this). I think that qualifies as a crowd, and probably one that has a diverse enough range of knowledge to populate Wikipedia about intelligent, important topics, such as Klingons. I mean Hogwarts. No! Wait, I mean real important topics that have to do with history and culture, such as Federalist Architecture.

Hm…

If you didn’t actually follow the links, a recap (all statements were accruate at the time time of this posting):

  • The entry for Klingons: Nine sections
  • The entry for Hogwarts: Fifteen sections
  • The entry for Federalist Architecture: One paragraph

Now the question is, is the Federalist style a legitimate comparison to these other topics? I mean, it is a fairly specific style of architecture, no? Well, there have been books written about it. I’m fairly certain there’s more than one paragraph’s worth of information to be said here.

You get the idea. You can’t have “wisdom of the crowds” when your “crowd” is a tiny slice of an already specific slice of the population. It’s been shown that the population of US internet users differs significantly in its demographics than the US population as a whole. Now you want to take this already skewed population, chop it down to 1%—which isn’t even a random sample, mind you; it’s the people motivated enough, or with enough desire to create the content on these wikis—and argue that this is a slice representative enough of the original population to contain whatever wisdom that population has? Are you a moron? (But I belabor the obvious; the question answers itself.)

If crowds have any sort of wisdom, its borne of the fact that they represent the widest sample possible. The “crowds” creating junk on the internet is not this sample. This has never been measured, but I’d be willing to bet that the 1% of content creators on the internet are of a very specific demographic. How can you expect this very specific group to have the wisdom required to do anything valuable?

You can’t have it both ways, Internets. It’s clear the 1% rule is not wrong because it’s easily measured. So that means the wisdom of the crowds on the internet is, as Penn and Teller say, bullshit. Any “wisdom” showing up on the internet is from a very specific group of people, and therefore is very likely to be flawed, or at least incomplete.

But what about wisdom of the crowds in general? Still there right? No. We don’t even need to speculate about this! We have examples already laying around, you just have to look at them for what they are. Can you think of anything at all that the “crowds” have decided, independent of the internet? Anything that everyone just “knows”.

Well, sure. They’re called stereotypes. Now obviously you’ve heard the cliche that there’s a truth to every stereotype. (Though if you’re proving the validity of stereotypes with cliches, you have bigger problems.) And everyone’s met at least one stereotypical person of some sort. But in general, most normal people don’t consider stereotypes to be all that valid, or at the very least, they’re bad insomuch as they act as a barrier to understanding.

But wait, there’s more! What about urban legends? It’s been a popular thing lately, to debunk or otherwise disprove urban legends. They’re generally set up in such a way as to seem like a personal anecdote, and for whatever reason, we’re hardwired to believe anecdotes from trustworthy sources as proof, despite the fact that they’re nothing of the sort. Over time, a series of anecdotes will add up to something like an urban legend, and it becomes common knowledge.

There’s your “wisdom of the crowds” in action. Now many people do not seriously believe in either stereotypes or urban legends, but then again, many people do. The balancing out happens between the informed and the uninformed—which can be pretty large. In other words, the only thing needed to create truth out of crowds is a big enough crowd. The only thing needed to create false truths out of crowds is a big enough and deluded enough crowd. Not very wise. In fact, it’s so unwise that this situation has its own fancy-schmancy latin saying to describe it, argumentum ad populum, appeal to the people. Try to use the so called “wisdom of the crowds” in any serious debate and I can tell you exactly how much water your argument holds. (Hint: less than one drop.)

So what’s this have to do with predictive markets? Simple. Predictive markets didn’t become a reality until the internet came along. In a sense, that’s what telephone polls were supposed to be, but they never rated event probabilities, they rated personal opinions. But now all kinds of people can go online and place bets on certain things happening and we aggregate these and come up with a prediction.

This is almost the stupidest thing I’ve ever heard.

First of all, consider the crowd again. Who’s participating in these online predicitve markets? Right, the 1 percenters—the same 1 percenters with highly ideosyncratic knowledge about the world. Highly ideosyncratic knowledge does not equal reliable predictions. Furthermore, since we don’t know much about this 1% group of people who are interacting with the internets, we don’t even know which way they might skew the results.

But, let’s say for the sake of argument that it was a completely unbiased sample of users betting on world events. It still wouldn’t matter. The entire concept of a market driven by people to output probabilities is flawed. It’s flawed because any free market is subject to bubbles and “bandwagon” fluctuations which make the predictions useless. But not only are predictive markets flawed—they don’t even mean anything.

Case in point: Who can forget the last two elections? Deadlocked. In both, it came down to a matter of thousands of votes. Very small numbers. Run the conditions for the 2004 election through a predictive market and—assuming they’re accurate—you would have gotten: Bush 50% chance of winning, Kerry 50% chance of winning.

And this tells us…? Nothing! Zero. There is no data here except that the American people are divided—but we already knew that from phone polls, so it doesn’t even predict that. The only thing it gives us is likelihoods. It’s like a big people-powered version of quantum physics. So what if something has a 43% chance of happening? This means nothing because ultimately the event either happens or it doesn’t. How can you possibly evaluate this as being even close to correct? Not only do predictive markets not actually do anything useful, there’s no way to check to see if they’re even doing it right!

And not only that, but predictive markets can only give predictions on events that are fed into them. If nobody thinks to add the scenario of “aliens invading earth”, then there won’t be any predictions on it; but if it happens, then what was the point of the prediction engine in the first place? It can’t account for unusual events even though they occassionally happen. Even if unusual events were in the system and it predicted a 0% chance of them happening—it still doesn’t tell us that they absolutely won’t happen, only that nobody believes it will happen.

Finally there’s the issue of people not understanding probability. Is this a prerequisite? What if I say the possibility of a terrorist attack by the end of the year is 1/3? I just pulled that out of my butt. Not only is it a random number I didn’t put any thought into, what does it even mean? A 1 out of 3 chance of terrorist attacks? Like terrorists are going to congregate three times, but on two of those occasions they decide not to attack and disband, leaving only 1 attack out of 3 potentials? And what constitutes something vague like a “terrorist attack”? Timothy McVeigh was a terrorist, but he wasn’t linked to Al Qaida. Does that count even if the bettors were thinking Al Qaida when they placed their bets? What if the attack is foiled? Is that a hit because they might have gotten away with it if they waited another five minutes in the bathroom, or is it a miss because, ultimately, they didn’t carry it out? What if a terrorist attack does occur, despite it being favored against? Does that mean the system is wrong? Or right?

But more importantly, can the 1% of the people online who interact with predictive markets collectively use their wisdom to answer these questions?

Using the collective wisdom of a crowd of 1, constituting 100% of the demographic of me, I predict the answer to that last question will be 100% no.

-Ted