Face to Face: Big data is the biggest loser in 2016 election (also prediction markets, pundits, etc.)

December 10, 2015

Big data is the biggest loser in 2016 election (also prediction markets, pundits, etc.)

The massive hype over "big data" during the past several election cycles, not to mention in the world at large, is finally being revealed as nonsense. Nassim Taleb has been the only major voice calling the whole approach bullshit, although he hasn't focused so much on the Trump phenomenon.

This election shows the fatal flaw of the big data program -- like all statistical learning programs, it has absolutely no clue what answer to give when it encounters an entirely unfamiliar environment. Maybe it'll give the right answer, and maybe it'll give a wrong answer -- whatever it says, our only rational response is to ignore it and look elsewhere, if anywhere, for answers.

Take an example: the 538 blog of poser quants tells us that, historically, the eventual Presidential nominee for a party had already done very well in opinion polls with the electorate, had amassed huge amounts of funds from donors, and/or had racked up scores of endorsements from politicians.

With Trump dominating the polls -- and media coverage -- while raising very little funds from donors and receiving no endorsements from major politicians, science says he can't win. Or at least, his chances are way below Fiorina, who they were "bullish" on after the second GOP debate, compared to their "bearish" stance on the master.

What the spergs can't see is that Trump is unlike anything in the data-set that they've honed their intuitions on. We haven't seen something like him since Teddy Roosevelt, but nerds generally don't appreciate history, and cannot force themselves to think back further than WWII, and typically 1980 in politics. Sure enough, 538's graphs on the "history" of endorsements for candidates only goes back to 1980.

Simply put, if there's no similar event to the Trump phenomenon in their history, why consult the history at all? It's like asking someone who's been trained on conjugating Spanish verbs to weigh in on how some verb is conjugated in Chinese. [1]

This whole situation brings up one of the central topics of statistical inference -- making a prediction based on interpolation vs. extrapolation.

With interpolation, you're making a guess about an item that lies within the range of what you've already seen, even if you haven't seen that exact item before. Nobody has any major objections to making these kinds of predictions, if you've got a dense enough data-set that will reveal how things behave within that range. You're mapping out a tiny square-inch within a territory that has been extensively surveyed for a mile around it.

With extrapolation, you're making a prediction about an item that lies well outside of the range that your data-set lies in. Honest folks view extrapolation as bogus -- not that the prediction is bound to be wrong, but that there's no reason to pay any heed to a guess that has no basis or grounding in the data-set. You are now sailing into uncharted waters, and assuming that the patterns of a territory you explored earlier will continue to apply in this unexplored territory. What could go wrong with assuming that the same pattern holds true everywhere?

For example, let's say there are two variables X and Y -- I promise, even innumerate people can get this -- like you remember from graphing equations in algebra class. Suppose you have a huge data-set -- thousands of points on the graph, revealing the fine-grained shape of the relationship between the two. Sample points -- (1,2), (2,4), (3,6), (1.1, 2.2), (2.1, 4.2), (3.1, 6.2), etc., all clearly suggesting that the Y value is 2 times the X value.

But what if the points in your data-set only had positive X values? Well, it might not present an obstacle if you're asked to predict what Y value will go with an X value of 2.5 -- supposing you hadn't already been given that point, you'd guess pretty safely that it would be 5, fitting with the rest of the multitude of points around it, and that Y would be 2 times X here as well.

However, if you were thrown a curveball, like X being a negative number, say -10, you wouldn't really know what to predict for the Y value anymore. Points with a negative value for X are outside of the data-set that you're drawing an association from, so you'd have no basis for a good guess. Maybe it'll continue the pattern from the points with positive X values, and Y will be -20. Then again maybe there's an absolute value function at work, making the magnitude the same but always giving a positive Y value, in this case X = -10 and Y = 20. Or any other of an infinite number of imaginable behaviors in this environment that you have no previous information about.

In such an unfamiliar territory, your guess is as good as any. Maybe it'll turn out right, maybe wrong, but you'll have no basis on those thousands of points of "big data" for your guess. If you do guess correctly, it will only be pure dumb luck, and nobody should pay any heed to your guess in the meantime.

How does extrapolation confuse people in an election like this one, with a never-before-seen candidate like Trump?

Lazy people have likened Trump to Perot, particularly if he decides to run on a third party. They try to analogize from the Perot phenomenon and conclude that Trump has little chance of winning the GOP nomination, and would crash and burn as a third party candidate.

But Perot had zippo in the polls, let alone was he dominating the GOP polls by double digits for more or less the entire time, and increasing more or less steadily all the while. He wasn't given wall-to-wall media coverage, and did not consistently draw crowds in the thousands and even tens of thousands. And he was a complete unknown before the election, while Trump has instant brand recognition. Not to mention their policy differences, with Trump being a broad populist and Perot focusing narrowly on NAFTA and trade agreements.

Since Trump's situation is radically different from Perot's, the earlier example of Perot predicts nothing about Trump today.

Slightly less lazy comparisons to George Wallace also don't hold up. When he sought the Democratic nomination in 1964, his appeal was largely regional (the Deep South), whereas Trump draws huge enthusiastic crowds in the Midwest, Plains, Deep South, Appalachia, New England, the Southwest -- everywhere, really. And he did not consistently dominate opinion polls. In 1968, he ran third party, but did not do so after dominating polls and coverage and crowds while earlier running on one of the two main parties. In 1972, he was nearly assassinated and his campaign ground to a halt. So far (knock on wood), no analogy can be drawn from Wallace's several campaigns to Trump's.

There has quite simply never been a candidate who was so dominating of the polls of a major party, media coverage, and crowd attendance, all throughout the second half of the year leading up to the primaries -- yet who was so loathed by the party's leadership, its elected officials, and his fellow candidates, let alone the other major party, with whom they launched an all-out mission to take him out.

Therefore, we have no idea whatsoever how the whole thing will unfold. Will the leadership bite the bullet and let him win, or try to sabotage him with attack ads? If that doesn't succeed, will they rig the primaries? If not, will they rig or buy off those at the Convention in the summer? Will they team up with Hillary to keep Trump from re-directing the Republican party? Or help to rig the general election? Or try to assassinate him?

We have no "big data" to draw on that would illuminate our current state of uncertainty. There just hasn't been anything like this before -- certainly, not an earlier example that also has tons of data to learn from. Our hunches may turn out to be right or wrong, but they will not be so on account of "what the data tell us". In an entirely unfamiliar setting, the data tell us nothing.

[1] Speaking of language, this is why computers cannot learn human languages to the degree we do. They do poorly with irregular forms, such as irregular verbs and irregular plurals. The statistical learning algorithms look for patterns between a present and past tense form of a verb, for thousands of verbs. It's not hard to learn the pattern for regular verbs -- stick "-ed" on the end. Some irregular verbs fall into families with similarities, but it's not hard-and-fast, and some verbs are sui generis.

Train the computer on verbs like "drink / drank / drunk" and they can correctly guess that "sing" goes "sing / sang / sung".

But ask it about the incredibly common verb "hit" -- it'll try to apply some variation to the root form, maybe "hit / hat / hut", or guess that it's regular "hit / hitted / hitted". All its guesses will be wrong since the forms are all the same, "hit / hit / hit". After training on "tooth / teeth," it won't be able to guess that it's "foot / feet," since the sound similarities between "tooth" and "foot" only held in an earlier stage of English (when the vowel was a long "oo"), and today they just have to be memorized individually.

These failures of machine learning apply very generally, and are the central weakness in connectionist and neural network approaches to modeling human language and cognition more broadly. They are good at abstracting associations within the data-set that they've been trained on, and can make good guesses about the properties of a new item if it resembles an item they've already seen. If the new item is unfamiliar from the training data-set, the guesses go all over the place and are all equally worthless.

Big data cannot think outside the box.

22 comments:

Curtis12/10/15, 8:02 AM
Speaking of which:
http://www.cnn.com/2015/12/10/politics/donald-trump-2016-independent-presidential-campaign/index.html
ReplyDelete
Replies
Severian12/10/15, 8:08 AM
nerds generally don't appreciate history, and cannot force themselves to think back further than WWII, and typically 1980 in politics.

You can say that again! There are, in fact, plenty of historical parallels for the Donald Trump experience. It's just that "pollster," and "campaign consultant" in general, has only existed since the later 1960s, so people use Richard Nixon as their lower bound. (Boomers ruin everything, don't they?)

Read just a little bit of history, and the comparisons jump right out. One is Weimar Germany. I am NOT saying Trump is a Nazi, since I'm not an idiot leftist. But the phenomenon of the masses voting for the only viable "fuck the major parties" candidate is not new. Hitler was the "anything but more of the Weimar Republic" candidate. We're living in Weimar America.

Or, if you prefer an American parallel, the Republican Party in the 1850s. From about 1820, the only issue that mattered in American politics was slavery, and both parties did everything in their considerable power to avoid talking about it. Then a third party came along and talked about it, and pulled in all the disaffected members of both parties -- the Whigs collapsed, the majority wing of the Democratic Party became the out-n-proud proslavery party, and I forget what happened after that. I'm sure it all worked out fine.

In fact, if you want to broaden it out further, you could simply say that the US government is going through a crisis of legitimacy. It doesn't do what it's supposed to do, and the elite's contempt for the people is overwhelming and obvious. I can't think of a single situation, in the entire modern West, where that didn't end in a revolution. The only thing unique about 2016, on that view, is that it'll (probably) be peaceful.... for a few more years.
ReplyDelete
Replies
Contaminated NEET12/10/15, 8:09 AM
You're right about this, but no matter how wrong the quants get this question, or any other, we'll never learn. The seeming objectivity of numbers is just too tempting to the bureaucratic mind, and bureaucratic minds run the world.
ReplyDelete
Replies
Curtis12/10/15, 11:51 AM
A low disgust reflex correlates with arrogance(humility vs. arrogance on the hexaco big six inventory), which is the cause of the big data industry. Its just another example of a trend in cocooning times to try to control life and perfect human nature. Cocooners are offended by the idea of bad luck and, being control freaks, believe that they can control everything that happens to them. Those who live in outgoing, health societies know better.
ReplyDelete
Replies
agnostic12/10/15, 1:52 PM
"you could simply say that the US government is going through a crisis of legitimacy."

It is, but that doesn't help us predict who's going to get the GOP nomination, which is what the "big data" and "prediction markets" hype is all about. Or, who will eventually win the election.

Some big populist change is coming in the short-to-medium term, but historical parallels don't give us more detail than that. We can't rule out the GOP elite rigging the nomination or the general election, or assassination, or etc.

The closest parallel we have is Teddy Roosevelt, but he was not elected into the Presidency -- he was VP, and President McKinley was assassinated by a bitter anarchist Slav whose parents were immigrants. Voters actually vote for the Pres, and the VP coasts along with him, so we can't even say that Trump's nearest persona won a tough election (though he did get re-elected, but that's much easier).

And then when Roosevelt went third-party with the Progressive / Bull Moose Party, he lost to the Republican Taft, whereas in 2016, if Trump went third-party and lost, it would be to the Democrat Hillary rather than the unelectable place-holder that the GOP chose over Trump.

The parallel of the emergence of the Republicans as a third-party success story is not as relevant as it sounds. America was still a fairly young nation, and the two "established" parties were not so deeply established. In 1860, the Whig + Democrat establishment was only a generation or so old. Not too hard to vote for dumping a party that was inchoate -- feels like trial and error.

By now, the Democrats and Republicans have been the only two parties for 150 years. When there have been major realignments -- Teddy Roosevelt among the Republicans, FDR among the Democrats, Reagan among the later Republicans, Clinton among the later Democrats -- it has never resulted in a third party, but a transformation of an existing established party.
ReplyDelete
Replies
agnostic12/10/15, 2:13 PM
BTW, none of the above should temper our enthusiasm for the Trump phenomenon. By all signs, he's going to enjoy even greater success than Teddy Roosevelt, who only got the VP spot and got lucky that the Pres was assassinated.

Historical parallels repeating themselves doesn't mean down to the 50th decimal place. There's going to be variation even among re-incarnations.

So perhaps this time around there will be even greater transformation (of the GOP and of society generally) than there was under Teddy Roosevelt.

Data won't tell us one way or the other, we just have to wait and see, and enjoy the changing tone of political discussions that Trump has already effected.
ReplyDelete
Replies
Derrick Bonsell12/10/15, 3:25 PM
There is simply too much inertia behind the two party system for it to disintegrate. Part of it is ballot access. A candidate running for the Republicans or Democrats doesn't need as many signatures to run for office than a third party or independent. More important than that, arguably, is our First-Past-the-Post election system. Every country with FPTP has two major parties, and maybe a third party that sometimes reaches power as part of a coalition. The UK has the Tories and Labour, with the Liberal-Democrats getting only a very small percentage of seats. Canada has the Liberals and the Conservatives, with NDP a few years ago getting a shot but in the most recent election they finished a distant third and seem like they'll fade into the background. In a FPTP system, you're a member of a district and you cast your vote for one of a series of candidates. If your guy loses your district gets nothing and your vote is irrelevant. So why take a chance on a candidate that cannot win? The two major parties benefit from this system.

In a proportional or semi-proportional system your vote still matters. If you have 100 seats and your party only gets 30% of the vote well you still have 30 seats*.

*Assuming it's proportional, semi-proportional systems (like Germany) are a bit more complicated.
ReplyDelete
Replies
NZT12/10/15, 4:49 PM
Good post. There's a certain type of bureaucratic, managerial mind to whom modeling, "risk mitigation", "data analytics", etc, are catnip, even when applied to inherently chaotic fields where they've been consistently proven unreliable. In my work I brush shoulders with a lot of financial models, and while a few of them are helpful (usually limited to very specific domains, like predicting how a bond portfolio would react to a change in interest rates), the vast majority are ultimately just very sophisticated ways of taking random-ass guesses at what the future will bring, and their main function is to make managers and executives feel better by having a standard process to follow.

What really gets under my skin is when people intuitively know a model won't really help us solve a problem, but they mewl about "but still, it's better than nothing". Well, no, it isn't. A model presents a veneer of objectivity and "mathiness" that easily lulls people into making bad decisions based on false confidence. Sometimes the most honest option is to admit you're aren't sure and make the best choice you can based on judgment, intuition, and luck. But concepts like "judgment" and "risk" and "uncertainty" are anathema to the bureaucratic mind, so you wind up throwing a lot of man-hours at projects that don't actually increase knowledge or improve decisions.

Incidentally, I'm normally no big fan of Malcolm Gladwell, but he wrote a great article about Taleb back in 2002 that might interest you:

http://gladwell.com/blowing-up/
ReplyDelete
Replies
TGGP12/10/15, 11:45 PM
Teddy Roosevelt doesn't seem that clear a parallel to Trump to me. As you note, he became President via McKinley's assassination. Prior to that he had been a governor, Lieutenant Colonel (and founder of the Rough Riders), Assistant Secretary of the Navy, Police Commissioner, Civil Service Commissioner and state assemblyman. Whereas Donald Trump has never been elected or appointed to any public office and doesn't have a consistent history as member of the Republican party (contrast with Roosevelt who refused to support mugwumps of the Democratic party, even if they had some common goals). I compared Trump once to Wendell Wilkie, who at the time was unusual in garnering a major party nomination despite little history with that party or any kind of public service. Of course, Wilkie was nominated under a very different process and the FDR years were an unusual period in American history (moreso than Teddy's).

My offer for a Trump bet still stands.
ReplyDelete
Replies
agnostic12/11/15, 1:53 AM
Your offer is empty because we are in an unprecedented setting, meaning nobody would be drawing on anything, and the guesses ("bets") would be pure noise, no signal, and whoever won would be from luck.

You're comically trying to make yourself look like a rebel calling someone's bluff -- trying to make it sound like we're all so confident that Trump is going to get the nomination, but are then hypocritically toning it down when asked to bet on it.

Maybe you haven't been paying attention, but nobody on the Trump train assumes he's got the nomination in the bag -- including the Don himself. Why else does he keep open the option of running third party if he's treated dishonestly by the party establishment? Why else are the majority of his supporters willing to vote for him as a third party candidate, unless everyone senses the real possibility of the GOP elite sabotaging its own frontrunner (by far)?

This is what I mean by prediction markets being big losers along with big data (not big enough to include a relevant comparison point). They are telling us nothing that we don't already know -- i.e., that Fiorina, Kasich, and Graham are highly unlikely to get the nom, compared to Rubio, Bush, or Trump.

As for who among the top 5 would get it, they have nothing to go on. Even GOP insiders don't know what's going to happen at the convention -- they're keeping the option open to sabotage Trump, but they're also keeping open the option that that won't be necessary because (they keep hoping) Trump will fall in the polls.

People in the prediction markets have no specialist information, since there is no parallel to draw from. And they have no insider information, since it's still up-in-the-air for both the frontrunner and the party leadership.

We might as well ask the Magic 8-Ball.
ReplyDelete
Replies
agnostic12/11/15, 2:47 AM
Prediction markets are also worthless because they only score guesses as correct or incorrect. They therefore do not measure an *expectation* that people have about the future -- which is the probability of some event happening, multiplied by the magnitude of the effect if it happens, for all possible events.

Technically the prediction markets could be set up that way -- that bets would require an expectation, or expectation with variance, or a full distribution. But that's too nitty-gritty and boring to the ADD audience of pop-sci TED Talk bullshit.

Poser science junkies only want to see a flashy scoreboard of what percent chance each candidate has of winning.

As an example, suppose the bet was about the number of American residents who will be deported during the administration of whoever the next President is.

This reduces to a bet on Trump winning the election, since everyone else is a wimp on immigration -- especially on using deportation as a means to enforce the law. There's some small number that all the other candidates might be comfortable with, so that would be the reference point (based on recent history from Bush II and Obama, who are ample parallels to all candidates aside from Trump).

The bet would then be, how many deportations in excess of the recent historical average (over 5, 10, or however-many years in the past), do you predict to unfold? Maybe on the scale of a year, 4 years, first 4 years of an administration, whatever.

Let's say there are two bets to make -- Trump or anyone else. If Trump wins, those bettors get paid by the losers an amount equal in dollars to the number of deportations beyond the recent average.

According to Trump skeptics themselves, the Trump enthusiasts are taking the riskier bet, an almost laughably long-shot bet, in the skeptics' minds -- that the GOP won't sabotage the nomination process and he'll then get the nomination and he'll then beat Hillary, or that they will sabotage but he'll then run third party and win over both Hillary and the GOP cuck -- so they should get paid more if right.

And the Trump bettors' wins -- or the non-Trump bettors' losses -- should be unbounded above, just to give the skeptics something to think about. Because the skeptics are way more confident of their predictions, yet if the skeptics are wrong, they're *really* wrong. It's not the same as betting Bush becomes President but then it's Rubio. No big deal for deportations. But if Trump wins and you're wrong, you're wrong by perhaps millions, and being so wildly wrong, while speaking so confidently in public, should incur a wildly harsh loss.

If Trump loses, the non-Trump bettors win an amount equal in dollars to the absolute difference between the actual number of deportations and the recent average. It won't amount to that much -- whether more or less than the average, it will be close. But it's tightly bounded by the fact that all other candidates are weak on immigration and deportation especially, so their wins (Trump bettors' losses) aren't going to be so potentially high.

That's as it should be, though, since -- as they themselves tell it -- they're making a safer bet.
ReplyDelete
Replies
Severian12/11/15, 6:12 AM
It is, but that doesn't help us predict who's going to get the GOP nomination, which is what the "big data" and "prediction markets" hype is all about. Or, who will eventually win the election.

True, but this too is missing the forest for the trees. It doesn't matter who wins the GOP nomination; the era of the two-party system is over.

If Trump doesn't win the GOP nomination, he goes third party. If Trump wins the nomination, the rest of the GOP goes 3rd party. And from there it falls like a line of dominoes. I honestly think Trump wins a three-way race between himself, Rubio, and Clinton, and once that happens, the Democrats split along the same lines -- their establishment wing going one way, the radical wing another.

[This, too, has happened before. Recall that Lincoln won the 1860 election because the Dems split three ways -- the mainline proslavery party under Stephen Douglas, the really proslavery party under John C. Breckinridge, plus a me-too quasi-Whig party under John Bell].

Eventually the dust will settle and the final American fascist system will take shape, but before that happens I wouldn't be surprised to see four or five "parties" slugging it out in the next election or two.

None of which, as you say, could have been predicted by the quants. But then again, trying to slap numbers and run simulations on human behavior has been a sucker bet from the get-go.
ReplyDelete
Replies
Severian12/11/15, 6:21 AM
If you want another parallel, consider the situation of the Democratic Party in 1968. Nobody ever talks about anything but RFK's assassination in that one, but check out Eugene McCarthy. There's your Trump figure. Outsider, focuses popular anger, shifts the conversation, and by extension his party, quite a bit... by the end of 1968, the Democratic Party was the "antiwar" party, running against the war they themselves started.
ReplyDelete
Replies
TGGP12/11/15, 8:42 PM
I wouldn't say I was a rebel when I challenged Morgan Warstler. I thought he made a lot of exceedingly confident claims and since I take seriously Robin Hanson on betting, I decided to try it for myself. It worked out pretty well, so why not do it again for another GOP primary? I had interpreted your statements as indicating that prognosticators and prediction markets were giving inaccurately low odds for Trump, and thought you might be willing to put some money/prognosticator reputation on that. But now you are saying "They are telling us nothing that we don't already know", and if you're not slighting their accuracy then it doesn't seem to me you've made much of a critique. The markets aren't indicating maximum entropy, and they don't simply mirror the polls. I haven't been closely following the race, so when I checked before making that offer again it was telling me things I didn't know. But if you actually think the prediction markets are inaccurately relying on old models of politics and that they really don't know anything, then the expected value of betting against them is positive, and I'm offering to be the less faceless fool with his money and bragging rights eager to depart.

I've taken part in prediction markets which ask questions of magnitude like you are interested in. However, that was part of a DARPA project rather than something as commercially viable as horserace polls.

The degree to which a skeptic or bettor is wrong depends on how confident they are. Those odds can be calculated into the payoff of a bet.

I am also willing to bet that the 2 party system is not dead, that the top two recipients of electoral & popular votes of this election (and the next) will be the nominees of the Republican & Democratic parties.
ReplyDelete
Replies
agnostic12/11/15, 10:41 PM
You have a serious brain problem where you keep cutting out qualifications that are explicitly given.

I did not say, without qualification, that prediction markets "don't tell us anything we don't already know" -- I said that they tell us nothing about the bird's-eye-view, that Kasich Fiorina Etc. are unlikely, and Trump Rubio Etc. are more likely. And that they are worthless within the finer grain scale -- that they tend to slight Trump and hype Rubio.

The true situation at this point is closer to even chances for Trump and whoever the cuck candidate will be -- probably Rubio.

It's possible the prediction markets have it right with Rubio over Trump in the nomination -- in the case where the GOP sabotages Trump. And it's possible they've got it backwards -- in the case where the GOP allows it's frontrunner (by far) to secure the nomination.

There's no way to decide between the two cases because there is no specialist info to be drawn on (no parallel in the past), and no insider info (GOP elite is still undecided about what to do).

BTW, I would say the prediction markets were worthless even if they had Trump far in the lead, mirroring the polls. Why? Because that assumes that the will of the GOP electorate will translate monotonically into the nomination process -- which is anything but clear. Maybe the elite will allow it, maybe they won't, no way to tell.

Hence no reason to bet on Trump over the head cuck (probably Rubio) or vice versa.
ReplyDelete
Replies
agnostic12/11/15, 10:52 PM
The other major reason why the prediction markets don't tell us anything is because there's no way to evaluate them after the fact.

The 2016 GOP nomination, the 2016 Presidential election, etc., are one-time-only events. There is no way we can run the experiment over again for thousands of trials, and see how many times Rubio (or Trump or whoever) got the GOP nomination, and check that frequency against what the prediction markets said (at any point in time). Ditto for evaluating their guesses about the President.

Suppose Rubio gets the nomination: does that prove the prediction markets "got it right"? No, because Rubio can get the nomination even in the case where his true chance of success was lower than Trump's, or about the same (as I'm suggesting). Remember, these are all probabilities, and as long as it isn't close to zero, they have some shot at winning.

With no frequency-based approach being possible to evaluate the prediction markets, we are left with a Bayesian approach. But that just reduces to how subjectively confident we are in the predictions being accurate. My hunch is they're over-hyping Rubio, who should be close to even with Trump. The mainstream hunch is that Rubio's way favored of Trump. Someone who isn't wise to the potential plot against Trump by the party elite would say Trump is way favored over Rubio.

But all of those Bayesian evaluations of the predictions are just a bunch of hot air, scientifically speaking, with no specialist or insider info to justify one or another.
ReplyDelete
Replies
agnostic12/11/15, 11:03 PM
That's why I say that I'll only seriously consider a bet where the gains from the riskier bet are unbounded above, and linked to the degree to which the safe bet got things wrong.

That takes it out of the mundane realm of trying to estimate an event's probability of success, where we can't run the experiment over and over to evaluate how accurate that probability estimate was. It takes it into the real world, where successful events have consequences, and of varying magnitudes.

Probability of success times size of impact, for all possible outcomes, is the expectation. That's something meaningful we can talk about in predicting the future. Throw in variance to allow for some deviation from that expectation, and now we're talking.

Any prediction market that doesn't include at least a mean of a (mostly) unbounded variable is just blowing smoke up our ass. Pretending to estimate probabilities of success for utterly non-repeatable events is a new low in the desecration of science by TED Talk cargo cult statisticians.
ReplyDelete
Replies
TGGP12/12/15, 11:25 AM
If you really think the markets are slighting Trump relative to Rubio/Cruz (and that the "true situation" is close to even), you should regard the given odds as a profitable betting opportunity. You can talk about what's "possible", but lots of things are possible (yet exceedingly unlikely). To the extent that the party leadership is important in determining the nomination and the prediction markets (but not polls) are telling us that, then they aren't worthless at all.

"GOP elite is still undecided about what to do"
I think they've decided "not Trump", and it's just a question of which not-Trump will get it. I don't know if Cruz, Rubio or some dark horse will get it, but I am predicting that whoever it is will get the nomination rather than Trump as the anti-Trump coalition solidifies behind them.

"The other major reason why the prediction markets don't tell us anything is because there's no way to evaluate them after the fact."
Yes, we can evaluate prediction markets after the fact. They depend on having an answerable question by a certain amount of time, and we can see whether they were predicting the correct outcome and how long it took for them get to that point. And they evaluate one-time events all the time! You get at a bit of truth in that a single prediction won't tell you much about the worth of prediction markets as a whole, because the point of such markets is to make many predictions and we evaluate how well-calibrated they are by aggregating all the outcomes.

"Rubio can get the nomination even in the case where his true chance of success was lower than Trump's, or about the same (as I'm suggesting)"
I'm not sure what you mean by "true chance of success". It's not a coin flip or die roll, so from a frequentist perspective wouldn't his odds just be 100% or 0% depending on whether he gets it?

"My hunch is they're over-hyping Rubio, who should be close to even with Trump"
Another possible interpretation just occurred to me which would be relevant: are you saying that Rubio's odds should be lowered to Trumps, with the remaining probability distributed to other candidates? Because I was really only willing to go along with market-odds on whether Trump would get it rather than on whether Rubio would be the not-Trump candidate. But if your hunch is that Trump's probability should be raised (which is what I had been thinking), then I'm willing to make a bet you'd regard as profitable.

The gains from a risky bet are already determined by the degree to which its opposite "got things wrong". Those are the odds. The mundane world is part of the real world, in which there are events whose outcomes we are interested in that can't be re-run. You can be interested in consequences downstream from those events, but those would just be more possible events (just ones commercial betting platforms happen to be less interested in than newsy stuff like horseraces). Personally, I'd be unwilling to bet on deportations because of stuff like the recent claims Obama has been "deporter in chief" if you count things in a rather dubious way.
ReplyDelete
Replies
Random Dude on the Internet12/12/15, 2:07 PM
Big data is really useless until people have time to really digest the impact social media plays in politics. Donald Trump is able to get his message out by bypassing the traditional media entirely; it is not a surprise that predictions keeping faltering. The reason is that he has a strong social media strategy. Other candidates might have a Twitter and an Instagram but Trump does things totally different.

Nate Silver and the like are using outdated models to predict the outcome of the election. The interesting thing to look out for is seeing how Trump's poll percentages tie to his primary result percentages. I have very little faith that the GOP will play fairly here; vote tampering and fraud is going to be a very real issue here. I have little doubt though that the Breitbarts of the world are going to keep a close eye and report any shenanigans that happen. I bet the GOP wishes now that they had superdelegates like the Democrats who can be the Trump stumpers.
ReplyDelete
Replies
agnostic12/12/15, 6:08 PM
"If you really think the markets are slighting Trump relative to Rubio/Cruz (and that the "true situation" is close to even), you should regard the given odds as a profitable betting opportunity."

I don't give a damn about making a few bucks from betting. I'm talking big picture -- the estimates from the prediction markets *as a model for the real world* are utterly useless. That is how these things are conceived of by those who pay any attention to them at all -- as a model (however accurate or inaccurate) of the real world chances that each candidate has of winning.

To check a model's accuracy, we need to see how well it fits the data. Except we cannot have any data here -- there will be one candidate who gets the nomination, and that's it. We can't run the nomination process lots of times, and see if the model's estimates of success for each candidate match the percent of winning trials for that candidate.

"I think they've decided "not Trump", and it's just a question of which not-Trump will get it."

That's not what we're talking about -- obviously they've made up their minds about not wanting Trump to get it. But what will their tactics amount to -- letting the convention proceed fairly, and try to trash Trump through attack ads before the primaries? Hold off on pre-convention smearing, but then buy off or otherwise rig the convention?

Their behavior that will influence the nomination choice is what is totally up-in-the-air and never-before-seen.

"I'm not sure what you mean by "true chance of success". It's not a coin flip or die roll, so from a frequentist perspective wouldn't his odds just be 100% or 0% depending on whether he gets it?"

These prediction markets are being treated as models for the underlying chances of success for each of the candidates. There is no frequentist interpretation possible, since the trial cannot be repeated. The Bayesian interpretation amounts to how confident the bettor is in the success of one outcome over the others.

Now, if I say that the cuck candidate and Trump have nearly even chances of success -- that isn't frequentist, since we can't run the nomination over and over again. It's me saying that we have no information from earlier parallels, and no insider information available, that would nudge our guess one way or the other with respect to the lead cuck vs. Trump. (Obviously we can tell Kasich Fiorina Etc. won't get it.)

Someone else -- most in the prediction markets -- are saying they feel more subjectively confident that Rubio will get it over Trump, for whatever reason (say, being the party favorite trumping the popularity with voters).

Suppose Rubio gets it. What does that do to resolve the disagreement over how subjectively confident we should have been at this point? It tells us nothing, unless we had the odds tilted in favor of one by orders of magnitude. But we're talking about a narrow disagreement -- the hypers of Rubio and Trump both give the other one a good chance, maybe 30-70 or 40-60 if it's just the two of them, not 1-99 or 0.1-99.9.

The outcome of a single event can't tell us how subjectively confident we should have been, if we were already in a narrow band to begin with (say, 30% or 40% or 50% in favor of one of two outcomes).
ReplyDelete
Replies
agnostic12/12/15, 6:17 PM
"are you saying that Rubio's odds should be lowered to Trumps, with the remaining probability distributed to other candidates?"

Or Trump's raised to Rubio's or whatever, with a small amount left for the also-rans.

"The gains from a risky bet are already determined by the degree to which its opposite "got things wrong"."

Right, but like I said for variables with simple values of success vs. failure, rather than anything quantitative that can be measured (like deportations), we can't win or lose very much when we're already agreed that a few outcomes have chances of succeeding in the neighborhood of one another.

We can't win or lose much, and there are high up-front costs required to really fine-tune our estimates (turning it into a tedious, time-and-energy consuming research project, losing money by taking days off work, etc.). So, no point in setting up the bet.

But, if one of us could win or lose YUGE amounts, then it might be worth it.

And winning or losing big means one of us would have to be astronomically wrong. Well, that won't be about who get the nomination -- you say Rubio 70% and I say Rubio 50%, neither of us is very off in case Rubio gets it, with 100%.

However, deportations will vary by orders of magnitude depending on who gets it. So now there's a real disagreement, something real-world rather than laboratory, and tied to an *expectation* about the future behavior of some random variable.

An expectation is a *real* prediction, unlike subjective impressions of how likely a non-repeatable event is to go one way or another.
ReplyDelete
Replies
Wm Jas Tychonievich5/6/16, 9:13 PM
"It's like asking someone who's been trained on conjugating Spanish verbs to weigh in on how some verb is conjugated in Chinese."

A very appropriate example -- since Chinese is so different from European languages that there's actually no such thing as conjugating verbs in Chinese.
ReplyDelete
Replies

You MUST enter a nickname with the "Name/URL" option if you're not signed in. We can't follow who is saying what if everyone is "Anonymous."