Showing posts with label Probability. Show all posts
Showing posts with label Probability. Show all posts

Saturday, July 19, 2014

Snappy responses you weren't hoping for that nonetheless answer the question quite well

In the last few years, unable to hold a list of just four grocery items in my head, I’d begun to fret a bit over my literal state of mind. So to reassure myself that nothing was amiss, just before tackling French I took a cognitive assessment called CNS Vital Signs, recommended by a psychologist friend. The results were anything but reassuring: I scored below average for my age group in nearly all of the categories, notably landing in the bottom 10th percentile on the composite memory test and in the lowest 5 percent on the visual memory test.
All this means that we adults have to work our brains hard to learn a second language. But that may be all the more reason to try, for my failed French quest yielded an unexpected benefit. After a year of struggling with the language, I retook the cognitive assessment, and the results shocked me. My scores had skyrocketed, placing me above average in seven of 10 categories, and average in the other three. My verbal memory score leapt from the bottom half to the 88th — the 88th! — percentile and my visual memory test shot from the bottom 5th percentile to the 50th. Studying a language had been like drinking from a mental fountain of youth.
What might explain such an improvement?
Regression toward the mean.

Friday, March 21, 2014

The hard part of Malaysia Airlines Flight 370

So much has been said speculating about the missing Malaysia Airlines flight that may or may not have crashed, or been hijacked, or been deliberately flown into the ocean, or god knows what else. I think a lot of people were surprised to find out that in this day and age it is possible for a jet to simply go missing for this long without anyone having a clear idea of what the hell happened to it.

What struck me about the story, however, is how particularly devastating it must be for the relatives of those who were on the plane. In the first place, it's hard to see many ways that their loved ones came out of this alive. If the plane crashed into the ocean due to some mechanical failure or pilot suicide, they're long gone. And the possibility of what that ending might have been like would surely be a haunting one. The most optimistic scenario is a hijacking, but given the plane hasn't turned up and there haven't been any announcements, either to gloat over prisoners or demand ransoms (does anybody even do that anymore? I dunno), any group that wanted to just steal the plane would probably not want to leave hundreds of potential witnesses around afterwards. Bottom line, it's looking pretty damn grim.

But the scenario gets made significantly worse even relative to a normal plane crash by the fact that humans are incredibly bad at dealing emotionally with probabilistic scenarios. What does it mean for there to be a 0.5% chance that your dad is still alive somewhere and being held hostage, a 30% chance he got smashed to pieces in a crash and a 69.5% chance he got killed by terrorists? How should you feel about that? 30% of the time you might be philosophical about bad luck, 69.5% of the time you might be outraged by the depravity of human beings and demanding vengeance. And 0.5% of the time, you should be very nervously hoping that somehow things can be negotiated to a satisfactory conclusion, and doing everything in your limited power to make that happen.

In other words, 99.5% of the time you should be trying to move on with your life. This is made possible by the fact that it's very hard to know how to move on since you don't know what lesson to learn. And 0.5% of the time, you should be hanging on to the hope that they're still coming back, because they may have had an incredibly lucky escape.

Unfortunately, most people's emotions don't work this way - they can only feel one thing at a time. To make this work, they have to round all bar one of these probabilities down to zero - maybe at the crude level of dead or alive, but maybe even at the level of which scenario among the various cases. Either you decide that your Dad is dead, for sure, or you decide that he's alive for sure. Obviously given these odds, most people should go with 'dead', but you would need to be very hard of heart to not understand why people are reluctant to let go of hope when it comes to their loved ones.

I hate the word 'closure', as it's associated so much with feel-good claptrap that's just a cover for narcissistic emotional exhibitionism. But if the term means anything useful, it's that people find it hard to deal emotionally with events where they only know the outcome probabilistically, and different outcomes are associated with very different emotions. James Bagian can probably deal with them. I flatter myself that I can probably deal with them. This would test to your very core whether you can actually feel statistics, or just know them intellectually.

But most people can't. They just get torn up over and over with no end. Affective forecasting says it takes about 3 months to get used to most things. The families here don't even get that, because the clock doesn't even start running properly.

What a terribly sad circumstance to have to deal with.

Monday, January 20, 2014

Free Startup Ideas – Traffic Predictions and Alarm Clocks Done Right

Here’s an idea for some enterprising engineer (most likely at Google or somewhere else with access to good traffic data) that I’m almost certainly not the first to have thought of.

A good traffic prediction algorithm would let you specify a time of day you need to arrive at a particular destination, a starting point, and tell you when you need to leave. Google Now already does a crude version of this. If you have flight details in your gmail account, it will sent you an alert when you need to leave in order to get to the airport an hour before your flight. But there’s a lot more cool stuff you could do with this.

For instance, it would be great to be able to take the directions in Google Maps and specify a day of the week and time (or day of the year) and see an estimate for how long the trip would take at that particular point in time. Since google has oodles of historical traffic data, they’d be able to get a pretty good estimate based just on historical traffic conditions. Ideally, you’d be able to take the same route and plot out how the expected length of journey varies with the starting time.

This would tell you what times of the day and night to avoid, letting you figure out how to adjust your work schedule to avoid traffic. It would also tell you about a fascinating quantity – the elasticity of time arrived to time left. There are times of the day, such as peak hour, where leaving 10 minutes later might cause you to arrive 15 minutes later (an elasticity of 1.5, suggesting that wasting those minutes is very costly), or at the back end when you can leave 10 minutes later and only arrive 8 minutes later (making those minutes subsidised).

Notably, everything I’ve described (like Google Now in its current form) only speaks of a point estimate of how long things will take, presumably either the mean or median. In reality, there’s much more interesting stuff you can do with the whole distribution.

For instance, lots of unexpected things happen with traffic – accidents, weather, what have you. So for a trip that leaves at 8am on a Monday, there’s actually a distribution of possible arrival times. For someone who knows what a distribution actually means, it would be very useful to be able to specify an acceptable percentage of the time that you would be late (or more than X minutes late), and have the algorithm give you a time that you needed to leave your house in order to get there on time with that probability.
If this were done, you could just subtract the number of minutes you need to get ready each morning, and that’s when you need to set your alarm.

Even more interesting would be to improve these predictions from unconditional to conditional by making use of both current traffic and weather conditions. The overall distribution of, say, Mondays in January, would give you the unconditional distribution of the chances of arriving on time. But you could definitely do better by generating conditional distributions that morning that relied on the local weather conditions and the current traffic conditions relative to the historical distribution. In other words, if you normally need to leave home at 8am, the app could use the fact that traffic at 6:30am is heavier than normal to estimate that you may need to wake up earlier than normal as well.

Done properly, I’d gladly pay $20 for this kind of app. If it really worked, I’d probably value it at much more than that, notwithstanding that an irrational cheapskate instinct kicks in regarding the prospect of paying more than a few bucks for an online app.

As with all Shylock ideas, should the app succeed I insist on receiving either fat royalties or a free t-shirt that says ‘I came up with the idea for [Traffick-ator] and all I got was this lousy t-shirt’. Medium please.

Wednesday, May 8, 2013

Hate Generalisations? You Probably Just Hate Statistics

One of the most oft-repeated nonsense claims by a certain type of low-wattage intellectual lefty is that one 'shouldn't generalise'. (For reasons that are worthy of a separate post', this seems to me to be reasonably correlated with people who also proudly announce that they 'don't judge').

Apparently, one of the Worst Things In The World you can do is to notice that information about the generality of a distribution may useful in predicting where a specific point in the distribution will lie.

For those people that don't like to 'generalise', I wonder what, if any, statistical measures they actually find interesting or legitimate.

What is an average, if not a statement that lets one generalise from a large number of data points to a concise summary property about all of the points combined? Or a standard deviation? Or a median?

The anti-generalisers tend to apply their argument ('assertion' is probably a better description) in two related ways, varying slightly in stupidity:

a) One should not summarise a range of data points into a general trend (e.g. 'On average, [Group X] commits murders at a higher rate than [Group Y]').

b) One should not use a general trend to form probabilistic inferences about a particular data point (e.g. 'Knowing statement a), if I also know that person A is in Group X, and person B is in Group Y, I should infer that person A has a higher probability of committing a murder than person B').

Version a) says you shouldn't notice trends in the world. Version b) says you shouldn't form inferences based on the trends you observe.

Both are bad in our hypothetical interlocutor's worldview, but I think version b) is what particularly drives them batty.

But unless you just hate Bayesian updating, the two statements flow from each other. b) is the logical consequence of a).

Now, this isn't a defence of every statement about the world that people make which cites claims a) and b). To a Bayesian, you have to update correctly.

You can have priors that are too wide, or too narrow.

You can make absurd mistakes that P(R|S) = P(S|R).

You can update too fast or too slowly based on new information.

And none of this has even begun to specify how you should treat the people you meet in life in response to such information.

None of my earlier statements are a defence of any of this. The first three are all incorrect applications of statistics. The last one is a question about manners, fairness, and how we should act towards our fellow man.

But there's nothing wrong with the statistical updating.

If your problem is with 'generalising', your problem is just some combination of 'the world we live in' and 'rationality'.

Suppose the example statements in a) and b) made you slightly uncomfortable. Let me ask you the following:

What groups X and Y did you have in mind when I spoke about the hypothetical murder trends example? Notice I didn't specify anything.

One possibility that you may be thinking I had in mind was that X = 'Blacks' and Y = 'Whites'. People don't tend to like talking about that one.

In actual fact, what I had in mind was X = 'Men' and Y = 'Women'. This one is not only uncontroversial, but it almost goes without saying.

As it turns out, both are true in the data.

Do inferences based on these two both make you equally uncomfortable? Somehow I doubt it.

And if they don't, you should be honest enough to admit that your problem is not actually with statistical updating, or 'generalisations'. It's just trying to launder some sociological or political concern through the action of browbeating the correct application of statistics.

So stop patronisingly sneering that something is a generalisation, and using that as an implied criticism of an argument or moral position. Otherwise zombie Pierre-Simon Laplace is going to come and beat yo' @$$ with a slide rule.

Tuesday, April 16, 2013

On the Ex-Ante and the Ex-Post

Some thoughts on the occasion of receiving an email from a friend. He went down to the Boston marathon to watch his friend finish, and was planning to view things at the finish line. He found it too crowded, and walked up the street. This caused him to miss the first explosion, which was right near where he was originally standing. It also put him right next to where the second explosion was. By sheer coincidence, in the shock from the first blast, he started to walk towards the finish line, the site of the initial explosion. This caused him to be just far enough away from the second bomb when it exploded, right near where he'd been. He managed to escape unhurt.

I don't know about you, but studying enough statistics has had a subtle but deep effect on how I view the world. We who aspire to rationality make all our decisions in the realm of ex-ante calculations. When you understand probability, you realize that it doesn't make any sense to regret betting on heads when tails comes up as the winner, just as it doesn't make any sense to thrill at having chosen tails. You can only organise your life around things you know now, and decisions are only truly good or bad when evaluated according to what you knew at the time.

And yet...

When all that's said and done, you don't eat the expectation, you eat the coin flip. Every day, it tumbles through the sky, and all you can do is gird your loins and brace for whatever happens at the end. You plan and plan, and still, one day when you're not thinking, everything comes down to whether or not you took three steps in the right direction or not.

Different people give lots of different names to that - chance, luck, fate, God, Kamma. Ultimately, they're describing the same thing - whether you live to write the email or you don't.

In the end, it just wasn't your day to die. I'm extraordinarily glad of that. You get to see the sunrise and keep your health, and we get to keep our friend. Somewhere else, other people are receiving much sadder emails.

Such is life.

Wednesday, March 13, 2013

The Intrade End Game

The most useful source of information on US political events, Intrade, is shutting down:
With sincere regret we must inform you that due to circumstances recently discovered we must immediately cease trading activity on www.intrade.com.
These circumstances require immediate further investigation, and may include financial irregularities which in accordance with Irish law oblige the directors to take the following actions:
-Cease exchange trading on the website immediately.
-Settle all open positions and calculate the settled account value of all Member accounts immediately.
-Cease all banking transactions for all existing Company accounts immediately.During the upcoming weeks, we will investigate these circumstances further and determine the necessary course of action.
To mitigate any further risk to members’ accounts, we have closed and settled all open contracts at fair market value as of the close of business on March 10, 2013, in accordance with the Terms and Conditions of our customers’ use of the website. You may view your account details and settled account balances by logging into the website.
At this time and until further notice, it is not possible to make any payments to members in accordance with their settled account balance until the investigations have concluded.
Hmmm.

The outcome of shutting down was bound to happen eventually. But it's hard to know what to make of this press release in particular.

Part of the backstory is this disgraceful attempt by the US Commodity Futures Trading Commission to sue them over the fact that US investors were using the markets. Imagine that! US citizens having a bet on the outcome of an election! How will the republic survive?

They did the same thing with the sports betting version, Tradesports a few years ago. Under the rubric of the standard 'Think of the Children!' argument, the US government had to work hard to stamp tradesports out for a much bigger sin - offering a better sports gambling product at a lower cost than Vegas was willing to offer. And hey presto! Out you go. Tradesports made the Superbowl actually fun (no mean feat), as you could watch each play and see how the price reacted, getting real-time information on the progress of the game.

Tradesports was taken out sooner, because Vegas makes decent money off sports betting. They were willing to let the political side linger a bit longer, because this is small potatoes. Seeing the writing on the wall, the company who originally ran both (Intrade and Tradesports) sensibly decided to split the two parts off into separate companies. This managed to stave off the crocodile a little longer. What they really needed to do was start bribing making donations to some US senators. That would have been more useful. The trouble with being incorporated in Ireland is that you're far enough away that you can't have political influence, but not so far away that you can escape prosecution. Then again Full Tilt Poker was incorporated in Ireland. Then again again, when the DOJ came looking for them, they ended up getting acquired by PokerStars, incorporated in The Isle of Man. I don't know how much the jurisdiction helps. If it were me, I'd try for Macau. You can bet the ChiComs wouldn't bother you. If you are going to do it, though, you need to learn the lesson that DeBeers executives figured out, but David Carruthers of BetOnSports.com didn't figure out - don't plan to set foot in the US.

As with all this stuff, the way they go after you is the Wikileaks trick - they make it illegal for credit card companies to transfer you money. He who controls Visa and Mastercard controls the world. At least until BitCoin gets big. Mencius Moldbug predicted that if BitCoin ever did look like it was getting big, the government would shut it down. Care to take the other side of that wager? I'll give you pretty good odds.

Whether the CFTC is motivated by the same considerations that make the government put the squeeze on Tradesports is unclear. Frankly, bureaucratic petty jealousy would be more than enough to motivate these pinheads. Look, someone somewhere is trading a financial product without our authority! Shut it down! Sue them into oblivion!

The real question is why Intrade decided to stop trading now. After all, the loathesome CFTC press release was from November. Why now?

As far as I can see, there seem to be two possibilities.

One is that the government is pulling a Conrad Black. This is where they charge you with a crime and then find some pretext to freeze your assets, thereby making it incredibly hard for you to raise the money to pay for decent lawyers. How freezing their members accounts will help them is unclear - I'd be quite surprised if under Irish law they're allowed to used member funds (which are likely in some kind of trust) to pay for legal bills.

The other is that they've figured out that the money simply isn't there. Corporate directors don't use the words 'financial irregularities' unless they really have to. In other words, the reason they can't pay out members is that they've figured out that someone has been skimming money off the top, and now there aren't enough funds there to pay out everyone in full. In which case, you can't pay out anyone until you find out what the hell is going on. This was the substance of the allegations against Full Tilt Poker. If you're skimming money off the top, you're effectively running a Ponzi scheme, although not one with explosive growth.

It's possible the two ideas actually interact. In other words, the member funds are held in trust, but the company has been forced to make some provision for losses under the CFTC action. If you think there's now a chance that you'll be insolvent when it's all finished, it's not clear exactly what steps you'd take as a company, but this may well be one of them. (Not remembering all of my trusts law so clearly, I tried getting some kind of answer for the US here, but it looked rather hard. Actual lawyers would probably know the answer). So even if there isn't anything untoward going on in the company accounts, it wouldn't surprise me if this had something to do with it.

So at last, the government gets their way! The prediction markets in the US get eviscerated, except for BetFair, which actually does make it hard for US investors to take part (which probably doesn't help the accuracy of US election predictions), and, more importantly, is large enough in the UK that they have political power and can't be pushed around so easily.

The government doesn't specifically want inaccurate predictions. It just doesn't want anyone other than Vegas  or Wall Street to make any money on contingent financial contracts, no matter how small the amount, no matter how trifling the event being predicted.

The real winner in all this is Nate Silver. If you want predictions for important events, don't look to markets to save you. The markets would be happy to oblige, of course, but too many important people stand to lose money if that happens. Nothing personal old chap, you know how it is.

Sunday, November 4, 2012

A Small Change to Improve TV Poker

On televised poker shows, they often display the probability that each player is likely to win the hand. This leads to games being essentially about the probability of an upset - can the 20% guy pull out the victory with the next card? I guess this makes for some dramatic tension, but it's not terribly useful for understanding poker.

The main reason is that what they display are full information probabilities - if you knew both players' hands, this is what you'd calculate the odds as being.

Of course, the whole point of poker is that you don't know what the other guy is holding, and you're trying to infer it. The question of how exactly you infer it, from the cards on the board and the way he's betting, is the entire art and science of the game.

The most scientific (or at least probabilistic) part is knowing your chances of winning given only the cards in your hand. If you're only going to display one probability for each player, this is the useful one to understand what the players are actually doing - if there's two players and you hold Ace of Diamonds and 3 of Hearts, what are your odds of winning if all cards are dealt? This would help people understand basic things like why the guy keeps betting if he's only got a 14% chance of winning - he doesn't know that he's only got a 14% chance of winning. He thinks he's got a 54% chance of winning, and doesn't know the other guy is holding a flush.

This number would also be much more useful for helping people learn to play poker better. They'd learn faster what each set of cards implied.

Now, the criticism here is that good poker players will infer much more than the unconditional probabilities based on the flop and how the guy is betting. But if you display both numbers (full information and conditional only on own cards), you'd at least know which way a skillful player was likely to be updating. In other words, he's inferring something between 14% and 54%.

I assume that the TV networks have decided that putting two probabilities on the screen is simply too confusing for the average boob TV audience. But I'm not so sure. Frankly, to watch the game at all, you've got to have some interest in poker, and it is simply impossible to be interested in poker without understanding the rudiments of probability (intuitively, if not formally). The guys who would find this totally confusing probably are never going to watch the show anyway.

I am as skeptical of human nature and ability as the next man, but on this one, I say give viewers the benefit of the doubt and put both full-information and partial information probabilities up.

(As a side note, I initially was going to title this post 'A Modest Proposal For Improving TV Poker', which has a good ring to it. The problem is that Jonathan Swift meant the 'modest' in a sarcastic way, and it adds greatly to the confusion to also use it for truly modest proposals -  it's like people who use the Casablanca 'shocked, SHOCKED' line for things that are actually shocking. Don't do it!)

Wednesday, August 1, 2012

Why I don't use hotel safes

People focus on the salient risks. OMG, someone might steal my passport!

Fair enough - they might. But truthfully, how high is the risk of this if you're staying in a decent hotel and it's somewhere not in plain sight, such as in a bag?

I submit that it's not very high. The only guy I know who ever personally got anything stolen was while staying in a dorm room in a backpackers, and it was stolen by the other guy in the room, not the maid. As it turns out, the backpacker stole his MP3 player that he'd fallen asleep while listening to, right from out of his ear! Talk about chutzpah. We'll file that as 'one more reason to avoid hippies in backpackers'.

But a low risk of theft is, on its own, no reason at all not to use a hotel safe.

On the other hand, if you're anything like me, do you know what the much bigger risk of you being separated from your passport is?

Leaving it in the damn hotel safe when you check out of the room because you forgot to get it out.

I've done that at least once, years ago, but thankfully I remembered when the taxi was only halfway to the airport.

It's not a salient risk, but it's much, much higher.