Showing posts with label Correlations. Show all posts
Showing posts with label Correlations. Show all posts

Thursday, May 10, 2018

On Predicting Divorce


Divorce, like death, is one of those things that deep down everyone assumes will only happen to other people. People believe this despite all the statistics and reasoning to the contrary. Also like death, it usually takes a divorce happening to someone close to you for the full gravity and horror of the situation to become apparent. But if it comes, there's a good chance it will take you by surprise. Maybe your marriage gets randomly run over by a bus. Maybe it develops a debilitating and malignant lung cancer, until when the end finally arrives it almost comes as a relief. Of course, divorce doesn’t have to happen to you. But maybe that’s just because the other inevitability steps in first. On a long enough time frame, the survival rate for everybody drops to zero, after all. If we lived for a million years, would any marriage last that long?

I have not been divorced. I have not even been married. Which makes me wholly unqualified to talk on the subject. But then again, even the most ardent real life students of the topic probably only have a few first-hand experiences on the subject. And knowledge on the subject is almost by construction going to be piecemeal. The people with the most firsthand experience of what it’s like to go through one are likely those with relatively less understanding of why it tends to occur. Or they’re bizarre gluttons for punishment.

If one is interested in forestalling divorce, there are two questions to ask. The first is how you should act in a marriage, conditional on your spouse. For this you can go to your local marriage counselor, or Dalrock, or Heartiste. Weight the three according to taste.

But there’s a second question – whom should you marry in the first place? I’ve probably spent more time thinking about this question, because it’s the Russian Roulette of high-stakes inference. And if I spend more time thinking about it than most people, perhaps oddly so, I at least have the defense that I think that most people spend an insufficient time thinking about it in cold, concrete terms.

So what might be things I’d look for?

The first, which doesn’t require much insight, is divorced parents, uncles and aunts, brothers and sisters, etc. Everything is partly heritable, so a fair amount of behavior will come from genetics. But this is one of those cases where you don’t really care where the predictive power comes from. The bit that’s environmental is being passed down too. Freud may have been wrong about the specific hypotheses he had on how children relate to their parents, but he was right on one thing – if you want to understand the child, look at their parents, and the child’s relationship with their parents.

Some people end up explicitly modeling themselves as a rejection and reaction against their parents’ failings. But most people end up subconsciously taking in expectations of what “normal” behavior looks like. Marital breakdown is like a car crash. Because crashes are quite infrequent, you probably want to spend more time analyzing near misses, where there’s a lot more frequent data to go on. In the marital domain, I find a quite illuminating question to be “how often did your parents tend to argue when you were a kid”? Everyone assumes their answer holds across the board for everyone. It doesn’t. Try it out.

So then we turn to characteristics of the person themselves. What traits are worrying?

To me, the biggest personality trait I’d worry about is selfishness and self-centredness, broadly defined. And importantly, you can’t look to how they are with you. You have to look at how they are with other people, especially those they don’t really like. Sacrificing and making an effort when in the first flush of excitement and love is very different than doing it after ten years when you’ve got two young children and you’re chronically underslept. The latter is when it actually matters. How does the person behave when they’re tired, and stressed, and having to do something they don’t really like?

Selfishness and self-centredness aren’t the same thing, of course, but they overlap. Selfishness is probably something that people are more apt to notice and avoid instinctively – is the person just stingy and rarely generous in unsolicited ways, unless they’re getting something out of it? This is probably likely to make your marriage unpleasant, leading to a visible deterioration. But it’s also something that is likely to make you avoid marrying someone in the first place just as an experiential aspect, regardless of the specific divorce question.

I suspect that self-centredness is both harder to diagnose, and more likely to get you blind-sided by a surprise divorce. In other words, does the person think that the main question to be answered is “Is this marriage something that makes me happy?”. If this is the relevant question, you might be surprised how their behavior turns on a dime when the answer switches to “no”. When things are going well and marriage makes them happy, a self-centred person might do lots of nice things for their spouse. But once it doesn’t, suddenly their desire to be generous decreases a lot in a way that seems surprising from the outside.

So how do you spot someone who’s not self-centred? Self-centredness can have a number of opposite traits, which manifest in different ways. One is empathy – genuine empathy, that is. Genuine empathy frequently asks the question “I wonder how that would feel to the other person?”. Someone who asks this frequently will wonder far in advance what divorce would be like for their husband, and their children. Self-centredness can coexist with kindness to others, and even compassion. This is the main way people don’t tend to spot it. Doing well-understood nice things to other people, because it feels good, is not the same thing as habitually thinking about how one’s words and actions will affect those around them. A self-centred person might do sweet things like buy a present for someone, but then later inadvertently hurt them with some carelessly chosen phrase, because they just weren’t really thinking about how it would impact the other person.

Another opposite trait is a sense of duty. Duty is a very old-fashioned word. Someone who has a concept of the duties of a wife is not just thinking about themselves. I suspect that a general sense of duty across the board is useful. Do they call their parents often, for instance? Do they have a sense of religious obligation? Even beyond their specific views on marriage, duty says that there are more important questions than just whether something makes you happy in the short term, or even at all. Some things just ought be done. And the broad sense of duty does not need to require a specific set of saint-like devotion to husbandly happiness. Good luck finding that in the Current Year (or, honestly, probably in any year). It’s probably enough to just have a stubborn insistence that one is obligated to work out one’s marital problems no matter what, because divorce is just not done.

Between the two, empathy avoids self-centredness by being able to reason on-the-fly about what other people around them are thinking and feeling. Duty is the conservative, Chesterton’s Fence version – because most people will insufficiently be able to reason out all the ways to make social arrangements work, we should roughly codify the parts that seem to be best practice. The former is more useful in a wide range of social situations, but probably also harder to find. The latter is scalable to more people, but of course we as a society don’t bother doing that scaling anymore.

There’s an additional component at play here, but it requires more honest introspection. Having a partner who isn’t self-centred is especially important if you yourself are self-centred. Because that’s exactly the nightmare kind of situation. When you’re both in the first flush of love, it will bring you pleasure to do nice things for each other, the other person’s nice behavior will bring out more niceness in you, and you’ll think it will last that way forever. But when things deteriorate, you’ll both start making excuses to start looking out for number one.

The one trait that I think is a) true and b) more likely to be emphasized by marriage counselors than Heartiste is the other person's ability to communicate about problems, figure out reasonable solutions, and stick to them. If I don’t dwell on this one at length, it’s not because I think it’s less important, just that I think it’s sufficiently obvious that you don’t need to come here to hear about it.

The final trait I would look at is the extent to which the person bears grudges, or how they act towards people they hate. Do they just try to move on and remain civil, or do they dwell a lot on the subject of people they dislike? It’s not so much that this predicts the possibility of divorce, but I suspect it surely predicts how they will act if and when it comes about. The central mistake that causes people to underestimate how bitter their divorce will be is that when they imagine the process of divorce, they’re imagining their wife or girlfriend now who loves them deeply. This is a terrible failure to do statistical conditioning. Conditional on getting divorced, the person hates your guts. So how does this person act towards people whose guts they hate? More importantly, are they willing to be reasonable and compromise, or are they willing to pay a price to stick it to someone they hate? This is the difference between a grudging and terse two hour conversation about who gets what and $500 in lawyers fees, vs $200K and two years of utter misery. Once the arms race train gets started, it’s very hard to stop. And people underestimate the arms race. Your lawyers will emphasise the part that ends with the divorce settlement. They won’t emphasise what it’s like to have to see that person you now loathe every second weekend to pick up the kids.

But sometimes, to paraphrase Sherlock (not Shylock) Holmes, we have to decide when the R2 of the regression is not as good as we would like it to be. And this is one of those cases. It is hard not to feel that, when all is said and done, one’s best calculations may not help one much here. One cannot, after all, pick a constellation of personality traits. One can only evaluate the girlfriend or boyfriend in front of you, and make a call one way or the other without knowing in any concrete way who the actual counterfactual girlfriend you haven’t yet met is. So you go with your gut, and roll the dice.
There is no means of testing which decision is better, because there is no basis for comparison. We live everything as it comes, without warning, like an actor going on cold. And what can life be worth if the first rehearsal for life is life itself? That is why life is always like a sketch. No, "sketch" is not quite a word, because a sketch is an outline of something, the groundwork for a picture, whereas the sketch that is our life is a sketch for nothing, an outline with no picture.
The good news is that, having rolled the dice, one can then (or hopefully sooner) turn all one’s attention to the second question of what to do once you’re in a marriage.

The bad news is that that, too, is subject to the Kundera problem.

Sometimes, despite everything, death happens to you too.

Sunday, July 2, 2017

On the time-series, the cross-section, and epistemic humility

One of the advantages of the economist's training is just the ability to instinctively think in terms of empirical tests. There's a reason that economists have tended to colonise other fields like sociology, law and politics - as much as anything, it comes from knowing about how to design proper empirical tests, and what good identification is.

Perhaps more importantly, it comes from knowing what poor identification is, including basic issues of endogeneity, reverse causality and omitted variables. For instance, knowing that you cannot infer almost anything about the relationship between prisons and crime just by looking at the variation over time in the total number of prisoners and the total number of criminals in a society.

The gold standard for identification, of course, is pure randomisation. When that isn't available, as it usually isn't outside a laboratory, you go for natural experiments - where something almost exogenous occurred. This gets used as an instrument - in the prisons and crime case, for instance, Steve Levitt used ACLU prison overcrowding litigation as a quasi-random shock to the prison population.

Of course, the pendulum swings back and forth. If the initial identification push corrected the free-for-all of 1980's empirical work (regressing the number of left shoes on the number of right socks!), the subsequent view seems to have gone towards 'identification uber alles'. In other words, the only question of interest is whether you've really truly identified totally exogenous variation, not the importance of the underlying topic. Plus, oddly, very few people seem willing to learn from an imperfect instrument. It seems to me that if there are 100 potential explanations, and an imperfect instrument rules out 90 of them, then we've learned something quite valuable - the answer is either the main hypothesis, or one of the remaining 10. But making the perfect the enemy of the good seems to be the way of things these days.

These questions end up being most important when you actually run empirical tests. Me, I'm lazy - there is a large hurdle for me to actually download a dataset and start fiddling with it. I'm always impressed by people like Audacious Epigone and Random C. Analysis who do this stuff all the time.

But there is another aspect of empirical training that ends up being even more useful for the computationally lazy - helping you sort through hypotheses just by knowing the panel nature of the dataset.

In particular, a lot of questions that purport to be about a time-series are really about a panel. That is, it's not really about a single variable over time (the time-series), it's about a group of different individuals over time. And thinking of the cross-section simultaneously with the time series greatly clarifies a lot of things. That is, instead of just coming up with hypotheses about why the average of some variable as changed over time, think about whether this hypothesis would also be able to explain which individuals would change more or less, or would be higher or lower on average.

One of my favorite examples is birthrates. The classic question is about the time-series : why have birthrates, on average, declined over time?

But there is also a cross-section - whose birthrates? This could be of individuals, or countries, or characteristics. Moreover, the cross-section exists both today, and in the past. If you're too lazy to run a regression and just wanted to sort through hypotheses about the time series with help from your friend William of Occam, one rule of thumb might be as follows: A good variable explains both the time series and the cross section. A mediocre variable explains the time series, but not the cross-section. A bad variable explains the time-series, but predicts the cross-section in the wrong direction.

For instance, suppose I wanted to know why birthrates in the west had declined overall. Here's the US Total Fertility Rate over time



You might look at that graph, and notice a big decline starting in the early 1960s. Certain hypotheses start suggesting themselves. What was going on in the 60's? Feminism? The Pill, approved by the FDA in 1960?

Quite possibly. But what hypotheses would come to mind if instead I showed you this graph:


The US is now in green. But we've also plotted New Zealand, in purple, Sweden, in light blue, and Japan in pink.

I personally would be astonished if your reaction isn't at least partly like mine, thinking that the world is actually quite a lot more complicated than you'd bargained for. Sweden was actually rising in the early 1960s, and New Zealand was higher than the US for a long time. Japan, meanwhile, has a completely different picture altogether.

You can add any number of these together at the excellent World Bank website to test whatever theory you have.

But there are other cross-sections within countries which can be used to test theories further.

Suppose I were primarily interested in the west, and my theory was about a rising cost of having children. It's a lot more expensive to raise children than it was in the past, as you have to pay for daycare, and "good schools", and all this other stuff, because societal expectations are higher.

I certainly hear people with kids complaining about cost all the time, which tells me that maybe there's something to it.

In the first place, I don't think this hypothesis stands up well to an actual examination of the data above. It turns out there's nothing quite like downloading the bloody data and plotting it to explode a lot of preconceptions. This probably actually is the most basic economist's tool of all, to be honest. Because the US graph above shows that not only did birth rates in general go down, but they went down precipitously starting in the 1960s, hit their low point around 1975, and have been slightly rising since then. Of course, since the graph only starts in the 1960s, you have to be wary of giving prominence to this date alone for the initial decline. But still, does this look like a graph of what you expect the cost of raising a child was?

I'd guess not, but with something hard to measure like "the cost of raising a child", who knows, maybe it is. After all, the aggregate time series is always hard. We only have one run of history, and lots of things are changing at once. But the cross-section has a lot more data.

For instance, consider the cross-section of rich and poor. At least in 2000, here's how they looked:


In other words, the poor have more children than the rich.

This immediately doesn't sound like a cost story. Even if the cost has gone up for everyone, why are the rich less able to bear it than the poor? Even if you think that the poor are on welfare so they don't care about the cost, as long as the rich have more money, they still are better placed to be able to deal with it. Are children some sort of inferior good, getting substituted for jetskis as income rises?

And there's another aspect - there's a historical cross-section as well. I suspect, though it turns out to be harder than I thought to find an easy citation for this, that this disgenic pattern in birthrates with respect to income is a relatively recent phenomenon. Certainly in pre-history or in polygamous societies, only the rich could afford to have large large families (or multiple wives, in the polygamy case). When the Malthusian limit binds, access to resources matters, and the rich outbreed the poor.

I've written before about how I think improved birth control is a big part of the story. But doubt it not, this does not much better at explaining the current cross section as a cost story. That is, it approximately fails to predict it at all (making it mediocre by my rule of thumb), rather than predicting it in the wrong direction like costs do. To get birth control to explain the cross-section of income as well, you'd need to believe that the poor are unable to afford birth control, but are able to afford the resulting children. Seems hard to square to me.

Cost, incidentally, also has similar cross-sectional problems when it comes to the increase in obesity. Leftists love the 'food desert' explanation, whereby the poor are forced to become obese because the stores in their area don't have enough fresh fruits and vegetables, and hence the only options to them are potato chips and coke.

Again, from a cross-sectional point of view, this is possible. But it's a disaster from the time-series point of view. As society in general has gotten richer, we've also gotten fatter. How is it that the poor today "can't afford to eat healthily", but the poor in the 1930's could?

So explanations have to get more complicated. There's no rule of nature that everything has a single explanation. I've picked fertility and obesity because they are two of the most stubborn problems facing the west today, which suggests, but does not guarantee, that they may not be amenable to a single simple explanation.

I think there's something quite humbling about looking at the totality of the data, because it rarely looks like any one neat explanation of anything. It reminds you that your models of the world are just that - models. You include what you think are the most important parts, but you leave out lots of other stuff too. Even if you're right on what's important (a big if), the world is a large and complicated place.

Tuesday, March 1, 2016

Micro-snapshots of personal agency

One of my minor hobbies is noticing small correlations in how people speak that reveal things about them. Some examples herehere and here.

I was reminded of one from a conversation I overheard in an elevator today:
Girl: I forgot to bring a pen. 
Guy: Oh well, we can go back up and get one. 
Girl: I used to have a nice one that I'd carry with me. 
Guy: For some reason, the crummy pens stick around, while the good pens always disappear.
Girl: Yeah, that's because people always end up taking them.

Which reminded me of something I noticed way back in the third grade.

Like all small children, our pencils would often go missing. And when they did, people immediately fell into one of two narratives

a) I lost it.

b) Someone stole it.

I was always in the first category. I assume that I'm just forgetful and careless, which I am.

But some kids were always certain, without any proof, that the world was full of malicious people out to get them, stealing all their pens and pencils.

And if the girl's conversation is anything to go by, I suspect this difference persists later in life.

I may simply be naive about this, and extrapolating from my own mental state. But I can't quite believe that there's that many pen thieves out there in the offices and classrooms of the world. Who are all these people apparently swiping pens? Even the guy's point, which is the better one, seems more obviously explained by the fact that you only notice when a good pen goes missing, and the crummy pens go missing too, but you didn't pay attention because you didn't care.

The first sign that there isn't a pen conspiracy is that pens seem to go missing at approximately the same rate as individual socks go missing in the washing process. And I don't think anyone actually believes that the underpants gnomes are taking them. Things get dropped randomly, or forgotten, or misplaced. That's just life.

But when these kinds of annoying things happen, do you accept that as just part of the random bad luck of life? Do you blame yourself? Or do you blame a conspiracy of others?

I would wager that people who think pens frequently go missing because they get stolen are less likely to accept responsibility for their own screwups in life. I would wager they these people are probably somewhat less self-aware.

That seems like a strong conclusion to draw. It's only a hunch, presented as such. But it's how I'd bet.

Off such small pieces of information are efficient estimates of personality made.

Given enough enough data about the world, nobody is really a mystery.

Friday, May 9, 2014

Mail Order Brides - Applied Inference, High Stakes Edition

Those of us who enjoy collecting correlations as a hobby sometimes yearn for a higher stakes version of our craft, something like the Correlation Olympics. The premise would be simple - you're given a small amount of information about a person, and asked to infer as much stuff as you possibly can about them. Points would be given both for being right, and for the non-obviousness of the conclusion you drew.

The closest real-world equivalent would be getting a mail-order bride. The market for lemons being what it is, I do not anticipate that getting a mail order bride is likely to be a sensible decision on average. And it really is a market for lemons - there are almost certainly decent men and women on both sides that could have quite happy pseudo-arranged marriages, but the problem is the high risk of golddiggers (on the one side) and abusive creeps (on the other). The bad prospects drive out the good.

That said, I don't think the people who do it are all necessarily broken or crazy (though many of them probably are). The reason is that I would wager that the international dating market is probably likely to have a higher chance of mispricing than the domestic one. Like every market, the fewer the people are who are attempting to trade on perceived mispricing, the more likely mispricing is to exist.Then again, lots of people go broke buying penny stocks on the same rationale. Illiquid markets just say there might be mispricing, not that your personal hunches will be able to sniff it out.

But I still retain a perverse fascination with the idea of choosing a mail order bride. This would be somewhere between Russian (pun intended) Roulette and the World Series of Poker when it comes to correlation studies.

Think about it - in the extreme form, for each person you've got 5 photos and a one paragraph description, possibly written in broken English, and from that you have to decide on somebody to spend the rest of your life with. In other words, you have to extract every single drop of useful information out of what you're presented with. What are they wearing? What are they doing? Is there anyone else in the photo? What's their body language? Where were they taken? How many photos are they smiling in? You need to devise an entire assessment of a person's character from such tiny scraps, and then be willing to back it up with a marriage commitment.

If you get it wrong, financial and emotional misery await. If you get it right, you may have finally found a happy life partner and a way out of a previous lonely existence.

Talk about high stakes. For reasons I can't express well, the prospect of backing one's judgment to such an outrageous level seems both terrifying and thrilling at the same time.

Of course, one doesn't actually have to gamble one's life on the outcome to play a practice version - just go to one of the many sites and look at a few profiles, and decide which one you would pick if you had to make a choice, and why. Playing poker for matchsticks is not the same as playing for bearer bonds, but you probably don't want your first game of poker to be the latter.

Better study those correlations, son!

Saturday, September 14, 2013

Awesome

Steve Sailer links to this fantastic New Yorker comic:


Ouch! Please report to the burn unit of the hopsital!

This hits so many outrageous buttons at once: 'incisively observing an unusual but true correlation', 'needless withering putdown of other people's dubious choices' and 'old school snobbishness' all in one.

I went through the list of people I knew with tattoos for P(Divorce|Tattoo), and it went 'Yep...Yep... Nope...Yep...'. Okay, what about the other direction, of the non-tattoo folks for P(Divorce | No Tattoo)? 'Nope... Nope... Nope... Yep...Nope... Nope.. .'

Day-amn.

If you, like me, are not particularly enamored of the spreading of this social trend, there are far more eloquently reasoned and interesting critiques of tattooing (for instance, this great Theodore Dalrymple essay), but as Mr Mencken put it, one good horse laugh is worth ten thousand syllogisms

As to why the underlying correlation exists, I think it works on two levels.

One is the treatment effect of traumatic parental events in a child's upbringing. Part of the appeal of tattoos (as far as I can tell) is the notion of their permanence - being able to inscribe something on yourself that will stay fixed, committing an idea or picture to permanent association with yourself. I can imagine that this desire is subconsciously more sought out by people for whom a significant event in their childhood was the disruption and dissolution of the home life they'd thought of as permanent.

The other is the likely heritability of time preference, and compulsive decision-making more generally. I can imagine that the kind of parent who enters into a rash marriage, or decides to have an affair with the secretary or mailman, will (through probably both genes and culture) result in a child who will think less about how the tattoo is going to look when they're 50 with wrinkled skin, or 26 and applying to the law firm.

Still, whatever the reason, I'm mentally filing this one away in the list of life's correlations to bear in mind when one needs to get all Last Psychiatrist in one's analysis of a person.

Monday, August 19, 2013

How to tell if a coffee shop serves good coffee, part 2...

Without drinking it, obviously.

This is continuing in the 'news you can use' category, among the trivialities that have been occupying my life of late while the events of the world pass me by.

I used to go with the smallest cup size offered by the cafe. There's a tendency among bad coffee shops to serve you up enormous bathtubs full of bilge water. Of course, to get a larger cup of coffee, they simply run the water through the same set of grounds until it turns into a burnt mess. The places that offer you a small sized coffee are more likely to know what they're doing.

But this was superseded by a tip from AL - the number of milk jugs on display. Good places will never heat their milk more than once. As a result, they tend to have a lot of small milk jugs around. If you see that, it's very likely somewhere that knows what they're doing. On the other hand, I've never had a good coffee from a place that had a single giant milk jug that kept being reheated.

If the place is failing the above signals and you still need a coffee, at a minimum order the smallest size you can.

(For the previous best signal, see here)

Monday, April 8, 2013

42

Apparently they're making a movie called '42'.

Knowing only this much about the movie, it's a useful way to segment people into a couple of groups based on the first association that comes to your mind from that number, either

a) Ah, that's Jackie Robinson's old number.

b) That movie must be about the answer to life, the universe and everything.

If I ranked people according to the chances that I'd find them interesting based on whether they thought of various combinations of a) and b), it would probably go:

1. b) only

2. Both a) and b)

3. Neither a) nor b)

4. a) only.

Your mileage may vary, but if you're reading this blog, I'm wager that it probably won't vary much.

Monday, January 28, 2013

Segregation Lives On

Not forced segregation, mind you. Like so many reactionary ideas (some of which were good, some of which, like this one, were not) it's gone and not coming back. You can measure how much it's not coming back by the infinitesimal number of Americans who would rate its absence as anything other than a clear indication of social progress.

So people like the idea that the government no longer forces people to segregate by race. So far, so good - the government certainly has no business enforcing such a policy.

People will also tell you that they don't like the idea of segregation in and of itself, even if it's not being imposed by government fiat. That, too, is a perfectly defensible and reasonable position.

But what's all the more puzzling is that notwithstanding the large number of Americans who would express such an opinion, geographically America is incredibly segregated by race. And nobody seems much bothered by it, as long as they don't have to talk about it.

Don't believe me? Check out this fascinating New York Times website that lets you visualise the demographic breakdown of each area.

Here's Chicago.


The green dots are white people, the yellow dots are Hispanics, the blue dots are black people, and the red dots are Asians.

Amazing, no? There are some parts where there's a gradual gradient across racial lines, but others where it's an incredibly sharp division.

Some of this can be explained as an effect of sorting on income. But if you look at the sharp divides between some of the black and Hispanic areas, it's hard to see much in the way of economic difference between them. Compare say zip code 60604 (94.8% black, median household income $26,930) with, say, zip code 60623 (62.9% Hispanic, median household income $28,203) or zip code 60608 (62.7% Hispanic, median household income $28,026) and it's hard to explain this as a rich area/poor area phenomenon.

This isn't just a Chicago thing, either. Go here and type in 'New York', 'Cleveland', 'St Louis', 'Los Angeles' or 'Las Vegas'. Everywhere you go, it's there.

So if this isn't an income thing, and it isn't a legislatively coerced thing (and I imagine it's not a 'provision of government services' thing), then what exactly is it? Do people actually just prefer to live around people of the same race, all other things equal? If you find the idea uncomfortable, don't blame me, I didn't make the city of Chicago look like that. Neither did the government. Millions of individuals, freely choosing where to live, created the map above.

It's certainly not a pleasant hypothesis. But honestly, if you look at the map, do you have a better explanation?

Thursday, December 6, 2012

Things you can infer about 'Songs of Love'

I always enjoy when someone's choice of words reveal things about them that they almost certainly didn't intend to convey.

A great example of this can be found in the wonderful Ben Folds song, 'Songs of Love'.

Let me pose the challenge in advance to you. Where was Ben Folds when he was inspired to write the song?

I've put a copy of the video below. To make sure you focus on the important part of the lyrics, I've written down the first two verses. Read through them, and see if you can infer what I inferred.
Pale pubescent beasts,
Roam through the streets,
And coffee shops.
Their prey gather in herds,
Of stiff knee-length skirts,
And white ankle socks.
But while they search for a mate
My type hibernate,
In bedrooms above,
Composing their songs of love.
Young, uniform minds
In uniform lives,
And uniform ties,
Run round, with trousers on fire
and signs of desire they cannot disguise,
While I try to find words,
As light as the birds,
That circle above,
To put in my songs of love.
The song is here:




In case you want to guess, the answer is below the fold (no pun intended):

Monday, November 19, 2012

Predicting if someone is Brazilian by how they speak English

One of my minor hobbies is trying to guess where people were born based on small details about them.

A fun way of doing this is with language. When people speak English (or any other language), they often subconsciously import assumptions about pronouncing words from their original tongue. Certain sounds will get pronounced in ways that sound slightly odd to a native English speaker, but are often correlated among people who grew up speaking a particular tongue, or from a particular region. The great OKH informed me that the study of this area is called 'phonotactics', so you might call me an amateur phonotactician

The latest one I cam across is a diagnostic for Brazilians. Like all linguistic tics, it's not universal, but it's reasonably predictive - it's neither necessary nor sufficient, but it's closer to being sufficient than it is to being necessary . It's the following:

Past tense verbs (e.g. words that end in 'ed'), they will sometimes pronounce the 'ed' as a hard sound.

So, for instance, the word 'combined', they'll sometimes pronounce as 'combine-ed', with the last sound being pronounced as in the start of 'education'.

I noticed this first in two Brazilians that I know, and confirmed it out of sample this weekend with another guy - he had dark brown hair and pale-ish skin with an accent that I couldn't easily place when I heard him giving a talk. He did the hard 'ed' sound in a talk, so I googled him and sure enough he was from Brazil.

The previous one (which I noted in the comments here, but which deserves its own post) is the following:

A strong diagnostic for Turkish people speaking English is that words that end in a hard 'r' they sometimes combine the 'r' with a 'zh' afterwards (think as in Dr Zhivago, or 'Jean-Claude' in the French pronunciation). So the word 'cover', they'll pronounce almost like 'coverj', if that makes sense. They won't do it all the time, so you often have to listen for a while before they'll do it. It's not uniquely Turkish - I've also come across it in one or two Eastern European groups, although I forget which. But it's a pretty strong predictor.

I've confirmed this across a few people, but I'll report to you soon an out of sample test - I heard my tailor say it the other day when I took in a suit to get adjusted. I'm going to ask him when I return, and we'll see if I'm right.

[Update]: Confirmed - he is indeed Turkish.

Correlations, baby. Though you throw them out with a pitchfork, yet they return.

Monday, February 20, 2012

Random correlations from a weekend in San Francisco

-Since it has been established by rigorous analysis that McDonalds restaurants tend to be the most profitable places at airports, it seems vanishingly unlikely that their absence in any first world airport is due to lack of demand. Hence a leading indicator of maddening nanny-state-ism gone mad is when airport terminals lack any fast food. This is becuase some pinhead bureacrat or politician deciding that it would be too low brow, or too unhealthy, or too commercial, or [insert modish condescending reason here]. True to form, the worst examples of this are the American Airlines terminal in SFO, and the international terminal in Heathrow. I leave the reader to their own conclusions.

-A bizarrely strong indicator that you're in a tourist trap area is the presence of a Bubba Gump Shrimp store. They always manage to find the area in any city where the worst rubberneck tourists congregate, and plonk their store down there. Pier 39 at Fisherman's Wharf in San Francisco, Navy Pier in Chicago, Times Square in New York, Cancun in Mexico, Mall of America in Minnesota, Santa Monica Pier in LA... I challenge you to look at their location map and find me an exception to this rule. Shylock's tip - if you see a Bubba Gump store, leave the area you're in straight away.

-Notwithstanding this grousing, San Francisco is a very fun city. Great Chinese food, very walkable, pretty architecture, and even the hordes of weirdo hippies lend a colourful charm when one is only there for the weekend.

Friday, January 27, 2012

Insight of the Day That I Was Most Pleased With

I was listening to a talk by this Greek girl today.

I was speaking to The Greek afterwards, and asked him the following: "Hey, does the Greek language have any works that end in either 't' or 'p' "?

Sure enough, it doesn't. Which I knew it wouldn't.

How did I know this?

Listening to the girl talk, there were certain words where she would add half an extra vowel at the end, particularly words that ended in 't' or 'p'. So the word 'treatment' became something almost like 'treatmenta' and 'group' became 'groupa'. Not with a strong emphasis on the 'a' at the end, but noticeable.

My hunch, which it seems was right, is that this came from the fact that she wasn't used to words ending in 't' and 'p' - she was used to a vowel at the end after these letters. And this was so subconscious that she was adding it in slightly in English, even though it wasn't there. This would only seem to work if words ending in these letters were completely absent.

Bam! It makes you look like Sherlock (not Shylock) Holmes when you can spot these kinds of obscure connections.

There's few things as satisfying as correctly identifying something random about the world based on correlations that most people aren't paying attention to.