Saturday, 13 August 2011

Riots in England: Conflating Correlation with Causation

It is an elementary point in statistics that it is not wise to infer causation from correlation. While it might appear that two variables have a causal relationship, it is possible that they do not. An example commonly used to illustrate this point is:

As ice cream sales increase, so does the rate of deaths from drowning.
Therefore ice cream causes drowning.

This reasoning ignores the mediating variable, i.e. good weather, which actually explains both the increase in ice cream sales and the increase in rate of deaths from drowning. This is a simple example that is used to make a point.

After the recent riots in London and other English cities, most on the left are attributing the events to "deprivation". Some have mapped the location of the riots on top of deprivation data (only including London, for some reason), observed a pattern, and decided that it's case closed. Not so fast! What pattern would be observed if we were to look at the relationship between other variables and rioting areas? I'm going to use population density and proportion of area that is non-white here, because they are variables that the left will not want to consider.

Table 1 depicts the 40 highest ranking Local Authorities for each of the following measures, separately: average LSOA score on Indices of Deprivation 2010, population density, and proportion of population that is non-white. The Local Authorities in bold font are those for which I was able to find evidence of rioting having taken place. The table quite clearly indicates that if someone wanted to pick a variable and claim it as an explanation, they'd chose population density or proportion of population that is non-white, and not deprivation. Not only do more of the LAs where rioting took place rank highly on these variables, they seemingly account for those areas which were considered surprising from the deprivation perspective. For example, while Croydon and Ealing are right up there on these measures, they're not high on the deprivation index (107 and 80, respectively).

Of course, the left wouldn't even entertain the idea that these variables might have some explanatory value. They dismiss them out-of-hand, as they're not interested in anything which would lead to arguments against over-population or immigration. They are ideologically obsessed with equality, and they will always perceive events in this context. That they're so impressed by misleading maps simply plotting one variable against another demonstrates that they're not at all practiced in arguing from evidence. Take this entry on the Greek Left Review blog as an example.

In summary, the purpose of this post is to highlight that inferring causation from correlation is not a sensible practice and that variables other than deprivation can be used in a similarly simplistic fashion to 'prove' entirely different arguments.


  1. >>Of course, the left wouldn't even entertain the idea that these variables might have some explanatory value.

    As someone with strong left wing views I dispute your assertion. Perhaps having high population density does predispose a population towards rioting. It certainly appears more common in urban than rural areas.

    I dispute the notion that the left shies away from evidence it dislikes. My view is always that we should be guided by what the best available evidence states.

    Also, you have constructed a straw man argument at the start of this blog. You argue that people sometimes make fallacious arguments. Your example is a correlation between drowning and ice cream sales although other humorous examples exist - eg there is a correlation between the number of pirates declining and global warming.

    However, crucially you haven't established that the link between deprivation and rioting is a false one. You have only managed to establish that further statistical analysis would be needed to prove it.

  2. See here for evidence casting doubt on the deprivation hypothesis:

    The purpose of this post is simply to demonstrate that correlation is not necessarily causation. As I've shown, simplistic observations of correlation show that population density and proportion of population that is non-white can be used to 'prove' different arguments. So, someone else can argue that it's evidence that we really need to decrease the population size for a stable society, and someone else can argue that the riots are evidence that racial diversity destabilises society.

    It isn't up to me to disprove a relationship. It is up to the person claiming a relationship to provide strong statistical evidence that it has some basis in reality. As I've argued here and on the Guardian article, simply looking at data and maps doesn't do that. It's amateurish.

    In the field of epidemiology, for example, a great deal of effort is invested in identifying causal relationships. They wouldn't just bung some data on a map, observe a pattern, publish a document on public health policy and then say "prove me wrong!".