Saturday, April 7, 2007

Causation and correlation

I thought I would follow up my post about causation with this article about correlation. It's well worth a read. Merely seeing that two things happen together don't mean that one is caused by the other (and assuming that can lead to disastrous consequences).

A woman in Holland is spending time in prison because several people died while she was working as a nurse (the prosecution claims that it's just too unlikely, and she *must* be a serial killer - and it's not clear that they have any real evidence against her beyond the improbability of the event). While that was an unlikely event, given the number of hospitals in Holland, one would expect that at least one of those hospitals would have an unlikely event. Ben Goldacre gives a good treatment of the issue in Losing the Lottery:

Meanwhile, a huge amount of corollary statistical information was almost completely ignored. In the three years before Lucia worked on the ward in question, there were 7 deaths. In the three years that Lucia did work on that ward, there were 6 deaths. It seems odd that the death rate should go down on a ward at the precise moment that a serial killer – on a killing spree – arrives on the scene. In fact, if Lucia killed them all, then there must have been no natural deaths on that ward at all, in the 3 years that she worked there.
It seems very likely this is an innocent woman...


