A Bayesian analysis of the case of Lucia de B.

de Vos, A. F. (2004).

Door statistici veroordeeld? Nederlands Juristenblad, 13, 686-688.

Here, the result of Google-translate by RD Gill

Would having posterior thoughts

Not be offending the gods?

Only the dinosaur

Had them before

Recall its fate! Revise your odds!

(made for a limerick competition at a Bayesian congress).

The following article has been the basis for two full-page articles on Saturday, March 13 in the science supplement of the NRC (with unfortunately disturbing typos in the ultimate calculation) and in “the Forum” of Trouw (with the expected announcement on the front page that I would claim that the chance that Lucy de B. was wrongly convicted would be 80%, which is not the case)

Condemned by statisticians?

Aart F. de Vos

Lucy de B is sentenced to life imprisonment. Statistical arguments played a role in this, although the influence of this in the media was overestimated. Many people died around the times that she was on duty. Accidentally? The consulted statistician, Henk Elffers, repeated his earlier statement during the current appeal that the probability was 1 in 342 million. I quote from the article “statisticians do not believe in coincidence” from the Hague newspaper of January 30th: “The chance that nine fatal incidents took place in the JKZ during the service of the accused on the basis of chance is nil, (…) It wasn’t a coincidence. I don’t know what it was. As a statistician I can’t say anything about it. The evidence is up to you”. The further report showed that the judge had great difficulty with this answer, but that it was not really solved.

Many witnesses were then heard who talk about circumstances, plausibility, oddities, improbabilities and undeniably strong connections. The court must combine all of this and arrive at a wise final judgment. A heavy task, certainly given the legal conceptual system that includes very many elements that have to do with probabilities, but does not need quantification and probability when combining them.

The crucial question is of course: how likely is Lucy de B to commit murders? Most laypeople will think that Elffers has answered that question, so that it is practically certain.

This is a misunderstanding. Elffers did not answer that question. Elffers is a classical statistician, and classical statisticians do not make statements about what is going on, but only about how unlikely things are if nothing were going on. However, there is another branch of statistics: the Bayesian. I belong to that other camp. And I’ve also been counting. With the following bewildering result:

If the information that Elffers used to reach his 1 in 342 million were the only information on which Lucy de B was convicted, I think that, based on a fairly superficial analysis, there would be about a 80% chance of this happening incorrectly .

This article is about this great contrast. It is not an indictment of Elffers, who was extremely modest in the court when interpreting his outcome, nor a plea to acquit Lucy de B, because the court uses mainly different arguments, albeit without unequivocal statements of probability, while there is nothing to absolute certainty. It is a plea to seriously study Bayesian statistics in the Netherlands, and this applies to both mathematicians and lawyers.

There is some similarity to the Sally Clark case, which was sentenced to life imprisonment in 1999 in England because two of her sons died shortly after birth. A wonderful analysis can be found in the September 2002 issue of “living mathematics”, an internet magazine (“http://plus.maths.org/issue21/features/clark/index.html”

An expert (not a statistician, a doctor) explained the chance that such a thing happened “accidentally” in the given circumstances 1 in 73 million. I quote: “probably the most infamous statistical statement ever made in a British courtroom (..) wrong, irrelevant, biased and totally misleading.” This statement is broken down to the ground in the said article. Including a reference to the Bayesian analysis. And a calculation of the probability that she was wrongly convicted of greater than 2/3. In this case, the expert’s statement was completely wrong on all counts, causing half the nation to fall over him and Sally Clark, though only released after four years. However, the case of Lucy de B. is infinitely more complicated. Elffers’ statement is, I will argue, not wrong, but it is misleading, and the Netherlands has no jurisprudence, but judgments, and even though they are not directly based on extensive knowledge of probability theory, they are much more settled. That does not alter the fact that there is a common element in the Lucy de B. and Sally Clark cases.

Bayesian statistics

My calculations are therefore based on alternative statistics, the Bayesian, named after Thomas Bayes, the first to write about ” reverse opportunities ”. That was in 1763. His discovery did not become really important until after 1960, mainly through the work of Leonard Savage, who proved that when you decide under uncertainty you cannot ignore the question of what opportunities the possible states of truth have. (in our case the states ” guilty ” and ” not guilty ”). Bayes learned how you can learn about that kind of opportunity from data. Scholars agree on the form of those calculations, which is pure probability. However, there is one problem: you have to think about what opportunities you would have given to the states before you saw your data (the prior). And often these are subjective opportunities. And if you have little data, the impact of those subjective chances on your final judgment is great. A reason for many classical statisticians to oppose this approach. Certainly in the Netherlands, where statistics are mainly conducted by mathematicians, people who are trained to solve problems without wondering what they have to do with reality. After a fanatical struggle over the foundations of decades (see my piece “the religious war of statisticians” at http://staff.feweb.vu.nl/avos/default.htm) the parties have come closer together. With one exception: the classical test. Bayesians have fundamental objections to classical tests. And Elffers’ statement takes the form of a classical test. This is where the groundwork debate focuses.

The Lucy Clog case

Following Elffers who explained his method of calculation in the Nederlands Juristenblad on the basis of a fictional case “Klompsma” [“klomp” is the Dutch word for “clog”. The suffix “-sma” indicates a person from the province of Groningen – RDG. This is all rather insulting] (which I also calculated to arrive at totally different conclusions) I want to talk about the fictional case Lucy Clog. Lucy Clog is a nurse who has experienced 11 deaths in a period in which on average only “one case occurs, but where no further evidence can be found. In this case too, Elffers would report an extremely small chance of chance in court, about 1 in 100 million. This is the case where I claim that a conviction, that is to say, given my information and my estimates of the context, has a chance of about 80% being wrong.

This requires some calculations. Some of them are complicated, but the most important aspect is not too difficult, although it appears that many are struggling with it. A simple example may make this key point clear.

You are at a party and a stranger starts telling you a whole story about the chance that Lucy de B is guilty, and he is counting joyfully on it. What do you think: is this a lawyer or a mathematician? If you say a mathematician because lawyers are usually unable to count as well, then you fall into a classical trap. You think: a mathematician can count well and the chance that a lawyer can count well is 10%, so it must be a mathematician. What you forget is that there are 100 times more lawyers than mathematicians. So if 10% of lawyers can keep such a story, there are still 10 times as many. So, under these assumptions, the probability is 10/11 that it is a lawyer. To which I must add that (I think) 75% of mathematicians belong to the male gender and 40% of lawyers, which I did not include. If she had been in the assignment instead of he would have made it up.

The same mistake, forgetting the context (more lawyers than mathematicians), can be made in the case of Lucy de B. The chance that you are dealing with a murderous nurse is a priori (before you know what is going on ) very much smaller than being an innocent nurse. You have to weigh that against the fact that the chance of 11 deaths is many times greater in the case of “murderous” than in the case of “innocent”.

The Bayesian way of performing the calculations in such cases also appears to be intuitively not easy to understand. Looking back on the example of the party, that might not be so bad.

The calculation does not go in terms of opportunities, but with “odds”, an untranslatable word that does not live in the Netherlands. Odds of 3 to 7 mean a chance of 3/10 that it is true and 7/10 that it is not. Englishmen who understand better thanks to horse racing: you win 7 if you are right and lose 3 if you are wrong. Opportunities and odds are two ways to describe the same thing. Another example: odds of 2 to 10 correspond to a probability of 2/12.

You need two elements for a simple Bayesian calculation. The prior odds and the plausibility ratio. In the example, the prior odds are mathematician or lawyer 1 to 100. The plausibility ratio is that a mathematician starts over calculating (100%) against the chance that a lawyer will do that (10%). So 10 to 1. The Bayes theorem now says that you must multiply the prior odds (1: 100) by the plausibility ratio (10: 1) to get the posterior odds (1:10), corresponding to a probability of 1 / 11 that it is a mathematician and 10/11 that it is a lawyer. Precis as previously calculated. The posterior odds are what you can say after the dates are known, the prior odds what you could say before. And the plausibility ratio is the way you learn from data.

Back to the Lucy Clog case. If the chance of 11 deaths is 1 in 100 million when Lucy Clog is innocent, and 1/2 when she is guilty – more about that later – then the plausibility ratio for the innocent against the guilty 1 to 50 million. But to calculate the probability of being guilty, you need the prior odds. They follow from the chance that a random nurse will commit murders. I estimate that at 1 to 400,000. There are forty thousand nurses in hospitals in the Netherlands, so that would mean nursing killings every 10 years. I hope that is an overestimate.

Bayes’ theorem now says that the posterior odds of “innocent” in the event of 11 deaths would be 400,000 to 50 million. That’s 8 out of 1000, a small chance, maybe enough to convict someone. Yet large enough to want to know more. And there is much more worth knowing.

It is strange that nobody has noticed anything. It is even stranger when further investigation yields no evidence of murder. If you think that there would still be an 80% chance of finding clues in the event of many murders, against of course 0% if it is a coincidence, then the plausibility ratio of the fact “nothing has been found” is 100 in favour of 20 innocence. Application of the rule shows that we now have odds of 40 to 1000, so a small 4% chance of innocence. Condemnation now becomes really questionable. And if the suspect continues to deny, which is more plausible when she is innocent than when she is guilty, say twice as plausible, the odds turn 80 to 1000, almost 8% chance.

As an explanation, an image that requires less calculation work (but says the same thing): It follows from the assumptions that in 20,000 years it occurs 1008 times that 11 deaths occur: 1,000 guilty and 8 innocent. Clues are found among 800 guilty people, 100 of the remaining 200 confess. 100 remain guilty and 8 innocent.

So Lucy Clog must be acquitted. And then I haven’t even talked about doubts about the chance of 1 in 100 million that “by chance” 11 people die. This chance would be many times higher in every Bayesian analysis. I estimate, based on experience, that 1 in 2 million would come out. A Bayesian analysis can include uncertainties. Uncertainties about the similarity of circumstances and qualities of nurses, for example. And uncertainties increase the chance of extreme events enormously, the literature contains many interesting examples. As I said, I think that if I had access to the data that Elffers uses, I would not get a chance of 1 in 100 million, but a chance of 1 in 2 million. At least I assume that for the time being, it would not surprise me if it were much higher. Preliminary calculations show that it can sometimes be 1 in 100,000. But 1 in 2 million already saves a factor of 50 by 1 in 100 million, and my odds would not be 80 to 1000 but 4000 to 1000, so 4 to 1. A chance of 80% to wrongly condemn. This is the 80% chance of innocence that I mentioned in the beginning. Unfortunately it is not possible to explain the factor 50 (or a factor 1000 if the 1 in 100,000 turns out to be correct) from the last step within the framework of this article without falling into mathematics.

What I hope has become clear is that you can always add information. “Not being able to find” and “has not known” are new facts that change the chance. And perhaps there are countless facts to add. In the case of Lucy de B., those kinds of facts are there. In the hypothetical case of Lucy Clog, not.

The fact that you can always add information in a Bayesian analysis is the most beautiful aspect of it. From prior odds, you come through data (11 deaths) to posterior odds, and these are again prior odds for the next steps, no indication and no confession. Virtually all further facts that emerge in a court case can be conceived in this way in the analysis. Any fact that has a different plausibility under the guilty hypothesis than the innocent hypothesis contributes. Perhaps it was noticed that it was only about opportunities that related to what actually happened that never happened to what could have happened. A classic test always talks about the probability of 11 or more deaths. That or more is irrelevant and misleading according to Bayesians. Incidentally, it is not necessarily easier to just talk about what happened. What is the probability of exactly 11 deaths if Lucy de Clog is guilty? The degree of murder, something with a lot of uncertainty about it, determines how many deaths there are, but if you are fired after 11 deaths, the chance is taken of you to commit even more. And that last fact matters for the odds. I have only put 50% down there, that is at most a factor of 2 next to it.

It may be clear that it is not really easy to come to statements if there is no convincing evidence. The most famous example to which many Bayesian are counted is a murder in California in 1956, committed by a black man with a white woman in a yellow Cadillac. A couple who met this description was taken to court, and many statistical analyzes followed. I have counted a lot on this example myself, and have experienced how difficult, but also surprising and satisfying, it is to constantly add new elements.

A whole other book is even devoted to a famous case: “a Probabilistic Analysis of the Sacco and Vanzetti Evidence,” published in 1996 by Jay Kadane, professor of Carnegie Mellon and one of the most prominent Bayesians. Who wants to know more consult his resume on his website http://lib.stat.cmu.edu/~kadane. In the ” Statistics and the Law ” field alone, he has more than thirty publications to his name, along with hundreds of other articles. This is now a well-developed field in America.

Conclusion?

I have thought for a long time what the conclusion of this story is, and I have had to revise my opinion several times. And the perhaps surprising conclusion is: the actions of all parties are not that bad, only their rationalization is, to put it mildly, a bit strange. Elffers makes strange calculations but formulates the conclusions in court in such a way that it becomes intuitively clear that he is not giving the answer that the court is looking for. The judge makes judgments that sound in terms of probabilities but I cannot bake bread from. But when I see what happens I get the feeling that it is much more like what is optimal than I would have thought possible, given the absurd rationalisations. The explanation is simple: actions are based on a process based on evolution, justifications are stuck on it and based on education. In my opinion, the Bayesian method is the only way to balance decisions under uncertainty about actions and rationalization. And that can be very fruitful. But the profit is initially much smaller than people think. What the court does in the Lucy de B case is surprisingly rational. The 11 deaths are not convincing in themselves, but enough to change the prior odds from 1 in 40,000 to odds from 16 to 5, in short, an order of magnitude in which it is necessary to gather additional information before judging. Exactly what the court does.

When I made my calculations, I thought at times: I have to go to court I finally sent the article but I heard nothing more about it. It turned out that the defence had called for a witness who seriously criticized Elffers’ calculations. However, without presenting the solution.

Maybe I will once again have the opportunity to fully calculate the Lucy de B. case. That could provide new insights. But it is quite a job. In this case, there is much more information than is used here, such as poisonous traces in patients. Here too, it is likely that a Bayesian analysis that takes into account all the uncertainties shows that statements by experts who say something like “it is impossible that there is another explanation than the administration of poison by Lucy de B” should be taken with a grain of salt turn into. Experts are usually people who overestimate their securities. On the other hand, incriminating information can also build up. Ten independent facts that are twice as likely under the guilt hypothesis change the odds by a factor of 1000. And if it turns out that the toxic traces of five deceased patients are nine times as likely as nine times more likely as a result of Lucy de B’s “murder-lust” among other explanations, it saves a factor of nine to the fifth, a small 60,000. Etc, etc

But I think the court is more or less like that. In an incomprehensible language, not for probability calculators, but sanctioned by evolution. We have few cases of convictions that were found to be wrong in the Netherlands. [Well! That was a Dutch layperson, writing in 2004. According to Ton Derksen, about 10% of very long term prisoners (very serious cases) are innocent, in the Netherlands. It is probably something similar in other jurisdictions. RDG].

If you did the entire process in terms of probability calculation, the resulting debates between prosecutors and lawyers cannot be overseen. And given their poor knowledge of probability, it is also undesirable for the time being. They have their secret language that usually led to reasonable conclusions. Even the chance that Lucy is guilty of B does not really fit in with that. There is also no law in the Netherlands that defines “legal and convincing evidence” in terms of the chances of a justified decision. Is that 95%? Or 99%? Judges will maintain that it is 99.99%. But judges are experts.

So I don’t think it’s wise to try to cast the process in terms of opportunity right now. But perhaps this discussion will produce something in the longer term. Judges who are well informed about the statistical significance of the starting situation and then write down a number for each piece of evidence of prosecutor and defender. The plausibility ratio of the fact discussed. To multiply all these numbers at the end and have his calculations checked again by a Bayesian statistician. However, I consider this a long-term perspective. I fear (I am not really young anymore) for life.