Why I am more than 99.99% certain that Lucy Letby is innocent

I use Bayes theorem: posterior odds equals prior odds times likelihood ratio. For an introduction, please read this nice blog post https://entropicthoughts.com/bayes-rule-odds-form

I use this rule, Bayes’ rule, repeatedly, each time taking account of another part of the evidence. It is named for Thomas Bayes, a presbyterian minister and mathematician, who was interested in using it to find a mathematical proof of the existence of God. https://en.wikipedia.org/wiki/Thomas_Bayes

The likelihood ratio for the question at hand, based on some part of the evidence, is the ratio of the probabilities of that part of the evidence under the two competing hypotheses. More precisely, one uses the conditional probabilities of that fact given previously incorporated evidence. We have to start somewhere and we start by describing two alternative hypotheses and our probabilities or degrees of belief or personal betting odds for those two hypotheses, before further evidence is taken into account.

Let’s start with the news reports of a police investigation of a possible killer nurse at a neonatology unit in the UK; the investigation being triggered by a disturbing spike in the death rate on that unit.

I think that in the last fifty years there simply hasn’t been been a case in the UK of a killer nurse on a neonatal ward, except possibly the case of Beverley Allitt. One might argue that there do exist doubts as to the safety of her conviction, or one might argue that there can have been serial killer nurses who completely evaded detection. Did Alittit work in an intensive care unit? I also think that in recent years, every year has seen a scandalous calamity in a UK neonatal ward, leading to avoidable deaths of quite a few babies. So a priori: the relative chances of a killer nurse being responsible for the spike, or simply poor care, is in my estimation 50:1 in favour of poor care in a failing hospital unit rather than activity of a killer. If you disagree, give me your arguments for both those rates and hence their ratio. If you would like to take a different starting point, try that. Eg, what is the chance a random nurse is a serial killer? At some point one will have to use the information that this was a neonatal unit and one will have to take account of the “normal” rate of deaths on the unit. I think my choice is reasonably specific. One could argue that the prior odds should be 10 to 1, or 100 to 1, instead of 50 to 1. I expect that most people will at least agree that killer nurses on neonatal units are very rare, disastrously poor care on a neonatal unit in the UK is not rare at all.

So we are back in 2017 and hear the news and rightly we should be sceptical that there really is a case here. But clearly there are grounds to investigate what is the cause of that spike, and maybe there is more information which the police already have.

Then, many years go by. A particular nurse is detained for questioning in two successive years; and finally arrested in a third year. Two more years go by (Corona). At last, a trial begins. It turns out that roughly seven years of police investigation has uncovered no direct evidence at all (neither medical evidence, toxicological evidence, witness testimony or CCTV recordings, finger prints or DNA) of unlawful action by the nurse who has been under intensive investigation all that time. And not just no evidence against that nurse – no direct strong evidence of malevolent activity by anyone.

One might want to argue that the insulin evidence is strong toxicological evidence. We could argue about that for a long time. Even if one or two babies were given unauthorised doses of insulin there is no direct proof that Lucy Letby did that herself. There is the possibility of accidental administration (twins in adjacent cots). The argument that Lucy did administer insulin seems to have been that we know at some point she carried out other murderous attacks and it is unlikely that there were two murderous nurses working in the unit. But why do we believe there are murderous nurses working on the unit? This argument can only be made after hearing all the other evidence in the case.

So we have to estimate the probability of a 7 year police hunt for evidence of murder by a particular nurse finding no direct evidence of any malevolent activity at all by anyone, if Lucy Letby actually was innocent, and if she truly was a serial killer. In my opinion ,what we actually observed is much more likely under the innocence hypothesis than under the guilty hypothesis. If she truly is innocent the chance of finding powerful directly incriminating evidence must be rather small; if she truly is a serial killer then it must be unlikely that that no baby can be definitely proven to have been murdered or attacked. I guess the two probabilities of no hard evidence to be 95% and 5% respectively. These are probabilities of 19/20 and 1/20 respectively, so a likelihood ratio of 19. I’ll be a bit more cautious and call it 10.

We already had odds of 50:1 in favour of innocence. We have a likelihood ratio of 10:1 in favour of innocence, having learnt that police investigation uncovered no strong and direct proof of malevolent harm to any baby. The odds on Lucy being innocent are therefore now 50 times 10, or 500 to 1.

Let’s now bring in the evidence from psychology. Are there reasons to believe Lucy is a psychopath? Which surely she must be, if she is a serial killer of babies in her care. It seems there is no reason at all to suspect she is a psychopath. I think that there very likely would be strong independent signs of psychopathy in her life history if she really is a serial killer, but obviously not so likely if she is completely innocent. [Clearly she could be both a psychopath but did not actually harm or try to harm any baby. I don’t think this is an interesting hypothesis to explore. I will also not pay attention to the Munchhausen by proxy idea, that she was trying to attract the attention of an older male doctor. All the evidence says that he was more romantically interested in her, than vice versa.]

Put the likelihood ratio at 2, ie twice as likely to see no evidence for psychopathy if innocent, than if a serial killer. Actually I think it should be closer to 10. We should ask some psychologists. Lucy Letby did not sadistically kill little animals when she was a child. By all accounts, she was a dedicated nurse and cared deeply for her work.

We were at 500 to 1 for innocence. Factor in a likelihood ratio of 2 for psychological evidence. Now it’s 1000 to 1. But we are not done yet.

Next, I would like to take account of the statistical evidence that the spike in deaths is quite adequately explained by the acuity of the patients being treated in those 18 months. I would say that this is exactly what we would expect if Lucy is innocent but very unlikely if she’s a serial killer. I think this hypothesis is very adequately supported by published MBRRACE-UK statistics, and what we know about the acuity of the babies in the case. We know why acuity went up in around 2014 and we know why it went down midway in 2017. The spike seems to have been caused by hospital policy which was being made and implemented by the consultants on that unit. They should have expected it.

Say a likelihood ratio of 10. That brings us to 10,000 to 1 she’s innocent; a posterior probability of 99.99%. I haven’t yet brought in the facts of an investigation driven by tunnel vision and coached by doctors who, as we now know, were making quite a few deadly mistakes themselves. I haven’t brought in yet the innocent explanation of the post-it note. In my opinion, the post-it note is powerful evidence for innocence; it makes absolutely no sense under the hypothesis of guilt. The irrelevance of the handover notes and the notations in her diary. Facebook searches? Her alleged lies (about what she was wearing when she was arrested). Anything else?

Anyway, I am now well above 99.99% sure that Lucy is innocent and since the press conference and the report of Shoo Lee and his colleagues, we can all be even more sure that that is the case.

Letter to the BMJ

Rapid response to:

John Launer: Thinking the unthinkable on Lucy Letby

BMJ 2023; 382 doi: https://doi.org/10.1136/bmj.p2197, published 26 September 2023, cite as: BMJ 2023;382:p2197

Dear Editor

I am a coauthor of the report of the Royal Statistical Society https://rss.org.uk/news-publication/news-publications/2022/section-group-reports/rss-publishes-report-on-dealing-with-uncertainty-i/. It is deeply distressing that the police investigation into the case of Lucy Letby and the subsequent trial made all of the mistakes in our book. The jury was never told how the police investigation arrived at that list of “suspicious” events and how it was further narrowed down to the list of charges. This is a case in which a target was painted around a suspect by investigators. We call it confirmation bias, in statistics. It is also often referred to as the Texas sharpshooter paradox.

Thanks to amateurs who report their work on Twitter and YouTube, we now know how the list of charges in the Lucy Letby case evolved. It is utterly scandalous that this history was not revealed to the court. Here is the broad picture.

Doctors reported Lucy to the police, against the wishes of the hospital board.

They told the police the exact period she had been on the ward and gave them the files on all deaths in that period and on some of the incidents: namely, exactly and only those “arrests” at which Lucy had been present.

What qualifies as an incident, what is an arrest?

There is no medical category “arrest, resuscitation” under which such events are logged in hospital administration. Probably there were about five times as many such events when Lucy was not on duty, but nobody has ever looked. There is no medical definition of such an event. No formal criteria.

“Unexpected, unexplained, sudden” are also not defined in any formal way. Nor is “stable”.

Next the absolutely unqualified, long retired, paediatrician Dewi Evans, who has a business helping out in civil child custody cases, went through those medical files looking for anomalies about which he could fantasise a murder or murder attack. His ideas that milk was injected into the stomach or air into the veins were far fetched, and later not confirmed by any other evidence. On the contrary, the actual evidence certainly contradicts the idea that Lucy Letby actually attacked any child. He never gave alternative medical explanations, as would have been the obligation of a forensic scientist. All the deaths had had a post-mortem and a coroner’s report. Every single event on the charge sheet has absolutely normal explanation. Lucy was never seen doing anything wrong.

The medical experts for the prosecution merely confirmed Evans’ diagnosis, they also did not do the job of a forensic scientist.

The defence had no experts. They had brought in one paediatrician. But at the pre-trial hearing he said he wasn’t qualified in endocrinology, toxicology, etc etc etc.

This was Texas sharpshooter, big time. Plus utterly incompetent defence.

Richard Gill

Member of Royal Dutch Academy of Sciences

Past president of Dutch statistical society.

How to lie with data

This spreadsheet was shown on TV both yesterday (Friday August 18, the day of the verdicts) and at the start of the trial of Lucy Letby. Apparently, Cheshire Constabulary find this absolutely damning evidence against Lucy. And indeed, many journalists seem to agree.

The 25 events are almost all of the events at which LL was present during the periods investigated. They are suspicious because she was under suspicion when the police started their investigations. Not surprisingly, most nurses are not present at many of these events. And of course, many nurses probably work far fewer hours than LL. Many are often on administrative duties.

The doctors on the ward are of course missing. Doctors were never investigated as suspects but from the start of police investigations apparently always believed to speak gospel truth. During cross-examination, during the trial, some of them have changed various parts of their stories. Of course, unlike Lucy, they do not lie, since they could never (under oath in court, or earlier, when being interviewed as witnesses by police) be saying untruths in order to deceive.

Back to the spreadsheet. When drawing conclusions from any data it is important to know how it was gathered. It is important to know what data is missing, but would be needed draw even the most preliminary and tentative inferences.

There was an NHS investigation into the raised rates of deaths and collapses at Countess of Chester Hospital (CoCH) in summer 2015 and summer 2016. It was published in 2017 by the Royal College of Paediatrics and Child Health (RCPCH). The investigation blamed the consultants for the appalling low standard of care, and the terrible situation regarding hygiene. The RCPCH investigators actually wrote that nurse Lucy Letby could not be associated with the events, but that passage was redacted out of the published report for privacy reasons. We know that already, consultants had presented their fears to hospital management. One of them (successful TV doctor and FaceBook influencer dr Ravi Jayaram) was on TV yesterday proudly telling the world that he had been vindicated. Management was inclined not to believe them, and did not act on them, but they certainly came to the ears of the RCPCH. On publication of the report, four consultants had had enough, and went to the police with their suspicions that LL was a murderer.

Thanks to FOI requests and statistical analysis by independent scientists, we now know that the rate of events (deaths and collapses) is just as much raised when Lucy is not on the ward as it is when she is on the ward. A lot of medical information (as well as the state of the drains at CoCH) points to a seasonal virus epidemic.

The elevated rate went back to normal after the hospital was down-graded (no longer accepting high risk patients), and when the drains were rebuilt, and when the senior consultant retired, all of which happened soon after the police investigation started. Incidentally, the rate of still-births and miscarriages show exactly the same pattern.

Lucy must certainly have been a witch in order to kill babies in the womb and even when she is far from the hospital.

Those familiar with miscarriages of justice involving serial killer nurses will be familiar with this police and prosecution tactic. Is it evil or is it just stupid? (cf. Hanlon’s razor). I think it is quite simply “learnt”. Police and prosecution learn what convinces jurors over the years, and that is why the same “mistakes” are made again and again. They work!

CBS Statistieken, uitkeringsaffaire, uithuisplaatsingen

Deze saga zet zich voort met een nieuwe publicatie van het CBS, https://www.cbs.nl/nl-nl/achtergrond/2023/05/onzekerheidsmarges-onderzoek-toeslagenaffaire-en-jeugdbescherming. Nou ja, het kwam uit anderhalf maand geleden. Ik was met andere dingen bezig …

Hierbij een eerste indruk. Er worden nu betrouwbaarheidsintervallen bepaald en men ziet meteen dat de statistische onzekerheid enorm is. Natuurlijk, worden deze berekeningen gebaseerd op statistische veronderstellingen, en die zijn altijd betwistbaar. Maar op zijn minst kunnen ze geinterpreteerd worden op een pure beschrijvend-data manier als een gevoeligheids analyse. Een brede interval laat zien dat als de data een klein beetje anders was, het antwoord totaal anders zou zijn geweest. We weten zo wie zo dat er allerlei foutbronnen zijn; we weten dat de gegevens in de data bestanden van rijksinstellingen heel ver kunnen afliggen van de ervaringen van de burgers; dat ze afhangen van allerlei definities en afspraken die hun oorsprong hebben in bureaucratische administraties.

Een belangrijke resultaat is het plaatje hieronder, waarbij statistische onzekerheidsmarges toegevoegd zijn aan een plaatje uit de eerste (en omstreden) CBS rapport. Figuur 6.1.1.

Ik heb de “kleine letters” en de “nog kleinere kleine letters” meegenomen, niet om te lezen, maar om te laten zien dat er een hele technisch verhaal bijhoort.

De eerste indruk is dat het lijntje in het midden ongeveer plat is. Dus: de nare ingreep (gedupeerd zijn) in jaar “nul” geen sterke effect heeft. Men ziet over meerdere jaren een lichte toename bij dezelfde 4000 gezinnen van maatregelen van jeugdbescherming wat, zo te zien, beste toevallig had kunnen zijn. De hypothese van “geen impact” kan niet verworpen worden op grond van deze cijfers.

Maar, dat is niet de enige mogelijke uitleg van het plaatje, en die is net zo min te verwerpen. Dat hobbeltje in de grafiek zou ook “echt” kunnen zijn, en bovendien veroorzaakt door de klap wat de belastingdienst in “jaar nul” uitdeelde. Het ziet eruit als een stijging van een half procent per jaar, over meerdere jaren. De meest aannemelijke schatting is dat 20 tot 30 (of zelfs meer) echte dubbele slachtoffers zijn; dubbele slachtoffers in de zin dat gedupeerd zijn door de uitkeringsschandaal werkelijk leidde tot een uithuisplaatsing wat anders niet zou zijn gebeurd.

Het echte effect is gedempt en uitgesmeerd door alle tekortkomingen van het onderzoek. De conclusie moet zijn: het zijn zeker tientallen en mogelijk zelfs honderd.

Overigens, zou ik graag een keer een extra cijfer willen hebben waardoor ik de statistische onzekerheid in het verschil in hoogte van deze twee waardes (blaue en groen) zou kunnen evalueren.

Er zijn ruwweg 4000 gedupeerden en die zijn gepaard één op één met vergelijkbare niet-gedupeerden. We hebben feitelijk te maken met rond de 4000 matched pairs. Het CBS weet van elk lid van elk paar of een jeugdbescherming actie plaatsvond. We hebben feitelijk 4000 waarnemingen van paren, elk waarvan een van de vier waardes kan aannemen (0, 0), (0, 1), (1, 0), (1, 1); noem deze twee gevallen (x, y). Een “1” betekent uit een huisplaatsing (of iets dergelijks), een “0” betekent geen uithuisplaatsing. We zijn geinteresseerd in de gemiddelde van de x‘en minus de gemiddelde van de y‘s. Dat is hetzelfde als de gemiddelde van alle (x – y) waarden; elk ervan is gelijk aan –1, 0, of +1. Ik zou graag het 2×2 tabel willen zien van aantallen van elk van de vier mogelijke gesamenlijke uitkomsten (x, y). Ik zou de standaard afwijking willen uitrekenen van de (x – y) waarden. Dit zou ons inzicht geven in de mate van success van de matching: als het goed is, zouden we een positieve correlatie zien tussen de uitkomsten van de twee groepen. Een correlatie van +1 zou impliceren dat de uitkomst volledig bepaald is door de matching variabelen, dat zou betekenen: gedupeerd zijn maakte werkelijk niks uit. Kom’ns op, CBS!

The bogeyman (Algemene Dagblad, 26 January)

date

2023-01-26 09:38:02

Richard Gill. © Rob Voss

Professor Gill helped exonerate Lucia de B., and is now making mincemeat of the CBS report on benefits affair

Top statistician Richard Gill cracks down on the research conducted by Statistics Netherlands (CBS) into custodial placements of children of victims in the benefits affair. ‘CBS should never have come to the conclusion that this group of parents was not hit harder than other parents.’

Carla van der Wal 26-01-23, 06:00 Last update: 08:10

Emeritus professor Richard Gill would prefer to pick edible mushrooms in the woods and spend time with his grandchildren. Nevertheless, the top statistician in the Netherlands, who previously helped to exonerate the unjustly convicted Lucia de B, is now firmly committed to the benefits affair.

CBS should never have started the investigation into the custodial placement of children of victims in the benefits affair, says Gill. “And the conclusion that this group of parents has not been hit harder than other parents, CBS should never have drawn. It left many people thinking: only the tax authorities have failed, but fortunately there is nothing wrong with youth care. So all the fuss about ‘state kidnappings’ was unnecessary.”

After Statistics Netherlands calculated how many children of benefit parents were placed out of home (in the end it turned out to be 2090), it seemed that victims in the affair lost their children more often than similar parents who were not victims. The results were presented on November 1 last year, which Gill now denounces.

Gill is emeritus professor of mathematical statistics at Leiden University and in the past was an advisor to the methodology department of Statistics Netherlands. In the case of Lucia de B. he showed that calculations that would show that De B. had more deaths in her services were incorrect.

CBS abuses

There is a special reason that Gill is now getting stuck in the benefits affair – but more on that later. First about the CBS report. Gill states that Statistics Netherlands is not equipped for this type of research and points out that after two research methods were dropped, only one ‘not ideal, but only option’ remained. He also thinks, among other things, that the more severely affected victims in the benefits affair should be the focus of the investigation. He emphasizes that relatively mildly affected families most likely had to deal with much less drastic consequences. CBS itself also says that it likes to use information about the degree of duping, but that there was none.

CBS also acknowledges some criticisms. “CBS itself has mentioned a number of comments to the report. There seems to be a misunderstanding on one point,” said a spokesperson, who also said that CBS still fully supports the conclusions. CBS will soon be discussing the methodology used with Gill, but in any case CBS sees itself as the right party to carry out the study. “CBS has the task of providing insight into social issues with reliable statistical information and data and has the necessary expertise and techniques. In this case there was a clear social need for statistical insight.”

Gill thinks otherwise and thinks it’s important to raise this. Because he is awakened by injustice. That was also a reason to offer his help when questions arose about the conviction of Lucia de B., who can simply be called Lucia de Berk again since her acquittal. In 2003 she was sentenced to life imprisonment.

Out-of-home placement

With the acquittal in 2010, Gill became not only a top statistician, but also a beacon of hope for people who experienced injustice. And José Booij, a mother of a child placed in care, contacted him many years ago.

Somewhere in Gill’s house in Apeldoorn there is still a box with papers from José. It contains diaries, newspaper clippings and diplomas of hers. She was a little different from other people. A doctor who fell for women, fled the Randstad and settled in Drenthe. There she became pregnant and had a baby. And she had a neighbour with whom she had a disagreement. “That neighbour had made all kinds of reports about José to the local police, said that something terrible would happen to the child.” After six weeks, José’s daughter was removed from home.

State kidnapping

“What happened to José at the time, I also call that a state kidnapping, just as the custodial placements among victims of the benefits affair are now called.” The woman continued to fight to get her child back. But gradually that fight drove her insane. She lost her job, she lost her home. She fled abroad. “Despite a court ruling that the child had to be returned to José, that did not happen. José eventually derailed. I now know that she has left information with more people in the Netherlands to ensure that it is available to her daughter when she is ready. But I can’t find José anymore. I heard she was seen in the south of the Netherlands after escaping from a psychiatric clinic in England.”

_{Text continues below the photo}

Demonstration by victims of the benefits affair. © ANP / ANP

And meanwhile he keeps that box. And Gill thinks of José, when he considers the investigation by the Central Bureau of Statistics into custodial placements of children of victims in the benefits affair. Gill makes mincemeat of it. “The only thing CBS can say is that the results suggest that the differences between the two groups that have been compared are quite small. There should be a lot more caution, and yet in the summary you see bold summaries, such as: ‘Being duped does not increase the likelihood of child protection measures’. I suspect that CBS was put under pressure to conduct this study, or wanted to justify its existence. Perhaps there is an urge to be of service.”

Time for justice

Now is the time to put that right, Gill thinks. Research needs to be done to find out what’s really going on. “I had actually hoped that younger colleagues would have stood up by now, who would take up such matters.” But as long as that doesn’t happen, he’ll do it himself. Maybe it’s in his genes. It was Gill’s mother – he was born in England – who helped crack the enigma code used by the Germans to communicate during World War II. Gill wasn’t surprised when he found out. He already suspected that his excellent mind was inherited not only from his father, but also from his mother.

Love

Yet in the end it was his wife – the love of whom led him settle in the Netherlands – who put him on this track. She pointed Gill to Lucia de Berk’s case and encouraged him to get to work. She may have regretted that. For example, when Gill threatened to burn his Dutch passport during a broadcast of The World Keeps on Turning Round (“De wereld draait door”) if the De Berk case was not reviewed. “She said, ‘You can’t say things like that!'”

In fact, he would like to enjoy his retirement with her now – he has been out of paid work for six years now. Then he would spend his days in the woods looking for edible mushrooms. And spend a lot of time with his grandchildren. But now his calculations also help exonerate other nurses. Last year, Daniela Poggiali was released in Italy after Gill interfered with the case together with an Italian colleague. There are still things waiting for him in England.

And so the benefits affair is here in the Netherlands, which, as far as Gill is concerned, needs more in-depth, thorough research to find out exactly what caused the custodial placements. “That is why I ended up with Pieter Omtzigt and Princess Laurentien, who are also involved in the benefits affair.” Among the people who express themselves diplomatically, he wants to be the bad cop, the man who shakes things up, as he did when he threatened to set his passport on fire. But at the same time, he also hopes that a young statistician will emerge who is prepared to take over the torch.

CBS provided this site with an extensive explanation in response to Gill’s criticism. It recognizes the complexity of this type of research, but sees itself as the appropriate body to carry out that research. An appointment to speak with Gill has already been scheduled. “CBS always tries to explain as clearly and transparently as possible in its reports what has been investigated, how it was done and what the results are.”

Statistics Netherlands also points to nuances in the text of the report, for example after the sentence above a piece of text: ‘Being duped does not increase the chance of child protection measures’. “On an individual level, there may be a relationship between duping and youth protection, which is stated in several places in the report.” Even if ‘on average no evidence is found for a relationship between duping and youth protection’, as Statistics Netherlands notes.

Statistics Netherlands fully supports the research and the conclusions as stated in the report. It is pointed out, however, that there are still opportunities for follow-up research, as has also been indicated by Statistics Netherlands.

De boeman (Algemene Dagblad, 26 januari)

Hoogleraar Gill hielp bij vrijpleiten Lucia de B., en maakt nu gehakt van CBS-rapport toeslagenaffaire

Topstatisticus Richard Gill kraakt het onderzoek dat het Centraal Bureau voor de Statistiek (CBS) uitvoerde naar uithuisplaatsingen van kinderen van gedupeerden in de toeslagenaffaire. ‘De conclusie dat deze groep ouders niet harder is geraakt dan andere ouders, had het CBS nooit mogen trekken.’

Carla van der Wal 26-01-23, 06:00 Laatste update: 08:10

Het liefste zou emeritus hoogleraar Richard Gill eetbare paddenstoelen plukken in het bos, en tijd doorbrengen met zijn kleinkinderen. Toch bijt de topstatisticus van Nederland, die eerder hielp bij het vrijpleiten van de onterecht veroordeelde Lucia de B, zich nu vast in de toeslagenaffaire.

Het CBS had nooit aan het onderzoek naar de uithuisplaatsing van kinderen van slachtoffers in de toeslagenaffaire moeten beginnen, zegt Gill. ,,En de conclusie dat deze groep ouders niet harder is geraakt dan andere ouders, had het CBS nooit mogen trekken. Die liet velen denken: alleen de belastingdienst heeft gefaald, maar er is gelukkig niets mis met jeugdzorg. Al die ophef over ‘staatsontvoeringen’ was dus onnodig.’’

Nadat het CBS becijferde hoeveel kinderen van toeslagenouders uit huis werden geplaatst (uiteindelijk bleken het er 2090), leek het of gedupeerden in de affaire vaker hun kinderen kwijtraakten dan soortgelijke ouders die geen slachtoffer waren. Op 1 november vorig jaar werden de resultaten gepresenteerd, die Gill nu hekelt.

Gill is emeritus hoogleraar mathematische statistiek aan de universiteit van Leiden en was in het verleden adviseur bij de afdeling methodologie van het CBS. In de zaak van Lucia de B. liet hij zien dat berekeningen die zouden aantonen dat De B. vaker sterfgevallen in haar diensten had, niet klopten.

Misstanden CBS

Dat Gill zich nu vastbijt in de toeslagenaffaire heeft een bijzondere reden – maar daarover later meer. Eerst nog over het rapport van het CBS. Gill stelt dat het CBS niet is ingericht op dit type onderzoek en wijst erop dat nadat twee onderzoeksmethodes afvielen slechts één ‘niet ideale, maar enige optie’ overbleef. Ook vindt hij onder meer dat zwaarder getroffen gedupeerden in de toeslagenaffaire centraal zouden moeten staan bij het onderzoek. Hij benadrukt dat relatief licht geraakte gezinnen hoogstwaarschijnlijk met veel minder ingrijpende gevolgen te maken hebben gehad. Het CBS zegt overigens zelf ook dat het graag informatie over de mate van gedupeerdheid gebruikt, maar dat die er niet was.

Het CBS erkent ook sommige punten van kritiek. ,,Een aantal heeft het CBS zelf als kanttekening genoemd bij het rapport. Op een enkel punt lijkt sprake van een misverstand’’, aldus een woordvoerder, die ook zegt dat het CBS nog volledig achter de conclusies staat. Over de gebruikte methodologie gaat het CBS binnenkort met Gill in gesprek, maar het CBS ziet zich in elk geval wél als de juiste partij om het onderzoek uit te voeren. ,,Het CBS heeft als taak om met betrouwbare statistische informatie en data inzicht te geven in maatschappelijke vraagstukken en beschikt over de nodige expertise en technieken. In dit geval was een duidelijke maatschappelijke behoefte aan statistisch inzicht.’’

Gill denkt daar anders over en vindt het belangrijk dat aan te kaarten. Want hij ligt wakker van onrecht. Dat was ook reden om zijn hulp aan te bieden toen er vragen rezen over de veroordeling van Lucia de B., die sinds haar vrijspraak gewoon weer Lucia de Berk genoemd kan worden. In 2003 werd ze veroordeeld tot een levenslange gevangenisstraf.

Uithuisplaatsing

Door de vrijspraak in 2010 werd Gill naast een topstatisticus ook een baken van hoop voor mensen die onrecht ervaarden. En nam José Booij, een moeder van een uit huis geplaatst kind, vele jaren geleden contact met hem op.

Ergens in het huis van Gill in Apeldoorn staat nog een doos met papieren van José. Erin zitten dagboeken, krantenknipsels en diploma’s van haar. Ze was een beetje anders dan andere mensen. Een jurist die op vrouwen viel, de Randstad ontvluchtte en neerstreek in Drenthe. Daar werd ze zwanger, kreeg ze een kindje. En had ze een buurvrouw, met wie ze onenigheid had. ,,Die buurvrouw had allerlei meldingen over José gedaan bij de lokale politie, had gezegd dat met het kindje iets vreselijks zou gebeuren.” Na zes weken werd Josés dochtertje uit huis geplaatst.

Staatsontvoering

,,Wat José indertijd is overkomen, dat noem ik ook een staatsontvoering, net zoals de uithuisplaatsingen onder slachtoffers van de toeslagenaffaire nu worden genoemd.” De vrouw bleef vechten om haar kind terug te krijgen. Maar gaandeweg dreef dat gevecht haar tot waanzin. Ze raakte haar werk kwijt, ze raakte haar huis kwijt. Ze vluchtte naar het buitenland. ,,Ondanks een oordeel van de rechter, dat het kind terug moest naar José, gebeurde dat niet. José is uiteindelijk ontspoord. Inmiddels weet ik dat ze bij meer mensen in Nederland informatie heeft achtergelaten, om te zorgen dat die beschikbaar is voor haar dochter, als die eraan toe is. Maar José heb ik niet meer kunnen vinden. Ik heb gehoord dat ze nog is gezien in het zuiden van Nederland, nadat ze was ontsnapt uit een psychiatrische kliniek in Engeland.”

_{Tekst gaat verder onder de foto}

Gedupeerden in de toeslagenaffaire demonstreren. © ANP / ANP

En ondertussen bewaart hij dus die doos. En denkt Gill aan José, als hij zich buigt over het onderzoek van het Centraal Bureau voor de Statistiek, naar uithuisplaatsingen van kinderen van slachtoffers in de toeslagenaffaire. Gill maakt er gehakt van. ,,Het enige wat het CBS kan zeggen, is dat de uitkomsten suggereren dat de verschillen tussen de twee groepen die zijn vergeleken vrij klein zijn. Er zou veel meer voorzichtigheid moeten zijn, en toch zie je in de samenvatting in vetgedrukte letters stellige samenvattingen, zoals: ‘Gedupeerdheid verhoogt de kans op kinderbeschermingsmaatregelen niet’. Ik vermoed dat het CBS onder druk is gezet om dit onderzoek te doen, of zijn bestaansrecht heeft willen verantwoorden. Wellicht is er sprake van een drang om dienstbaar te zijn.”

Tijd voor rechtvaardigheid

Nu is het tijd om dat recht te zetten, vindt Gill. Er moet onderzoek worden gedaan, om te kijken hoe het echt zit. ,,Ik had eigenlijk gehoopt dat er inmiddels jongere collega’s zouden zijn opgestaan, die dit soort zaken op zouden pakken.” Maar zolang dat niet gebeurt, doet hij het zelf wel. Misschien zit het wel in zijn genen. Het was Gills moeder – hij werd geboren in Engeland – die tijdens de Tweede Wereldoorlog meewerkte aan het kraken van de enigmacode, die door de Duitsers werd gebruikt om te communiceren. Gill verraste het niet, toen hij erachter kwam. Hij had al zo’n vermoeden dat zijn excellente verstand niet alleen een erfenis van zijn vader, maar ook zijn moeder was.

De liefde

Toch was het uiteindelijk zijn vrouw – de liefde zorgde dat hij in Nederland neerstreek – die hem op dit spoor heeft gezet. Zij wees Gill op de zaak van Lucia de Berk en stimuleerde hem ermee aan de slag te gaan. Misschien heeft ze dat wel eens betreurd. Bijvoorbeeld toen Gill tijdens opnames van De wereld draait door dreigde zijn Nederlandse paspoort te verbranden, als de zaak De Berk niet werd herzien. ,,Ze zei: dat kan je toch niet doen?”

Eigenlijk zou hij nu met haar van zijn pensioen willen genieten – hij is inmiddels zes jaar gestopt met zijn betaalde werk. Dan zou hij zijn dagen vullen in het bos, zoekend naar eetbare paddenstoelen. En veel tijd doorbrengen met zijn kleinkinderen. Maar nu helpen zijn berekeningen ook bij het vrijpleiten van andere verpleegkundigen. Vorig jaar werd Daniela Poggiali nog vrijgelaten in Italië, nadat Gill zich samen met een Italiaanse collega met de zaak bemoeide. In Engeland zijn nog zaken die op hem wachten.

En de toeslagenaffaire is er hier in Nederland dus, waar wat Gill betreft diepgravender, gedegen onderzoek naar moet komen, om uit te zoeken wat nu precies de uithuisplaatsingen veroorzaakte. ,,Ik ben daarom terechtgekomen bij Pieter Omtzigt en prinses Laurentien, die zich ook met de toeslagenaffaire bezighouden.” Tussen de mensen die zich diplomatiek uiten, wil hij best de bad cop zijn, de man die de boel opschudt, zoals hij deed toen hij dreigde zijn paspoort in de fik te steken. Maar tegelijkertijd hoopt hij toch vooral ook dat er een jonge statisticus opstaat, die bereid is de fakkel over te nemen.

Het CBS gaf deze site een uitgebreide toelichting, naar aanleiding van de kritiek van Gill. Het erkent de complexiteit van dit soort onderzoek, maar ziet zichzelf wél als aangewezen instantie om dat onderzoek uit te voeren. De afspraak om met Gill te spreken is al ingepland. ,,Het CBS tracht in de rapporten altijd zo duidelijk en transparant mogelijk uit te leggen wat onderzocht is, hoe dat is gedaan en wat de uitkomsten zijn.”

Ook wijst het CBS op nuanceringen in de tekst van het rapport, bijvoorbeeld na de zin boven een stuk tekst: ‘Gedupeerdheid verhoogt de kans op kinderbeschermingsmaatregelingen niet’. ,,Er kan op individueel niveau wél een relatie tussen dupering en jeugdbescherming zijn, dat staat op meerdere plekken in het rapport vermeld.” Ook als er ‘gemiddeld genomen geen bewijs gevonden wordt voor een relatie tussen dupering en jeugdbescherming’, zoals het CBS constateert.

Het CBS staat volledig achter het onderzoek en de conclusies zoals die in het rapport vermeld staan. Wel wordt erop gewezen dat er nog mogelijkheden zijn voor vervolgonderzoek, dat heeft het CBS ook aangegeven.

Nog meer over uitkeringsschandaal en uithuisplaatsingen

Hieronder volgt een poging (20 januari 2023, ‘s ochtends) om het kern van het verhaal op te schrijven in 500 woorden en Jip en Janneke taal. Het lukte niet.

Heeft het CBS de waarheid in pacht?

Velen werden wakker geschud door carabetier Peter Pannekoek’s woorden “1115 staatsontvoeringen”. Maar ze kunnen weer in slaap gesust zijn door het CBS rapport “Jeugdbescherming en de toeslagenaffaire – Kwantitatief onderzoek naar kinderbeschermingsmaatregelen bij kinderen van gedupeerden van de toeslagenaffaire”. Een van de belangrijkste conclusies (samenvatting, eerste bladzijde) luidt

“Gedupeerdheid verhoogt de kans op kinderbeschermingsmaatregelen niet“.

Dat is een krachtige uitspraak. Geen enkel relativering, geen “kleine letters”. Geen melding dat het een uitspraak is die alleen gemaakt kan worden onder een hele reeks veronderstellingen. Helaas, een hele reeks veronderstellingen waarvan velen pertinent onwaar zijn.

Mijn antwoord: misschien geen 1115, maar misschien wel: 115

Nu munt het CBS uit in het doen van beschrijvend statistiek, wat ook hun wettelijke opdracht is. Ze dienen neutraal de feiten te ontsluiten en weer te geven die politiek en bestuur en burgers nodig hebben. Waar het CBS minder expertise in huis heeft, omdat het ook beslist niet tot hun taak behoort, is in het ontwarren van oorzaak en gevolg. Dat noemen we tegenwoordig “Causaliteit” en het is een uiterst actueel, belangrijk, subtiel, en complex onderwerp binnen het wetenschappelijk onderzoek; explosief gegroeid sinds Judea Pearls boek “Causality” uit 2000. Kan je causaliteit concluderen door het waarnemen van correlatie of associatie?

Voorbeeld. Lucia de B maakte vreselijk veel incidenten mee in haar diensten. Veel meer dan men zou hebben verwacht en dat leidde ook tot levenslange gevangenisstraf voor seriële moord. Pas later werd duidelijk dat haar aanwezigheid juist de reden was dat medisch onderzoekers bepaalde gebeurtenissen als incidenten karakteriseerden!

Maar kan geen associatie ook op causaliteit duiden? Jawel! Statistieken kunnen misleiden. Een aansprekend visuele representatie van statistieken des te meer. Mijn oog werd getrokken door Figuur 6.1.2 in het CBS rapport waarin we drie vrolijk gekleurde balkjes zijn, die de percentages 1%, 4% en 4% dienen te representeren. Zie je wel! De percentage uithuisplaatsingen bij de gedupeerden is exact wat je zou hebben verwacht, als al die gezinnen helemaal niet gedupeerd waren geweest!

Ik zou zeggen, dat kan geen toeval zijn. Na studie van het onderzoeksprotocol inclusief de vele door de team hanteerde algoritmes, wordt ook duidelijk dat het geen toeval is. Door de onderzoekskeuzes die het onderzoeksteam zich gedwongen voelde te maken is het verschil in uithuisplaatsingskans tussen “vergelijkbare” wel en niet gedupeerden systematisch verkleind. Het verschil is dus groter dan het lijkt (het lijkt nul te zijn, maar dat is het beslist niet). De juiste conclusie van het onderzoek had moeten zijn, ten eerste, dat er zeker tientallen uithuisplaatsingen “extra” plaatsvonden vanwege de affaire en mogelijk honderd (of zelfs een paar honderd). Een tweede conclusie had moeten zijn dat deze gedurfde pilot studie bewezen heeft dat een totaal ander onderzoeksopzet nodig is oude gestelde vraag te beantwoorden. Mogelijk, iets in de trant van het eerder verworpen onderzoeksvoorstel van Prof. Bart Tromp van de Universiteit Groningen. Overigens, is het nooit nodig om alle dossiers van de hele geschiedenis van alle gedupeerden door te pluizen. Door slim een aselecte steekproef in een verstandig gekozen deelpopulatie te nemen, kan men zich beperken tot het goed uitzoeken van relatief weinig gevallen.

Goede “Data Science” is onmogelijk zonder grote expertise te combineren uit drie gebieden tegelijkertijd: 1) algoritmes en computer mogelijkheden; 2) kansrekening en inferientiele statistiek (dwz het kwantificeren van de onzekerheid in de gevonden resultaten); 3) (last but not least!) vakspecifieke kennis van het beoogde toepassingsgebied; in dit geval psychologie, recht, bestuur.

De verantwoording van mijn claims ben ik momenteel aan het uitschrijven in mijn blog, https://gill1109.com/2023/01/18/de-statistiek-van-slachtoffers-van-uitkeringsschandaal/; het moet nog veel worden uitgebreid met nadere onderbouwing, verwijzingen, enzovoorts.

Ik denk aan een statistische simulatie om mijn punt te illustreren. Die twee getallen “4%” hebben foutbalken nodig van ongeveer +/- 1%. Lastig omdat ik rekening moet houden met de correlatie binnen de paren. We kunnen alleen maar raden hoe groot het is. Dus: meerdere simulaties met verschillende gissingen.

Yet more on the Dutch benefits scandal and child removals

This is a first attempt to summarise my claims in 500 words and simple language. It didn’t succeed.

Does CBS have direct access to the truth?

Many were shaken up by carabetier Peter Pannekoek ‘s words “1115 state kidnappings”. But they may have been lulled back to sleep by the CBS report “Youth protection and the benefits affair – Quantitative research into child protection measures in children of victims of the benefits affair”. One of the main conclusions (summary, first page) reads

“Being a victim of the benefits scandal does not increase the likelihood of child protection measures“.

That’s a powerful statement. No relativization whatsoever, no “small print”. No mention of it being a statement that can only be made under a slew of assumptions. Alas, a slew of assumptions many of which are patently untrue.

My answer: Maybe not 1115, but could well have been 115.

Now CBS excels at doing descriptive statistics, which is also their legal assignment. They should neutrally disclose and represent the facts that politicians, administration and citizens need. Where CBS has less in-house expertise, because it is certainly not part of their task, is in disentangling cause and effect. This is what we call “Causality” today and it is an extremely topical, important, subtle, and complex subject of scientific inquiry; exploded since Judea Pearl’s 2000 book “Causality”. Can you infer causality by observing correlation or association?

Example. Lucia de B experienced an awful lot of incidents in her services. Much more than one would have expected and that also led to life imprisonment for serial murder. Only later did it become clear that her presence was precisely the reason why medical examiners characterized certain events as incidents!

But can *no* association also indicate causality? Yes! Statistics can be misleading. An appealing visual representation of statistics all the more. My eye was drawn to Figure 6.1.2 in the CBS report in which we are three brightly colored bars, which should represent the percentages 1%, 4% and 4%. See! The percentage of custodial placements among the victims is exactly what you would have expected, if all those families had not been victimized at all!

I’d say that can’t be a coincidence. After studying the research protocol, including the many algorithms used by the team, it also becomes clear that this is no coincidence. Due to the research choices that the research team felt compelled to make, the difference in out-of-home placements between “comparable” victims and non-victims has been systematically reduced. So the difference is greater than it appears (it appears to be zero, but it is definitely not). The correct conclusion of the investigation should have been, first, that there were certainly dozens of “extra” custodial placements because of the affair and possibly a hundred (or even a few hundred). A second conclusion should have been that this bold pilot study has proven that a completely different research design is needed to answer an old question. Possibly, something along the lines of Prof. dr. Bart Tromp of the University of Groningen. Incidentally, it is never necessary to go through *all* files of the entire history of all victims. By smartly taking a random sample in a sensibly chosen sub-population, one can limit oneself to properly sorting out relatively few cases.

Good “Data Science” is impossible without combining great expertise from three areas at the same time: 1) algorithms and computing capabilities; 2) probability theory and inferiential statistics (ie quantifying the uncertainty in the results found); 3) (last but not least!) subject-specific knowledge of the intended application area; in this case psychology, law, administration.

I am currently writing out the justification for my claims in my blog, https://gill1109.com/2023/01/18/de-statistiek-van-slachtoffers-van-toeslagsschandaal/; it still needs to be expanded a lot with further substantiation, references, and so on.

I’m thinking of a statistical simulation to illustrate my point. Those two numbers “4%” need error bars of about +/- 1%. Tricky because I must take account of the correlation within the pairs. We can only guess how big it is. So: several simulations with different guesses.

Statistics of victims of the Dutch child-benefits scandal

Commentary on the CBS report

Author: prof.dr. (em.) Richard D. Gill

Mathematical Institute, Leiden University

Monday January 16, 2023

Richard Gill is emeritus professor of mathematical statistics at Leiden University. He is a member of the KNAW and former chairman of the Netherlands Statistical Society (VVS-OR)

=========================================

Mr. Pieter Omtzigt has asked me to give my expert opinion on the CBS report that examines whether the number of child care placements of children by Dutch child protection authorities increased because their families had fallen victim to the child benefit scandal in the Netherlands.

The current note is preliminary and I intend to refine it further. My purpose is to stimulate discussion among relevant professionals of the methodology used by the CBS in this particular case. Feedback, please!

The report gives a clear (and short) account of creative statistical analysis of much complexity. The sophisticated nature of the analysis techniques, the urgency of the question, and the need to communicate the results to a general audience probably led to important “fine print” about the reliability of the results being omitted. The authors seem to me to be too confident in their findings.

Numerous choices had to be made by the CBS team to answer the research questions. Many preferable options are excluded due to data availability and confidentiality. Changing one of the many steps in the analysis through changes in criteria or methodology could lead to wildly different answers. The actual finding of two nearly equal percentages (both close to 4%) in the two groups of families is, in my opinion, “too good to be true”. It’s a fluke. Its striking character may have encouraged the authors to formulate their conclusions much more strongly than they are entitled to.

In this regard, I found it significant that the authors note that the datasets are so large that statistical uncertainty is unimportant. But this is simply not true. After constructing an artificial control group, they have two groups of size (in round numbers) 4000, and 4% of cases in each group, i.e. about 160. According to a rule of thumb calculation (Poisson variation), the statistical variation in those two numbers have a standard deviation of about the square root of 160, so about 12.5. That means that one of those numbers (160) could easily happen to have twice the standard deviation, which is about 25. The conclusion that the benefits scandal did not lead to more children being removed from home than without it would have been the case, certainly cannot be drawn . Taking into account the statistical sampling error, it is quite possible that the control group (those not afflicted by the benefits scandal) would have been 50 less. In that case, the study group experienced 50 more than they would have done, had they not been victims of the benefits scandal.

To make the numbers easier still, suppose there was an error of 40 cases too few in the light blue bar standing for 4%. 40 out of 4000 is 1 out of 100, 1%. Change the light blue bar from height 4% to height 3% and they don’t look the same at all!

But this is already without taking into account possible systematic errors. The statistical techniques used are advanced and model-based. This means that they depend on the validity of many particular assumptions about the form and nature of the relationships between the variables included in the analysis (using “logistic regression”). The methodology uses these assumptions for its convenience and power (more assumptions mean stronger conclusions, but threatens “garbage in, garbage out”). Logistic regression is such a popular tool in so many applied fields because the model is so simple: the results are so easy to interpret, the calculation can often be left to the computer without user intervention. But there’s no reason why the model should be exactly true; one can only hope that it is a useful approximation. Whether it is useful depends on the task for which it is used. The current analysis uses logistic regression for purposes for which it was not designed.

The assumptions of the standard model of logistic regression are certainly not exactly met. It is not clear whether the researchers tested for failure of the assumptions (for example, by looking for interaction effects – violation of additivity). The danger is that the failure of the assumptions can lead to systematic bias in the results, bias that affects the synthetic (“matched”) control group. The central assumption in logistic regression is the additivity of effects of various factors on the log-odds scale (“odds” means probability divided by complementary probability; log means logarithm). This could be true to a first rough approximation, but it is certainly not exactly true. “All models are wrong, but some are useful”.

A good practice is to build models by analyzing a first data set and then evaluating the final chosen model on an independently collected second data set. In this study, not one but numerous models were tested. The researchers seem to have chosen from countless possibilities through subjective assessment of plausibility and effectiveness. This is fine in an exploratory analysis. But the findings of such an exploration must be tested against new data (and there is no new data).

The end result was a procedure to choose “nearest neighbour matches” with respect to a number of observed characteristics of the cases examined. Errors in the logistic regression used to choose matched controls can systematically bias the control group.

Further big questions concern the actual selection of cases and controls at the beginning of the analysis. Not all families affected by the benefits scandal had to pay back a huge amount of subsidy. Mixing the hard-hit and the weak-hit dilutes the effect of the scandal, both in magnitude and accuracy, the latter because maller samples lead to relatively less accurate determination of effect size.

Another problem is that the pre-selection control population (families in general from which a child was removed) also contains victims of the benefit scandal (the study population). That brings the two groups closer together, even more so after the familywise one-on-one matching process, which of course selectively finds matches among the subpopulation most likely to be affected by the benefits scandal.

De statistiek van slachtoffers van het uitkeringsschandaal

Auteur: prof.dr. (em.) Richard D. Gill

Mathematisch Instituut, Universiteit Leiden

maandag 16 januari 2023

Richard Gill is emeritus hoogleraar wiskundige statistiek aan de Universiteit Leiden. Hij is lid van de KNAW en voormalig voorzitter van het Nederlands Statistisch Genootschap (VVS-OR)

===========================================

De heer Pieter Omtzigt heeft mij gevraagd om mijn deskundige mening te geven over het CBS-rapport waarin wordt onderzocht of het aantal uithuisplaatsingen van kinderen door de Nederlandse kinderbescherming is toegenomen doordat hun families het slachtoffer zijn geworden van het kinderbijslagschandaal in Nederland. De huidige nota is voorlopig en ik ben van plan deze verder te verfijnen. Commentaar, kritiek, is welkom.

Het rapport geeft een duidelijk (en kort) verslag van creatieve statistische analyses van enige complexiteit. Het geavanceerde karakter van de analysetechnieken, de urgentie van de vraag en de noodzaak om de resultaten aan een algemeen publiek te communiceren, hebben er waarschijnlijk toe geleid dat belangrijke “kleine lettertjes” over de betrouwbaarheid van de resultaten werden weggelaten. De auteurs lijken mij te veel vertrouwen te hebben in hun bevindingen.

Om de onderzoeksvragen te beantwoorden moesten er door het CBS-team tal van keuzes worden gemaakt. Veel voorkeursopties zijn uitgesloten vanwege beschikbaarheid van gegevens en vertrouwelijkheid. Het wijzigen van een van de vele stappen in de analyse door wijzigingen in criteria of methodologie kan tot enorm verschillende antwoorden leiden. De daadwerkelijke bevinding van twee bijna gelijke percentages (beide dicht bij de 4%) in de twee groepen gezinnen is naar mijn mening “te mooi om waar te zijn”. Het is een toevalstreffer. Het opvallende karakter ervan heeft de auteurs misschien aangemoedigd om hun conclusies veel sterker te formuleren dan waar ze recht op hebben.

In dit verband vond ik het veelzeggend dat de auteurs opmerken dat de datasets zo groot zijn dat statistische onzekerheid onbelangrijk is. Maar dit is gewoon niet waar. Na constructie van een kunstmatige controlegroep hebben ze twee groepen van omvang (in ronde getallen) 4000, en 4% van de gevallen in elke groep, dat wil zeggen ongeveer 160. Volgens een vuistregelberekening (Poisson-variatie) heeft de statistische variatie in die twee getallen een standaarddeviatie van ongeveer de vierkantswortel van 160, dus ongeveer 12,5. Dat betekent dat elk van die getallen (160) toevallig gemakkelijk twee keer de standaarddeviatie kan hebben, namelijk ongeveer 25.

Rekening houdend met de statistische steekproeffout, is het heel goed mogelijk dat de controlegroep (degenen die niet getroffen zijn door het uitkeringsschandaal) 50 minder zou zijn geweest. In dat geval maakte de onderzoeksgroep er 50 meer mee dan ze zouden hebben gedaan als ze geen slachtoffer waren geweest van het uitkeringsschandaal.

Om de cijfers nog makkelijker te maken, stel dat er een fout was van 40 gevallen te weinig in de lichtblauwe balk, wat staat voor 4%. 40 van de 4000 is 1 van de 100, 1%. Verander de lichtblauwe balk van hoogte 4% naar hoogte 3% en ze zien er helemaal niet hetzelfde uit!

Maar dit is al zonder rekening te houden met mogelijke systematische fouten. De gebruikte statistische technieken zijn geavanceerd en modelmatig. Dit betekent dat ze afhankelijk zijn van de validiteit van tal van bijzondere aannames over vorm en aard van de relaties tussen de variabelen die in de analyse zijn opgenomen (met behulp van “logistische regressie”). De methodologie gebruikt deze aannames vanwege zijn gemak (“convenience”) en kracht (meer aannames betekent sterkere conclusies, maar dan dreigt “garbage in, garbage out”). Logistische regressie is zo’n populair hulpmiddel in zoveel toegepaste gebieden omdat het model zo eenvoudig is: de resultaten zijn zo gemakkelijk te interpreteren, de berekening kan vaak zonder tussenkomst van de gebruiker aan de computer worden overgelaten. Maar er is geen enkele reden waarom het model precies waar zou moeten zijn; men kan alleen maar hopen dat het een bruikbare benadering is. Of het nuttig is, hangt af van de taak waarvoor men het gebruikt. De huidige analyse gebruikt logistische regressie voor doeleinden waarvoor het niet is ontworpen.

Aan de aannames van het standaardmodel wordt zeker niet precies voldaan. Het is niet duidelijk of de onderzoekers hebben getest op het falen van de aannames (bijvoorbeeld door te zoeken naar interactie-effecten – schending van additiviteit). Het gevaar is dat het falen van de aannames kan leiden tot systematische vertekening in de resultaten, vertekening die van invloed is op de synthetische (“gematchte”) controlegroep. De centrale aanname bij logistische regressie is de additiviteit van effecten van verschillende factoren op de schaal van log-odds (“odds” betekent: kans gedeeld door complementaire kans; log betekent logarithme). Dit zou waar kunnen zijn bij een eerste ruwe benadering, maar het is zeker niet exact waar. “Alle modellen zijn verkeerd, maar sommige zijn nuttig”.

Een goede praktijk is om modellen te bouwen door een eerste dataset te analyseren en vervolgens het uiteindelijk gekozen model te evalueren op een onafhankelijk verzamelde tweede dataset. In deze studie werden niet één maar tal van modellen uitgeprobeerd. De onderzoekers lijken te hebben gekozen uit talloze mogelijkheden door subjectieve beoordeling van plausibiliteit en effectiviteit. Dit is prima in een verkennende analyse. Maar de bevindingen van zo’n verkenning moeten worden getoetst aan nieuwe gegevens (en er zijn geen nieuwe gegevens).

Het resultaat was een procedure om “naaste buur overeenkomsten” te kiezen met betrekking tot een aantal waargenomen kenmerken van de onderzochte gevallen. Fouten in de logistische regressie die wordt gebruikt om overeenkomende controles te kiezen, kunnen de controlegroep systematisch vertekenen.

Verdere vragen gaan over de daadwerkelijke selectie van cases en controles aan het begin van de analyse. Niet alle door het uitkeringsschandaal getroffen gezinnen moesten een enorm bedrag aan subsidie terugbetalen. Door de hard getroffen en de zwak getroffen te mengen, wordt het effect van het schandaal afgezwakt, zowel in grote als in nauwkeurigheid.

Een ander probleem is dat de pre-selectie controlepopulatie (gezinnen in het algemeen waarvan een kind werd weggehaald) ook slachtoffers bevat van het uitkeringsschandaal (de studiepopulatie). Dat brengt de twee groepen dichter bij elkaar, en dat nog meer na het matchingsproces, dat uiteraard selectief matches vindt onder de subpopulatie die het meest waarschijnlijk door het uitkeringsschandaal is getroffen.

	Maddox Prize shortli… on The Lucy Letby case
	Lucy on Dutch Family Justice – the tra…
	Innominate on Why I am more than 99.99% cert…
	Innominate on Why I am more than 99.99% cert…
	David on Why I am more than 99.99% cert…

	Maddox Prize shortli… on The Lucy Letby case
	Lucy on Dutch Family Justice – the tra…
	Innominate on Why I am more than 99.99% cert…
	Innominate on Why I am more than 99.99% cert…
	David on Why I am more than 99.99% cert…