Letter to the BMJ

Rapid response to:

John Launer: Thinking the unthinkable on Lucy Letby

BMJ 2023; 382 doi: https://doi.org/10.1136/bmj.p2197, published 26 September 2023, cite as: BMJ 2023;382:p2197

Dear Editor

I am a coauthor of the report of the Royal Statistical Society https://rss.org.uk/news-publication/news-publications/2022/section-group-reports/rss-publishes-report-on-dealing-with-uncertainty-i/. It is deeply distressing that the police investigation into the case of Lucy Letby and the subsequent trial made all of the mistakes in our book. The jury was never told how the police investigation arrived at that list of “suspicious” events and how it was further narrowed down to the list of charges. This is a case in which a target was painted around a suspect by investigators. We call it confirmation bias, in statistics. It is also often referred to as the Texas sharpshooter paradox.

Thanks to amateurs who report their work on Twitter and YouTube, we now know how the list of charges in the Lucy Letby case evolved. It is utterly scandalous that this history was not revealed to the court. Here is the broad picture. 

Doctors reported Lucy to the police, against the wishes of the hospital board.

They told the police the exact period she had been on the ward and gave them the files on all deaths in that period and on some of the incidents: namely, exactly and only those “arrests” at which Lucy had been present.

What qualifies as an incident, what is an arrest?

There is no medical category “arrest, resuscitation” under which such events are logged in hospital administration. Probably there were about five times as many such events when Lucy was not on duty, but nobody has ever looked. There is no medical definition of such an event. No formal criteria.

“Unexpected, unexplained, sudden” are also not defined in any formal way. Nor is “stable”.

Next the absolutely unqualified, long retired, paediatrician Dewi Evans, who has a business helping out in civil child custody cases, went through those medical files looking for anomalies about which he could fantasise a murder or murder attack. His ideas that milk was injected into the stomach or air into the veins were far fetched, and later not confirmed by any other evidence. On the contrary, the actual evidence certainly contradicts the idea that Lucy Letby actually attacked any child. He never gave alternative medical explanations, as would have been the obligation of a forensic scientist. All the deaths had had a post-mortem and a coroner’s report. Every single event on the charge sheet has absolutely normal explanation. Lucy was never seen doing anything wrong.

The medical experts for the prosecution merely confirmed Evans’ diagnosis, they also did not do the job of a forensic scientist.

The defence had no experts. They had brought in one paediatrician. But at the pre-trial hearing he said he wasn’t qualified in endocrinology, toxicology, etc etc etc. 

This was Texas sharpshooter, big time. Plus utterly incompetent defence. 

Richard Gill

Member of Royal Dutch Academy of Sciences

Past president of Dutch statistical society.

The post-it note

Is Lucy’s post-it note a confession? Whether you will see it as a confession or a cry of innocent anguish depends on whether *you* have a heart and a brain. If you read it carefully, you will see that Lucy does not say that she killed those babies. She says that *they said* she killed those babies. Yes, she does say she is evil. She thinks she is clearly a bad nurse who apparently couldn’t save those babies, despite her (possibly too energetic, and certainly not well supervised) attempts. More seriously, she had had an affair with an older married man, a doctor, who later dumped her and betrayed her. She spoke out about doctors’ mistakes and about the catastrophic hygienic circumstances in which she and her colleagues had to work. For two years, doctors had tried to have her taken off that ward, because she pissed them off. Her colleague nurses loved her for her forthrightness and lovely character. She is so sorry for the suffering she caused her parents and step-brothers. She is considering suicide. She has PTSD.

This deciphering of the note was created by https://x.com/chrisjclarkesq?s=21&t=1S47Jut6K2dqjKzr1sc-4A , known as Mycroft on ‘X’, that is the ‘X’ formerly known as Twitter.

How to lie with data

This spreadsheet was shown on TV both yesterday (Friday August 18, the day of the verdicts) and at the start of the trial of Lucy Letby. Apparently, Cheshire Constabulary find this absolutely damning evidence against Lucy. And indeed, many journalists seem to agree.

The 25 events are almost all of the events at which LL was present during the periods investigated. They are suspicious because she was under suspicion when the police started their investigations. Not surprisingly, most nurses are not present at many of these events. And of course, many nurses probably work far fewer hours than LL. Many are often on administrative duties.

The doctors on the ward are of course missing. Doctors were never investigated as suspects but from the start of police investigations apparently always believed to speak gospel truth. During cross-examination, during the trial, some of them have changed various parts of their stories. Of course, unlike Lucy, they do not lie, since they could never (under oath in court, or earlier, when being interviewed as witnesses by police) be saying untruths in order to deceive.

Back to the spreadsheet. When drawing conclusions from any data it is important to know how it was gathered. It is important to know what data is missing, but would be needed draw even the most preliminary and tentative inferences.

There was an NHS investigation into the raised rates of deaths and collapses at Countess of Chester Hospital (CoCH) in summer 2015 and summer 2016. It was published in 2017 by the Royal College of Paediatrics and Child Health (RCPCH). The investigation blamed the consultants for the appalling low standard of care, and the terrible situation regarding hygiene. The RCPCH investigators actually wrote that nurse Lucy Letby could not be associated with the events, but that passage was redacted out of the published report for privacy reasons. We know that already, consultants had presented their fears to hospital management. One of them (successful TV doctor and FaceBook influencer dr Ravi Jayaram) was on TV yesterday proudly telling the world that he had been vindicated. Management was inclined not to believe them, and did not act on them, but they certainly came to the ears of the RCPCH. On publication of the report, four consultants had had enough, and went to the police with their suspicions that LL was a murderer.

Thanks to FOI requests and statistical analysis by independent scientists, we now know that the rate of events (deaths and collapses) is just as much raised when Lucy is not on the ward as it is when she is on the ward. A lot of medical information (as well as the state of the drains at CoCH) points to a seasonal virus epidemic.

The elevated rate went back to normal after the hospital was down-graded (no longer accepting high risk patients), and when the drains were rebuilt, and when the senior consultant retired, all of which happened soon after the police investigation started. Incidentally, the rate of still-births and miscarriages show exactly the same pattern.

Lucy must certainly have been a witch in order to kill babies in the womb and even when she is far from the hospital.

Those familiar with miscarriages of justice involving serial killer nurses will be familiar with this police and prosecution tactic. Is it evil or is it just stupid? (cf. Hanlon’s razor). I think it is quite simply “learnt”. Police and prosecution learn what convinces jurors over the years, and that is why the same “mistakes” are made again and again. They work!

The Lucy Letby case

Note: [20 August 2023] This post is incomplete. It needs a prequel: the history of medical investigations into two “unexplained clusters” of deaths at the neonatal ward of the Countess of Chester Hospital. It needs many sequels: statistical evidence; how the cases were selected (the Texas sharpshooter paradox) and the origin of suspicions that a particular nurse might be a serial killer; the post-it note; the alleged insulin poisonings; the trouble with sewage backflow and the evidence of the plumber; the euthanasias. For the medical material, the site to visit is the magnificent https://rexvlucyletby2023.com/.

Lucy Letby, a young nurse, has been tried at Manchester Crown Court for 7 murders and 15 murder attempts on 17 newborn children in the neonatal ward at Countess of Chester Hospital, Chester, UK, in 2015 and 2016.

She was found:– Guilty of 7 counts of murder (against 7 babies)
– Guilty of 7 counts of attempted murder (against 6 babies)
– Not guilty on 2 counts of attempted murder (against 2 of the 6 babies she *was* found guilty of attempting to murder). No decision was reached on 6 counts of attempted murder against 6 different babies. However, 2 of those 6 she was also found guilty of a different count of attempted murder. [Thanks to the commenter who corrected my numbers.]

The prosecution dropped one further murder charge just before the trial started, on the instruction of the judge. Several groups of alleged murders and murder attempts concern the same child, or twin or triplet siblings. All but one child was born pre-term. Several of them, extremely pre-term.

I’m not saying that I know that Lucy Letby is innocent. As a scientist, I am saying that this case is a major miscarriage of justice. Lucy did not have a fair trial. The similarities with the famous case of Lucia de Berk in the Netherlands are deeply disturbing.

The image below summarizes findings concerning the medical evidence. This was not my research. The graphic was given to me by a person who wishes to remain anonymous, in order to disseminate the research now fully documented on https://rexvlucyletby2023.com/, whose author and owner wishes to remain anonymous. Note that the defence has not called any expert witnesses at all (except for one person: the plumber). Possibly, they had not enough funds for this. Crowd-sourcing might be a smart way of getting the necessary work done for free, to be used at a subsequent appeal. That’s a dangerous tactic, and it seems to me that the defence has already taken a foolish step: they admitted that two babies received unauthorised doses of insulin, and their client was obliged to believe that too.

This blog post started in May 2023 as a first attempt by myself to blog about a case which I have been following for a long time. The information I report here was uncovered by others and is discussed on various internet fora. Links and sources are given below, some lead to yet more excellent sources. Everything here was communicated to the defence, but they declined to use it in court. Maybe they felt their hands were bound by pre-trial agreement between the trial parties as to what evidence would be brought to the attention of the jury, which witnesses, etc.

An extraordinary feature of UK criminal prosecution law is that if exculpatory evidence is in the possession of the defence, but not used in court, then it should not be used at a subsequent appeal, whether by the same defence team or a new one. This might explain why the defence team would not even inform their client of their knowledge of the existence of evidence which exonerated her. Even though, it is also against the law that they did not, as far as we know, disclose evidence which they had which was in her favour. The UK law on criminal court procedure is case law. New judges can always decide to depart from past judges’ rulings.

A very important issue is that the rules of use of expert evidence is that all expert evidence must be introduced before the trial starts. It is strictly forbidden to introduce new expert evidence once the trial is underway.

UK criminal trials are tightly scripted theatre. The jury is of course incommunicado, very close to its verdict, and I do not aim to influence the jury or their verdict. I aim to stimulate discussion of the case in advance of a likely appeal against a likely guilty verdict. I wish to support that small part of the UK population who are deeply concerned that this trial is going to end in an unjustified guilty verdict. Probably it will, but that will not be the end. So much information has come out in the 9 months of the trial so far, that a serious fight on behalf of Lucy Letby is now possible. Public opinion crystallised long ago against Lucy. It can be made fluid again, and maybe it can even be reversed, and this is what must happen if she is to get a fair re-trial.

As a concerned scientist who perceives a miscarriage of justice in the making, I attempted to communicate information not only to the defence but also to the prosecution, to the judge (via the clerk of the court), and to the Director of Public Prosecutions. That was a Kafkaesque experience which I will write about on another occasion. Personally, I tend to think that Lucy is innocent. That was however not my reason for attempting to contact the authorities. As a scientist, it was manifestly clear to me that she was not getting a fair trial. Science was being abused. I tried to communicate with the appropriate authorities. I failed to get any response. Therefore I had to “go public”.

Here is a short list of key medical/scientific issues, originally copied from an early version of the incredible and amazing website https://rexvlucyletby2023.com/, with occasional slight rephrasing and some small, hopefully correct, additions by myself. That site presents full scientific documentation and argumentation for all of the claims made there.

  1. Air embolism cannot be determined by imaging, and can only be determined soon after death, and requires the extraction of air from the circulatory system, and analysis of the composition of the air using gas chromatography.
  2. The coroner found a cause of death in 5 out of 7 of the alleged murder cases. Two of them appeared to be, in part, related to aggressive CPR, two appeared to be due to undiagnosed hypoxic-ischemic encephalopathy and myocarditis, one of the infants received no autopsy, and the other infant was determined to have died due to prematurity. It is highly unusual for the cause of death to be altered years after the fact and using methodology that is not supported by the coroner’s office.
  3. The two claims of insulin poisoning are not supported by the testing conducted, and the infants (who are still alive and well) did not have dangerously low or dangerously high blood glucose levels for any period of time. There are many physiological reasons that could explain their low blood glucose during the whole period. In one of the two cases, assumptions are being made on the basis of one test taken at a single time point, clearly inconsistent with the other medical readings, and contravening the manufacturer’s own instructions for use (see image below). The report detailing the conclusions from that single test violates the code of practice of the forensic science regulator. Moreover, it appears that some numerical error has been made in the necessary calculation, resulting in an outcome which is physiologically impossible (or the person responsible did not know about the so-called “hook effect”). The mismatch between C-peptide and insulin concentration does not prove that the excess insulin found must have been synthetic insulin. There are many other biological explanations for a mismatch. No testing was done to determine the origin of the insulin. Similarly, there are many innocent explanations for the detection of some insulin in a feeding bag.
  4. The air embolism hypothesis is confusing because it fails to explain why some children apparently perished and others did not, and it has not been supported by the minimal necessary measurements.
  5. In at least one case, Lucy is blamed with causing white matter brain injury. This claim is utterly dishonest. The infant who experienced this brain injury was born at 23 weeks gestation, and white matter brain injury is associated with such early births. Further, there is sufficient evidence that demonstrates that enterovirus and parechovirus infection has been linked to white matter brain injury in neonates, resulting in cerebral palsy.
  6. At the time of the collapses and deaths of the infants, enterovirus and parechovirus had been reported in other hospitals. There is a history of outbreaks of these viruses in neonatal wards in hospitals around the world. They especially harm preterm infants who do not yet have a functioning immune system. It is reported that many parents of the infants were concerned that their ward had a virus (as was Lucy) and that Dr Gibbs denied this was so. To date we have seen no evidence to show they did any viral testing, and if they did what the results were.

Then a fact pertaining to my own scientific competence.

Both prosecution and defence were warned long ago about the statistical issues in such cases. Both have responded that they are not going to use any statistics. They are also not using the services of any statistician. Seems the RSS report https://rss.org.uk/news-publication/news-publications/2022/section-group-reports/rss-publishes-report-on-dealing-with-uncertainty-i/ has had the opposite effect to that intended. Amusingly, the same thing happened in the case of Lucia de Berk. At the appeal the prosecution stopped using statistics. She was convicted solely on the grounds of “irrefutable medical scientific evidence”. (Here, I’m quoting from the words both spoken by the judges and written down on the first page of their > 100 page report of the reasons and reasoning which had led to their unshakable conviction that Lucia de Berk was guilty. The longest judge’s summing up in Dutch legal history). I was one of the five coauthors of the RSS report. We were a “task force”, formally commissioned by the “Statistics and the Law” section of the society. I consider it the most important scientific work of my career. It took us two years to put together. We started the work in 2020; we had seen the Lucy Letby trial on the horizon since 2017 when police investigations started and the suspect being investigated was already common knowledge.

The UK does not have anything like that because a jury of ordinary folk are the ones who (legally) determine guilt or innocence. This is a clever device which makes fighting a conviction very difficult; no one can know what arguments the jury had in their mind, no one knows what, if anything, was the key fact that convinced them of guilt. Ordinary people are convinced by what seems to be a smoking gun, they then see all the other evidence through a filter. This is called “confirmation bias”. In the Lucy Letby case, the smoking gun was probably the post-it note, and the insulin then seems to clinch the matter. The prosecution cross-examination convinces those who already believe Lucy is guilty that she moreover is constantly lying. More on all this in later posts, I hope.

Back to the insulin. Here are the instructions on the insulin testing kit used for the trial, taken from this website http://pathlabs.rlbuht.nhs.uk/ccfram.htm, the actual file is http://pathlabs.rlbuht.nhs.uk/insulin.pdf. Notice the warning printed in red. Yes, it was printed in red, that was not something I changed later. (All this is not my discovery; the person who uncovered these facts wishes to remain anonymous).

The toxicological evidence used in the trial violates the code of practice of the UK’s Forensic Science Regulator (see link below). It should have been deemed inadmissible. Instead, the defence has not disputed it, and thereby obliged their own client Lucy to agree that there must have been a killer on the ward. The jury are instructed to believe that two babies were given insulin without authorization, endangering their lives. (The two babies in question are still very much alive, to this day. Probably now at primary school.)

The defence stated to me that they cannot inform Lucy of the alternative analysis of the insulin question. It appears to me that this violates their own code of practice. Do they feel bound by the weird rules of UK’s criminal prosecution practice? Their client, Lucy Letby, is herself essentially merely a piece of evidence, seized by the police from what they believe is a scene of crime. No one may tamper with it during the duration of her own trial, which is lasting 10 months! I think this constitutes an appalling violation of basic human rights. The UK laws on contempt of court are meant to guarantee a fair trial. But in the case of a 10-month trial on 22 charges of murder and attempted murder, they are guaranteeing an unfair trial.

Lucy’s solicitor refused to pass on a friendly personal letter of support to Lucy or to her parents because she had not instructed him to do so. Should one laugh or cry about that excuse? I have the impression that he is not very bright and that he may have been convinced she is guilty. If so, I hope he is changing his mind. In the UK, the solicitor does all the legwork and communication between the client and the defence team. The barrister does the cross-examinations and the court theatrics, but probably never builds up a personal relationship with his client. Lucy has been all this time prison, in pre-trial detention, far from Manchester or Hereford. This might explain the extraordinarily weak defence which has been put up so far. But it might be deliberate.

One must take into account the fact that funding for legal support is meagre. The prosecution has been working on the case for 6 or so years, with unlimited resources. The defence has had a relatively very short time, with very limited resources. Probably the solicitor and the barrister already put in many more hours than they are paid for. There are no funds for expensive scientific witnesses. It is very possible that the defence team well understands that they cannot put up a serious defence during the 9 to 10 months of the trial, but that precisely this time period, with a huge number of revelations being made outside the trial, material for a serious defence during an appeal has been “crowd-sourced”. It seems to me that this mass of high-quality independent scientific work provides plenty of grounds for an appeal, in the case that the jury hands down a guilty verdict.

Some links:

Sarrita Adams’ Science on Trial website

scienceontrial.com

Formerly: https://rexvlucyletby2023.com/


Scott McLachlan’s Law Health and Tech blog

LL Part 0: Scepticism in Action: Reflections on evidence presented in the Lucy Letby trial. https://lawhealthandtech.substack.com/p/scepticism-in-action

LL Part 1: Hospital Wastewater https://lawhealthandtech.substack.com/p/ll-part-1-hospital-wastewater

LL Part 2: An ‘Association’ https://lawhealthandtech.substack.com/p/ll-part-2-an-association

LL Part 3: Death already lived in the NICU Environment, https://lawhealthandtech.substack.com/p/ll-part-3-death-already-lived-in

LL Part 4: Outbreak in a New NICU: Build it and the pathogens will come…https://lawhealthandtech.substack.com/p/ll-part-4-outbreak-in-a-new-nicu

LL Part 5: The Demise of Child A https://lawhealthandtech.substack.com/p/ll-part-5-the-demise-of-child-a

LL Part 6: The Incredible Dr Dewi Evans https://lawhealthandtech.substack.com/p/ll-part-6-the-incredible-dr-dewi

LL Part 7: The Demise of Child C. https://lawhealthandtech.substack.com/p/ll-part-7-the-demise-of-child-c

LL Part 8: The Death of Child D. Had she been left or resumed on CPAP, she might still be alive today. https://lawhealthandtech.substack.com/p/ll-part-8-the-death-of-child-d


Peter Elston’s “Chimpinvestor” blog

Do Statistics Prove Accused Nurse Lucy Letby Innocent? https://www.chimpinvestor.com/post/do-statistics-prove-accused-nurse-lucy-letby-innocent This splendid and comprehensive blog post also has a large list of links to reports and data sets. Yet more data analysis can and should be done. This site gives anyone who wants to a quick-start. And after that, two more outstanding posts…

https://www.chimpinvestor.com/post/more-remarkable-statistics-in-the-lucy-letby-case

https://www.chimpinvestor.com/post/the-travesty-of-the-lucy-letby-verdicts


Data obtained from FOI requests

FOI requests provided some fantastic data sets https://www.whatdotheyknow.com/request/neonatal_deaths_and_fois#incoming-1255362 see especially https://www.whatdotheyknow.com/request/521287/response/1265224/attach/2/FOI%204568×1.xlsx?cookie_passthrough=1


How forensic science should be reported in court

Forensic Science Regulator: statutory code of practice https://www.gov.uk/government/publications/statutory-code-of-practice-for-forensic-science-activities


One of numerous enterovirus and parechovirus epidemics in neonatal wards

Cluster of human parechovirus infections as the predominant cause of sepsis in neonates and infants, Leicester, United Kingdom, 8 May to 2 August 2016 https://www.eurosurveillance.org/content/10.2807/1560-7917.ES.2016.21.34.30326


Someone commissioned a pretrial statistical and risk analysis – results not used in the trial

Lucy Letby Trial, Statistical and Risk Analysis Expert Input. Who commissioned this analysis, and what did it yield? (I can give you the answer after the verdict has come out). https://www.oldfieldconsultancy.co.uk/lucy-letby-trial-statistical-and-risk-analysis-expert-input/


The RSS (statistics and law section) report – not used in the trial

Royal Statistical Society: “Healthcare serial killer or coincidence?
Statistical issues in investigation of suspected medical misconduct” by the RSS Statistics and the Law Section, September 2022 https://rss.org.uk/news-publication/news-publications/2022/section-group-reports/rss-publishes-report-on-dealing-with-uncertainty-i/

At a pre-publication meeting of stake-holders held to gain feedback on our report, a senior West Midlands police inspector told me “we are not using statistics because they only make people confused”. Lucy’s sollicitor and barrister knew well in advance of our report, were even given names of excellent UK experts whom they could consult, but did not bother to contact one of them. No statistics in our courts please, we are British! Yet the UK has the best applied statisticians and epidemiologists in the world.


Article in “Science” about my work on serial killer nurses

Unlucky Numbers: Richard Gill is fighting the shoddy statistics that put nurses in prison for serial murder. Science, Vol 379, Issue 6629, 2022. https://www.science.org/content/article/unlucky-numbers-fighting-murder-convictions-rest-shoddy-stats


Two subreddits on the Lucy Letby case

https://www.reddit.com/r/scienceLucyLetby/ (the Lucy Letby Science subreddit)

https://www.reddit.com/r/lucyletby/ (general)


Medical Ethics

John Gibbs, recently retired Consultant Paediatrician at the Countess of Chester
Hospital, defined Medical Ethics as “Playing God with Life and Death decisions.”
See article “Medical Ethics” on page 6 of The Messenger, Monthly Newsletter of St Michael’s, Plas Newton, Chester) – reporting on talk by Dr John Gibbs, retiring paediatrician at CoCH. https://stmichaelschester.com/wp-content/uploads/2019/04/Messenger-April-2020.pdf. Audio: https://stmichaelschester.com/sermons/encounter-medical-ethics/


The state of forensic science in the UK

https://www.bbc.co.uk/sounds/play/m001k7vt?partner=uk.co.bbc&origin=share-mobile “The UK’s forensic science used to be considered the gold standard, but no longer. The risk of miscarriages of justice is growing. And now a new Westminster Commission is trying to find out what went wrong. Joshua talks to its co-chair, leading forensic scientist Dr Angela Gallop CBE, and to criminal defence barrister Katy Thorne KC.”


Criminal Procedure Rules and Criminal Practice Directions

Revised rules came out earlier this year, so maybe they do not apply to a trial which started earlier. Still, they express what the Lord Chief Justice of England and Wales presently wants to promote. https://www.judiciary.uk/guidance-and-resources/message-from-lord-burnett-lord-chief-justice-of-england-and-wales-new-criminal-practice-directions-2023/ . See especially Section 7 of his “Criminal Practice Directions (2023)” https://www.judiciary.uk/wp-content/uploads/2023/04/Criminal-Practice-Directions-2023-1-3.pdf


New expert evidence cannot be admitted once a trial is in progress

“The courts have indicated that they are prepared to refuse leave to the Defence to call expert evidence where they have failed to comply with CrimPR; for example by serving reports late in the proceedings, which raise new issues (Writtle v DPP [2009] EWHC 236). See also: R v Ensor [2010] 1 Cr. App. R.18 and Reed, Reed & Garmson[2009] EWCA Crim. 2698″. This quote comes from https://www.cps.gov.uk/legal-guidance/expert-evidence. Note, a judge is always allowed to break with precedence. The rule is not actually a permanent rule, it is merely a description of current practice. Current practice evolves when and if a new judge sees fit to break with precedence. Obviously, he would have to come up with good legal reasons why he believes he has to do that. It’s his prerogative, his free choice. That’s the essence of case law, aka common law.

CBS Statistieken, uitkeringsaffaire, uithuisplaatsingen

Deze saga zet zich voort met een nieuwe publicatie van het CBS, https://www.cbs.nl/nl-nl/achtergrond/2023/05/onzekerheidsmarges-onderzoek-toeslagenaffaire-en-jeugdbescherming. Nou ja, het kwam uit anderhalf maand geleden. Ik was met andere dingen bezig …

Hierbij een eerste indruk. Er worden nu betrouwbaarheidsintervallen bepaald en men ziet meteen dat de statistische onzekerheid enorm is. Natuurlijk, worden deze berekeningen gebaseerd op statistische veronderstellingen, en die zijn altijd betwistbaar. Maar op zijn minst kunnen ze geinterpreteerd worden op een pure beschrijvend-data manier als een gevoeligheids analyse. Een brede interval laat zien dat als de data een klein beetje anders was, het antwoord totaal anders zou zijn geweest. We weten zo wie zo dat er allerlei foutbronnen zijn; we weten dat de gegevens in de data bestanden van rijksinstellingen heel ver kunnen afliggen van de ervaringen van de burgers; dat ze afhangen van allerlei definities en afspraken die hun oorsprong hebben in bureaucratische administraties.

Een belangrijke resultaat is het plaatje hieronder, waarbij statistische onzekerheidsmarges toegevoegd zijn aan een plaatje uit de eerste (en omstreden) CBS rapport. Figuur 6.1.1.

Ik heb de “kleine letters” en de “nog kleinere kleine letters” meegenomen, niet om te lezen, maar om te laten zien dat er een hele technisch verhaal bijhoort.

De eerste indruk is dat het lijntje in het midden ongeveer plat is. Dus: de nare ingreep (gedupeerd zijn) in jaar “nul” geen sterke effect heeft. Men ziet over meerdere jaren een lichte toename bij dezelfde 4000 gezinnen van maatregelen van jeugdbescherming wat, zo te zien, beste toevallig had kunnen zijn. De hypothese van “geen impact” kan niet verworpen worden op grond van deze cijfers.

Maar, dat is niet de enige mogelijke uitleg van het plaatje, en die is net zo min te verwerpen. Dat hobbeltje in de grafiek zou ook “echt” kunnen zijn, en bovendien veroorzaakt door de klap wat de belastingdienst in “jaar nul” uitdeelde. Het ziet eruit als een stijging van een half procent per jaar, over meerdere jaren. De meest aannemelijke schatting is dat 20 tot 30 (of zelfs meer) echte dubbele slachtoffers zijn; dubbele slachtoffers in de zin dat gedupeerd zijn door de uitkeringsschandaal werkelijk leidde tot een uithuisplaatsing wat anders niet zou zijn gebeurd.

Het echte effect is gedempt en uitgesmeerd door alle tekortkomingen van het onderzoek. De conclusie moet zijn: het zijn zeker tientallen en mogelijk zelfs honderd.

Overigens, zou ik graag een keer een extra cijfer willen hebben waardoor ik de statistische onzekerheid in het verschil in hoogte van deze twee waardes (blaue en groen) zou kunnen evalueren.

Er zijn ruwweg 4000 gedupeerden en die zijn gepaard één op één met vergelijkbare niet-gedupeerden. We hebben feitelijk te maken met rond de 4000 matched pairs. Het CBS weet van elk lid van elk paar of een jeugdbescherming actie plaatsvond. We hebben feitelijk 4000 waarnemingen van paren, elk waarvan een van de vier waardes kan aannemen (0, 0), (0, 1), (1, 0), (1, 1); noem deze twee gevallen (x, y). Een “1” betekent uit een huisplaatsing (of iets dergelijks), een “0” betekent geen uithuisplaatsing. We zijn geinteresseerd in de gemiddelde van de x‘en minus de gemiddelde van de y‘s. Dat is hetzelfde als de gemiddelde van alle (xy) waarden; elk ervan is gelijk aan –1, 0, of +1. Ik zou graag het 2×2 tabel willen zien van aantallen van elk van de vier mogelijke gesamenlijke uitkomsten (x, y). Ik zou de standaard afwijking willen uitrekenen van de (xy) waarden. Dit zou ons inzicht geven in de mate van success van de matching: als het goed is, zouden we een positieve correlatie zien tussen de uitkomsten van de twee groepen. Een correlatie van +1 zou impliceren dat de uitkomst volledig bepaald is door de matching variabelen, dat zou betekenen: gedupeerd zijn maakte werkelijk niks uit. Kom’ns op, CBS!

The bogeyman (Algemene Dagblad, 26 January)

date2023-01-26 09:38:02

Richard Gill. © Rob Voss

Professor Gill helped exonerate Lucia de B., and is now making mincemeat of the CBS report on benefits affair

Top statistician Richard Gill cracks down on the research conducted by Statistics Netherlands (CBS) into custodial placements of children of victims in the benefits affair. ‘CBS should never have come to the conclusion that this group of parents was not hit harder than other parents.’

Carla van der Wal 26-01-23, 06:00 Last update: 08:10

Emeritus professor Richard Gill would prefer to pick edible mushrooms in the woods and spend time with his grandchildren. Nevertheless, the top statistician in the Netherlands, who previously helped to exonerate the unjustly convicted Lucia de B, is now firmly committed to the benefits affair.

CBS should never have started the investigation into the custodial placement of children of victims in the benefits affair, says Gill. “And the conclusion that this group of parents has not been hit harder than other parents, CBS should never have drawn. It left many people thinking: only the tax authorities have failed, but fortunately there is nothing wrong with youth care. So all the fuss about ‘state kidnappings’ was unnecessary.”

After Statistics Netherlands calculated how many children of benefit parents were placed out of home (in the end it turned out to be 2090), it seemed that victims in the affair lost their children more often than similar parents who were not victims. The results were presented on November 1 last year, which Gill now denounces.

Gill is emeritus professor of mathematical statistics at Leiden University and in the past was an advisor to the methodology department of Statistics Netherlands. In the case of Lucia de B. he showed that calculations that would show that De B. had more deaths in her services were incorrect.

CBS abuses

There is a special reason that Gill is now getting stuck in the benefits affair – but more on that later. First about the CBS report. Gill states that Statistics Netherlands is not equipped for this type of research and points out that after two research methods were dropped, only one ‘not ideal, but only option’ remained. He also thinks, among other things, that the more severely affected victims in the benefits affair should be the focus of the investigation. He emphasizes that relatively mildly affected families most likely had to deal with much less drastic consequences. CBS itself also says that it likes to use information about the degree of duping, but that there was none.

CBS also acknowledges some criticisms. “CBS itself has mentioned a number of comments to the report. There seems to be a misunderstanding on one point,” said a spokesperson, who also said that CBS still fully supports the conclusions. CBS will soon be discussing the methodology used with Gill, but in any case CBS sees itself as the right party to carry out the study. “CBS has the task of providing insight into social issues with reliable statistical information and data and has the necessary expertise and techniques. In this case there was a clear social need for statistical insight.”

Gill thinks otherwise and thinks it’s important to raise this. Because he is awakened by injustice. That was also a reason to offer his help when questions arose about the conviction of Lucia de B., who can simply be called Lucia de Berk again since her acquittal. In 2003 she was sentenced to life imprisonment.

Out-of-home placement

With the acquittal in 2010, Gill became not only a top statistician, but also a beacon of hope for people who experienced injustice. And José Booij, a mother of a child placed in care, contacted him many years ago.

Somewhere in Gill’s house in Apeldoorn there is still a box with papers from José. It contains diaries, newspaper clippings and diplomas of hers. She was a little different from other people. A doctor who fell for women, fled the Randstad and settled in Drenthe. There she became pregnant and had a baby. And she had a neighbour with whom she had a disagreement. “That neighbour had made all kinds of reports about José to the local police, said that something terrible would happen to the child.” After six weeks, José’s daughter was removed from home.

State kidnapping

“What happened to José at the time, I also call that a state kidnapping, just as the custodial placements among victims of the benefits affair are now called.” The woman continued to fight to get her child back. But gradually that fight drove her insane. She lost her job, she lost her home. She fled abroad. “Despite a court ruling that the child had to be returned to José, that did not happen. José eventually derailed. I now know that she has left information with more people in the Netherlands to ensure that it is available to her daughter when she is ready. But I can’t find José anymore. I heard she was seen in the south of the Netherlands after escaping from a psychiatric clinic in England.”

Text continues below the photo

Demonstration by victims of the benefits affair. © ANP / ANP

And meanwhile he keeps that box. And Gill thinks of José, when he considers the investigation by the Central Bureau of Statistics into custodial placements of children of victims in the benefits affair. Gill makes mincemeat of it. “The only thing CBS can say is that the results suggest that the differences between the two groups that have been compared are quite small. There should be a lot more caution, and yet in the summary you see bold summaries, such as: ‘Being duped does not increase the likelihood of child protection measures’. I suspect that CBS was put under pressure to conduct this study, or wanted to justify its existence. Perhaps there is an urge to be of service.”

Time for justice

Now is the time to put that right, Gill thinks. Research needs to be done to find out what’s really going on. “I had actually hoped that younger colleagues would have stood up by now, who would take up such matters.” But as long as that doesn’t happen, he’ll do it himself. Maybe it’s in his genes. It was Gill’s mother – he was born in England – who helped crack the enigma code used by the Germans to communicate during World War II. Gill wasn’t surprised when he found out. He already suspected that his excellent mind was inherited not only from his father, but also from his mother.

Love

Yet in the end it was his wife – the love of whom led him settle in the Netherlands – who put him on this track. She pointed Gill to Lucia de Berk’s case and encouraged him to get to work. She may have regretted that. For example, when Gill threatened to burn his Dutch passport during a broadcast of The World Keeps on Turning Round (“De wereld draait door”) if the De Berk case was not reviewed. “She said, ‘You can’t say things like that!'”

In fact, he would like to enjoy his retirement with her now – he has been out of paid work for six years now. Then he would spend his days in the woods looking for edible mushrooms. And spend a lot of time with his grandchildren. But now his calculations also help exonerate other nurses. Last year, Daniela Poggiali was released in Italy after Gill interfered with the case together with an Italian colleague. There are still things waiting for him in England.

And so the benefits affair is here in the Netherlands, which, as far as Gill is concerned, needs more in-depth, thorough research to find out exactly what caused the custodial placements. “That is why I ended up with Pieter Omtzigt and Princess Laurentien, who are also involved in the benefits affair.” Among the people who express themselves diplomatically, he wants to be the bad cop, the man who shakes things up, as he did when he threatened to set his passport on fire. But at the same time, he also hopes that a young statistician will emerge who is prepared to take over the torch.

CBS provided this site with an extensive explanation in response to Gill’s criticism. It recognizes the complexity of this type of research, but sees itself as the appropriate body to carry out that research. An appointment to speak with Gill has already been scheduled. “CBS always tries to explain as clearly and transparently as possible in its reports what has been investigated, how it was done and what the results are.”

Statistics Netherlands also points to nuances in the text of the report, for example after the sentence above a piece of text: ‘Being duped does not increase the chance of child protection measures’. “On an individual level, there may be a relationship between duping and youth protection, which is stated in several places in the report.” Even if ‘on average no evidence is found for a relationship between duping and youth protection’, as Statistics Netherlands notes.

Statistics Netherlands fully supports the research and the conclusions as stated in the report. It is pointed out, however, that there are still opportunities for follow-up research, as has also been indicated by Statistics Netherlands.

De boeman (Algemene Dagblad, 26 januari)

Richard Gill.
Richard Gill. © Rob Voss

Hoogleraar Gill hielp bij vrijpleiten Lucia de B., en maakt nu gehakt van CBS-rapport toeslagenaffaire

Topstatisticus Richard Gill kraakt het onderzoek dat het Centraal Bureau voor de Statistiek (CBS) uitvoerde naar uithuisplaatsingen van kinderen van gedupeerden in de toeslagenaffaire. ‘De conclusie dat deze groep ouders niet harder is geraakt dan andere ouders, had het CBS nooit mogen trekken.’

Carla van der Wal 26-01-23, 06:00 Laatste update: 08:10

Het liefste zou emeritus hoogleraar Richard Gill eetbare paddenstoelen plukken in het bos, en tijd doorbrengen met zijn kleinkinderen. Toch bijt de topstatisticus van Nederland, die eerder hielp bij het vrijpleiten van de onterecht veroordeelde Lucia de B, zich nu vast in de toeslagenaffaire. 

Het CBS had nooit aan het onderzoek naar de uithuisplaatsing van kinderen van slachtoffers in de toeslagenaffaire moeten beginnen, zegt Gill. ,,En de conclusie dat deze groep ouders niet harder is geraakt dan andere ouders, had het CBS nooit mogen trekken. Die liet velen denken: alleen de belastingdienst heeft gefaald, maar er is gelukkig niets mis met jeugdzorg. Al die ophef over ‘staatsontvoeringen’ was dus onnodig.’’

Nadat het CBS becijferde hoeveel kinderen van toeslagenouders uit huis werden geplaatst (uiteindelijk bleken het er 2090), leek het of gedupeerden in de affaire vaker hun kinderen kwijtraakten dan soortgelijke ouders die geen slachtoffer waren. Op 1 november vorig jaar werden de resultaten gepresenteerd, die Gill nu hekelt.

Gill is emeritus hoogleraar mathematische statistiek aan de universiteit van Leiden en was in het verleden adviseur bij de afdeling methodologie van het CBS. In de zaak van Lucia de B. liet hij zien dat berekeningen die zouden aantonen dat De B. vaker sterfgevallen in haar diensten had, niet klopten.

Misstanden CBS 

Dat Gill zich nu vastbijt in de toeslagenaffaire heeft een bijzondere reden – maar daarover later meer. Eerst nog over het rapport van het CBS. Gill stelt dat het CBS niet is ingericht op dit type onderzoek en wijst erop dat nadat twee onderzoeksmethodes afvielen slechts één ‘niet ideale, maar enige optie’ overbleef. Ook vindt hij onder meer dat zwaarder getroffen gedupeerden in de toeslagenaffaire centraal zouden moeten staan bij het onderzoek. Hij benadrukt dat relatief licht geraakte gezinnen hoogstwaarschijnlijk met veel minder ingrijpende gevolgen te maken hebben gehad. Het CBS zegt overigens zelf ook dat het graag informatie over de mate van gedupeerdheid gebruikt, maar dat die er niet was.

Het CBS erkent ook sommige punten van kritiek. ,,Een aantal heeft het CBS zelf als kanttekening genoemd bij het rapport. Op een enkel punt lijkt sprake van een misverstand’’, aldus een woordvoerder, die ook zegt dat het CBS nog volledig achter de conclusies staat. Over de gebruikte methodologie gaat het CBS binnenkort met Gill in gesprek, maar het CBS ziet zich in elk geval wél als de juiste partij om het onderzoek uit te voeren. ,,Het CBS heeft als taak om met betrouwbare statistische informatie en data inzicht te geven in maatschappelijke vraagstukken en beschikt over de nodige expertise en technieken. In dit geval was een duidelijke maatschappelijke behoefte aan statistisch inzicht.’’

Gill denkt daar anders over en vindt het belangrijk dat aan te kaarten. Want hij ligt wakker van onrecht. Dat was ook reden om zijn hulp aan te bieden toen er vragen rezen over de veroordeling van Lucia de B., die sinds haar vrijspraak gewoon weer Lucia de Berk genoemd kan worden. In 2003 werd ze veroordeeld tot een levenslange gevangenisstraf.

Uithuisplaatsing

Door de vrijspraak in 2010 werd Gill naast een topstatisticus ook een baken van hoop voor mensen die onrecht ervaarden. En nam José Booij, een moeder van een uit huis geplaatst kind, vele jaren geleden contact met hem op.

Ergens in het huis van Gill in Apeldoorn staat nog een doos met papieren van José. Erin zitten dagboeken, krantenknipsels en diploma’s van haar. Ze was een beetje anders dan andere mensen. Een jurist die op vrouwen viel, de Randstad ontvluchtte en neerstreek in Drenthe. Daar werd ze zwanger, kreeg ze een kindje. En had ze een buurvrouw, met wie ze onenigheid had. ,,Die buurvrouw had allerlei meldingen over José gedaan bij de lokale politie, had gezegd dat met het kindje iets vreselijks zou gebeuren.” Na zes weken werd Josés dochtertje uit huis geplaatst.

Staatsontvoering 

,,Wat José indertijd is overkomen, dat noem ik ook een staatsontvoering, net zoals de uithuisplaatsingen onder slachtoffers van de toeslagenaffaire nu worden genoemd.” De vrouw bleef vechten om haar kind terug te krijgen. Maar gaandeweg dreef dat gevecht haar tot waanzin. Ze raakte haar werk kwijt, ze raakte haar huis kwijt. Ze vluchtte naar het buitenland. ,,Ondanks een oordeel van de rechter, dat het kind terug moest naar José, gebeurde dat niet. José is uiteindelijk ontspoord. Inmiddels weet ik dat ze bij meer mensen in Nederland informatie heeft achtergelaten, om te zorgen dat die beschikbaar is voor haar dochter, als die eraan toe is. Maar José heb ik niet meer kunnen vinden. Ik heb gehoord dat ze nog is gezien in het zuiden van Nederland, nadat ze was ontsnapt uit een psychiatrische kliniek in Engeland.”

Tekst gaat verder onder de foto

Gedupeerden in de toeslagenaffaire demonstreren.
Gedupeerden in de toeslagenaffaire demonstreren. © ANP / ANP

En ondertussen bewaart hij dus die doos. En denkt Gill aan José, als hij zich buigt over het onderzoek van het Centraal Bureau voor de Statistiek, naar uithuisplaatsingen van kinderen van slachtoffers in de toeslagenaffaire. Gill maakt er gehakt van. ,,Het enige wat het CBS kan zeggen, is dat de uitkomsten suggereren dat de verschillen tussen de twee groepen die zijn vergeleken vrij klein zijn. Er zou veel meer voorzichtigheid moeten zijn, en toch zie je in de samenvatting in vetgedrukte letters stellige samenvattingen, zoals: ‘Gedupeerdheid verhoogt de kans op kinderbeschermingsmaatregelen niet’. Ik vermoed dat het CBS onder druk is gezet om dit onderzoek te doen, of zijn bestaansrecht heeft willen verantwoorden. Wellicht is er sprake van een drang om dienstbaar te zijn.”

Tijd voor rechtvaardigheid

Nu is het tijd om dat recht te zetten, vindt Gill. Er moet onderzoek worden gedaan, om te kijken hoe het echt zit. ,,Ik had eigenlijk gehoopt dat er inmiddels jongere collega’s zouden zijn opgestaan, die dit soort zaken op zouden pakken.” Maar zolang dat niet gebeurt, doet hij het zelf wel. Misschien zit het wel in zijn genen. Het was Gills moeder – hij werd geboren in Engeland – die tijdens de Tweede Wereldoorlog meewerkte aan het kraken van de enigmacode, die door de Duitsers werd gebruikt om te communiceren. Gill verraste het niet, toen hij erachter kwam. Hij had al zo’n vermoeden dat zijn excellente verstand niet alleen een erfenis van zijn vader, maar ook zijn moeder was.

De liefde

Toch was het uiteindelijk zijn vrouw – de liefde zorgde dat hij in Nederland neerstreek – die hem op dit spoor heeft gezet. Zij wees Gill op de zaak van Lucia de Berk en stimuleerde hem ermee aan de slag te gaan. Misschien heeft ze dat wel eens betreurd. Bijvoorbeeld toen Gill tijdens opnames van De wereld draait door dreigde zijn Nederlandse paspoort te verbranden, als de zaak De Berk niet werd herzien. ,,Ze zei: dat kan je toch niet doen?”

Eigenlijk zou hij nu met haar van zijn pensioen willen genieten – hij is inmiddels zes jaar gestopt met zijn betaalde werk. Dan zou hij zijn dagen vullen in het bos, zoekend naar eetbare paddenstoelen. En veel tijd doorbrengen met zijn kleinkinderen. Maar nu helpen zijn berekeningen ook bij het vrijpleiten van andere verpleegkundigen. Vorig jaar werd Daniela Poggiali nog vrijgelaten in Italië, nadat Gill zich samen met een Italiaanse collega met de zaak bemoeide. In Engeland zijn nog zaken die op hem wachten.

En de toeslagenaffaire is er hier in Nederland dus, waar wat Gill betreft diepgravender, gedegen onderzoek naar moet komen, om uit te zoeken wat nu precies de uithuisplaatsingen veroorzaakte. ,,Ik ben daarom terechtgekomen bij Pieter Omtzigt en prinses Laurentien, die zich ook met de toeslagenaffaire bezighouden.” Tussen de mensen die zich diplomatiek uiten, wil hij best de bad cop zijn, de man die de boel opschudt, zoals hij deed toen hij dreigde zijn paspoort in de fik te steken. Maar tegelijkertijd hoopt hij toch vooral ook dat er een jonge statisticus opstaat, die bereid is de fakkel over te nemen.

Het CBS gaf deze site een uitgebreide toelichting, naar aanleiding van de kritiek van Gill. Het erkent de complexiteit van dit soort onderzoek, maar ziet zichzelf wél als aangewezen instantie om dat onderzoek uit te voeren. De afspraak om met Gill te spreken is al ingepland. ,,Het CBS tracht in de rapporten altijd zo duidelijk en transparant mogelijk uit te leggen wat onderzocht is, hoe dat is gedaan en wat de uitkomsten zijn.”

Ook wijst het CBS op nuanceringen in de tekst van het rapport, bijvoorbeeld na de zin boven een stuk tekst: ‘Gedupeerdheid verhoogt de kans op kinderbeschermingsmaatregelingen niet’. ,,Er kan op individueel niveau wél een relatie tussen dupering en jeugdbescherming zijn, dat staat op meerdere plekken in het rapport vermeld.” Ook als er ‘gemiddeld genomen geen bewijs gevonden wordt voor een relatie tussen dupering en jeugdbescherming’, zoals het CBS constateert.

Het CBS staat volledig achter het onderzoek en de conclusies zoals die in het rapport vermeld staan. Wel wordt erop gewezen dat er nog mogelijkheden zijn voor vervolgonderzoek, dat heeft het CBS ook aangegeven.

Nog meer over uitkeringsschandaal en uithuisplaatsingen

Hieronder volgt een poging (20 januari 2023, ‘s ochtends) om het kern van het verhaal op te schrijven in 500 woorden en Jip en Janneke taal. Het lukte niet.

Heeft het CBS de waarheid in pacht?

Velen werden wakker geschud door carabetier Peter Pannekoek’s woorden “1115 staatsontvoeringen”. Maar ze kunnen weer in slaap gesust zijn door het CBS rapport “Jeugdbescherming en de toeslagenaffaire – Kwantitatief onderzoek naar kinderbeschermingsmaatregelen bij kinderen van gedupeerden van de toeslagenaffaire”. Een van de belangrijkste conclusies (samenvatting, eerste bladzijde) luidt

Gedupeerdheid verhoogt de kans op kinderbeschermingsmaatregelen niet“.

Dat is een krachtige uitspraak. Geen enkel relativering, geen “kleine letters”. Geen melding dat het een uitspraak is die alleen gemaakt kan worden onder een hele reeks veronderstellingen. Helaas, een hele reeks veronderstellingen waarvan velen pertinent onwaar zijn.

Mijn antwoord: misschien geen 1115, maar misschien wel: 115

Nu munt het CBS uit in het doen van beschrijvend statistiek, wat ook hun wettelijke opdracht is. Ze dienen neutraal de feiten te ontsluiten en weer te geven die politiek en bestuur en burgers nodig hebben. Waar het CBS minder expertise in huis heeft, omdat het ook beslist niet tot hun taak behoort, is in het ontwarren van oorzaak en gevolg. Dat noemen we tegenwoordig “Causaliteit” en het is een uiterst actueel, belangrijk, subtiel, en complex onderwerp binnen het wetenschappelijk onderzoek; explosief gegroeid sinds Judea Pearls boek “Causality” uit 2000. Kan je causaliteit concluderen door het waarnemen van correlatie of associatie?

Voorbeeld. Lucia de B maakte vreselijk veel incidenten mee in haar diensten. Veel meer dan men zou hebben verwacht en dat leidde ook tot levenslange gevangenisstraf voor seriële moord. Pas later werd duidelijk dat haar aanwezigheid juist de reden was dat medisch onderzoekers bepaalde gebeurtenissen als incidenten karakteriseerden!

Maar kan geen associatie ook op causaliteit duiden? Jawel! Statistieken kunnen misleiden. Een aansprekend visuele representatie van statistieken des te meer. Mijn oog werd getrokken door Figuur 6.1.2 in het CBS rapport waarin we drie vrolijk gekleurde balkjes zijn, die de percentages 1%, 4% en 4% dienen te representeren. Zie je wel! De percentage uithuisplaatsingen bij de gedupeerden is exact wat je zou hebben verwacht, als al die gezinnen helemaal niet gedupeerd waren geweest!

Ik zou zeggen, dat kan geen toeval zijn. Na studie van het onderzoeksprotocol inclusief de vele door de team hanteerde algoritmes, wordt ook duidelijk dat het geen toeval is. Door de onderzoekskeuzes die het onderzoeksteam zich gedwongen voelde te maken is het verschil in uithuisplaatsingskans tussen “vergelijkbare” wel en niet gedupeerden systematisch verkleind. Het verschil is dus groter dan het lijkt (het lijkt nul te zijn, maar dat is het beslist niet). De juiste conclusie van het onderzoek had moeten zijn, ten eerste, dat er zeker tientallen uithuisplaatsingen “extra” plaatsvonden vanwege de affaire en mogelijk honderd (of zelfs een paar honderd). Een tweede conclusie had moeten zijn dat deze gedurfde pilot studie bewezen heeft dat een totaal ander onderzoeksopzet nodig is oude gestelde vraag te beantwoorden. Mogelijk, iets in de trant van het eerder verworpen onderzoeksvoorstel van Prof. Bart Tromp van de Universiteit Groningen. Overigens, is het nooit nodig om alle dossiers van de hele geschiedenis van alle gedupeerden door te pluizen. Door slim een aselecte steekproef in een verstandig gekozen deelpopulatie te nemen, kan men zich beperken tot het goed uitzoeken van relatief weinig gevallen.

Goede “Data Science” is onmogelijk zonder grote expertise te combineren uit drie gebieden tegelijkertijd: 1) algoritmes en computer mogelijkheden; 2) kansrekening en inferientiele statistiek (dwz het kwantificeren van de onzekerheid in de gevonden resultaten); 3) (last but not least!) vakspecifieke kennis van het beoogde toepassingsgebied; in dit geval psychologie, recht, bestuur.

De verantwoording van mijn claims ben ik momenteel aan het uitschrijven in mijn blog, https://gill1109.com/2023/01/18/de-statistiek-van-slachtoffers-van-uitkeringsschandaal/; het moet nog veel worden uitgebreid met nadere onderbouwing, verwijzingen, enzovoorts.

Ik denk aan een statistische simulatie om mijn punt te illustreren. Die twee getallen “4%” hebben foutbalken nodig van ongeveer +/- 1%. Lastig omdat ik rekening moet houden met de correlatie binnen de paren. We kunnen alleen maar raden hoe groot het is. Dus: meerdere simulaties met verschillende gissingen.

Yet more on the Dutch benefits scandal and child removals

This is a first attempt to summarise my claims in 500 words and simple language. It didn’t succeed.

Does CBS have direct access to the truth?

Many were shaken up by carabetier Peter Pannekoek ‘s words “1115 state kidnappings”. But they may have been lulled back to sleep by the CBS report “Youth protection and the benefits affair – Quantitative research into child protection measures in children of victims of the benefits affair”. One of the main conclusions (summary, first page) reads

Being a victim of the benefits scandal does not increase the likelihood of child protection measures“.

That’s a powerful statement. No relativization whatsoever, no “small print”. No mention of it being a statement that can only be made under a slew of assumptions. Alas, a slew of assumptions many of which are patently untrue.

My answer: Maybe not 1115, but could well have been 115.

Now CBS excels at doing descriptive statistics, which is also their legal assignment. They should neutrally disclose and represent the facts that politicians, administration and citizens need. Where CBS has less in-house expertise, because it is certainly not part of their task, is in disentangling cause and effect. This is what we call “Causality” today and it is an extremely topical, important, subtle, and complex subject of scientific inquiry; exploded since Judea Pearl’s 2000 book “Causality”. Can you infer causality by observing correlation or association?

Example. Lucia de B experienced an awful lot of incidents in her services. Much more than one would have expected and that also led to life imprisonment for serial murder. Only later did it become clear that her presence was precisely the reason why medical examiners characterized certain events as incidents!

But can *no* association also indicate causality? Yes! Statistics can be misleading. An appealing visual representation of statistics all the more. My eye was drawn to Figure 6.1.2 in the CBS report in which we are three brightly colored bars, which should represent the percentages 1%, 4% and 4%. See! The percentage of custodial placements among the victims is exactly what you would have expected, if all those families had not been victimized at all!

I’d say that can’t be a coincidence. After studying the research protocol, including the many algorithms used by the team, it also becomes clear that this is no coincidence. Due to the research choices that the research team felt compelled to make, the difference in out-of-home placements between “comparable” victims and non-victims has been systematically reduced. So the difference is greater than it appears (it appears to be zero, but it is definitely not). The correct conclusion of the investigation should have been, first, that there were certainly dozens of “extra” custodial placements because of the affair and possibly a hundred (or even a few hundred). A second conclusion should have been that this bold pilot study has proven that a completely different research design is needed to answer an old question. Possibly, something along the lines of Prof. dr. Bart Tromp of the University of Groningen. Incidentally, it is never necessary to go through *all* files of the entire history of all victims. By smartly taking a random sample in a sensibly chosen sub-population, one can limit oneself to properly sorting out relatively few cases.

Good “Data Science” is impossible without combining great expertise from three areas at the same time: 1) algorithms and computing capabilities; 2) probability theory and inferiential statistics (ie quantifying the uncertainty in the results found); 3) (last but not least!) subject-specific knowledge of the intended application area; in this case psychology, law, administration.

I am currently writing out the justification for my claims in my blog, https://gill1109.com/2023/01/18/de-statistiek-van-slachtoffers-van-toeslagsschandaal/; it still needs to be expanded a lot with further substantiation, references, and so on.

I’m thinking of a statistical simulation to illustrate my point. Those two numbers “4%” need error bars of about +/- 1%. Tricky because I must take account of the correlation within the pairs. We can only guess how big it is. So: several simulations with different guesses.

Statistics of victims of the Dutch child-benefits scandal

Commentary on the CBS report

Author: prof.dr. (em.) Richard D. Gill

Mathematical Institute, Leiden University

Monday January 16, 2023

Richard Gill is emeritus professor of mathematical statistics at Leiden University. He is a member of the KNAW and former chairman of the Netherlands Statistical Society (VVS-OR)

=========================================

Mr. Pieter Omtzigt has asked me to give my expert opinion on the CBS report that examines whether the number of child care placements of children by Dutch child protection authorities increased because their families had fallen victim to the child benefit scandal in the Netherlands.

The current note is preliminary and I intend to refine it further. My purpose is to stimulate discussion among relevant professionals of the methodology used by the CBS in this particular case. Feedback, please!

The report gives a clear (and short) account of creative statistical analysis of much complexity. The sophisticated nature of the analysis techniques, the urgency of the question, and the need to communicate the results to a general audience probably led to important “fine print” about the reliability of the results being omitted. The authors seem to me to be too confident in their findings.

Numerous choices had to be made by the CBS team to answer the research questions. Many preferable options are excluded due to data availability and confidentiality. Changing one of the many steps in the analysis through changes in criteria or methodology could lead to wildly different answers. The actual finding of two nearly equal percentages (both close to 4%) in the two groups of families is, in my opinion, “too good to be true”. It’s a fluke. Its striking character may have encouraged the authors to formulate their conclusions much more strongly than they are entitled to.

In this regard, I found it significant that the authors note that the datasets are so large that statistical uncertainty is unimportant. But this is simply not true. After constructing an artificial control group, they have two groups of size (in round numbers) 4000, and 4% of cases in each group, i.e. about 160. According to a rule of thumb calculation (Poisson variation), the statistical variation in those two numbers have a standard deviation of about the square root of 160, so about 12.5. That means that one of those numbers (160) could easily happen to have twice the standard deviation, which is about 25. The conclusion that the benefits scandal did not lead to more children being removed from home than without it would have been the case, certainly cannot be drawn . Taking into account the statistical sampling error, it is quite possible that the control group (those not afflicted by the benefits scandal) would have been 50 less. In that case, the study group experienced 50 more than they would have done, had they not been victims of the benefits scandal.

To make the numbers easier still, suppose there was an error of 40 cases too few in the light blue bar standing for 4%. 40 out of 4000 is 1 out of 100, 1%. Change the light blue bar from height 4% to height 3% and they don’t look the same at all!

But this is already without taking into account possible systematic errors. The statistical techniques used are advanced and model-based. This means that they depend on the validity of many particular assumptions about the form and nature of the relationships between the variables included in the analysis (using “logistic regression”). The methodology uses these assumptions for its convenience and power (more assumptions mean stronger conclusions, but threatens “garbage in, garbage out”). Logistic regression is such a popular tool in so many applied fields because the model is so simple: the results are so easy to interpret, the calculation can often be left to the computer without user intervention. But there’s no reason why the model should be exactly true; one can only hope that it is a useful approximation. Whether it is useful depends on the task for which it is used. The current analysis uses logistic regression for purposes for which it was not designed.

The assumptions of the standard model of logistic regression are certainly not exactly met. It is not clear whether the researchers tested for failure of the assumptions (for example, by looking for interaction effects – violation of additivity). The danger is that the failure of the assumptions can lead to systematic bias in the results, bias that affects the synthetic (“matched”) control group. The central assumption in logistic regression is the additivity of effects of various factors on the log-odds scale (“odds” means probability divided by complementary probability; log means logarithm). This could be true to a first rough approximation, but it is certainly not exactly true. “All models are wrong, but some are useful”.

A good practice is to build models by analyzing a first data set and then evaluating the final chosen model on an independently collected second data set. In this study, not one but numerous models were tested. The researchers seem to have chosen from countless possibilities through subjective assessment of plausibility and effectiveness. This is fine in an exploratory analysis. But the findings of such an exploration must be tested against new data (and there is no new data).

The end result was a procedure to choose “nearest neighbour matches” with respect to a number of observed characteristics of the cases examined. Errors in the logistic regression used to choose matched controls can systematically bias the control group.

Further big questions concern the actual selection of cases and controls at the beginning of the analysis. Not all families affected by the benefits scandal had to pay back a huge amount of subsidy. Mixing the hard-hit and the weak-hit dilutes the effect of the scandal, both in magnitude and accuracy, the latter because maller samples lead to relatively less accurate determination of effect size.

Another problem is that the pre-selection control population (families in general from which a child was removed) also contains victims of the benefit scandal (the study population). That brings the two groups closer together, even more so after the familywise one-on-one matching process, which of course selectively finds matches among the subpopulation most likely to be affected by the benefits scandal.