Relationship between incidence of breathing obstruction and degree of muzzle shortness in pedigree dogs
This blog post is the result of rapid conversion from a preprint, typeset with LaTeX, posted on arXiv.org as https://arxiv.org/abs/2209.08934, and submitted to the journal PLoS ONE. I used pandoc to convert LaTeX to Word, then simply copy-pasted the content of the Word document into WordPress. After that, a few mathematical symbols and the numerical contents of the tables needed to be fixed by hand.
There has been much concern about health issues associated with the breeding of short-muzzled pedigree dogs. The Dutch government commissioned a scientific report Fokken met Kortsnuitige Honden (Breeding of short-muzzled dogs), van Hagen (2019), and based on it rather stringent legislation, restricting breeding primarily on the basis of a single simple measurement of brachycephaly, the CFR: cranial-facial ratio. Van Hagen’s work is a literature study and it draws heavily on statistical results obtained in three publications: Njikam (2009), Packer et al. (2015), and Liu et al. (2017). In this paper, I discuss some serious shortcomings of those three studies and in particular, show that Packer et al. have drawn unwarranted conclusions from their study. In fact, new analyses using their data lead to an entirely different conclusion.
The present work was commissioned by “Stichting Ras en Recht” (SRR; Foundation Justice for Pedigree dogs) and focuses on the statistical research results of earlier papers summarized in the literature study Fokken met Kortsnuitige Honden (Breeding of short-muzzled – brachycephalic – dogs) by dr M. van Hagen (2019). That report is the final outcome of a study commissioned by the Netherlands Ministry of Agriculture, Nature, and Food Quality. It was used by the ministry to justify legislation restricting breeding of animals with extreme brachycephaly as measured by a low CFR, cranial-facial ratio.
An important part of van Hagen’s report is based on statistical analyses in three key papers: Njikam et al. (2009), Packer et al. (2015), and Liu et al. (2017). Notice: the paper Packer et al. (2015) reports results from two separate studies, called by the authors Study 1 and Study 2. The data analysed in Packer et al. (2015) study 1 was previously collected and analysed for other purposes in an earlier paper Packer et al. (2013) which does not need to be discussed here.
In this paper, I will focus on these statistical issues. My conclusion is the cited papers have many serious statistical shortcomings, which were not recognised by van Hagen (2019). In fact, a reanalysis of the Study 2 data investigated in Packer et al. (2015) leads to conclusions completely opposite to those drawn by Packer et al., and completely opposite to the conclusions drawn by van Hagen. I come to the conclusion that the Packer et al. study 2 badly needs updating with a much larger replication study.
A very important question is just how generalisable are the results of those papers. There is no word on that issue in van Hagen (2019). I will start by discussing the paper which is most relevant to our question: Packer et al. (2015).
An important preparatory remark should be made concerning the term “BOAS”, brachycephalic obstructive airway syndrome. It is a syndrome, which means: a name for some associated characteristics. “Obstructed airways” means: difficulty in breathing. “Brachycephalic” means: having a (relatively) short muzzle. Having difficulty in breathing is a symptom sometimes caused by having obstructed airways; it is certainly the case that the medical condition is often associated with having a short muzzle. That does not mean that having a short muzzle causes the medical condition. In the past, dog breeders have selected dogs with a view to accentuating certain features, such as a short muzzle: unfortunately, at the same time, they have sometimes selected dogs with other, less favourable characteristics at the same time. The two features of dogs’ anatomies are associated, but one is not the cause of the other. “BOAS” really means: having obstructed airways and a short muzzle.
Packer et al. (2015) reports findings from two studies. The sample for the first study, “Study 1”, 700 animals, consisted of almost all dogs referred to the Royal Veterinary College Small Animal Referral Hospital (RVC-SAH) in a certain period in 2012. Exclusions were based on a small list of sensible criteria such as the dog being too sick to be moved or too aggressive to be handled. However, this is not the end of the story. In the next stage, those dogs who actually were diagnosed to have BOAS (brachycephalic obstructive airway syndrome) were singled out, together with all dogs whose owners reported respiratory difficulties, except when such difficulties could be explained by respiratory or cardiac disorders. This resulted in a small group of only 70 dogs considered by the researchers to have BOAS, and it involved dogs of 12 breeds only. Finally, all the other dogs of those breeds were added to the 70, ending up with 152 dogs of 13 (!) breeds. (The paper contains many other instances of carelessness).
To continue with the Packer et al. (2015) Study 1 reduced sample of 152 dogs, this sample is a sample of dogs with health problems so serious that they are referred to a specialist veterinary hospital. One might find a relation between BOAS and CFR (craniofacial ratio) in that special population which is not the same as the relation in general. Moreover, the overall risk of BOAS in this special population is by its construction higher than in general. Breeders of pedigree dogs generally exclude already sick dogs from their breeding programmes.
That first study was justly characterised by the authors as exploratory. They had originally used the big sample of 700 dogs for a quite different investigation, Packer et al. (2013). It is exploratory in the sense that they investigated a number of possible risk factors for BOAS besides CFR, and actually used the study to choose CFR as appearing to be the most influential risk factor, when each is taken on its own, according to a certain statistical analysis method, in which already a large number of prior assumptions had been built in. As I will repeat a few more times, the sample is too small to check those assumptions. I do not know if they also tried various simple transformations of the risk factors. Who knows, maybe the logarithm of a different variable would have done better than CFR.
In the second study (“Study 2”), they sampled anew, this time recruiting animals directly mainly from breeders but also from general practice. A critical selection criterium was a CFR smaller than 0.5, that number being the biggest CFR of a dog with BOAS from Study 1. They especially targeted breeders of breeds with low CFR, especially those which had been poorly represented in the first study. Apparently, the Affenpinscher and Griffon Bruxellois are not often so sick that they get referred to the RVC-SAH; of the 700 dogs entering Study 1, there was, for instance, just 1 Affenpinscher and only 2 Griffon Bruxellois. Of course, these are also relatively rare breeds. Anyway, in Study 2, those numbers became 31 and 20. So: the second study population is not so badly biased towards sick animals as the first. Unfortunately, the sample is much, much smaller, and per breed, very small indeed, despite the augmentation of rarer breeds.
Now it is important to turn to technical comments concerning what perhaps seems to speak most clearly to the non-statistically schooled reader, namely, Figure 2 of Packer et al., which I reproduce here, together with the figure’s original caption.
In the abstract of their paper, they write “we show […] that BOAS risk increases sharply in a non-linear manner”. They do no such thing! They assume that the log odds of BOAS risk , that is: log(p/(1 – p)), depends exactly linearly on CFR and moreover with the same slope for all breeds. The small size of these studies forced them to make such an assumption. It is a conventional “convenience” assumption. Indeed, this is an exploratory analysis, moreover, the authors’ declared aim was to come up with a single risk factor for BOAS. They were forced to extrapolate from breeds which are represented in larger numbers to breeds of which they had seen many less animals. They use the whole sample to estimate just one number, namely the slope of log(p/(1 – p)) as an assumed linear function of CFR. Each small group of animals of each breed then moves that linear function up or down, which corresponds to moving the curves to the right or to the left. Those are not findings of the paper. They are conventional model assumptions imposed by the authors from the start for statistical convenience and statistical necessity and completely in tune with their motivations.
One indeed sees in the graphs that all those beautiful curves are essentially segments of the same curve, shifted horizontally. This has not been shown in the paper to be true. It was assumed by the authors of the paper to be true. Apparently, that assumption worked better for CFR than for the other possible criteria which they considered: that was demonstrated by the exploratory (the author’s own characterisation!) Study 1. When one goes from Study 1 to Study 2, the curves shift a bit: it is definitely a different population now.
There are strange features in the colour codes. Breeds which should be there are missing, and breeds which shouldn’t be there are. The authors have exchanged graphs (a) and (b)! This can be seen by comparing the minimum and maximum predicted risks from their Table 2.
Notice that these curves represent predictions for neutered dogs with breed mean neck girth, breed ideal body condition score (breed ideal body weight). I don’t know whose definition of ideal is being used here. The graphs are not graphs of probabilities for dog breeds, but model predictions for particular classes of dogs of various breeds. They depend strongly on whether or not the model assumptions are correct. The authors did not (and could not) check the model assumptions: the sample sizes are much too small.
By the way, breeders’ dogs are generally not neutered. Still, one-third of the dogs in the sample were neutered, so the “baseline” does represent a lot of animals. Notice that there is no indication whatsoever of statistical uncertainty in those graphics. The authors apparently did not find it necessary to add error bars or confidence bands to their plots. Had they done so, the pictures would have given a very, very different impression.
In their discussion, the authors write “Our results confirm that brachycephaly is a risk factor for BOAS and for the first time quantitatively demonstrate that more extreme brachycephalic conformations are at higher risk of BOAS than more moderate morphologies; BOAS risk increases sharply in a non-linear manner as relative muzzle length shortens”. I disagree strongly with their appraisal. The vaunted non-linearity was just a conventional and convenience (untested) assumption of linearity in the much more sensible log-odds scale. They did not test this assumption and most importantly, they did not test whether it held for each breed considered separately. They could not do that, because both of their studies were much, much too small. Notice that they themselves write, “we found some exceptional individuals that were unaffected by BOAS despite extreme brachycephaly” and it is clear that these exceptions were found in specific breeds. But they do not tell us which.
They also tell us that other predictors are important next to CFR. Once CFR and breed have been taken into account (in the way that they take it into account!), neck girth (NG) becomes very important.
They also write, “if society wanted to eliminate BOAS from the domestic dog population entirely then based on these data a quantitative limit of CFR no less than 0.5 would need to be imposed”. They point out that it is unlikely that society would accept this, and moreover, it would destroy many breeds which do not have problems with BOAS at all! They mention, “several approaches could be used towards breeding towards more moderate, lower-risk morphologies, each of which may have strengths and weaknesses and may be differentially supported by stakeholders involved in this issue”.
This paper definitely does not support imposing a single simple criterion for all dog breeds, much as its authors might have initially hoped that CFR could supply such a criterion.
In a separate section, I will test their model assumptions, and investigate the statistical reliability of their findings.
Now I turn to the other key paper, Liu et al. (2017). In this 8-author paper, the last and senior author, Jane Ladlow, is a very well-known authority in the field. This paper is based on a study involving 604 dogs of only three breeds, and those are the three breeds which are already known to be most severely affected by BOAS: bulldogs, French bulldogs, and pugs. They use a similar statistical methodology to Packer et al., but now they allow each breed to have a different shaped dependence on CFR. Interestingly, the effects of CFR on BOAS risk for pugs, bulldogs and French bulldogs are not statistically significant. Whether or not they are the same across those three breeds becomes, from the statistical point of view, an academic question.
The statistical competence and sophistication of this group of authors can be seen at a glance to be immeasurably higher than that of the group of authors of Packer et al. They do include indications of statistical uncertainty in their graphical illustrations. They state, “in our study with large numbers of dogs of the three breeds, we obtained supportive data on NGR (neck girth ratio: neck girth/chest girth), but only a weak association of BOAS status with CFR in a single breed.” Of course, part of that could be due to the fact that, in their study, CFR did not vary much within each of those three breeds, as they themselves point out. I did not yet re-analyse their data to check this. CFR was certainly highly variable in these three breeds in both of Packer et al.’s studies, see the figures above, and again in Liu et al. as is apparent from my Figure 2 below. But Liu et al. also point out that anyway, “anatomically, the CFR measurement cannot determine the main internal BOAS lesions along the upper airway”.
Another of their concluding remarks is the rather interesting “overall, the conformational and external factors as measured here contribute less than 50% of the variance that is seen in BOAS”. In other words, BOAS is not very well predicted by these shape factors. They conclude, “breeding toward [my emphasis] extreme brachycephalic features should be strictly avoided”. I should hope that nowadays, no recognised breeders deliberately try to make known risk features even more pronounced.
Liu et al. studied only bulldogs, French bulldogs and pugs. The CFRs of these breeds do show within breed statistical variation. The study showed that a different anatomical measure was an excellent predictor of BOAS. Liu et al. moreover explain anatomically and medically why one should not expect CFR to be relevant for the health problems of those races of dogs.
It is absolutely not true that almost all of the animals in that study have BOAS. The study does not investigate BOS. The study was set up in order to investigate the exploratory findings and hypotheses of Packer et al. and it rejects them, as far as the three races they considered were concerned. Packer et al. hoped to find a simple relationship between CFR and BOAS for all brachycephalic dogs but their two studies are both much too small to verify their assumptions. Liu et al. show that for the three races studied, the relationship between measurements of body structure and ill health associated with them, varies between races.
In contradiction to the opinion of van Hagen (2019), there are no “contradictions” between the studies of Packer et al. and Liu et al. The first comes up with some guesses, based on tiny samples from each breed. The second investigates those guesses but discovers that they are wrong for the three races most afflicted with BOAS. Study 1 of Packer et al. is a study of sick animals, but Study 2 is a study of animals from the general population. Liu et al. is a study of animals from the general population. (To complicate matters, Njikam et al., Packer et al. and Liu et al. all use slightly different definitions or categorisations of BOAS.)
Njikam et al. (2009), like the later researchers in the field, fit logistic regression models. They exhibit various associations between illness and risk factors per breed. They do not quantify brachycephaly by CFR but by a similar measure, BRA, the ratio of width to length of the skull. CFR and BRA are approximately non-linear one-to-one functions of one another (this would be exact if skull length equalled skull width plus muzzle length, i.e., assuming a spherical cranium), so a threshold criterium in terms of one can be roughly translated into a threshold criterium in terms of the other. Their samples are again, unfortunately, very small (the title of their paper is very misleading).
Their main interest is in genetic factors associated with BOAS apart from the genetic factors behind CFR, and indeed they find such factors! In other words, this study shows that BOAS is very complex. Its causes are multifactorial. They have no data at all on the breeds of primary interest to SRR: these breeds are not much afflicted by BOAS! It seems that van Hagen again has a reading of Njikam et al. which is not justified by that paper’s content.
Fortunately, the data sets used by the publications in PLoS ONE are available as “supplementary material” on the journal’s web pages. First of all, I would like to show a rather simple statistical graphic which shows that the relation between BOAS and CFR in Packer et al.’s Study 2 data does not look at all as the authors hypothesized. First, here are the numbers: a table of numbers of animals with and without BOAS in groups split according to CFR as a percentage, in steps of 5%. The authors recruited animals mainly from breeders, with CFR less than 50%. It seems there were none in their sample with a CFR between 45% and 50%.
BOAS versus CFR group
This next figure is a simple “pyramid plot” of percentages with and without BOAS per CFR group. I am not taking into account the breed of these dogs, nor of other possible explanatory factors. However, as we will see, the suggestion given by the plot seems to be confirmed by more sophisticated analyses. And that suggestion is: BOAS has a roughly constant incidence of about 20% among dogs with a CFR between 20% and 45%. Below that level, BOAS incidence increases more or less linearly as CFR further decreases.
Be aware that the sample sizes on which these percentages are based are very, very small.
Could it be that the pattern shown in Figure 3 is caused by other important characteristics of the dogs, in particular, breed? In order to investigate this question, I, first of all, fitted a linear logistic regression model with only CFR, and then a smooth logistic regression model with only CFR. In the latter, the effect of CFR on BOAS is allowed to be any smooth function of CFR – not a function of a particular shape. The two fitted curves are seen in Figure 4. The solid line is the smooth, the dashed line is the fitted logistic curve.
This analysis confirms the impression of the pyramid plot. However, the next results which I obtained were dramatic. I added to the smooth model also Breed and Neutered-status, and also investigated some of the other variables which turned up in the papers I have cited. It turned out that “Breed” is not a useful explanatory factor. CFR is hardly significant. Possibly, just one particular breed is important: the Pug. The differences between the others are negligible (once we have taken account of CFR). The variable “neutered” remains somewhat important.
Here (Table 2) is the best model which I found. As far as I can see, the Pug is a rather different animal from all the others. On the logistic scale, even taking account of CFR, Neckgirth and Neuter status, being a Pug increases the log odds ratio for BOAS by 2.5. Below a CFR of 20%, each 5% decrease in CFR increases the log odds ratio for BOAS by 1, so is associated with an increase in incidence by a factor of close to 3. In the appendix can be seen what happens when we allow each breed to have its own effect. We can no longer separate the influence of Breed from CFR and we cannot say anything about any individual breeds, except for one.
|(CFRpct – 20) * (CFRpct < 20)||–0.20***||(0.05)|
|Breed == “Pug”:TRUE||2.48***||(0.71)|
|*** p < 0.001; ** p < 0.01; * p < 0.05|
The pug is in a bad way. But we knew that before. Packer Study 2 data:
|W.out BOAS||With BOAS|
The graphs of Packer et al. in Figure 1 are a fantasy. Reanalysis of their data shows that their model assumptions are wrong. We already knew that BOAS incidence, Breed, and CFR are closely related and naturally they see that again in their data. But the actual possibly Breed-wise relation between CFR and BOAS is completely different from what their fitted model suggests. In fact, the relation between CFR and BOAS seems to be much the same for all breeds, except possibly for the Pug.
The paper Packer et al. (2015) is rightly described by its authors as exploratory. This means: it generates interesting suggestions for further research. The later paper by Liu et al. (2017) is excellent follow-up research. It follows up on the suggestions of Packer et al., but in fact it does not find confirmation of their hypotheses. On the contrary, it gives strong evidence that they were false. Unfortunately, it only studies three breeds, and those breeds are breeds where we already know action should be taken. But already on the basis of a study of just those three breeds, it comes out strongly against taking one single simple criterion, the same for all breeds, as the basis for legislation on breeding.
Further research based on a reanalysis of the data of Packer et al. (2015) shows that the main assumptions of those authors were wrong and that, had they made more reasonable assumptions, completely different conclusions would have been drawn from their study.
The conclusion to be drawn from the works I have discussed is that it is unreasonable to suppose that a single simple criterion, the same for all breeds, can be a sound basis for legislation on breeding. Packer et al. clearly hoped to find support for this but failed: Liu et al. scuppered that dream. Reanalysis of their data with more sophisticated statistical tools shows that they should already have seen that they were betting on the wrong horse.
Below a CFR of 20%, a further decrease in CFR is associated with a higher incidence of BOAS. There is not enough data on every breed to see if this relationship is the same for all breeds. For Pugs, things are much worse. For some breeds, it might not be so bad.
Study 2 of Packer et al. (2015) needs to be replicated, with much larger sample sizes.
van Hagen MAE (2019) Fokken met Kortsnuitige Honden. Criteria ter handhaving van art. 3.4. Besluit Houders van dieren Fokken met Gezelschapsdieren. Departement Dier in Wetenschap en Maatschappij en het Expertisecentrum Genetica Gezelschapsdieren, Universiteit Utrecht. https://dspace.library.uu.nl/handle/1874/391544; English translation: https://www.uu.nl/sites/default/files/eng_breeding_short-muzzled_dogs_in_the_netherlands_expertisecentre_genetics_of_companionanimals_2019_translation_from_dutch.pdf
Liu N-C, Troconis EL, Kalmar L, Price DJ, Wright HE, Adams VJ, Sargan DR, Ladlow JF (2017) Conformational risk factors of brachycephalic obstructive airway syndrome (BOAS) in pugs, French bulldogs, and bulldogs. PLoS ONE 12 (8): e0181928. https://doi.org/10.1371/journal.pone.0181928
Njikam IN, Huault M, Pirson V, Detilleux J (2009) The influence of phylogenic origin on the occurrence of brachycephalic airway obstruction syndrome in a large retrospective study. International Journal of Applied Research in Veterinary Medicine 7(3) 138–143. http://www.jarvm.com/articles/Vol7Iss3/Nijkam%20138-143.pdf
Packer RMA, Hendricks A, Volk HA, Shihab NK, Burn CC (2013) How Long and Low Can You Go? Effect of Conformation on the Risk of Thoracolumbar Intervertebral Disc Extrusion in Domestic Dogs. PLoS ONE 8 (7): e69650. https://doi.org/10.1371/journal.pone.0069650
Packer RMA, Hendricks A, Tivers MS, Burn CC (2015) Impact of Facial Conformation on Canine Health: Brachycephalic Obstructive Airway Syndrome. PLoS ONE 10 (10): e0137496. https://doi.org/10.1371/journal.pone.0137496
|Breed:Cavalier King Charles Spaniel||0.82||(1.37)|
|Breed:Dogue de Bordeaux||–43.35||(67108864.00)|
|Breed:Staffordshire Bull Terrier||–43.37||(47453132.81)|
|Breed:Staffordshire Bull Terrier Cross||2.36||(2.07)|
|Num. smooth terms||1|
|*** p < 0.001; ** p < 0.01; * p < 0.05|
The above model (Table 4) allowing each breed to have its own separate “fixed” effect is not a success. That certainly was presumably the motivation to make “Breed” a random, not a fixed, effect in the Packer et al. publication, because treating breed effects as drawn from a normal distribution and assuming the same effect of CFR for all breeds disguises the multicollinearity and lack of information in the data. Many breeds, most of them contributing only one or two animals, enabled the authors’ statistical software to compute an overall estimate of “variability between breeds” but the result is pretty meaningless.
Further inspection shows that many breeds are only represented by 1or 2 animals in the study. Only five are in something a bit like reasonable numbers. These five are the Affenpinscher, Cavalier King Charles Spaniel, Griffon Bruxellois, Japanese Chin and Pug; in numbers 31, 11, 20, 10, 32. I fitted a GLM (logistic regression) trying to explain BOAS in these 105 animals and their breed together with variables CFR, BCR, and so on. Still then, the multicollinearity between all these variables is so strong that the best model did not include CFR at all. In fact: once BCS (Body Condition Score) was included, no other variable could be added without almost everything becoming statistically insignificant. Not surprisingly, it is good to have a good BCS. Being a Pug or a Japanese Chin is disastrous. Cavalier King Charles Spaniel is intermediate. Affenpinscher and Griffon Bruxellois have the least BOAS (and about the same amount, namely an incidence of 10%), even though the mean CFRs of these two species seem somewhat different (0.25, 0.15).
Had the authors presented p-values and error bars the paper would probably never have been published. The study should be repeated with a sample 10 times larger.
This work was partly funded by “Stichting Ras en Recht” (SRR; Foundation Justice for Pedigree dogs). The author accepted the commission by SSR to review statistical aspects of MAE van Hagen’s report “Breeding of short-muzzled dogs” under the condition that he would report his honest professional and scientific opinion on van Hagen’s literature study and its sources.