Slinging mud at El DiabloA couple of months ago, on this blog we discussed a study by Busby and Messiers entitled ‘‘Cancer near Trawsfynydd Nuclear Power Station in Wales,UK: A Cross-Sectional Cohort Study”, and which had a couple of shortcomings…or, in other words…it was pretty bad. If you want to know why and how, have a look at “The Story that wasn’t one…” (link). The paper was published in ‘Jacobs Journal of Epidemiology and Preventive Medicine’, and if you have never heard of that…you are still not the only one! It is one of those new open access journals that features prominently on Beall’s list ofpredatory, or in other words if-you-pay-we-will-publish-anything, journals (link). What was also slightly dodgy about that whole study was that one of the authors (Busby, for the record) is on the journal’s Editorial Board, which may well have helped getting a paper with such errors published. Anyway, that was a previous blog post and if you are interested in it you can find the link above (or elsewhere on The Fun Police site). Interestingly, the same author and Editorial Board member has recently published a new study in, of course, the same journal. It was published online on September 14, 2016, and as such got published within two months of submission of the first draft. This implies it was an exceptionally well-written and scientifically accurate first version of the paper, the peer-review and journal are very fast, or the peer-review may not have been up to scratch…or of course a combination of the three. It is open access, and you can find it by clicking here (link).Enfin…The paper is entitled “Is there Evidence of Adverse Health Effects Near US NuclearInstallations? Infant Mortality in Coastal Communities near The Diablo Canyon Nuclear Power Station in California, 1989-2012”. That is quite a mouth full, but in summary the study looks at whether there is a correlation between living on a coastal side close to the Diablo Canyon Nuclear Power Station and the number of children dying in that area. The idea behind the study is not that bad. Some previous studies have shown that living close to a nuclear power station may be correlated with a number of increased disease risks, and some have proposed that releases from the nuclear power station may have something to do with that. Busby in this study argues that this risk is an underestimate of the true risk because whereas previous studies looked at population generally living within the vicinity of power stations, the real risk comes from contamination down-wind and down-stream of those stations. Specifically for power stations built near the sea, which is quite a lot of them, this would result in contamination of coastal areas with radioactive materials. Indeed, as you may have guessed, Diablo Canyon Nuclear Plant in California is such a place. So, Busby got the 1989-2012 births and infant deaths by year and by Zip code, summed them per area (zip code group) and divided the deaths by the births to obtain crude mortality rates per 1,000 births. For unknown reason to me, since he had the annual data, he then decided to group this again in four 6-year periods. And then, finally, he divided these in a Coastal Group, which supposedly has the highest exposure, an Inland Group, which has lower exposure but is still relatively close to the power station, and as a third group he used the average numbers across all of California. The paper comes with a nice figure which can be used to draw the coastal and inland groups of zip codes, so well at least you can see what was done. Interestingly, there seem to be two sets of Inland “control areas”, which is a bit unclear, while also not all coastal areas have been included. This is all a bit ambiguous, but let’s give him the benefit of the doubt.What first springs to mind when looking at the infant mortality rates is, quite worryingly, how high they are compared to European countries (i.e. 4-7 per 1,000 compared to 2-4ish in Europe as you can see here). That however, is a completely different story…The general argument in the paper is that despite the fact these rates have gone down in the coastal areas, the inland areas, and in California as a whole up to the year 2000, they kept going down until 2007-2012 in California while in the Coastal areas they started to increase again post-2000. In the “intermediate” inland areas the rate staid fairly stable at about 4 per 1,000 from 2,000 onwards. Busby’s argument then is that this had to have been the result of the releases from the nuclear power station, and he shows a nice and clear linear correlation between the infant mortality rate in the coastal areas in the four periods with the cumulative amount of tritium (as a marker of all releases from the plant) from the start of the operations in 1986.. This, in a nutshell, is the story. Could it be true? Yes it could. Is it a bit flimsy? It most definitely is. Despite the fact Busby had the data to look at this correlation on an annual level, thereby being able to look at temporal patterns in a much better way, he choose not to, and instead used four periods only. I personally find that odd. Moreover, there is a table in the paper which shows the number of births in each of the included areas (stratified by coastal, inland or California as a whole) for the four periods. After just one glance you should notice that whereas in all but one of the coastal areas the number of births has steadily been decreasing over the time period, in most of the inland areas this has been increasing; in fact, the overall decrease is almost entirely the result of a mass exodus in the area of San Luis Obispo. Let’s think about that a bit….….so in the coastal areas the population has decreasing by about 20% over that time period, while in the inland areas it has decreased by less than 5%. If the number of infants dying remained more or less stable (for sake of argument), the observed changes in rates would be exactly as shown in the figure – as a result of demographic changes only! Is that likely? Well that depends who is actually leaving (or entering) the areas. Indeed, although I don’t know the areas it seems very plausible that in the 24 year period much will have changed. Someone else pointed that out and said it was the result of Busby not taking the changes in Hispanic/White birth rates in the study area into account because, apparently, the infant mortality rate in Hispanics is higher. Busby shows the percentages of Hispanic births over time in the three areas and shows that this cannot be the explanation (conveniently, this is not aggregated by the same time periods, so cannot be directly linked). This could have easily been modelled statistically, such that the effect of the percentage of Hispanic/White births was taken into account (and what about a, presumably, increased percentage of mixed-race births? How would that change the estimates?). This is called multivariate regression methods, and if you don’t know what this is; for the purpose of this article it is easiest to remember that it is easy to do and Busby has access to the data to do this. There are only very few reasons why someone would not do multivariate regression models to adjust for important confounding factors (for confounding click here): 1) It is the start of the data analyses, and a researcher just wants to see what is going on at face value. Multivariate analyses will follow after. 2) Multivariate analyses are not needed because it is a randomized controlled trial and all confounders are randomly distributed in the groups (note: this is NOT the case here). And even then it is often done anyway, just in case.3) You want to make a point, and it shows very clearly in straightforward comparison, but not when you to better, multivariate analyses.Indeed, point three is dodgy science, and problematic! I talked about this in my last blog post as well (reference to “Alcohol and ‘fact’ checking in Ireland” here). It unfortunately happens a lot; most notably because it often fits so nicely with the point we are trying to make, so why look further…this is called confirmation bias. And I strongly suspect this is what has been done in this paper too. A point well made, so why spoil it by making it more complicated?!?! Busby has the annual data and he has the racial demographics of the births (at least at area level), so why not look at it? But lets go back to those differences in migration rates (remember, the 20% vs 5%) and think about multivariate analyses a bit further. Say for example that it is mostly young people moving out of the coastal areas and into the inland areas, what would happen then? The result would be that with less births (we know that from the paper), and with infant mortality rates going up (we know that from the paper too), that the percentage of birth to older parents probably has gone up too. Maternal age is directly related to infant mortality rate in the US (here is a link to a paper for the US (link1 and link2), so that could explain what we are seeing in the paper as well, and therefore we may not need the contamination from the nuclear plant as a factor to explain the observed patterns.Can we find out?I did a quick online search, and was not able to quickly find longitudinal data (please let me know if you know how to get this), but the 2010(ish) median age for each of the areas in the study is easy to find. That will work just as an indication… So the range for the coastal areas is about 28 to 57 years of age (median ~44 years), and for the inland areas the range is about 27 to 45 years (median ~37 years). In other words….yes indeed, the population in the coastal areas is older on average than the inland area (at least in the 2007-2012 time stratum), and this could explain the observed differences.My point here is not that the cause for these differences in infant mortality rates is definitely not radiation exposure, I don’t know that (although in my opinion this is highly unlikely), but that with relatively little effort I found a possible other explanation. And it is fairly straightforward to come up with possible other demographic factors as well; for example, maybe socio-economic status differs between the areas, or income (in fact, the same sources showed that in 2010ish median household income was comparable in both areas, and was about 45k, but was much higher in California as a whole (58k) which may explain that difference). And maybe, if these are poorer areas, there is less investment in healthcare or more people will have less access to healthcare, just to name some other things. My main point is that all these other factors could easily have been included in the analyses, and these data are all available (probably for free). Just to recap; there are three reasons why you would not do multivariate analyses when the data are available… …..and only one seems relevant to this particular study. As a side note, since I came across this when doing a bit of googling about the area, and thought it would be nice to illustrate the correlation with contamination issue as well. How about the following alternative hypothesis? I came across the following report (full report link) entitled “Pismo Beach Fecal contamination source identification study”. Pismo beach is one of the coastal areas in the study, and the report describes a study to identity the biological sources of fecal contamination as well as the physical and environmental factors that influence the levels of bacteria in the ocean waters at Pismo Beach, and was conducted in 2008 (you know, in the final 2007-2012 time block). So contamination must have been ongoing in the preceding years….I mean, you know, …..swimming in water polluted with fecal residue, let alone swallowing it… just sayin’… *Anyway, personally, I don’t think studies like this should be published (or at least not without properly addressing their shortcomings); especially not in a highly emotive area of environmental health such as this one. There are enough worries related to radiation and nuclear power as it is, and these things don’t help. A comparable worry has been ongoing in the UK in relation to childhood leukaemia clusters in areas surrounding nuclear power stations, and whether this could be the result of the release of radioactive contamination from the sites. The UK Independent Advisory Committee on Medical Aspects of Radiation in the Environment (COMARE) has conducted a thorough and exhaustive review of all available evidence and concluded that a more likely explanation, at least in part, is the result of urban/rural population-mixing and associated changes in viral loads (disclaimer: I am a member of that committee). If you are interested, a link to the full report which can be downloaded for free, can be found here (link).