The Story that wasn’t one....

I was asked last week by Sense about Science to comment on a paper that was recently published in  something called Jacobs Journal of Epidemiology and Preventive Medicine. Never heard of that journal?  Indeed, neither did I.   It turns out that this is one of the open access journals that features prominently on Beall’s list of  predatory, or in other words if-you-pay-we-will-publish-anything, journals (link). That’s not a great start  for any scientific paper, and usually implies it was rejected by any number of better, peer-reviewed,  scientific journals. But let’s not be too hasty to judge the paper before reading it, after all, through some  miraculous mechanism of science dissemination this paper got picked up by the Daily Mail (here) and thus  read by quite a lot of people.  It’s a paper by Busby and de Messieres entitled ‘Cancer near Trawsfynydd Nuclear Power Station in Wales,  UK: A Cross-Sectional Cohort Study” and, according to the Daily Mail, proves that living close to a nuclear  power station, or more specifically downwind from one, is associated with massively increased cancer  risks. The strongest risk found for female breast cancer with a five times higher risk than expected. So  that’s quite serious….if it is true of course. Go and enjoy, have a look; it’s open access (here).  My initial and short response can be found on the   Sense about Science website and essentially covers two issues:   problems with the study design and the very small number of   cases (see the comment here).   Now that I have a bit more time for writing (the new Fun Police   website you are now looking at took up most of my time last weekend   instead), let’s have a closer look at the paper together and see what we think….  ….so what do we think?  ….What the hell is a cross sectional cohort study????   Let’s start there because it is in the title. Let me summarize two different epidemiological study designs  for you: - A cross-sectional study is, as the name implies, a study done at one point in time to get information of  the population of interest by, figuratively (obviously…literally is unethical), cutting through the population  and see how many cancers you find. It is, therefore, a ‘snapshot’.   - A cohort study on the other hand is a study where you take a clearly defined group of people and then  follow them over time to see who develops the disease of interest. It’s called a cohort study because,  again figuratively, it looks like a roman army cohort that starts as a clearly defined group of soldiers which  is then followed through until the end of the battle. What I described here is a prospective cohort study,  but you can also conduct a retrospective cohort study in which you, for example, get information about  everyone who was living in a certain area at a specific point in time (say 1970) and you see what happened  to them until today.   So in other words, a cross sectional study looks at one point in time and a cohort study follows a set of  people over time. On the other hand, a cross-sectional cohort study as it turns out is a design that has  been proposed previously by Hudson et al. (link), but never gained much traction. The reason for it is  that, basically, it is not very good. It’s an amalgamation of a cross-sectional cohort study and a badly  conducted cohort study.  Looking at the paper we can conclude that it is in fact a cross sectional study. There is nothing wrong with  this approach, but in terms of determining the causal relation between living downwind from a nuclear  power station and cancer risk, it is a pretty weak epidemiological design. The people that currently live in  this area could have lived there for their whole life, or could have just moved into the area for example,  or they could have smoked for twenty years but stopped five years ago (eg now being non-smokers), or for  example, they do not want to answer the door, just to name a few problems. Indeed, a retrospective  cohort study would have been a much better idea. Anyway, the investigators went to the houses of  everyone living in the study area, eg downwind from the nuclear power station, and asked whether they  had cancer (plus presumably other questions, but that is not very well described). That is fine, and you  then get an estimate for the current prevalence of cancers in that area. The word prevalence here is very  crucial, because the investigators then compared this to the expected incidence rate of cancer (based on  the population age distribution). Notice how that is a different word?  With respect to this particular study, it’s quite obvious where the trouble lies. A cross-sectional study is  used to determine the prevalence (eg all current cases) of a certain disease in a population. It cannot be  used to determine the incidence since this is the number of new cases in a certain amount of time and you  don’t have any time elapsing because you took a snapshot when you collected the data. What you can do  is, and what was done in the study by Busby, is to only include the new cases that have emerged in the  last three years. That’s ok, you then just have the prevalence of the new cases. To get the incidence rate  however, you will need to divide it be the total population denominator. It gets a bit technical here, but  essentially because you haven’t included all the people who lived in that area in the past three years  (some will have moved away, some will have moved in, some may have died from other diseases, etcetera)  you cannot determine the denominator so you don’t know what number to divide your number of new  cases by to get the correct incidence rates. If you had done a proper retrospective cohort study, you could have obtained the rate because you have clearly defined the population for the denominator. I hope all  this makes sense; the basic idea is that because the cancer incidence rates in the paper are incorrect, the  comparison with the population expected incidence rates is, by definition, pointless.  However, for the sake of argument, let’s assume not a single person moved out of the area and not a  single person moved into the area. Also, nobody died in the years prior to the cross-sectional snapshot,  and let’s also assume that the interviewers interviewed everyone (do you see how unlikely that is…). This  would mean that the calculated incidence ratios are correct and we can compare them with the expected  incidence rates. In this study, these expected rates have been obtained for England and Wales for 5-year  age groups to account for the increased cancer risk with age (in fact, age is the biggest risk factor we  know). Would this give us the correct expected rates? Unfortunately not…. It is very unlikely that this population living under a nuclear power station in a remote, non-urban area in  Wales is very representative of the general population. Cancer rates differ between socio-economic  classes, are dependent on the prevalence of smoking in the population, and there various other factors  that could imply the rates are likely different. So that’s not great…  It would have been quite easy to check this though; you could have applied the exact same methodology  to a comparable control population (for example those anywhere in the vicinity, but not downwind, of the  nuclear power station would in this case have been a good idea) and see if you do not observe any  increased risks. Additionally, you could have compared both populations and you would not have to have  dealt with a somewhat dodgy comparison to expected national rates (note that this still would not have  solved the problem with the calculation of the incidence rates from this cross-sectional data!). It baffles  me that the researchers have not done this to be honest… So in summary, the researchers have calculated the wrong incidence rates and then compared these to the  wrong expected rates, and did not double check all this with a control population. That is pretty sloppy.   But since all this is pretty technical, would it have mattered? Actually not at all! We could not have  bothered after reading the abstract… …the study is based on 22 cancer cases only, with the main finding for breast cancer based on six female  breast cancers. Five of these are below sixty years of age (this is important since, you may remember, age  is by far the biggest risk factor for the development of cancer), and of these only 1 was a non-smoker (3  are smokers and 1 is unknown because she died), which is a known and important risk factor for breast  cancer as well. In other words, any other analysis would probably have found smoking to increase cancer  risk…which is not something we did not know yet.   Since epidemiology is based on statistical methods relying on big numbers any population studies based on  so few cases is considered very weak. Statistically significant findings will occur, but just one or two  additional or fewer cases will completely change the result. Would you trust a finding for whole population  based on just one or two people?   So does this mean that living downwind of that nuclear power station is not associated with increased  cancer risk? Not at all…like for a lack of proof of increased risk there is also a lack of proof for the  opposite. In fact, this study does not tell us anything...   Indeed, we could not have bothered with reading the whole paper, and could have just stuck with the  abstract. I am therefore, profoundly sorry for wasting your time…. …although you have now learned not to trust articles in the Daily Mail…..at least not blindly.     
Back   Back Back   Back
Disclaimer Disclaimer Disclaimer Disclaimer Home           Home Home           Home About  site   About  site About  site   About  site About  me    About  me About  me    About  me Current reading   Current reading Current reading   Current reading CONTACT   CONTACT CONTACT   CONTACT Links                  Links Links                  Links Book shelves      Book shelves Book shelves      Book shelves
Blog Archive    Blog Archive Blog Archive    Blog Archive Recent posts  Recent posts Recent posts  Recent posts