Disclaimer Disclaimer Disclaimer Disclaimer Home           Home Home           Home About  site   About  site About  site   About  site About  me    About  me About  me    About  me Current reading   Current reading Current reading   Current reading CONTACT   CONTACT CONTACT   CONTACT Links                  Links Links                  Links Book shelves      Book shelves Book shelves      Book shelves
Blog Archive    Blog Archive Blog Archive    Blog Archive Recent posts  Recent posts Recent posts  Recent posts

Should we trust low to moderate increased risks in

observational epidemiology, like ever…?

A relatively short post this month, and it is also dealing with something we all know. However, sometimes  it is important to reiterate all the stuff everybody already knows. So that they remember them, and you  know…everybody actually knows them.  So welcome to the wonderful world of residual confounding.  Just to explain, in case epidemiology is not the driving force of your life, a confounder is a factor that is  correlated to the exposure of interest and also to the disease you are interested in (and yes, is not on the  causal pathway and stuff). So in graphical form this looks like:  To give you an example; people who smoke tobacco on average are also more likely to drink alcohol and  both cause cancer (and smoking does not cause alcohol consumption or vice versa). So you can imagine  that if you do a study on the effect of alcohol consumption on cancer risk, you need to also ask people  about smoking, since there will be more people who smoke tobacco (which also causes cancer as you may  know) in the group of people who drink alcohol. If you don’t ask this, then part of the cancer cases you  think have been caused by alcohol are actually caused by tobacco and the proportion of this gets worse on  average for people drinking more (because of that they are correlated).   Anyway, this implies it is important that all confounding factors are taken into account when you do an  epidemiological study. This is however, not as easy as it sounds. Not only do you need to think about asking  participants that (so you need to have thought about this before you started collecting any data), the fact  that something could be a confounding factor should also be known when you are supposed to ask about it  (on account of you having to think about it before starting to collect data). For (retrospective)  observational epidemiology then, there is an extra problem in that if the information about the  confounder was not collected, say, twenty years back, how are you going to estimate its effect at all (one  solution, is through modelling, such as what we used here: <link>)?  If the effect of the exposure of interest on the disease is really big then this is not necessarily a massive  problem (if you are only interested in whether something is a true risk factor). For example, if you work  on a building site then you working at heights is correlated to you being exposed to dust, and both can  cause premature death. However, if you are interested in the risk of falling from heights and mortality,  measuring dust exposure, despite it being a confounder, is not that important (If you think this is a stupid  example, I am not entirely unsympathetic…and would very much welcome a better example in the  comments below!).  But most of these big risk factors are kind of already known; what if we are studying things with low or  moderate increased risks? And by that I mean those with less than two-fold increased risks (or odds ratios  or relative risks below 2); say for example the health risks such as air pollution, the studies of whether  electromagnetic fields may cause cancer, basically any study on cancer risk and single food ingredients,  etcetera? I did not choose these examples at a whim; they have low-to-moderate risks and the potential  for confounding is very large (for example from socio-economic status, other nutrients, exercise,….).      For myself initially, I thought I’d just run some simulation studies to familiarize myself again with the  impact of this (eg what if exercise wasn’t measured in a study on cancer risk and, say, drinking of green  tea?). It kind of drove the point home of how much of an issue this residual confounding really is, so I  thought to share it with you. I simulated a case-control study of 1000 people (500ish cases and 500ish controls) and for the sake of  argument did not assume there was any measurement error. Now what if the exposure of interest was not  a cause of the disease (i.e. OR=1), but it was correlated with another factor that was – but this factor was  not taken into account in the statistical model?  I simulated different correlations between the two factors (range 10% to 80% (100% being perfect  correlation)) and also changed the odds ratio of the unmeasured factor from 1 (no effect either) to 2.2  (more than two-fold increased risk). I then plotted the odds ratio of the (unrelated!) risk factor that I got  from the statistical logistic model.  Have a look at the figure below!  The x-axis is the OR of the unmeasured confounding factor, the coloured lines are for the different  correlations, and the black line is the true OR (i.e this is 1, because it was not truly a risk factor).   Now I think this figure is pretty worrying! If the correlation between both is moderate (40%, the green line) you start to see an increased risk where  non exists from about an OR of only 1.4 of the unmeasured confounder; and the effect gets much stronger  quickly with higher correlations. For example, a moderate correlation of 60% is really very common, and  so is an OR of 1.5 (the yellow line), and this would give you a wrong point estimate for the effect of 1.3;  or a 30% increased risk. As a comparison for that 30%, the increased risk of long-term exposure to outdoor  sulphur (PM2.5 specifically) and mortality is 14% (95% confidence interval 6%-23%) per 200 ng/m3  (reference to the ESCAPE study: Beelen et al. 2015). So the results on this simulation are quite realistic...  Now more importantly, for detection of a statistical significant risk, the lower limit of the 95% confidence  interval should be above 1. I have done this for my simulated study in the figure below:  In other words, for moderate correlations (>0.6) you only need a pretty small unmeasured confounder of  about OR=1.4 (40%) to conclude that the exposure you were interested in (and remember, which does not  actually cause anything) causes the disease. Even with a correlation of only 40% (the grey line) a missed  confounder with an OR of about 1.8 (still moderate really) will have you conclude that you just found a  (new!) causal factor for the disease!  So is there a point to this? There is of course….it’s a warning that if we find odds  ratios below about 2, we should really spend a lot more time thinking about residual  confounding. In fact, I think this should be a mandatory section in the Discussion section of  observational epidemiology papers. This is already done in most papers, but usually the  focus is really on the effect we observed (let’s face it, a P-value below 5% will be more  likely to get published, unfortunately), with little description of other causes and residual  confounding. In fact, and I am proud to say that I used these words in a paper once, we  should be more “epistemologically modest” in observation epidemiology and don’t make  such big claims.  The real message as such is of course that, next time you publish a paper, or next time you read a Daily  Mail paper about what causes cancer this time, that you think about confounding…  …and that you think about The Fun Police’s blog!....  Next time you attend a seminar or conference, and someone presents this new finding with a low or  moderate risk, please stand up, raise your hand and say “You should have a look at this blog by this  guy…..  
Back   Back Back   Back