A case-control study involves the comparison of groups, namely cases and controls, in order to estimate the association between exposure and disease. By choosing persons with the adverse health outcome of interest (cases) and a properly constituted sample from the source population (controls), we assess exposure prevalence in the two groups in order to estimate of the association between exposure and disease. A cohort study also has the same goal, estimating the asso ciation between exposure and disease, and also does so by comparing two groups, i.e., the exposed and the unexposed. Unfortunately, there is a temptation to select cases and controls in a manner that mimics the selection of exposed and unexposed subjects in a cohort study. The role of an unexposed group in a cohort study however, and a control group in a case-control study are entirely different. In a cohort study, the goal is to select an unexposed group that has identical baseline risk of disease as the exposed group other than any effect of the exposure itself. If that goal is met, then the disease experience of the unexposed group provides a valid estimate of the disease risk the exposed persons would have had if they had not been exposed (counterfactual comparison). Cohort studies attempt to mimic a randomized trial or experiment in which the exposure of interest is manipulated to ensure, to the maximum extent possible, that the exposed and unexposed are identical in all respects other than the exposure of interest.
In a case-control study, by contrast, given a set of cases with the disease, the goal is to select controls who approximate the exposure prevalence in the study base, that is, the population experience that generated the cases. The key comparison to assess whether the control group is a good one is not between the cases and the controls, but between the controls and the study base they are intended to approximate. The available cases define the scope of the study base, namely the population experience that gave rise to that particular set of cases. Once defined clearly, the challenge for control selection is unbiased sampling from that study base. If this is done properly, then the case-control study will give results as valid as those that would have been obtained from a cohort study of the same population subject to sampling error. It should be noted, however, that biases inherent in that underlying cohort, such as selection bias associated with exposure allocation, would be replicated in the case-control study sampled from that cohort.
Consider two studies of the same issue, agricultural pesticide exposure and the development of Parkinson's disease. In the cohort study, we identify a large population of pesticide users to monitor the incidence of Parkinson's disease and an unexposed cohort that is free of such exposure. We would then compare the incidence of Parkinson's disease in the two groups. In the case-control study, assume we have a roster of Parkinson's disease cases from a large referral center and select controls for comparison from the same geographic region as the cases, in order to assess the prevalence of exposure to agricultural pesticides in each group and thereby estimate the association. The methodologic challenge in the cohort study is to identify an unexposed cohort that is as similar as possible to the exposed group in all other factors that influence the risk of developing Parkinson's disease, such as demographic characteristics, tobacco use, and family disease history. Bias arises to the extent that our unexposed group does not generate a valid estimate of the disease risk the pesticide-exposed persons would have had absent that exposure.
Bias arises in case-control studies not because the cases and controls differ on characteristics other than exposure but because the selected controls do not accurately reflect exposure prevalence in the study base. In our efforts to choose appropriate controls for the referred Parkinson's disease cases, we need to first ask what defines the study base that generated those cases—what is the geographic scope, socioeconomic, and behavioral characteristics of the source population for these cases, which health care providers refer patients to this center, etc. Once we fully understand the source of those cases, we seek to sample without bias from that study base. The critical comparison that defines whether we have succeeded in obtaining a valid control group is not the comparison of controls to Parkinson's disease cases but the comparison of controls to the study base that generated those cases. Only if the cases are effectively a random sample from the study base, that is, only if it is a foregone conclusion that there are no predictors of disease, would the goal of making controls as similar as possible to cases be appropriate.
Properly selected controls have the same exposure prevalence, within the range of sampling error, as the study base. Selection bias distinctive to case-control studies arises when the cases and controls are not coherent relative to one another (Miettinen, 1985), i.e., the groups do not come from the same study base. Thus, the falsely seductive analogy that "exposed and unexposed should be alike in all respects except disease" in cohort studies and therefore "cases and controls should be alike in all respects except disease" is simply incorrect.
Was this article helpful?
All you need is a proper diet of fresh fruits and vegetables and get plenty of exercise and you'll be fine. Ever heard those words from your doctor? If that's all heshe recommends then you're missing out an important ingredient for health that he's not telling you. Fact is that you can adhere to the strictest diet, watch everything you eat and get the exercise of amarathon runner and still come down with diabetic complications. Diet, exercise and standard drug treatments simply aren't enough to help keep your diabetes under control.