Synopsis of lecture for MPH candidates at Dept of Social and Preventive Medicine Universiti Malaya on Friday 3rd November 2006 by Professor Omar Hasan Kasule

1.0 BASICS  

The case-control study is popular because or its low cost, rapid results, and flexibility. It uses a small numbers of subjects. It is used for disease (rare and non rare) as well as non disease situations. A case control study can be exploratory or definitive. The variants of case control studies are the case-base, the case-cohort, the case-only, and the crossover designs. In the case-base design, cases are all diseased individuals in the population and controls are a random sample of disease-free individuals in the same base population. The case-cohort design is sampling from a cohort (closed or open). The case-only design is used in genetic studies in which the control exposure distribution can be worked out theoretically. The crossover design is used for sporadic exposures. The same individual can serve as a case or as a control several times without any prejudice to the study.



The marginal totals, a+b and c+d, are fixed by design before data collection thus prevalence cannot be computed. The source population for cases and controls must be the same. Cases are sourced from clinical records, hospital discharge records, disease registries, data from surveillance programs, employment records, and death certificates. Cases are either all cases of a disease or a sample thereof. Only incident cases (new cases) are selected. Controls must be from the same population base as the cases and must be like cases in everything except having the disease being studied. Information comparability between the case series and the control series must be assured. Hospital, community, neighborhood, friend, dead, and relative controls are used. There is little gain in efficiency beyond a 1:2 case control ratio unless control data is obtained at no cost. Confounding can be prevented or controlled by stratification and matching. Exposure information is obtained from interviews, hospital records, pharmacy records, vital records, disease registry, employment records, environmental data, genetic determinants, biomarker, physical measurements, and laboratory measurements. A nested case-control study can be carried out within a follow-up study. In this case, blood and other biological specimens collected from the cohort at the start can be analyzed for exposure information when cases of disease appear.



The results of a case control study are set out in a 2x2 contingency table and the following parameters are computed: proportion exposed among the diseased, proportion exposed among the non-diseased, the exposure odds among the diseased, the exposure odds among the non-diseased, the exposure odds ratio (with confidence bounds), the attributable rate, the attributable rate%, and the population attributable rate. Matching enhances validity by controlling for confounding and increases efficiency. Overmatching results into under-estimate of the effect measures. Matching increases costs, introduces selection bias, and does not allow study of the effect of the matching factor.



The case-control study design has the following strengths/advantages: computation of the OR as an approximation of the RR, low cost, short duration, and convenience for subjects because they are contacted/interviewed only once. The case control design several disadvantages: RR is approximated and is not measured, Pr(E+/D+) is computed instead of the more informative Pr(D+/E+), rates are not obtained because marginal totals are artificial and not natural being fixed by design, the time sequence between exposure and disease outcome is not clear, vulnerability to bias (misclassification, selection, and confounding), inability to study multiple outcomes, lack of precision in evaluating rare exposures, inability to validate historical exposure information, and inability to control for relevant confounding factors.



The bigger the samples size the bigger the power. Since confounding reduces the power of a study, increasing the sample size mitigates the effects of confounding.  For best results and ease of analysis, the number of cases should equal the number of controls. In actual practice the supply of cases is limited whereas controls are available in abundance. For a given number of cases power increases with increase of number of controls; not much marginal increase in power is obtained if the case control ratio is higher than 1:6. Economic considerations play a part in determining the case: control ratio. Specific formulas are used to compute sample size under each of the following situations: unmatched design with equal numbers, unmatched design with unequal numbers, matched (1:1) design, and matched (1:many) design.


Professor Omar Hasan Kasule Sr. November 2006