0710-Cross Sectional Design

Background material by Professor Omar Hasan Kasule for Year 3 PPSD session on 01st November 2007.


The cross-sectional study, also called the prevalence study or naturalistic sampling, has the objective of determination of prevalence of risk factors and prevalence of disease at a point in time (calendar time or an event like birth or death).  Disease and exposure are ascertained simultaneously. A cross-sectional study can be descriptive or analytic or both.  It may be done once or may be repeated. Individual-based studies collect information on individuals. Group-based (ecologic) studies collect aggregate information about groups of individuals. Cross-sectional studies are used in community diagnosis, preliminary study of disease etiology, assessment of health status, disease surveillance, public health planning, and program evaluation. Cross-sectional studies have the advantages of simplicity, and rapid execution to provide rapid answers. Their disadvantages are: inability to study etiology because the time sequence between exposure and outcome is unknown, inability to study diseases with low prevalence, high respondent bias, poor documentation of confounding factors, and over-representation of diseases of long duration.



The study may be based on the whole population or a sample. It may be based on individual sampling units or groups of individuals. The study sample is divided into 4 groups: a = exposed cases, b = unexposed cases, c = exposed noncases, and d = unexposed noncases. The total sample size is n = a + b + c + d; n is the only quantity fixed before data collection. The marginal totals are n1 = a+b,  n0 = b+d, m1 = a+b, and m0 = c+d. None of the marginal totals is fixed. Sampling methods can be simple random sampling, cluster sampling, systematic sampling, and multi-stage sampling. Sample size is determined using specific formulas. Cases are identified from clinical examinations, interviews, or clinical records. Data is collected by clinical examination, questionnaires, personal interview, and review of clinical records.



The following descriptive statistics can be computed from a cross-sectional study: mean, standard deviation, median, percentile, quartiles, ratios, proportions, prevalence of the risk factor, n1/n, and the prevalence of the disease, m1/n. The following analytic statistics can be computed: correlation coefficient, regression coefficient, odds ratio, and rate difference. The prevalence difference is computed as p1 – p0 = a/n1 - b/n0. The prevalence ratio is computed as p1/p0 = (a/n1) / (b/n0). The prevalence odds ratio is computed as POR = {p1(1 - p1)} / { p0(1 - p0)}.



Ecological studies, exploratory or analytic, study aggregate and not individual information. Groups commonly used are schools, factories, and countries. Exposure is measured as an overall group index. Outcome is measured as rates, proportions, and means. The correlation and regression coefficients are used as effect measures. The advantages of ecological studies are: low cost, convenience, easy analysis, and interpretation. They have several weaknesses. They generate but cannot test hypotheses. They cannot be used in definitive etiological research. They suffer from ecological fallacy (relation at the aggregate is not true at the individual level). They lack data to control for confounding. Data is often inaccurate or incomplete. Collinearity is a common problem.



Surveys involve more subjects than the usual epidemiological sample are used for measurement of health and disease, assessment of needs, assessment service utilization and care. They may be population or sample surveys. Planning of surveys includes: literature survey, stating objectives, identifying and prioritizing the problem, formulating a hypothesis, defining the population, defining the sampling frame, determining sample size and sampling method, training study personnel, considering logistics (approvals, manpower, materials and equipment., finance, transport, communication, and  accommodation), preparing and  pre-testing the study questionnaire. Surveys may be cross sectional or longitudinal. The household is the usual sampling unit. Sampling may be simple random sampling, systematic sampling, stratified sampling, cluster sampling, or multistage sampling. Existing data may be used or new data may be collected using a questionnaire (postal, telephone, diaries, and interview), physical examinations, direct observation, and laboratory investigations. Structure and contents of the survey report is determined by potential readers. The report is used to communicate information and also

Professor Omar Hasan Kasule, Sr. November, 2007