Statisticians use data from a variety of sources: observational data are from cross-sectional, retrospective, and prospective studies; experimental data are derived from planned experiments and clinical trials. What are some illustrations of the types of data from each of these sources? Sometimes, observational data have been collected from naturally or routinely occurring situations. Other times, they are collected for administrative purposes; examples are data from medical records, government agencies, or surveys. Experimental data include the results that have been collected from formal intervention studies or clinical trials; some examples are survival data, the proportion of patients who recover from a medical procedure, and relapse rates after taking a new medication.

Most study designs contain one or more outcome variables that are specified explicitly. (Sometimes, a study design may not have an explicitly defined outcome variable but, rather, the outcome is implicit; however, the use of an implicit outcome variable is not a desirable practice.) Study outcome variables may range from counts of the number of cases of illness or the number of deaths to responses to an attitude questionnaire. In some disciplines, outcome variables are called dependent variables. The researcher may wish to relate these outcomes to disease risk factors such as exposure to toxic chemicals, electromagnetic radiation, or particular medications, or to some other factor that is thought to be associated with a particular health outcome.

In addition to outcome variables, study designs assess exposure factors. For example, exposure factors may include toxic chemicals and substances, ionizing radiation, and air pollution. Other types of exposure factors, more formally known as risk factors, include a lack of exercise, a high-fat diet, and smoking. In other disciplines, exposure factors sometimes are called independent variables. However, epidemiologists prefer to use the term exposure factor.

One important issue pertains to the time frame for collection of data, whether information about exposure and outcome factors is referenced about a single point in time or whether it involves looking backward or forward in time. These distinctions are important because, as we will learn, they affect both the types of analyses that we can perform and our confidence about inferences that we can make from the analyses. The following illustrations will clarify this issue.

