Biomarkers discovery is one of the most difficult tasks in biology partly due to the level of complexity of biological systems. To start with there are the problems of "normal biological variation" of 10-40%. This type of variation is impossible to reduce or eliminate from an experiment and thus must be accounted for in the statistical analysis. This variation arises from the differences in individuals resulting from genetic background, environment, diet, age, sex, and an almost limitless set of variables. In well-controlled systems like animal models, a number of these variables can be controlled and, as a result, we see that the normal biological variation can be reduced and, to some degree, controlled. In a medical environment using humans for the study subjects this is not possible or practical. In fact for a number of commonly used medical tests there is a normal range for the amount of an analyte and values outside this range are considered a problem. Well-established tests such as cholesterol, blood glucose, triglycerides, etc., all have a normal range, not a single normal value. The same turns out to be true for other biomarkers as well. In the process of discovery, validation, and assay development, it will be necessary to use statistical analysis to determine the normal and disease concentration ranges for each biomarker.
The problem of biological variation can be minimized by good study design and careful statistical analysis of data. By carefully selecting the patients for the discovery phase of the experiment one can focus the search on biomarkers that are directly related to the question of interest. To this end it is advisable during the biomarker discovery experiments to select samples where the diagnosis is clear and the patient data are as uncomplicated as possible. Uncomplicated patient data indicate samples from patients with as few medical complications as possible other than those resulting from the disease state under investigation. For example, selecting a sample from a person with cancer as their only medical condition is preferable to a cancer patient with heart disease, high blood pressure, or diabetes for use in the discovery experiment. This type of sample selection will reduce the number of variables that will have to be analyzed, potential sources for biomarkers from other diseases, and the complexity of the data analysis. The initial discovery experiment can generally use a relatively small number of samples; a set of 6-15 diseased persons and an equal number of matched (age, sex, race, etc.) controls is a good number to work with provided that these samples are well chosen. The goal of the discovery experiment is to find as many potential markers as possible. The primary validation experiment should be larger: from 30 to 50 samples and matched controls. The goal of the primary validation experiment is to test the discovered markers and select the best of the discovery marker set to focus further efforts on.
The exact number of samples that you will need to work with in both the discovery and primary validation phases of the project depends on the techniques that are used and the complexity of the disease and the patient samples that you are working with.
The second issue the investigator will face is the immense number and diversity of biomolecules present in a biological system. These molecules can range from simple organic or inorganic compounds such as glucose and Na+ to large complex biopolymers such as lipids, proteins, and carbohydrates. To further complicate the picture, biopolymers can be mixed with each other to form lipopolysaccharides, glycoproteins, lipoproteins, etc. From this complicated picture the investigator looking for a biomarker must find the differences that highlight a particular condition from this complicated array of molecules. This might seem like an impossible task, but it is not. Successful biomarker discovery has aided the advancement of medicine over the last century.
Was this article helpful?