Data Analysis

The proteomic patterns acquired on both MS instruments were analyzed using the ProteomeQuest bioinformatics tool employing ASCII files consisting of m/z and intensity values [21]. As described earlier, the data were analyzed by the algorithm in an attempt to find a set of features at precise m/z values whose combined, normalized relative intensity values in «-space best segregate the data derived from the training set of spectra. The entire set of spectra acquired from the serum samples was divided into three data sets: (1) a training set that was used to discover the hidden diagnostics patterns; (2) a testing set; and (3) a validation set. The key subset of m/z values identified using the training set were used to classify, in a blinded fashion, the testing and validation sets, hence the algorithm had no prior knowledge of the spectra in the testing and validation sets.

The diagnostic models derived from the training sets were tested using blind serum-sample spectra obtained from 31 unaffected women and 63 women with ovarian cancer. The ovarian cancer diagnostic models were further validated using blind serum-sample spectra obtained from 37 unaffected women and 40 women with ovarian cancer. The results (Figure 4.9) clearly showed the ability of the bioinformatic algorithm to recognize feature sets from mass spectra acquired using the higher-resolution Qq-TOF MS, which result in statistically superior models over those generated using the lower-resolution PBS-II TOF MS. These models were statistically superior, not only in testing

FIGURE 4.9 Histograms representing the testing and blinded validation results of (A) sensitivity and (B) specificity of the diagnostic models for MS data acquired on either a Qq-TOF or a PBS-IITOF mass spectrometer.

(sensitivity, P2 < .0001; specificity, P2 < 3 x 10-19), but also in validation (sensitivity, P2 < 9 x 10-9; specificity, P2 < 6 x 10-6).

Four of the diagnostic models were found to be both 100% sensitive and specific in their ability to correctly discriminate serum samples taken from unaffected women from those taken from suffering from ovarian cancer. These models were generated from data acquired using the Qq-TOF MS. Just as importantly, and key if this technology is to become a viable screening tool, no false-positive or false-negative classifications occurred using these models, giving each of these models a PPV of 100% using the patient cohort employed in this study. No models generated from the PBS-II TOF MS data were both 100% sensitive and specific.

Another key aspect to this study is found in the examination of the key m/z features that comprise the four best performing patterns. One criticism of the use of proteomic patterns for diagnostic purposes is that the identity of the key m/z features is not known. At this point, it is debatable as to whether it is worth

FIGURE 4.10 Comparison of SELDI Qq-TOF mass spectra of serum from an ovarian cancer patient (panel A) and an unaffected individual (panel B). Insets show expanded m/z regions highlighting significant intensity differences of the peaks in the m/z bins 7060.121 and 8605.678, identified by the algorithm as belonging to the optimum discriminatory pattern.

FIGURE 4.10 Comparison of SELDI Qq-TOF mass spectra of serum from an ovarian cancer patient (panel A) and an unaffected individual (panel B). Insets show expanded m/z regions highlighting significant intensity differences of the peaks in the m/z bins 7060.121 and 8605.678, identified by the algorithm as belonging to the optimum discriminatory pattern.

the effort to identify these features, as they may provide little aid in developing an alternative diagnostic platform. There is also the likelihood that these key values may represent proteins that provide exciting insights to the manifestation and progression of cancer, and therefore identifying them is most likely a worthwhile effort. Examination of the four models that had 100% PPV for ovarian cancer reveals certain consistent features. Although the proteomic patterns generated from both healthy and cancer patients using the Qq-TOF MS are quite similar (Figure 4.10), careful inspection of the raw mass spectra reveals that key features within the binned m/z values 7060.121 and 8605.678 are indeed differentially abundant in a selection of the serum samples obtained from ovarian cancer patients as compared to unaffected individuals (Figure 4.10, insets). The results indicate that these MS peaks originate from species that may be consistent indicators of the presence of ovarian cancer and represent good candidates that may be key disease-progression indicators. The consistency of these peaks within spectra of serum acquired from patients with ovarian cancer, provide an excellent target for identification. Efforts at identifying these low molecular-weight components in serum are ongoing in our laboratory. Indeed, one of the overlooked powers of the proteomic pattern approach is its ability to screen hundreds of serum samples in a high throughput manner, and therefore quickly determine targets for further investigation. It must be reiterated that the ability to distinguish sera from an unaffected individual or an individual with ovarian cancer based on a single serum pro-teomic m/z feature alone, however, is not possible across the entire serum study set. Correct diagnosis is only possible when the key m/z features and their intensities are analyzed as a whole.

While the sensitivity and specificity of a previous report using a lower-resolution PBS-II TOF MS to diagnose ovarian cancer was impressive, to screen for diseases of relatively low prevalence, such as ovarian cancer, a diagnostic test must exceed 99.6% sensitivity and specificity to minimize false-positives, while correctly detecting early-stage disease when it is present [32]. In blinded testing and validation studies any one of the four best models generated using Qq-TOF MS data were able to correctly classify 22/22 women with stage I ovarian cancer, 81/81 women with stage II, III, and IV ovarian cancer, and 68/68 benign disease controls. It can be envisioned that in the near future a clinical test would simultaneously employ several combinations of highly accurate diagnostic proteomic patterns. Taken together, these patterns can achieve an even higher degree of accuracy in screening a large population heterogeneity and potential variability in sample quality and handling. Hence, a high-resolution system, such as the Qq-TOF MS employed in this study, is preferred based on the present results that serve as a platform for clinical trials of serum proteomic patterns.

Was this article helpful?

0 0

Post a comment