# Statistical Decoding Methodology And Interpretation Of Biological Assay Results

A statistical approach essentially eliminates the false-positive problem that occurs in all biological assays. A strategy that applies statistics to the decoding process is used to establish the active compounds from the pooled beads [5,10]. The beads are arrayed from the reaction block to the assay format in a random fashion. The total number of beads selected from each final pool was sufficient to have an average of six copies of each compound to be present in a solution assay. Lawn assays used an average of three copies of each compound. Statistical decoding relies on repetition of the mass codes to indicate activity. The results from biological assay are compared to that of random selection of wells from a pool that employs a software program. The frequency of repeating codes from the assay will differ from the random case when biological activity is due to a compound in the library. Since there is an average of six copies of each compound in the solution assay, then an active compound is expected to be present in six of the higher biologically active wells. The decoding of all the beads associated with a positive assay result

(~five beads for each well) generates a table of codes whereby the common code equates to the active compound and occurs with an average frequency of six (number of times a compound is in the assay). This approach permits the observed frequency of individual compounds in the various wells to identify the biologically active component.

Because the beads are random in the reaction well when the beads are dispensed for biological assay, there is a distribution of the number of copies of each compound that is present in the assay. To obtain a code with a frequency of greater than three is a clear indication that the corresponding compound is active in the assay, as this frequency is significantly higher than could be accounted for by chance alone. In this way, false-positives, either associated with the assay itself or from some unknown factor that occurs during the synthesis or the cleavage procedure, have been eliminated.

Since the beads are dispensed in the plates in a random fashion, the activity in the assay plate should display a random pattern. The active wells from the assay are verified to be randomly distributed with in-house software. This should always be true, except for the case where the activity is associated with monomer C. In this case, every compound in a pool contains the active monomer and the entire plate will be active.

The data and results obtained from the sequential processing routine are illustrated in Figure 8.14. The solution plates of library compounds are screened by biologists. The data are processed through a series of algorithms to determine the wells that contain a probable lead or a "structure-activity-relationship" upon decoding. The library pools were rank ordered with a scoring function, (Figure 8.14A) that takes into account several factors from the biological screen data. The higher the scoring function, the more likely the pool contains an active compound. One pool stood out from the rest, and the individual assay values from that pool were again ranked to show 11 high-activity values significantly above the remaining wells (Figure 8.14B). A total of 58 beads from the corresponding wells were arrayed as singles beads, chemically cleaved, and the codes read by mass spectrometry. The data for these eleven wells were tabulated and compared to that obtained at random by Monte Carlo prediction of an average number of repeating codes (MC (avg)) and a standard deviation (MC (std)) by the random selection of 11 wells from the assay. The assay results show that code 45 is present in all eleven wells and that the probability of 11 codes repeating purely by chance is zero (Figure 8.14B). This compound was resynthesized in a 10-mg scale, high-performance liquid-chromatographic (HPLC) purified, and biologically rescreened to verify the activity. This strategy leads to the identification of an active compound without the task of resynthesis of all compounds in the active wells or rescreening discrete beads of the active pool. The active pool can be identified without the requirement of resynthesis and rescreening efforts, because the statistical decoding methodology ensures activity.