FIGURE 15.4 Hypothetical compound series having good in vitro potency, but low in vivo activity after oral administration. Low permeability is apparently responsible for low bioavailability.

FIGURE 15.4 Hypothetical compound series having good in vitro potency, but low in vivo activity after oral administration. Low permeability is apparently responsible for low bioavailability.

Presentation of the data with a ranking color scheme (e.g., green for high, yellow for moderate, red for low) can be useful for formatting pharmaceutical profiling data to provide increased insights for in vitro and in vivo properties. The results are easily visualized through color recognition. This information helps to diagnose the potential causes for undesirable properties and what can be done to improve these properties. For example, a hypothetical series of compounds has good in vitro potency, but have no activity in vivo after oral administration. Correlating the in vivo bioavailability data with in vitro solubility, permeability, microsomal stability, and plasma stability with color recognition in Figure 15.4 quickly indicates that permeability is the major cause of poor oral bioavailaility for the series. Strategies can then be formulated to overcome this hurdle early in discovery, such as increase lipophilicity or pro-drug approaches to enhance permeability. If the undesirable feature is due to some properties that cannot be fixed, a "go/no go" decision can be made quickly to avoid investment in a series that is not likely to succeed.

Multivariate Data Analysis for Pharmaceutical Profiling

A common method of data analysis is to examine changes in one variable at a time (e.g., solubility) and relate these to the structure of the compound. The large volumes of data that are now produced from high throughput methods, as well as their complexity and correlation, suggest that there are more productive ways to derive information from the data. One useful approach for analyzing large and complex data sets is multivariate analysis [27,28]. With this approach an entire data set is simultaneously analyzed to derive increased information.

Many biological activities and druglike properties are governed by a set of strongly intercorrelated molecular and physicochemical properties. Multivariate analysis allows the inclusion of all theoretically calculated and experimentally measured descriptors for model building, QSAR, and quantitative structure-property relationships (QSPR) [29,30]:

1. Computationally derived descriptors (e.g., molecular weight, size, shape, polarity, electrostatic interactions, charge, lipophilicity, hydrogen bonding capacity, polar surface area).

2. In vitro experimentally measured physicochemical properties (e.g., solubility, permeability, stability, Log P, pKa) and biochemical characteristics (e.g., microsomal stability, CYP450 inhibition, plasma stability, plasma protein binding).

3. In vivo PK and tissue distribution (e.g., clearance, volume of distribution, half-life, mean residence time, fraction drug bound).

4. In vitro and in vivo biological data (e.g., enzyme or receptor assays, cell-based functional assays, efficacy animal models).

In general, experimentally measured descriptors provide more reliable data than calculated descriptors [31]. On the other hand, calculated parameters are advantageous for predicting the behavior of new molecules, even before they are actually synthesized.

A pharmaceutical-profiling program measures many diverse properties. Each compound is usually characterized by more than 10 different properties (descriptors/variables). Using multivariate analysis methods, these properties and their effect on compound performance can be correlated in order to address the goals for data analysis and interpretation in greater depth. Mul-tivariate analysis treats the data as a multidimensional array. The data are mapped in multidimensional space to find the correlations and reduced to a two- or three-dimensional space format that is easier for people to visualize. Multivariate analyses methods, such as principal components analysis (PCA), multiple linear regression (MLR), and partial least-squares (PLS), provide low-dimensional and statistical representation of data points with a few information-rich parameters [28]. Multivariate analysis tools are powerful for extracting underlying information, to find regularities in the data, and separate them from the "noise" [27]. Data sets having a few errors, outliers, or missing data can be dealt with [32].

Both qualitative and quantitative information can be derived from multi-variate analysis. A qualitative classification model provides overviews of how the properties relate to each other and how changes can be made to molecules in a qualitative sense. The main goal for qualitative analysis is to assist understanding and put all the pieces of the puzzle together. A quantitative model focuses on accurate prediction of various desirable properties for designing new molecules and combinatorial libraries with good potency and druglike properties.

Quantitative multivariate models provide the advantage of predicting the properties of new molecules, even before they are synthesized. This provides synthetic direction for improving pharmaceutical properties and helps to prioritize the synthetic efforts. Winiwarter et al. [33] used multivariate analysis to develop a model for predicting in vivo human jejunal permeability using experimentally and theoretically derived descriptors. Statistical software SIMCA from Umetrics AB ( was used. Its step-wise process for model building, which appears to be widely applicable, was as follows.

1. A critical but difficult to measure property (human jejunal permeability) was experimentally determined for a set of 22 compounds.

2. Several other properties or descriptors of each of these compounds were then experimentally measured or calculated.

3. PCA was performed on this compound set to demonstrate that the set was representative of drugs in general.

4. Compounds that were likely to have a different mechanism (paracellular or active transport) than the others (passive transcellular transport) were omitted.

5. The remaining compounds were divided into a training set and a test set. PLS models were generated on the training set.

6. Variables (measured properties and calculated descriptors) that were shown by these models not to correlate with the property of interest (human jejunal permeability) were dismissed as being of low relative importance, based on the method of variable importance (VIP).

7. A linear equation relating the important variables to the property of interest was derived. The models were evaluated using the test set. Once the model was verified, the final equation model was calculated using both the training and test sets.

Similar multivariate analysis approaches have been widely applied for model building and predicting druglike and physicochemical properties. For example, the following were predicted:

• Caco-2 permeability and intestinal drug absorption [31,34-41]

• Blood-brain barrier penetration [39,42,43]

• P -glycoprotein substrates [44,45]

Multivariate analysis can be used to predict multiple parameters at the same time and give insights into the overall profile of a compound series or library. It also allows the integration of QSAR and QSPR analysis, and thus simultaneous optimization of biological activity properties (in vitro and in vivo potency) and druglike properties (solubility, permeability, metabolism, PK) at early stages of drug discovery [44,45]. For example, the potency (IC50) and pharmaceutical-property data (solubility and permeability) can be analyzed at the same time by using simple descriptors (H-bonding, log P, molecular weight (MW), etc.). The correlation provides insights for modification of the molecules to make them more potent and more druglike, or to find a compromise between the two. Molecular descriptors should be simple and intuitive, if possible, so medicinal chemists can easily incorporate them to make changes in their compounds. Simple answers and directions tend to be very effective. For instance, more lipophilicity may be needed to make a compound series permeable, or increasing MW may increase potency, but decrease solubility. With the dynamic correlation of property variables, finding a balance between the two properties is necessary.

Figure 15.5 shows a hypothetical model PLS model for a set of compounds. The analysis shows that the cell-based functional assays (isopropionic acid) (IPA)-rat, IPA-human, and bioassay) do not correlate well with the receptor-binding assay. On the other hand, the cell-based assays are correlated well with the pharmaceutical properties, such as solubility and permeability. The model shows that increased MW and hydrogen-bonding capacity will improve potency. That, however, will decrease solubility and permeability and make them less druglike. So, for this series of compounds, it is important to find a balance between potency and druglike properties.

As new molecules are made, in vitro, pharmaceutical properties may be rapidly determined. By applying predictive multivariate models, other properties (e.g., in vivo permeability) that are more difficult to measure may be predicted. As the database increases, models become more precise and general. This iterative process provides useful information for drug and property design. Follow-up compounds and libraries can then be designed to have higher potency and better properties.

Multivariate data analysis can be coupled with data visualization programs, such as Spotfire, to enhance data visualization [46]. Many of these visualization programs are currently being applied for HTS data to develop SAR and for data mining. Similar methodologies can also be used to visualize pharmaceutical-profiling data and for SPR. These visualization tools can be used interactively by medicinal chemists to help them look for the important interactions and provide conceptual understanding of structure-property relationships.

MA.M2 (PUS), Untitled, Work set Loadings: w*c[1]/w*c[2]

Was this article helpful?

0 0

Post a comment