Use of Classification Regression Tree in Predicting Oral Absorption in Humans

The purpose of this study is to explore the use of classification regression trees (CART) in predicting, in the dose-independent range, the fraction dose absorbed in humans. Since the results from clinical formulations in humans were used for training the model, a hypothetical state of drug molecule...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Chemical Information and Computer Sciences 2004-11, Vol.44 (6), p.2061-2069
Hauptverfasser: Bai, Jane P. F, Utis, Andrey, Crippen, Gordon, He, Han-Dan, Fischer, Volker, Tullman, Robert, Yin, He-Qun, Hsu, Cheng-Pang, Jiang, Lan, Hwang, Kin-Kai
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The purpose of this study is to explore the use of classification regression trees (CART) in predicting, in the dose-independent range, the fraction dose absorbed in humans. Since the results from clinical formulations in humans were used for training the model, a hypothetical state of drug molecules already dissolved in the intestinal fluid was adopted. Therefore, the molecular attributes affecting dissolution were not considered in the model. As a result, the model projects the highest achievable fraction dose absorbed, providing a reference point for manipulating the formulations or solid states to optimize oral clinical efficacy. A set of approximately 1260 structures and their human oral pharmacokinetic data, including bioavailability and/or absorption and/or radio-labeled studies, were used, with 899 compounds as the training set and 362 the test set. The numerical range of the fraction dose absorbed, 0 to 1, was divided into 6 classes with each class having a size of approximately 0.16. A set of 28 structural descriptors was used for modeling oral absorption without considering active transport. Then, a separate branch was created for modeling oral absorption involving active transport. The AAE of the training set was 0.12 and those of five test sets ranged from 0.17 to 0.2. In terms of classification, two test sets of unpublished, proprietary compounds showed 79% to 86% prediction when the predicted values fallen within ± one class of real values were considered predicted. Overall, the computational errors from all the test sets of diverse structures were similar and reasonably acceptable. As compared to artificial membranes for ranking drug absorption potential, prediction by the CART model is considered fast and reasonably accurate for accelerating drug discovery. One can not only improve continuously the accuracy of CART computations by expanding the chemical space of the training set but also calculate the statistical errors associated with individual decision paths resulting from the training set to determine whether to accept individual computations of any test sets.
ISSN:0095-2338
1549-9596
1549-960X
DOI:10.1021/ci040023n