Development of a CSRML version of the Analog identification Methodology (AIM) fragments and their evaluation within the Generalised Read-Across (GenRA) approach

•Developed a CSRML version of AIM fragments.•Evaluated performance of AIM fragments relative to the AIM database.•Compared AIM and ToxPrint fragments against a chemistry list of interest to EPA.•Used AIM and ToxPrint fragments in the prediction of LD50 with GenRA. The Analog Identification Methodolo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational toxicology 2023-02, Vol.25, p.100256, Article 100256
Hauptverfasser: Adams, Matthew, Hidle, Hannah, Chang, Daniel, Richard, Ann M., Williams, Antony J., Shah, Imran, Patlewicz, Grace
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Developed a CSRML version of AIM fragments.•Evaluated performance of AIM fragments relative to the AIM database.•Compared AIM and ToxPrint fragments against a chemistry list of interest to EPA.•Used AIM and ToxPrint fragments in the prediction of LD50 with GenRA. The Analog Identification Methodology (AIM) was developed over 20 years ago to identify analogues to support read-across at the US Environmental Protection Agency. However, the current public version of the standalone tool, released in 2012, is no longer usable on Windows operating systems supported by Microsoft. Additionally, the structural logic for analogue selection is based on older, customised Simplified molecular-input-line-entry system (SMILES)-type features that are incompatible with modern cheminformatics tools. Given these limitations, a case study was undertaken to explore a more transparent, extensible method of implementing the AIM fragments using Chemical Subgraphs and Reactions Mark-up Language (CSRML). A CSRML file was developed to codify the original AIM fragments, and the extent to which AIM fragments were faithfully replicated was assessed using the AIM Database. The overall mean performance of the CSRML-AIM across all fragments in terms of sensitivity, specificity, and Jaccard similarity was 89.5%, 99.9%, and 82.2%, respectively. Comparing the AIM fragments with public ToxPrints using a large set of ∼25,000 substances of regulatory interest to EPA found them to be dissimilar, with an average maximum Jaccard score of 0.24 for AIM and 0.29 for ToxPrint fingerprints. Both fragment sets were then used as inputs in the automated read-across approach, Generalised Read-Across (GenRA), to evaluate the quality of fit in predicting rat acute oral toxicity LD50 values with the coefficient of determination (R2) and root mean squared error (RMSE). The performance of AIM fragments was R2=0.434 and RMSE=0.663 whereas that of ToxPrints was R2=0.477 and RMSE=0.638. A bootstrap resampling using 100 iterations found the mean and the 95th confidence interval of R2 to be 0.349 [0.319, 0.379] for AIM fragments and 0.377 [0.338, 0.412] for ToxPrints. Although AIM and ToxPrints performed similarly in predicting LD50, they differed in their performance at a local level, revealing that their features can offer complementary insights.
ISSN:2468-1113
2468-1113
DOI:10.1016/j.comtox.2022.100256