An evaluation of existing QSAR models and structural alerts and development of new ensemble models for genotoxicity using a newly compiled experimental dataset

•Genotoxicity data were compiled for over 9299 unique substances.•A categorization scheme was developed to prioritize chemicals on the basis of their genotoxic potential.•The predictive performance of a selection of (Q)SARs were evaluated against this categorization scheme.•An ensemble in silico mod...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational toxicology 2021-05, Vol.18 (C), p.100167, Article 100167
Hauptverfasser: Pradeep, Prachi, Judson, Richard, DeMarini, David M., Keshava, Nagalakshmi, Martin, Todd M., Dean, Jeffry, Gibbons, Catherine F., Simha, Anita, Warren, Sarah H., Gwinn, Maureen R., Patlewicz, Grace
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Genotoxicity data were compiled for over 9299 unique substances.•A categorization scheme was developed to prioritize chemicals on the basis of their genotoxic potential.•The predictive performance of a selection of (Q)SARs were evaluated against this categorization scheme.•An ensemble in silico model was developed to mimic the categorization scheme. Regulatory agencies world-wide face the challenge of performing risk-based prioritization of thousands of substances in commerce. In this study, a major effort was undertaken to compile a large genotoxicity dataset (54,805 records for 9299 substances) from several public sources (e.g., TOXNET, COSMOS, eChemPortal). The names and outcomes of the different assays were harmonized, and assays were annotated by type: gene mutation in Salmonella bacteria (Ames assay) and chromosome mutation (clastogenicity) in vitro or in vivo (chromosome aberration, micronucleus, and mouse lymphoma Tk+/− assays). This dataset was then evaluated to assess genotoxic potential using a categorization scheme, whereby a substance was considered genotoxic if it was positive in at least one Ames or clastogen study. The categorization dataset comprised 8442 chemicals, of which 2728 chemicals were genotoxic, 5585 were not and 129 were inconclusive. QSAR models (TEST and VEGA) and selected OECD Toolbox structural alerts/profilers (e.g., OASIS DNA alerts for Ames and chromosomal aberrations) were used to make in silico predictions of genotoxicity potential. The performance of the individual QSAR tools and structural alerts resulted in balanced accuracies of 57–73%. A Naïve Bayes consensus model was developed using combinations of QSAR models and structural alert predictions. The ‘best’ consensus model selected had a balanced accuracy of 81.2%, a sensitivity of 87.24% and a specificity of 75.20%. This in silico scheme offers promise as a first step in ranking thousands of substances as part of a prioritization approach for genotoxicity.
ISSN:2468-1113
2468-1113
DOI:10.1016/j.comtox.2021.100167