Species translatable blood gene signature as a marker of exposure to smoking: Computational approaches of the top ranked teams in the sbv IMPROVER Systems Toxicology Challenge

•Common blood mRNAs predict current exposure to cigarette smoke in mouse and human.•Organism specific signature is more accurate for prediction of current exposure.•Best predictive methods found in previous challenges are proven again to be robust. Crowdsourcing has been used to address computationa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational toxicology 2018-02, Vol.5, p.25-30
Hauptverfasser: Saraç, Ömer Sinan, Kumar, Rahul, Dhanda, Sandeep Kumar, Balcı, Ali Tuğrul, Bilgen, İsmail, Romero, Roberto, Tarca, Adi L.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Common blood mRNAs predict current exposure to cigarette smoke in mouse and human.•Organism specific signature is more accurate for prediction of current exposure.•Best predictive methods found in previous challenges are proven again to be robust. Crowdsourcing has been used to address computational challenges in systems biology and assess translation of findings across species. Sub-challenge 2 of the sbv IMPROVER Systems Toxicology Challenge was designed to determine whether a common set of genes can be used to identify exposure to cigarette smoke in both human and mouse. Participating teams used a training set of human and mouse blood gene expression data to derive parsimonious models (up to 40 genes) that classify subjects into exposure groups: smokers, former smokers, and never-smokers. Teams were ranked based on two classification performance metrics evaluated on a blinded test dataset. Prediction of current exposure to cigarette smoke in human and mouse by a common prediction model was achieved by the top ranked team (Team 219) with 89% balanced accuracy (BAC), while past exposure was predicted with only 57% BAC. The prediction model of the top ranked team was a random forest classifier trained on sets of genes that appeared best for each species separately with no overlap between species. By contrast, Team 264, ranked second (tied with Team 250), selected genes that were simultaneously predictive in both species and achieved 80% and 59% BAC when predicting current and past exposure, respectively. These performance values were lower than the 96.5% and 61% BAC estimates for current and past exposure, respectively, obtained by Team 264 (top ranked in sub-challenge 1) when using only human data. Unlike past exposure, current exposure to cigarette smoke can be accurately assessed in both human and mouse with a common prediction model based on blood mRNAs. However, requiring a common gene signature to be predictive in both species resulted in a substantial decrease in balanced accuracy for prediction of current exposure to cigarette smoke (from 96.5% to 80%), suggesting species-specific responses exist.
ISSN:2468-1113
2468-1113
DOI:10.1016/j.comtox.2017.04.001