Sparse multitask regression for identifying common mechanism of response to therapeutic targets

Motivation: Molecular association of phenotypic responses is an important step in hypothesis generation and for initiating design of new experiments. Current practices for associating gene expression data with multidimensional phenotypic data are typically (i) performed one-to-one, i.e. each gene is...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2010-06, Vol.26 (12), p.i97-i105
Hauptverfasser: Zhang, Kai, Gray, Joe W., Parvin, Bahram
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Motivation: Molecular association of phenotypic responses is an important step in hypothesis generation and for initiating design of new experiments. Current practices for associating gene expression data with multidimensional phenotypic data are typically (i) performed one-to-one, i.e. each gene is examined independently with a phenotypic index and (ii) tested with one stress condition at a time, i.e. different perturbations are analyzed separately. As a result, the complex coordination among the genes responsible for a phenotypic profile is potentially lost. More importantly, univariate analysis can potentially hide new insights into common mechanism of response. Results: In this article, we propose a sparse, multitask regression model together with co-clustering analysis to explore the intrinsic grouping in associating the gene expression with phenotypic signatures. The global structure of association is captured by learning an intrinsic template that is shared among experimental conditions, with local perturbations introduced to integrate effects of therapeutic agents. We demonstrate the performance of our approach on both synthetic and experimental data. Synthetic data reveal that the multi-task regression has a superior reduction in the regression error when compared with traditional L1-and L2-regularized regression. On the other hand, experiments with cell cycle inhibitors over a panel of 14 breast cancer cell lines demonstrate the relevance of the computed molecular predictors with the cell cycle machinery, as well as the identification of hidden variables that are not captured by the baseline regression analysis. Accordingly, the system has identified CLCA2 as a hidden transcript and as a common mechanism of response for two therapeutic agents of CI-1040 and Iressa, which are currently in clinical use. Contact: b_parvin@lbl.gov
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btq181