Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data

Large-scale pharmacogenomic screens of cancer cell lines have emerged as an attractive pre-clinical system for identifying tumor genetic subtypes with selective sensitivity to targeted therapeutic strategies. Application of modern machine learning approaches to pharmacogenomic datasets have demonstr...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Pacific Symposium on Biocomputing 2014 2014, p.63-74
Hauptverfasser:	Neto, Elias, Margolin, Adam, Friend, Stephen, Jang, In, Guinney, Justin
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Artificial Intelligence Cell Line, Tumor Computational Biology Databases, Genetic - statistics & numerical data Drug Resistance, Neoplasm - genetics Gene Expression Profiling - statistics & numerical data Humans Models, Genetic Neoplasms - drug therapy Neoplasms - genetics Pharmacogenetics - statistics & numerical data Regression Analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	74
container_issue
container_start_page	63
container_title	Pacific Symposium on Biocomputing 2014
container_volume
creator	Neto, Elias Margolin, Adam Friend, Stephen Jang, In Guinney, Justin
description	Large-scale pharmacogenomic screens of cancer cell lines have emerged as an attractive pre-clinical system for identifying tumor genetic subtypes with selective sensitivity to targeted therapeutic strategies. Application of modern machine learning approaches to pharmacogenomic datasets have demonstrated the ability to infer genomic predictors of compound sensitivity. Such modeling approaches entail many analytical design choices; however, a systematic study evaluating the relative performance attributable to each design choice is not yet available. In this work, we evaluated over 110,000 different models, based on a multifactorial experimental design testing systematic combinations of modeling factors within several categories of modeling choices, including: type of algorithm, type of molecular feature data, compound being predicted, method of summarizing compound sensitivity values, and whether predictions are based on discretized or continuous response values. Our results suggest that model input data (type of molecular features and choice of compound) are the primary factors explaining model performance, followed by choice of algorithm. Our results also provide a statistically principled set of recommended modeling guidelines, including: using elastic net or ridge regression with input features from all genomic profiling platforms, most importantly, gene expression features, to predict continuous-valued sensitivity scores summarized using the area under the dose response curve, with pathway targeted compounds most likely to yield the most accurate predictors. In addition, our study provides a publicly available resource of all modeling results, an open source code base, and experimental design for researchers throughout the community to build on our results and assess novel methodologies or applications in related predictive modeling problems.
format	Article
fullrecord	<record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_gale_vrl_6127900014</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>CX6127900014</galeid><sourcerecordid>CX6127900014</sourcerecordid><originalsourceid>FETCH-LOGICAL-g285t-961cc6ade903ef40a1e8c7bcbc46c0c26ce164f6f4a5940980110ef876a27e3d3</originalsourceid><addsrcrecordid>eNo1kF1LwzAYhaMgbsz9AS8kf6CQr6bJ5ej8GEwUptclTd_MSNqOJBv03zuZXh04PDxwzhVa6kppRUWpOGPkGs0Z52UhNZcztEzpmxBCS6a5kLdoxgTTVcnFHLndlDL0JnuLVylBSj0MGY8OrwYTpnNtAn6F_DV2Cbsx4nU87vEOhuSzP_k84fcInbfZjwN2cexxbQYLEdcQAt76AfDaZHOHbpwJCZZ_uUCfT48f9UuxfXve1KttsWeqzIWW1FppOtCEgxPEUFC2am1rhbTEMmmBSuGkE6bUgmhFKCXgVCUNq4B3fIEeLt7Dse2haw7R9yZOzf_gM3B_AfYmQHOKoZGUVfr3HcF_AJw_XVI</addsrcrecordid><sourcetype>Index Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data</title><source>MEDLINE</source><source>DOAB: Directory of Open Access Books</source><creator>Neto, Elias ; Margolin, Adam ; Friend, Stephen ; Jang, In ; Guinney, Justin</creator><creatorcontrib>Neto, Elias ; Margolin, Adam ; Friend, Stephen ; Jang, In ; Guinney, Justin</creatorcontrib><description>Large-scale pharmacogenomic screens of cancer cell lines have emerged as an attractive pre-clinical system for identifying tumor genetic subtypes with selective sensitivity to targeted therapeutic strategies. Application of modern machine learning approaches to pharmacogenomic datasets have demonstrated the ability to infer genomic predictors of compound sensitivity. Such modeling approaches entail many analytical design choices; however, a systematic study evaluating the relative performance attributable to each design choice is not yet available. In this work, we evaluated over 110,000 different models, based on a multifactorial experimental design testing systematic combinations of modeling factors within several categories of modeling choices, including: type of algorithm, type of molecular feature data, compound being predicted, method of summarizing compound sensitivity values, and whether predictions are based on discretized or continuous response values. Our results suggest that model input data (type of molecular features and choice of compound) are the primary factors explaining model performance, followed by choice of algorithm. Our results also provide a statistically principled set of recommended modeling guidelines, including: using elastic net or ridge regression with input features from all genomic profiling platforms, most importantly, gene expression features, to predict continuous-valued sensitivity scores summarized using the area under the dose response curve, with pathway targeted compounds most likely to yield the most accurate predictors. In addition, our study provides a publicly available resource of all modeling results, an open source code base, and experimental design for researchers throughout the community to build on our results and assess novel methodologies or applications in related predictive modeling problems.</description><identifier>EISSN: 2335-6936</identifier><identifier>EISBN: 9789814583220</identifier><identifier>EISBN: 9814583227</identifier><identifier>PMID: 24297534</identifier><language>eng</language><publisher>United States</publisher><subject>Algorithms ; Artificial Intelligence ; Cell Line, Tumor ; Computational Biology ; Databases, Genetic - statistics & numerical data ; Drug Resistance, Neoplasm - genetics ; Gene Expression Profiling - statistics & numerical data ; Humans ; Models, Genetic ; Neoplasms - drug therapy ; Neoplasms - genetics ; Pharmacogenetics - statistics & numerical data ; Regression Analysis</subject><ispartof>Pacific Symposium on Biocomputing 2014, 2014, p.63-74</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>127,146,776,780,25331</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/24297534$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Neto, Elias</creatorcontrib><creatorcontrib>Margolin, Adam</creatorcontrib><creatorcontrib>Friend, Stephen</creatorcontrib><creatorcontrib>Jang, In</creatorcontrib><creatorcontrib>Guinney, Justin</creatorcontrib><title>Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data</title><title>Pacific Symposium on Biocomputing 2014</title><addtitle>Pac Symp Biocomput</addtitle><description>Large-scale pharmacogenomic screens of cancer cell lines have emerged as an attractive pre-clinical system for identifying tumor genetic subtypes with selective sensitivity to targeted therapeutic strategies. Application of modern machine learning approaches to pharmacogenomic datasets have demonstrated the ability to infer genomic predictors of compound sensitivity. Such modeling approaches entail many analytical design choices; however, a systematic study evaluating the relative performance attributable to each design choice is not yet available. In this work, we evaluated over 110,000 different models, based on a multifactorial experimental design testing systematic combinations of modeling factors within several categories of modeling choices, including: type of algorithm, type of molecular feature data, compound being predicted, method of summarizing compound sensitivity values, and whether predictions are based on discretized or continuous response values. Our results suggest that model input data (type of molecular features and choice of compound) are the primary factors explaining model performance, followed by choice of algorithm. Our results also provide a statistically principled set of recommended modeling guidelines, including: using elastic net or ridge regression with input features from all genomic profiling platforms, most importantly, gene expression features, to predict continuous-valued sensitivity scores summarized using the area under the dose response curve, with pathway targeted compounds most likely to yield the most accurate predictors. In addition, our study provides a publicly available resource of all modeling results, an open source code base, and experimental design for researchers throughout the community to build on our results and assess novel methodologies or applications in related predictive modeling problems.</description><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Cell Line, Tumor</subject><subject>Computational Biology</subject><subject>Databases, Genetic - statistics & numerical data</subject><subject>Drug Resistance, Neoplasm - genetics</subject><subject>Gene Expression Profiling - statistics & numerical data</subject><subject>Humans</subject><subject>Models, Genetic</subject><subject>Neoplasms - drug therapy</subject><subject>Neoplasms - genetics</subject><subject>Pharmacogenetics - statistics & numerical data</subject><subject>Regression Analysis</subject><issn>2335-6936</issn><isbn>9789814583220</isbn><isbn>9814583227</isbn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNo1kF1LwzAYhaMgbsz9AS8kf6CQr6bJ5ej8GEwUptclTd_MSNqOJBv03zuZXh04PDxwzhVa6kppRUWpOGPkGs0Z52UhNZcztEzpmxBCS6a5kLdoxgTTVcnFHLndlDL0JnuLVylBSj0MGY8OrwYTpnNtAn6F_DV2Cbsx4nU87vEOhuSzP_k84fcInbfZjwN2cexxbQYLEdcQAt76AfDaZHOHbpwJCZZ_uUCfT48f9UuxfXve1KttsWeqzIWW1FppOtCEgxPEUFC2am1rhbTEMmmBSuGkE6bUgmhFKCXgVCUNq4B3fIEeLt7Dse2haw7R9yZOzf_gM3B_AfYmQHOKoZGUVfr3HcF_AJw_XVI</recordid><startdate>2014</startdate><enddate>2014</enddate><creator>Neto, Elias</creator><creator>Margolin, Adam</creator><creator>Friend, Stephen</creator><creator>Jang, In</creator><creator>Guinney, Justin</creator><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope></search><sort><creationdate>2014</creationdate><title>Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data</title><author>Neto, Elias ; Margolin, Adam ; Friend, Stephen ; Jang, In ; Guinney, Justin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-g285t-961cc6ade903ef40a1e8c7bcbc46c0c26ce164f6f4a5940980110ef876a27e3d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Cell Line, Tumor</topic><topic>Computational Biology</topic><topic>Databases, Genetic - statistics & numerical data</topic><topic>Drug Resistance, Neoplasm - genetics</topic><topic>Gene Expression Profiling - statistics & numerical data</topic><topic>Humans</topic><topic>Models, Genetic</topic><topic>Neoplasms - drug therapy</topic><topic>Neoplasms - genetics</topic><topic>Pharmacogenetics - statistics & numerical data</topic><topic>Regression Analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Neto, Elias</creatorcontrib><creatorcontrib>Margolin, Adam</creatorcontrib><creatorcontrib>Friend, Stephen</creatorcontrib><creatorcontrib>Jang, In</creatorcontrib><creatorcontrib>Guinney, Justin</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><jtitle>Pacific Symposium on Biocomputing 2014</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Neto, Elias</au><au>Margolin, Adam</au><au>Friend, Stephen</au><au>Jang, In</au><au>Guinney, Justin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data</atitle><jtitle>Pacific Symposium on Biocomputing 2014</jtitle><addtitle>Pac Symp Biocomput</addtitle><date>2014</date><risdate>2014</risdate><spage>63</spage><epage>74</epage><pages>63-74</pages><eissn>2335-6936</eissn><eisbn>9789814583220</eisbn><eisbn>9814583227</eisbn><abstract>Large-scale pharmacogenomic screens of cancer cell lines have emerged as an attractive pre-clinical system for identifying tumor genetic subtypes with selective sensitivity to targeted therapeutic strategies. Application of modern machine learning approaches to pharmacogenomic datasets have demonstrated the ability to infer genomic predictors of compound sensitivity. Such modeling approaches entail many analytical design choices; however, a systematic study evaluating the relative performance attributable to each design choice is not yet available. In this work, we evaluated over 110,000 different models, based on a multifactorial experimental design testing systematic combinations of modeling factors within several categories of modeling choices, including: type of algorithm, type of molecular feature data, compound being predicted, method of summarizing compound sensitivity values, and whether predictions are based on discretized or continuous response values. Our results suggest that model input data (type of molecular features and choice of compound) are the primary factors explaining model performance, followed by choice of algorithm. Our results also provide a statistically principled set of recommended modeling guidelines, including: using elastic net or ridge regression with input features from all genomic profiling platforms, most importantly, gene expression features, to predict continuous-valued sensitivity scores summarized using the area under the dose response curve, with pathway targeted compounds most likely to yield the most accurate predictors. In addition, our study provides a publicly available resource of all modeling results, an open source code base, and experimental design for researchers throughout the community to build on our results and assess novel methodologies or applications in related predictive modeling problems.</abstract><cop>United States</cop><pmid>24297534</pmid><tpages>12</tpages></addata></record>
fulltext	fulltext
identifier	EISSN: 2335-6936
ispartof	Pacific Symposium on Biocomputing 2014, 2014, p.63-74
issn	2335-6936
language	eng
recordid	cdi_gale_vrl_6127900014
source	MEDLINE; DOAB: Directory of Open Access Books
subjects	Algorithms Artificial Intelligence Cell Line, Tumor Computational Biology Databases, Genetic - statistics & numerical data Drug Resistance, Neoplasm - genetics Gene Expression Profiling - statistics & numerical data Humans Models, Genetic Neoplasms - drug therapy Neoplasms - genetics Pharmacogenetics - statistics & numerical data Regression Analysis
title	Systematic Assessment of Analytical Methods for Drug Sensitivity Prediction from Cancer Cell Line Data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T16%3A53%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Systematic%20Assessment%20of%20Analytical%20Methods%20for%20Drug%20Sensitivity%20Prediction%20from%20Cancer%20Cell%20Line%20Data&rft.jtitle=Pacific%20Symposium%20on%20Biocomputing%202014&rft.au=Neto,%20Elias&rft.date=2014&rft.spage=63&rft.epage=74&rft.pages=63-74&rft.eissn=2335-6936&rft_id=info:doi/&rft_dat=%3Cgale_pubme%3ECX6127900014%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&rft.eisbn=9789814583220&rft.eisbn_list=9814583227&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/24297534&rft_galeid=CX6127900014&rfr_iscdi=true