Semiparametric Regression of Multidimensional Genetic Pathway Data: Least-Squares Kernel Machines and Linear Mixed Models

We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Biometrics 2007-12, Vol.63 (4), p.1079-1088
Hauptverfasser: Liu, Dawei, Lin, Xihong, Ghosh, Debashis
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1088
container_issue 4
container_start_page 1079
container_title Biometrics
container_volume 63
creator Liu, Dawei
Lin, Xihong
Ghosh, Debashis
description We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations.
doi_str_mv 10.1111/j.1541-0420.2007.00799.x
format Article
fullrecord <record><control><sourceid>jstor_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2665800</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>4541462</jstor_id><sourcerecordid>4541462</sourcerecordid><originalsourceid>FETCH-LOGICAL-c6139-eafe1f55c5839711be164213ed3df6848e7c73149ea682ddde424cba6f3d87d13</originalsourceid><addsrcrecordid>eNqNktuO0zAQhiMEYsvCGyCwuOAuxY6dxEECCRa2rGhYlrJixY01jSetSw5dO2Hbt8clVTncgCXLh__7Rx7PBAFhdMz8eLYas1iwkIqIjiNK07GfWTbe3ApGB-F2MKKUJiEX7OoouOfcyh-zmEZ3gyMmaSqFpKNgO8ParMFCjZ01BfmEC4vOmbYhbUnyvuqMNjU2uxuoyAQb7Dz2EbrlDWzJG-jgOZkiuC6cXffgveQ92gYrkkOxNI0_Q6PJ1O_AktxsUJO81Vi5-8GdEiqHD_brcXB5-vbzybtwej45O3k1DYuE8SxEKJGVcVzEkmcpY3NkiYgYR811mfgkMC1SzkSGkMhIa40iEsUckpJrmWrGj4OXQ9x1P69RF9h0Fiq1tqYGu1UtGPWn0pilWrTfVZQksaTUB3i6D2Db6x5dp2rjCqwqaLDtnUoy6l8UyX-CQjIRD-CTv8BV21v_v075zCSPUxp7SA5QYVvnLJaHJzOqdl2gVmpXbLUrttp1gfrZBWrjrY9-T_mXcV92D7wYgBtT4fa_A6vXZ-e533n_w8G_cl1rD37hXSKJvBwOsnEdbg4y2G8qSXkaqy8fJuqK5heTr_xUXXj-8cCX0CpYWOPU5SyijFMqeZRKxn8A2rLkTA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>213835705</pqid></control><display><type>article</type><title>Semiparametric Regression of Multidimensional Genetic Pathway Data: Least-Squares Kernel Machines and Linear Mixed Models</title><source>Jstor Complete Legacy</source><source>Oxford University Press Journals All Titles (1996-Current)</source><source>MEDLINE</source><source>Wiley Online Library Journals Frontfile Complete</source><source>JSTOR Mathematics &amp; Statistics</source><creator>Liu, Dawei ; Lin, Xihong ; Ghosh, Debashis</creator><creatorcontrib>Liu, Dawei ; Lin, Xihong ; Ghosh, Debashis</creatorcontrib><description>We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations.</description><identifier>ISSN: 0006-341X</identifier><identifier>EISSN: 1541-0420</identifier><identifier>DOI: 10.1111/j.1541-0420.2007.00799.x</identifier><identifier>PMID: 18078480</identifier><identifier>CODEN: BIOMA5</identifier><language>eng</language><publisher>Malden, USA: Blackwell Publishing Inc</publisher><subject>Algorithms ; Biometrics ; biometry ; Biometry - methods ; BLUPs ; Computer Simulation ; computer software ; data collection ; Data Interpretation, Statistical ; Data smoothing ; gene expression ; Gene Expression Profiling - methods ; genes ; Genetics ; Kernel function ; Kernel functions ; least squares ; Linear Models ; Linear regression ; Measurement techniques ; Medical genetics ; Model/variable selection ; Modeling ; Models, Biological ; Nonparametric models ; Nonparametric regression ; Parametric models ; Penalized likelihood ; Prostate cancer ; prostatic neoplasms ; Proteome - metabolism ; Regression Analysis ; REML ; Research methodology ; Sample Size ; Score test ; Sensitivity and Specificity ; Signal Transduction - physiology ; Simulations ; Smoothing parameter ; statistical models ; Support vector machines ; variance</subject><ispartof>Biometrics, 2007-12, Vol.63 (4), p.1079-1088</ispartof><rights>Copyright 2007 The International Biometric Society</rights><rights>2007, The International Biometric Society</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c6139-eafe1f55c5839711be164213ed3df6848e7c73149ea682ddde424cba6f3d87d13</citedby><cites>FETCH-LOGICAL-c6139-eafe1f55c5839711be164213ed3df6848e7c73149ea682ddde424cba6f3d87d13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/4541462$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/4541462$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,776,780,799,828,881,1411,27901,27902,45550,45551,57992,57996,58225,58229</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/18078480$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Dawei</creatorcontrib><creatorcontrib>Lin, Xihong</creatorcontrib><creatorcontrib>Ghosh, Debashis</creatorcontrib><title>Semiparametric Regression of Multidimensional Genetic Pathway Data: Least-Squares Kernel Machines and Linear Mixed Models</title><title>Biometrics</title><addtitle>Biometrics</addtitle><description>We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations.</description><subject>Algorithms</subject><subject>Biometrics</subject><subject>biometry</subject><subject>Biometry - methods</subject><subject>BLUPs</subject><subject>Computer Simulation</subject><subject>computer software</subject><subject>data collection</subject><subject>Data Interpretation, Statistical</subject><subject>Data smoothing</subject><subject>gene expression</subject><subject>Gene Expression Profiling - methods</subject><subject>genes</subject><subject>Genetics</subject><subject>Kernel function</subject><subject>Kernel functions</subject><subject>least squares</subject><subject>Linear Models</subject><subject>Linear regression</subject><subject>Measurement techniques</subject><subject>Medical genetics</subject><subject>Model/variable selection</subject><subject>Modeling</subject><subject>Models, Biological</subject><subject>Nonparametric models</subject><subject>Nonparametric regression</subject><subject>Parametric models</subject><subject>Penalized likelihood</subject><subject>Prostate cancer</subject><subject>prostatic neoplasms</subject><subject>Proteome - metabolism</subject><subject>Regression Analysis</subject><subject>REML</subject><subject>Research methodology</subject><subject>Sample Size</subject><subject>Score test</subject><subject>Sensitivity and Specificity</subject><subject>Signal Transduction - physiology</subject><subject>Simulations</subject><subject>Smoothing parameter</subject><subject>statistical models</subject><subject>Support vector machines</subject><subject>variance</subject><issn>0006-341X</issn><issn>1541-0420</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNktuO0zAQhiMEYsvCGyCwuOAuxY6dxEECCRa2rGhYlrJixY01jSetSw5dO2Hbt8clVTncgCXLh__7Rx7PBAFhdMz8eLYas1iwkIqIjiNK07GfWTbe3ApGB-F2MKKUJiEX7OoouOfcyh-zmEZ3gyMmaSqFpKNgO8ParMFCjZ01BfmEC4vOmbYhbUnyvuqMNjU2uxuoyAQb7Dz2EbrlDWzJG-jgOZkiuC6cXffgveQ92gYrkkOxNI0_Q6PJ1O_AktxsUJO81Vi5-8GdEiqHD_brcXB5-vbzybtwej45O3k1DYuE8SxEKJGVcVzEkmcpY3NkiYgYR811mfgkMC1SzkSGkMhIa40iEsUckpJrmWrGj4OXQ9x1P69RF9h0Fiq1tqYGu1UtGPWn0pilWrTfVZQksaTUB3i6D2Db6x5dp2rjCqwqaLDtnUoy6l8UyX-CQjIRD-CTv8BV21v_v075zCSPUxp7SA5QYVvnLJaHJzOqdl2gVmpXbLUrttp1gfrZBWrjrY9-T_mXcV92D7wYgBtT4fa_A6vXZ-e533n_w8G_cl1rD37hXSKJvBwOsnEdbg4y2G8qSXkaqy8fJuqK5heTr_xUXXj-8cCX0CpYWOPU5SyijFMqeZRKxn8A2rLkTA</recordid><startdate>200712</startdate><enddate>200712</enddate><creator>Liu, Dawei</creator><creator>Lin, Xihong</creator><creator>Ghosh, Debashis</creator><general>Blackwell Publishing Inc</general><general>International Biometric Society</general><general>Blackwell Publishing Ltd</general><scope>FBQ</scope><scope>BSCLL</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>7S9</scope><scope>L.6</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>200712</creationdate><title>Semiparametric Regression of Multidimensional Genetic Pathway Data: Least-Squares Kernel Machines and Linear Mixed Models</title><author>Liu, Dawei ; Lin, Xihong ; Ghosh, Debashis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c6139-eafe1f55c5839711be164213ed3df6848e7c73149ea682ddde424cba6f3d87d13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Algorithms</topic><topic>Biometrics</topic><topic>biometry</topic><topic>Biometry - methods</topic><topic>BLUPs</topic><topic>Computer Simulation</topic><topic>computer software</topic><topic>data collection</topic><topic>Data Interpretation, Statistical</topic><topic>Data smoothing</topic><topic>gene expression</topic><topic>Gene Expression Profiling - methods</topic><topic>genes</topic><topic>Genetics</topic><topic>Kernel function</topic><topic>Kernel functions</topic><topic>least squares</topic><topic>Linear Models</topic><topic>Linear regression</topic><topic>Measurement techniques</topic><topic>Medical genetics</topic><topic>Model/variable selection</topic><topic>Modeling</topic><topic>Models, Biological</topic><topic>Nonparametric models</topic><topic>Nonparametric regression</topic><topic>Parametric models</topic><topic>Penalized likelihood</topic><topic>Prostate cancer</topic><topic>prostatic neoplasms</topic><topic>Proteome - metabolism</topic><topic>Regression Analysis</topic><topic>REML</topic><topic>Research methodology</topic><topic>Sample Size</topic><topic>Score test</topic><topic>Sensitivity and Specificity</topic><topic>Signal Transduction - physiology</topic><topic>Simulations</topic><topic>Smoothing parameter</topic><topic>statistical models</topic><topic>Support vector machines</topic><topic>variance</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Dawei</creatorcontrib><creatorcontrib>Lin, Xihong</creatorcontrib><creatorcontrib>Ghosh, Debashis</creatorcontrib><collection>AGRIS</collection><collection>Istex</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Biometrics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Dawei</au><au>Lin, Xihong</au><au>Ghosh, Debashis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Semiparametric Regression of Multidimensional Genetic Pathway Data: Least-Squares Kernel Machines and Linear Mixed Models</atitle><jtitle>Biometrics</jtitle><addtitle>Biometrics</addtitle><date>2007-12</date><risdate>2007</risdate><volume>63</volume><issue>4</issue><spage>1079</spage><epage>1088</epage><pages>1079-1088</pages><issn>0006-341X</issn><eissn>1541-0420</eissn><coden>BIOMA5</coden><abstract>We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations.</abstract><cop>Malden, USA</cop><pub>Blackwell Publishing Inc</pub><pmid>18078480</pmid><doi>10.1111/j.1541-0420.2007.00799.x</doi><tpages>10</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0006-341X
ispartof Biometrics, 2007-12, Vol.63 (4), p.1079-1088
issn 0006-341X
1541-0420
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2665800
source Jstor Complete Legacy; Oxford University Press Journals All Titles (1996-Current); MEDLINE; Wiley Online Library Journals Frontfile Complete; JSTOR Mathematics & Statistics
subjects Algorithms
Biometrics
biometry
Biometry - methods
BLUPs
Computer Simulation
computer software
data collection
Data Interpretation, Statistical
Data smoothing
gene expression
Gene Expression Profiling - methods
genes
Genetics
Kernel function
Kernel functions
least squares
Linear Models
Linear regression
Measurement techniques
Medical genetics
Model/variable selection
Modeling
Models, Biological
Nonparametric models
Nonparametric regression
Parametric models
Penalized likelihood
Prostate cancer
prostatic neoplasms
Proteome - metabolism
Regression Analysis
REML
Research methodology
Sample Size
Score test
Sensitivity and Specificity
Signal Transduction - physiology
Simulations
Smoothing parameter
statistical models
Support vector machines
variance
title Semiparametric Regression of Multidimensional Genetic Pathway Data: Least-Squares Kernel Machines and Linear Mixed Models
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T12%3A13%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Semiparametric%20Regression%20of%20Multidimensional%20Genetic%20Pathway%20Data:%20Least-Squares%20Kernel%20Machines%20and%20Linear%20Mixed%20Models&rft.jtitle=Biometrics&rft.au=Liu,%20Dawei&rft.date=2007-12&rft.volume=63&rft.issue=4&rft.spage=1079&rft.epage=1088&rft.pages=1079-1088&rft.issn=0006-341X&rft.eissn=1541-0420&rft.coden=BIOMA5&rft_id=info:doi/10.1111/j.1541-0420.2007.00799.x&rft_dat=%3Cjstor_pubme%3E4541462%3C/jstor_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=213835705&rft_id=info:pmid/18078480&rft_jstor_id=4541462&rfr_iscdi=true