Regression modelling on stratified data with the lasso

We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized ver...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Biometrika 2017-03, Vol.104 (1), p.83-96
Hauptverfasser: OLLIER, E., VIALLON, V.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 96
container_issue 1
container_start_page 83
container_title Biometrika
container_volume 104
creator OLLIER, E.
VIALLON, V.
description We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice, and propose an approach that bypasses this at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. An empirical study confirms that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided.
doi_str_mv 10.1093/biomet/asw065
format Article
fullrecord <record><control><sourceid>jstor_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_01509933v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26363644</jstor_id><sourcerecordid>26363644</sourcerecordid><originalsourceid>FETCH-LOGICAL-c332t-6a25cab8c7c2a2e2921ac6fe29e013880678b665d719fcf983f894a4e00023ad3</originalsourceid><addsrcrecordid>eNo9kEFLAzEQhYMoWKtHj8JePaxNMtl0cyxFrbAgiJ7DNJu0KdtGkmDx35uyUuYw84Zvhscj5J7RJ0YVzNY-7G2eYTpS2VyQCRNS1NAwekkmlFJZgxDimtyktDtJ2cgJkR92E21KPhyqfejtMPjDpioi5YjZO2_7qseM1dHnbZW3thowpXBLrhwOyd799yn5enn-XK7q7v31bbnoagPAcy2RNwbXrZkbjtxyxRka6cpgKYO2pXLerouRfs6UM0614FolUNjijwP2MCWP498tDvo7-j3GXx3Q69Wi06cdZQ1VCuCHFbYeWRNDStG68wGj-hSQHgPSY0CFfxj5XcohnmEuoZQQ8AfMpWPc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Regression modelling on stratified data with the lasso</title><source>Jstor Complete Legacy</source><source>Oxford University Press Journals All Titles (1996-Current)</source><source>JSTOR Mathematics &amp; Statistics</source><creator>OLLIER, E. ; VIALLON, V.</creator><creatorcontrib>OLLIER, E. ; VIALLON, V.</creatorcontrib><description>We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice, and propose an approach that bypasses this at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. An empirical study confirms that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided.</description><identifier>ISSN: 0006-3444</identifier><identifier>EISSN: 1464-3510</identifier><identifier>DOI: 10.1093/biomet/asw065</identifier><language>eng</language><publisher>Biometrika Trust</publisher><subject>Data Analysis, Statistics and Probability ; Physics ; Statistics</subject><ispartof>Biometrika, 2017-03, Vol.104 (1), p.83-96</ispartof><rights>2017 Biometrika Trust</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c332t-6a25cab8c7c2a2e2921ac6fe29e013880678b665d719fcf983f894a4e00023ad3</citedby><cites>FETCH-LOGICAL-c332t-6a25cab8c7c2a2e2921ac6fe29e013880678b665d719fcf983f894a4e00023ad3</cites><orcidid>0000-0001-6925-2941</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/26363644$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/26363644$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,776,780,799,828,881,27901,27902,57992,57996,58225,58229</link.rule.ids><backlink>$$Uhttps://hal.science/hal-01509933$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>OLLIER, E.</creatorcontrib><creatorcontrib>VIALLON, V.</creatorcontrib><title>Regression modelling on stratified data with the lasso</title><title>Biometrika</title><description>We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice, and propose an approach that bypasses this at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. An empirical study confirms that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided.</description><subject>Data Analysis, Statistics and Probability</subject><subject>Physics</subject><subject>Statistics</subject><issn>0006-3444</issn><issn>1464-3510</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><recordid>eNo9kEFLAzEQhYMoWKtHj8JePaxNMtl0cyxFrbAgiJ7DNJu0KdtGkmDx35uyUuYw84Zvhscj5J7RJ0YVzNY-7G2eYTpS2VyQCRNS1NAwekkmlFJZgxDimtyktDtJ2cgJkR92E21KPhyqfejtMPjDpioi5YjZO2_7qseM1dHnbZW3thowpXBLrhwOyd799yn5enn-XK7q7v31bbnoagPAcy2RNwbXrZkbjtxyxRka6cpgKYO2pXLerouRfs6UM0614FolUNjijwP2MCWP498tDvo7-j3GXx3Q69Wi06cdZQ1VCuCHFbYeWRNDStG68wGj-hSQHgPSY0CFfxj5XcohnmEuoZQQ8AfMpWPc</recordid><startdate>20170301</startdate><enddate>20170301</enddate><creator>OLLIER, E.</creator><creator>VIALLON, V.</creator><general>Biometrika Trust</general><general>Oxford University Press (OUP)</general><scope>AAYXX</scope><scope>CITATION</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0001-6925-2941</orcidid></search><sort><creationdate>20170301</creationdate><title>Regression modelling on stratified data with the lasso</title><author>OLLIER, E. ; VIALLON, V.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c332t-6a25cab8c7c2a2e2921ac6fe29e013880678b665d719fcf983f894a4e00023ad3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Data Analysis, Statistics and Probability</topic><topic>Physics</topic><topic>Statistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>OLLIER, E.</creatorcontrib><creatorcontrib>VIALLON, V.</creatorcontrib><collection>CrossRef</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>Biometrika</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>OLLIER, E.</au><au>VIALLON, V.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Regression modelling on stratified data with the lasso</atitle><jtitle>Biometrika</jtitle><date>2017-03-01</date><risdate>2017</risdate><volume>104</volume><issue>1</issue><spage>83</spage><epage>96</epage><pages>83-96</pages><issn>0006-3444</issn><eissn>1464-3510</eissn><abstract>We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice, and propose an approach that bypasses this at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. An empirical study confirms that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided.</abstract><pub>Biometrika Trust</pub><doi>10.1093/biomet/asw065</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-6925-2941</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0006-3444
ispartof Biometrika, 2017-03, Vol.104 (1), p.83-96
issn 0006-3444
1464-3510
language eng
recordid cdi_hal_primary_oai_HAL_hal_01509933v1
source Jstor Complete Legacy; Oxford University Press Journals All Titles (1996-Current); JSTOR Mathematics & Statistics
subjects Data Analysis, Statistics and Probability
Physics
Statistics
title Regression modelling on stratified data with the lasso
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T04%3A28%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Regression%20modelling%20on%20stratified%20data%20with%20the%20lasso&rft.jtitle=Biometrika&rft.au=OLLIER,%20E.&rft.date=2017-03-01&rft.volume=104&rft.issue=1&rft.spage=83&rft.epage=96&rft.pages=83-96&rft.issn=0006-3444&rft.eissn=1464-3510&rft_id=info:doi/10.1093/biomet/asw065&rft_dat=%3Cjstor_hal_p%3E26363644%3C/jstor_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_jstor_id=26363644&rfr_iscdi=true