Regression modelling on stratified data with the lasso
We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized ver...
Gespeichert in:
Veröffentlicht in: | Biometrika 2017-03, Vol.104 (1), p.83-96 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 96 |
---|---|
container_issue | 1 |
container_start_page | 83 |
container_title | Biometrika |
container_volume | 104 |
creator | OLLIER, E. VIALLON, V. |
description | We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice, and propose an approach that bypasses this at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. An empirical study confirms that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided. |
doi_str_mv | 10.1093/biomet/asw065 |
format | Article |
fullrecord | <record><control><sourceid>jstor_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_01509933v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26363644</jstor_id><sourcerecordid>26363644</sourcerecordid><originalsourceid>FETCH-LOGICAL-c332t-6a25cab8c7c2a2e2921ac6fe29e013880678b665d719fcf983f894a4e00023ad3</originalsourceid><addsrcrecordid>eNo9kEFLAzEQhYMoWKtHj8JePaxNMtl0cyxFrbAgiJ7DNJu0KdtGkmDx35uyUuYw84Zvhscj5J7RJ0YVzNY-7G2eYTpS2VyQCRNS1NAwekkmlFJZgxDimtyktDtJ2cgJkR92E21KPhyqfejtMPjDpioi5YjZO2_7qseM1dHnbZW3thowpXBLrhwOyd799yn5enn-XK7q7v31bbnoagPAcy2RNwbXrZkbjtxyxRka6cpgKYO2pXLerouRfs6UM0614FolUNjijwP2MCWP498tDvo7-j3GXx3Q69Wi06cdZQ1VCuCHFbYeWRNDStG68wGj-hSQHgPSY0CFfxj5XcohnmEuoZQQ8AfMpWPc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Regression modelling on stratified data with the lasso</title><source>Jstor Complete Legacy</source><source>Oxford University Press Journals All Titles (1996-Current)</source><source>JSTOR Mathematics & Statistics</source><creator>OLLIER, E. ; VIALLON, V.</creator><creatorcontrib>OLLIER, E. ; VIALLON, V.</creatorcontrib><description>We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice, and propose an approach that bypasses this at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. An empirical study confirms that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided.</description><identifier>ISSN: 0006-3444</identifier><identifier>EISSN: 1464-3510</identifier><identifier>DOI: 10.1093/biomet/asw065</identifier><language>eng</language><publisher>Biometrika Trust</publisher><subject>Data Analysis, Statistics and Probability ; Physics ; Statistics</subject><ispartof>Biometrika, 2017-03, Vol.104 (1), p.83-96</ispartof><rights>2017 Biometrika Trust</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c332t-6a25cab8c7c2a2e2921ac6fe29e013880678b665d719fcf983f894a4e00023ad3</citedby><cites>FETCH-LOGICAL-c332t-6a25cab8c7c2a2e2921ac6fe29e013880678b665d719fcf983f894a4e00023ad3</cites><orcidid>0000-0001-6925-2941</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/26363644$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/26363644$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,776,780,799,828,881,27901,27902,57992,57996,58225,58229</link.rule.ids><backlink>$$Uhttps://hal.science/hal-01509933$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>OLLIER, E.</creatorcontrib><creatorcontrib>VIALLON, V.</creatorcontrib><title>Regression modelling on stratified data with the lasso</title><title>Biometrika</title><description>We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice, and propose an approach that bypasses this at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. An empirical study confirms that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided.</description><subject>Data Analysis, Statistics and Probability</subject><subject>Physics</subject><subject>Statistics</subject><issn>0006-3444</issn><issn>1464-3510</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><recordid>eNo9kEFLAzEQhYMoWKtHj8JePaxNMtl0cyxFrbAgiJ7DNJu0KdtGkmDx35uyUuYw84Zvhscj5J7RJ0YVzNY-7G2eYTpS2VyQCRNS1NAwekkmlFJZgxDimtyktDtJ2cgJkR92E21KPhyqfejtMPjDpioi5YjZO2_7qseM1dHnbZW3thowpXBLrhwOyd799yn5enn-XK7q7v31bbnoagPAcy2RNwbXrZkbjtxyxRka6cpgKYO2pXLerouRfs6UM0614FolUNjijwP2MCWP498tDvo7-j3GXx3Q69Wi06cdZQ1VCuCHFbYeWRNDStG68wGj-hSQHgPSY0CFfxj5XcohnmEuoZQQ8AfMpWPc</recordid><startdate>20170301</startdate><enddate>20170301</enddate><creator>OLLIER, E.</creator><creator>VIALLON, V.</creator><general>Biometrika Trust</general><general>Oxford University Press (OUP)</general><scope>AAYXX</scope><scope>CITATION</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0001-6925-2941</orcidid></search><sort><creationdate>20170301</creationdate><title>Regression modelling on stratified data with the lasso</title><author>OLLIER, E. ; VIALLON, V.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c332t-6a25cab8c7c2a2e2921ac6fe29e013880678b665d719fcf983f894a4e00023ad3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Data Analysis, Statistics and Probability</topic><topic>Physics</topic><topic>Statistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>OLLIER, E.</creatorcontrib><creatorcontrib>VIALLON, V.</creatorcontrib><collection>CrossRef</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>Biometrika</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>OLLIER, E.</au><au>VIALLON, V.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Regression modelling on stratified data with the lasso</atitle><jtitle>Biometrika</jtitle><date>2017-03-01</date><risdate>2017</risdate><volume>104</volume><issue>1</issue><spage>83</spage><epage>96</epage><pages>83-96</pages><issn>0006-3444</issn><eissn>1464-3510</eissn><abstract>We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a reference stratum. We show that the performance of a penalized version of this approach depends on this arbitrary choice, and propose an approach that bypasses this at almost no additional computational cost. Regarding model selection consistency, our proposal mimics the strategy based on an optimal and covariate-specific choice for the reference stratum. An empirical study confirms that our proposal generally outperforms the basic approach in the identification and description of the interactions. An illustration on gene expression data is provided.</abstract><pub>Biometrika Trust</pub><doi>10.1093/biomet/asw065</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-6925-2941</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0006-3444 |
ispartof | Biometrika, 2017-03, Vol.104 (1), p.83-96 |
issn | 0006-3444 1464-3510 |
language | eng |
recordid | cdi_hal_primary_oai_HAL_hal_01509933v1 |
source | Jstor Complete Legacy; Oxford University Press Journals All Titles (1996-Current); JSTOR Mathematics & Statistics |
subjects | Data Analysis, Statistics and Probability Physics Statistics |
title | Regression modelling on stratified data with the lasso |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T04%3A28%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Regression%20modelling%20on%20stratified%20data%20with%20the%20lasso&rft.jtitle=Biometrika&rft.au=OLLIER,%20E.&rft.date=2017-03-01&rft.volume=104&rft.issue=1&rft.spage=83&rft.epage=96&rft.pages=83-96&rft.issn=0006-3444&rft.eissn=1464-3510&rft_id=info:doi/10.1093/biomet/asw065&rft_dat=%3Cjstor_hal_p%3E26363644%3C/jstor_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_jstor_id=26363644&rfr_iscdi=true |