Predictive Variable Selection in Generalized Linear Models

Here we extend predictive method for model selection of Laud and Ibrahim to the generalized linear model. This prescription avoids the need to directly specify prior probabilities of models and prior densities for the parameters. Instead, a prior prediction for the response induces the required prio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the American Statistical Association 2002-09, Vol.97 (459), p.859-871
Hauptverfasser: Meyer, Mary C, Laud, Purushottam W
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 871
container_issue 459
container_start_page 859
container_title Journal of the American Statistical Association
container_volume 97
creator Meyer, Mary C
Laud, Purushottam W
description Here we extend predictive method for model selection of Laud and Ibrahim to the generalized linear model. This prescription avoids the need to directly specify prior probabilities of models and prior densities for the parameters. Instead, a prior prediction for the response induces the required priors. We propose normal and conjugate priors for generalized linear models, each using a single prior prediction for the mean response to induce suitable priors for each variable-subset model. In this way, an informative prior is used to select a subset of variables. In addition to producing a ranking of models by size of the predictive criterion, the standard deviation of the criterion is used as a calibration number to produce a set of equally good models. A straightforward Markov chain Monte Carlo algorithm is used to accomplish the necessary computations. We illustrate this method with real and simulated datasets and compare results with the Bayes factors and the Akaike information and Bayes information model selection criteria. The simulation results confirm the efficacy of the method, because the correct model is known. An illustrative application demonstrates selection of important predictors of success in identifying the sentinel lymph node during surgical treatment of breast cancer. A forward selection procedure is described to avoid a full search over the 218 possible models in this case.
doi_str_mv 10.1198/016214502388618654
format Article
fullrecord <record><control><sourceid>jstor_pasca</sourceid><recordid>TN_cdi_pascalfrancis_primary_13915897</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>3085727</jstor_id><sourcerecordid>3085727</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-dc2dbc6cc208703fd6e5f7c6764e49d3e37d255c1c695a91efc54c57b39004613</originalsourceid><addsrcrecordid>eNp9kFtLAzEQhYMoWKt_QHxYBH1bTTabm-CDiDeoKHjBtyVNZiEl3dRkq-ivN6VeQMF5GZjzzeFwENom-IAQJQ8x4RWpGa6olJxIzuoVNCCMirIS9dMqGiyAMhNqHW2kNMF5hJQDdHQbwTrTuxcoHnV0euyhuAMP-RS6wnXFBXQQtXfvYIuR60DH4jpY8GkTrbXaJ9j63EP0cH52f3pZjm4urk5PRqWpsexLayo7NtyYCkuBaWs5sFYYLngNtbIUqLAVY4YYrphWBFrDasPEmCqMa07oEO0vfWcxPM8h9c3UJQPe6w7CPDVUEZbdcAZ3f4GTMI9dztbkFoRgUskMVUvIxJBShLaZRTfV8a0huFl02fztMj_tfTrrZLRvo-6MSz-fiwhSicztLLlJ6kP81imWTFQL-Xgpu64NcapfQ_S26fWbD_HLk_4T4wMoOo55</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>274775898</pqid></control><display><type>article</type><title>Predictive Variable Selection in Generalized Linear Models</title><source>JSTOR Mathematics &amp; Statistics</source><source>Jstor Complete Legacy</source><source>Taylor &amp; Francis:Master (3349 titles)</source><creator>Meyer, Mary C ; Laud, Purushottam W</creator><creatorcontrib>Meyer, Mary C ; Laud, Purushottam W</creatorcontrib><description>Here we extend predictive method for model selection of Laud and Ibrahim to the generalized linear model. This prescription avoids the need to directly specify prior probabilities of models and prior densities for the parameters. Instead, a prior prediction for the response induces the required priors. We propose normal and conjugate priors for generalized linear models, each using a single prior prediction for the mean response to induce suitable priors for each variable-subset model. In this way, an informative prior is used to select a subset of variables. In addition to producing a ranking of models by size of the predictive criterion, the standard deviation of the criterion is used as a calibration number to produce a set of equally good models. A straightforward Markov chain Monte Carlo algorithm is used to accomplish the necessary computations. We illustrate this method with real and simulated datasets and compare results with the Bayes factors and the Akaike information and Bayes information model selection criteria. The simulation results confirm the efficacy of the method, because the correct model is known. An illustrative application demonstrates selection of important predictors of success in identifying the sentinel lymph node during surgical treatment of breast cancer. A forward selection procedure is described to avoid a full search over the 218 possible models in this case.</description><identifier>ISSN: 0162-1459</identifier><identifier>EISSN: 1537-274X</identifier><identifier>DOI: 10.1198/016214502388618654</identifier><identifier>CODEN: JSTNAL</identifier><language>eng</language><publisher>Alexandria, VA: Taylor &amp; Francis</publisher><subject>Body mass index ; Breast cancer ; Calibration ; Conjugate prior ; Dyes ; Exact sciences and technology ; Generalized linear model ; Generalized linear models ; Gibbs sampling ; Inference from stochastic processes; time series analysis ; L criterion ; Linear inference, regression ; Linear models ; Logistic regression ; Mathematics ; Model testing ; Modeling ; Nonparametric inference ; Normal prior ; Parametric inference ; Predictive distribution ; Predictive modeling ; Probabilities ; Probability and statistics ; Regression analysis ; Sciences and techniques of general use ; Statistical analysis ; Statistical methods ; Statistics ; Theory and Methods ; Tumors</subject><ispartof>Journal of the American Statistical Association, 2002-09, Vol.97 (459), p.859-871</ispartof><rights>American Statistical Association 2002</rights><rights>Copyright 2002 American Statistical Association</rights><rights>2003 INIST-CNRS</rights><rights>Copyright American Statistical Association Sep 2002</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-dc2dbc6cc208703fd6e5f7c6764e49d3e37d255c1c695a91efc54c57b39004613</citedby><cites>FETCH-LOGICAL-c408t-dc2dbc6cc208703fd6e5f7c6764e49d3e37d255c1c695a91efc54c57b39004613</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/3085727$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/3085727$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,778,782,801,830,27907,27908,58000,58004,58233,58237,59628,60417</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=13915897$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Meyer, Mary C</creatorcontrib><creatorcontrib>Laud, Purushottam W</creatorcontrib><title>Predictive Variable Selection in Generalized Linear Models</title><title>Journal of the American Statistical Association</title><description>Here we extend predictive method for model selection of Laud and Ibrahim to the generalized linear model. This prescription avoids the need to directly specify prior probabilities of models and prior densities for the parameters. Instead, a prior prediction for the response induces the required priors. We propose normal and conjugate priors for generalized linear models, each using a single prior prediction for the mean response to induce suitable priors for each variable-subset model. In this way, an informative prior is used to select a subset of variables. In addition to producing a ranking of models by size of the predictive criterion, the standard deviation of the criterion is used as a calibration number to produce a set of equally good models. A straightforward Markov chain Monte Carlo algorithm is used to accomplish the necessary computations. We illustrate this method with real and simulated datasets and compare results with the Bayes factors and the Akaike information and Bayes information model selection criteria. The simulation results confirm the efficacy of the method, because the correct model is known. An illustrative application demonstrates selection of important predictors of success in identifying the sentinel lymph node during surgical treatment of breast cancer. A forward selection procedure is described to avoid a full search over the 218 possible models in this case.</description><subject>Body mass index</subject><subject>Breast cancer</subject><subject>Calibration</subject><subject>Conjugate prior</subject><subject>Dyes</subject><subject>Exact sciences and technology</subject><subject>Generalized linear model</subject><subject>Generalized linear models</subject><subject>Gibbs sampling</subject><subject>Inference from stochastic processes; time series analysis</subject><subject>L criterion</subject><subject>Linear inference, regression</subject><subject>Linear models</subject><subject>Logistic regression</subject><subject>Mathematics</subject><subject>Model testing</subject><subject>Modeling</subject><subject>Nonparametric inference</subject><subject>Normal prior</subject><subject>Parametric inference</subject><subject>Predictive distribution</subject><subject>Predictive modeling</subject><subject>Probabilities</subject><subject>Probability and statistics</subject><subject>Regression analysis</subject><subject>Sciences and techniques of general use</subject><subject>Statistical analysis</subject><subject>Statistical methods</subject><subject>Statistics</subject><subject>Theory and Methods</subject><subject>Tumors</subject><issn>0162-1459</issn><issn>1537-274X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2002</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp9kFtLAzEQhYMoWKt_QHxYBH1bTTabm-CDiDeoKHjBtyVNZiEl3dRkq-ivN6VeQMF5GZjzzeFwENom-IAQJQ8x4RWpGa6olJxIzuoVNCCMirIS9dMqGiyAMhNqHW2kNMF5hJQDdHQbwTrTuxcoHnV0euyhuAMP-RS6wnXFBXQQtXfvYIuR60DH4jpY8GkTrbXaJ9j63EP0cH52f3pZjm4urk5PRqWpsexLayo7NtyYCkuBaWs5sFYYLngNtbIUqLAVY4YYrphWBFrDasPEmCqMa07oEO0vfWcxPM8h9c3UJQPe6w7CPDVUEZbdcAZ3f4GTMI9dztbkFoRgUskMVUvIxJBShLaZRTfV8a0huFl02fztMj_tfTrrZLRvo-6MSz-fiwhSicztLLlJ6kP81imWTFQL-Xgpu64NcapfQ_S26fWbD_HLk_4T4wMoOo55</recordid><startdate>20020901</startdate><enddate>20020901</enddate><creator>Meyer, Mary C</creator><creator>Laud, Purushottam W</creator><general>Taylor &amp; Francis</general><general>American Statistical Association</general><general>Taylor &amp; Francis Ltd</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7WY</scope><scope>7WZ</scope><scope>7X7</scope><scope>7XB</scope><scope>87Z</scope><scope>88E</scope><scope>88I</scope><scope>8AF</scope><scope>8BJ</scope><scope>8C1</scope><scope>8FE</scope><scope>8FG</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FQK</scope><scope>FRNLG</scope><scope>FYUFA</scope><scope>F~G</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JBE</scope><scope>K60</scope><scope>K6~</scope><scope>K9-</scope><scope>K9.</scope><scope>L.-</scope><scope>L6V</scope><scope>M0C</scope><scope>M0R</scope><scope>M0S</scope><scope>M0T</scope><scope>M1P</scope><scope>M2O</scope><scope>M2P</scope><scope>M7S</scope><scope>MBDVC</scope><scope>PADUT</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>PYYUZ</scope><scope>Q9U</scope><scope>S0X</scope></search><sort><creationdate>20020901</creationdate><title>Predictive Variable Selection in Generalized Linear Models</title><author>Meyer, Mary C ; Laud, Purushottam W</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-dc2dbc6cc208703fd6e5f7c6764e49d3e37d255c1c695a91efc54c57b39004613</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Body mass index</topic><topic>Breast cancer</topic><topic>Calibration</topic><topic>Conjugate prior</topic><topic>Dyes</topic><topic>Exact sciences and technology</topic><topic>Generalized linear model</topic><topic>Generalized linear models</topic><topic>Gibbs sampling</topic><topic>Inference from stochastic processes; time series analysis</topic><topic>L criterion</topic><topic>Linear inference, regression</topic><topic>Linear models</topic><topic>Logistic regression</topic><topic>Mathematics</topic><topic>Model testing</topic><topic>Modeling</topic><topic>Nonparametric inference</topic><topic>Normal prior</topic><topic>Parametric inference</topic><topic>Predictive distribution</topic><topic>Predictive modeling</topic><topic>Probabilities</topic><topic>Probability and statistics</topic><topic>Regression analysis</topic><topic>Sciences and techniques of general use</topic><topic>Statistical analysis</topic><topic>Statistical methods</topic><topic>Statistics</topic><topic>Theory and Methods</topic><topic>Tumors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Meyer, Mary C</creatorcontrib><creatorcontrib>Laud, Purushottam W</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>STEM Database</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>Public Health Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>International Bibliography of the Social Sciences</collection><collection>Business Premium Collection (Alumni)</collection><collection>Health Research Premium Collection</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Consumer Health Database (Alumni Edition)</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>ABI/INFORM Global</collection><collection>Consumer Health Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Healthcare Administration Database</collection><collection>Medical Database</collection><collection>Research Library</collection><collection>Science Database</collection><collection>Engineering Database</collection><collection>Research Library (Corporate)</collection><collection>Research Library China</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>ABI/INFORM Collection China</collection><collection>ProQuest Central Basic</collection><collection>SIRS Editorial</collection><jtitle>Journal of the American Statistical Association</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Meyer, Mary C</au><au>Laud, Purushottam W</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predictive Variable Selection in Generalized Linear Models</atitle><jtitle>Journal of the American Statistical Association</jtitle><date>2002-09-01</date><risdate>2002</risdate><volume>97</volume><issue>459</issue><spage>859</spage><epage>871</epage><pages>859-871</pages><issn>0162-1459</issn><eissn>1537-274X</eissn><coden>JSTNAL</coden><abstract>Here we extend predictive method for model selection of Laud and Ibrahim to the generalized linear model. This prescription avoids the need to directly specify prior probabilities of models and prior densities for the parameters. Instead, a prior prediction for the response induces the required priors. We propose normal and conjugate priors for generalized linear models, each using a single prior prediction for the mean response to induce suitable priors for each variable-subset model. In this way, an informative prior is used to select a subset of variables. In addition to producing a ranking of models by size of the predictive criterion, the standard deviation of the criterion is used as a calibration number to produce a set of equally good models. A straightforward Markov chain Monte Carlo algorithm is used to accomplish the necessary computations. We illustrate this method with real and simulated datasets and compare results with the Bayes factors and the Akaike information and Bayes information model selection criteria. The simulation results confirm the efficacy of the method, because the correct model is known. An illustrative application demonstrates selection of important predictors of success in identifying the sentinel lymph node during surgical treatment of breast cancer. A forward selection procedure is described to avoid a full search over the 218 possible models in this case.</abstract><cop>Alexandria, VA</cop><pub>Taylor &amp; Francis</pub><doi>10.1198/016214502388618654</doi><tpages>13</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0162-1459
ispartof Journal of the American Statistical Association, 2002-09, Vol.97 (459), p.859-871
issn 0162-1459
1537-274X
language eng
recordid cdi_pascalfrancis_primary_13915897
source JSTOR Mathematics & Statistics; Jstor Complete Legacy; Taylor & Francis:Master (3349 titles)
subjects Body mass index
Breast cancer
Calibration
Conjugate prior
Dyes
Exact sciences and technology
Generalized linear model
Generalized linear models
Gibbs sampling
Inference from stochastic processes
time series analysis
L criterion
Linear inference, regression
Linear models
Logistic regression
Mathematics
Model testing
Modeling
Nonparametric inference
Normal prior
Parametric inference
Predictive distribution
Predictive modeling
Probabilities
Probability and statistics
Regression analysis
Sciences and techniques of general use
Statistical analysis
Statistical methods
Statistics
Theory and Methods
Tumors
title Predictive Variable Selection in Generalized Linear Models
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T20%3A13%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pasca&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predictive%20Variable%20Selection%20in%20Generalized%20Linear%20Models&rft.jtitle=Journal%20of%20the%20American%20Statistical%20Association&rft.au=Meyer,%20Mary%20C&rft.date=2002-09-01&rft.volume=97&rft.issue=459&rft.spage=859&rft.epage=871&rft.pages=859-871&rft.issn=0162-1459&rft.eissn=1537-274X&rft.coden=JSTNAL&rft_id=info:doi/10.1198/016214502388618654&rft_dat=%3Cjstor_pasca%3E3085727%3C/jstor_pasca%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=274775898&rft_id=info:pmid/&rft_jstor_id=3085727&rfr_iscdi=true