Using the Whole Cohort in the Analysis of Case-Cohort Data

Case-cohort data analyses often ignore valuable information on cohort members not sampled as cases or controls. The Atherosclerosis Risk in Communities (ARIC) study investigators, for example, typically report data for just the 10%-15% of subjects sampled for substudies of their cohort of 15,972 par...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:American journal of epidemiology 2009-06, Vol.169 (11), p.1398-1405
Hauptverfasser: Breslow, Norman E., Lumley, Thomas, Ballantyne, Christie M., Chambless, Lloyd E., Kulich, Michal
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1405
container_issue 11
container_start_page 1398
container_title American journal of epidemiology
container_volume 169
creator Breslow, Norman E.
Lumley, Thomas
Ballantyne, Christie M.
Chambless, Lloyd E.
Kulich, Michal
description Case-cohort data analyses often ignore valuable information on cohort members not sampled as cases or controls. The Atherosclerosis Risk in Communities (ARIC) study investigators, for example, typically report data for just the 10%-15% of subjects sampled for substudies of their cohort of 15,972 participants. Remaining subjects contribute to stratified sampling weights only. Analysis methods implemented in the freely available R statistical system (http://cran.r-project.org/) make better use of the data through adjustment of the sampling weights via calibration or estimation. By reanalyzing data from an ARIC study of coronary heart disease and simulations based on data from the National Wilms Tumor Study, the authors demonstrate that such adjustment can dramatically improve the precision of hazard ratios estimated for baseline covariates known for all subjects. Adjustment can also improve precision for partially missing covariates, those known for substudy participants only, when their values may be imputed with reasonable accuracy for the remaining cohort members. Links are provided to software, data sets, and tutorials showing in detail the steps needed to carry out the adjusted analyses. Epidemiologists are encouraged to consider use of these methods to enhance the accuracy of results reported from case-cohort analyses.
doi_str_mv 10.1093/aje/kwp055
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2768499</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/aje/kwp055</oup_id><sourcerecordid>1721190931</sourcerecordid><originalsourceid>FETCH-LOGICAL-c560t-77cd5d3fcb4ddad374ddd6dff9fa782107adba3e3eec7f5751dd6018c499542a3</originalsourceid><addsrcrecordid>eNqFkctKAzEUhoMoWi8bH0AGQRfCaC5N0nEhSL2C4EZxGU5zsVOnk5rMKL690Q71stDVgfwf30nyI7RN8CHBBTuCiT16ep1hzpdQj_SlyAXlYhn1MMY0L6iga2g9xgnGhBQcr6I1UjAuGR300PF9LOvHrBnb7GHsK5sN_diHJivrz7PTGqq3WMbMu2wI0eZdfAYNbKIVB1W0W93cQPcX53fDq_zm9vJ6eHqTay5wk0upDTfM6VHfGDBMpmGEca5wIAeUYAlmBMwya7V0XHKSYkwGul8UvE-BbaCTuXfWjqbWaFs3ASo1C-UUwpvyUKqfSV2O1aN_UVSKQZIkwX4nCP65tbFR0zJqW1VQW99GJSSlgmH5L0ixEFxSnsDdX-DEtyH9VWIYLwhl7MN2MId08DEG6xZXJlh9FKdScWpeXIJ3vj_yC-2aSsBeB0DUULkAtS7jgqNEEM7T4gXn29lfC98BTeCt9g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>235912337</pqid></control><display><type>article</type><title>Using the Whole Cohort in the Analysis of Case-Cohort Data</title><source>MEDLINE</source><source>Oxford University Press Journals All Titles (1996-Current)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Alma/SFX Local Collection</source><creator>Breslow, Norman E. ; Lumley, Thomas ; Ballantyne, Christie M. ; Chambless, Lloyd E. ; Kulich, Michal</creator><creatorcontrib>Breslow, Norman E. ; Lumley, Thomas ; Ballantyne, Christie M. ; Chambless, Lloyd E. ; Kulich, Michal</creatorcontrib><description>Case-cohort data analyses often ignore valuable information on cohort members not sampled as cases or controls. The Atherosclerosis Risk in Communities (ARIC) study investigators, for example, typically report data for just the 10%-15% of subjects sampled for substudies of their cohort of 15,972 participants. Remaining subjects contribute to stratified sampling weights only. Analysis methods implemented in the freely available R statistical system (http://cran.r-project.org/) make better use of the data through adjustment of the sampling weights via calibration or estimation. By reanalyzing data from an ARIC study of coronary heart disease and simulations based on data from the National Wilms Tumor Study, the authors demonstrate that such adjustment can dramatically improve the precision of hazard ratios estimated for baseline covariates known for all subjects. Adjustment can also improve precision for partially missing covariates, those known for substudy participants only, when their values may be imputed with reasonable accuracy for the remaining cohort members. Links are provided to software, data sets, and tutorials showing in detail the steps needed to carry out the adjusted analyses. Epidemiologists are encouraged to consider use of these methods to enhance the accuracy of results reported from case-cohort analyses.</description><identifier>ISSN: 0002-9262</identifier><identifier>EISSN: 1476-6256</identifier><identifier>DOI: 10.1093/aje/kwp055</identifier><identifier>PMID: 19357328</identifier><identifier>CODEN: AJEPAS</identifier><language>eng</language><publisher>Cary, NC: Oxford University Press</publisher><subject>Analysis. Health state ; Biological and medical sciences ; Biomarkers - analysis ; Calibration ; Cohort Studies ; Coronary Artery Disease - epidemiology ; Coronary Artery Disease - ethnology ; Coronary Artery Disease - genetics ; Epidemiologic Methods ; Epidemiology ; Female ; General aspects ; Genotype ; Humans ; Linear Models ; Male ; Medical research ; Medical sciences ; Miscellaneous ; Observation ; Practice of Epidemiology ; Proportional Hazards Models ; Public health. Hygiene ; Public health. Hygiene-occupational medicine ; Research methodology ; Risk Factors ; Sampling Studies ; Statistical analysis ; Statistical methods</subject><ispartof>American journal of epidemiology, 2009-06, Vol.169 (11), p.1398-1405</ispartof><rights>American Journal of Epidemiology © The Author 2009. Published by the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org. 2009</rights><rights>2009 INIST-CNRS</rights><rights>American Journal of Epidemiology © The Author 2009. Published by the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c560t-77cd5d3fcb4ddad374ddd6dff9fa782107adba3e3eec7f5751dd6018c499542a3</citedby><cites>FETCH-LOGICAL-c560t-77cd5d3fcb4ddad374ddd6dff9fa782107adba3e3eec7f5751dd6018c499542a3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,1578,27903,27904</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=21615512$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19357328$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Breslow, Norman E.</creatorcontrib><creatorcontrib>Lumley, Thomas</creatorcontrib><creatorcontrib>Ballantyne, Christie M.</creatorcontrib><creatorcontrib>Chambless, Lloyd E.</creatorcontrib><creatorcontrib>Kulich, Michal</creatorcontrib><title>Using the Whole Cohort in the Analysis of Case-Cohort Data</title><title>American journal of epidemiology</title><addtitle>Am J Epidemiol</addtitle><description>Case-cohort data analyses often ignore valuable information on cohort members not sampled as cases or controls. The Atherosclerosis Risk in Communities (ARIC) study investigators, for example, typically report data for just the 10%-15% of subjects sampled for substudies of their cohort of 15,972 participants. Remaining subjects contribute to stratified sampling weights only. Analysis methods implemented in the freely available R statistical system (http://cran.r-project.org/) make better use of the data through adjustment of the sampling weights via calibration or estimation. By reanalyzing data from an ARIC study of coronary heart disease and simulations based on data from the National Wilms Tumor Study, the authors demonstrate that such adjustment can dramatically improve the precision of hazard ratios estimated for baseline covariates known for all subjects. Adjustment can also improve precision for partially missing covariates, those known for substudy participants only, when their values may be imputed with reasonable accuracy for the remaining cohort members. Links are provided to software, data sets, and tutorials showing in detail the steps needed to carry out the adjusted analyses. Epidemiologists are encouraged to consider use of these methods to enhance the accuracy of results reported from case-cohort analyses.</description><subject>Analysis. Health state</subject><subject>Biological and medical sciences</subject><subject>Biomarkers - analysis</subject><subject>Calibration</subject><subject>Cohort Studies</subject><subject>Coronary Artery Disease - epidemiology</subject><subject>Coronary Artery Disease - ethnology</subject><subject>Coronary Artery Disease - genetics</subject><subject>Epidemiologic Methods</subject><subject>Epidemiology</subject><subject>Female</subject><subject>General aspects</subject><subject>Genotype</subject><subject>Humans</subject><subject>Linear Models</subject><subject>Male</subject><subject>Medical research</subject><subject>Medical sciences</subject><subject>Miscellaneous</subject><subject>Observation</subject><subject>Practice of Epidemiology</subject><subject>Proportional Hazards Models</subject><subject>Public health. Hygiene</subject><subject>Public health. Hygiene-occupational medicine</subject><subject>Research methodology</subject><subject>Risk Factors</subject><subject>Sampling Studies</subject><subject>Statistical analysis</subject><subject>Statistical methods</subject><issn>0002-9262</issn><issn>1476-6256</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkctKAzEUhoMoWi8bH0AGQRfCaC5N0nEhSL2C4EZxGU5zsVOnk5rMKL690Q71stDVgfwf30nyI7RN8CHBBTuCiT16ep1hzpdQj_SlyAXlYhn1MMY0L6iga2g9xgnGhBQcr6I1UjAuGR300PF9LOvHrBnb7GHsK5sN_diHJivrz7PTGqq3WMbMu2wI0eZdfAYNbKIVB1W0W93cQPcX53fDq_zm9vJ6eHqTay5wk0upDTfM6VHfGDBMpmGEca5wIAeUYAlmBMwya7V0XHKSYkwGul8UvE-BbaCTuXfWjqbWaFs3ASo1C-UUwpvyUKqfSV2O1aN_UVSKQZIkwX4nCP65tbFR0zJqW1VQW99GJSSlgmH5L0ixEFxSnsDdX-DEtyH9VWIYLwhl7MN2MId08DEG6xZXJlh9FKdScWpeXIJ3vj_yC-2aSsBeB0DUULkAtS7jgqNEEM7T4gXn29lfC98BTeCt9g</recordid><startdate>20090601</startdate><enddate>20090601</enddate><creator>Breslow, Norman E.</creator><creator>Lumley, Thomas</creator><creator>Ballantyne, Christie M.</creator><creator>Chambless, Lloyd E.</creator><creator>Kulich, Michal</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QP</scope><scope>7T2</scope><scope>7TK</scope><scope>7U7</scope><scope>7U9</scope><scope>C1K</scope><scope>H94</scope><scope>K9.</scope><scope>NAPCQ</scope><scope>7U1</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20090601</creationdate><title>Using the Whole Cohort in the Analysis of Case-Cohort Data</title><author>Breslow, Norman E. ; Lumley, Thomas ; Ballantyne, Christie M. ; Chambless, Lloyd E. ; Kulich, Michal</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c560t-77cd5d3fcb4ddad374ddd6dff9fa782107adba3e3eec7f5751dd6018c499542a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Analysis. Health state</topic><topic>Biological and medical sciences</topic><topic>Biomarkers - analysis</topic><topic>Calibration</topic><topic>Cohort Studies</topic><topic>Coronary Artery Disease - epidemiology</topic><topic>Coronary Artery Disease - ethnology</topic><topic>Coronary Artery Disease - genetics</topic><topic>Epidemiologic Methods</topic><topic>Epidemiology</topic><topic>Female</topic><topic>General aspects</topic><topic>Genotype</topic><topic>Humans</topic><topic>Linear Models</topic><topic>Male</topic><topic>Medical research</topic><topic>Medical sciences</topic><topic>Miscellaneous</topic><topic>Observation</topic><topic>Practice of Epidemiology</topic><topic>Proportional Hazards Models</topic><topic>Public health. Hygiene</topic><topic>Public health. Hygiene-occupational medicine</topic><topic>Research methodology</topic><topic>Risk Factors</topic><topic>Sampling Studies</topic><topic>Statistical analysis</topic><topic>Statistical methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Breslow, Norman E.</creatorcontrib><creatorcontrib>Lumley, Thomas</creatorcontrib><creatorcontrib>Ballantyne, Christie M.</creatorcontrib><creatorcontrib>Chambless, Lloyd E.</creatorcontrib><creatorcontrib>Kulich, Michal</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Health and Safety Science Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Environmental Sciences and Pollution Management</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>Risk Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>American journal of epidemiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Breslow, Norman E.</au><au>Lumley, Thomas</au><au>Ballantyne, Christie M.</au><au>Chambless, Lloyd E.</au><au>Kulich, Michal</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Using the Whole Cohort in the Analysis of Case-Cohort Data</atitle><jtitle>American journal of epidemiology</jtitle><addtitle>Am J Epidemiol</addtitle><date>2009-06-01</date><risdate>2009</risdate><volume>169</volume><issue>11</issue><spage>1398</spage><epage>1405</epage><pages>1398-1405</pages><issn>0002-9262</issn><eissn>1476-6256</eissn><coden>AJEPAS</coden><abstract>Case-cohort data analyses often ignore valuable information on cohort members not sampled as cases or controls. The Atherosclerosis Risk in Communities (ARIC) study investigators, for example, typically report data for just the 10%-15% of subjects sampled for substudies of their cohort of 15,972 participants. Remaining subjects contribute to stratified sampling weights only. Analysis methods implemented in the freely available R statistical system (http://cran.r-project.org/) make better use of the data through adjustment of the sampling weights via calibration or estimation. By reanalyzing data from an ARIC study of coronary heart disease and simulations based on data from the National Wilms Tumor Study, the authors demonstrate that such adjustment can dramatically improve the precision of hazard ratios estimated for baseline covariates known for all subjects. Adjustment can also improve precision for partially missing covariates, those known for substudy participants only, when their values may be imputed with reasonable accuracy for the remaining cohort members. Links are provided to software, data sets, and tutorials showing in detail the steps needed to carry out the adjusted analyses. Epidemiologists are encouraged to consider use of these methods to enhance the accuracy of results reported from case-cohort analyses.</abstract><cop>Cary, NC</cop><pub>Oxford University Press</pub><pmid>19357328</pmid><doi>10.1093/aje/kwp055</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0002-9262
ispartof American journal of epidemiology, 2009-06, Vol.169 (11), p.1398-1405
issn 0002-9262
1476-6256
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2768499
source MEDLINE; Oxford University Press Journals All Titles (1996-Current); EZB-FREE-00999 freely available EZB journals; Alma/SFX Local Collection
subjects Analysis. Health state
Biological and medical sciences
Biomarkers - analysis
Calibration
Cohort Studies
Coronary Artery Disease - epidemiology
Coronary Artery Disease - ethnology
Coronary Artery Disease - genetics
Epidemiologic Methods
Epidemiology
Female
General aspects
Genotype
Humans
Linear Models
Male
Medical research
Medical sciences
Miscellaneous
Observation
Practice of Epidemiology
Proportional Hazards Models
Public health. Hygiene
Public health. Hygiene-occupational medicine
Research methodology
Risk Factors
Sampling Studies
Statistical analysis
Statistical methods
title Using the Whole Cohort in the Analysis of Case-Cohort Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T07%3A26%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Using%20the%20Whole%20Cohort%20in%20the%20Analysis%20of%20Case-Cohort%20Data&rft.jtitle=American%20journal%20of%20epidemiology&rft.au=Breslow,%20Norman%20E.&rft.date=2009-06-01&rft.volume=169&rft.issue=11&rft.spage=1398&rft.epage=1405&rft.pages=1398-1405&rft.issn=0002-9262&rft.eissn=1476-6256&rft.coden=AJEPAS&rft_id=info:doi/10.1093/aje/kwp055&rft_dat=%3Cproquest_pubme%3E1721190931%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=235912337&rft_id=info:pmid/19357328&rft_oup_id=10.1093/aje/kwp055&rfr_iscdi=true