Estimation of the average correlation coefficient for stratified bivariate data
If the relationship between two ordered categorical variables X and Y is influenced by a third categorical variable with K levels, the Cochran–Mantel–Haenszel (CMH) correlation statistic QC is a useful stratum‐adjusted summary statistic for testing the null hypothesis of no association between X and...
Gespeichert in:
Veröffentlicht in: | Statistics in medicine 1999-03, Vol.18 (5), p.567-580 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 580 |
---|---|
container_issue | 5 |
container_start_page | 567 |
container_title | Statistics in medicine |
container_volume | 18 |
creator | Rubenstein, Linda M. Davis, Charles S. |
description | If the relationship between two ordered categorical variables X and Y is influenced by a third categorical variable with K levels, the Cochran–Mantel–Haenszel (CMH) correlation statistic QC is a useful stratum‐adjusted summary statistic for testing the null hypothesis of no association between X and Y. Although motivated by and developed for the case of K I×J contingency tables, the correlation statistic QC is also applicable when X and Y are continuous variables. In this paper we derive a corresponding estimator of the average correlation coefficient for K I×J tables. We also study two estimates of the variance of the average correlation coefficient. The first is a restricted variance based on the variances of the observed cell frequencies under the null hypothesis of no association. The second is an unrestricted variance based on an asymptotic variance derived by Brown and Benedetti. The estimator of the average correlation coefficient works well in tables with balanced and unbalanced margins, for equal and unequal stratum‐specific sample sizes, when correlation coefficients are constant over strata, and when correlation coefficients vary across strata. When the correlation coefficients are zero, close to zero, or the cell frequencies are small, the confidence intervals based on the restricted variance are preferred. For larger correlations and larger cell frequencies, the unrestricted confidence intervals give superior performance.
We also apply the CMH statistic and proposed estimators to continuous non‐normal data sampled from bivariate gamma distributions. We compare our methods to statistics for data sampled from normal distributions. The size and power of the CMH and normal theory statistics are comparable. When the stratum‐specific sample sizes are small and the distributions are skewed, the proposed estimator is superior to the normal theory estimator. When the correlation coefficient is zero or close to zero, the restricted confidence intervals provide the best performance. None of the confidence intervals studied provides acceptable performances across all correlation coefficients, sample sizes and non‐normal distributions. Copyright © 1999 John Wiley & Sons, Ltd. |
doi_str_mv | 10.1002/(SICI)1097-0258(19990315)18:5<567::AID-SIM52>3.0.CO;2-F |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_69705537</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>69705537</sourcerecordid><originalsourceid>FETCH-LOGICAL-c5012-41a64714fe43a785b55c760538b223e3bc50608cf3d7500013f56d3b6c656843</originalsourceid><addsrcrecordid>eNqFkE1v1DAQhiMEokvhL6AcEGoPWfyRsZMFgarQLZEKe2hFuI0cxwZDdlPsbKH_Hi9ZChJInEaaefzq9ZMkryiZU0LYs6OLuqqPKSllRhgUR7QsS8IpHNNiAS9AyMXipH6dXdRvgb3kczKvVs9ZtryTzG7f3E1mhEmZCUnhIHkQwmdCKAUm7ycHlDBSFpTNktVpGN1ajW7YpINNx08mVdfGq48m1YP3pp9OejDWOu3MZkzt4NMw-niwznRp666Vd2o0aadG9TC5Z1UfzKP9PEwul6eX1ZvsfHVWVyfnmQZCWZZTJXJJc2tyrmQBLYCWggAvWsa44W3EBCm05Z0EEotzC6LjrdACRJHzw-TpFHvlh69bE0Zcu6BN36uNGbYBRSkJAJcRbCZQ-yEEbyxe-fhff4OU4E414k417rThThv-Uo20QMCoGjGqxp-qkSPBaoUMlzH58b7Ctl2b7o_cyW0EnuwBFbTqrVcb7cJvTlIauYh9mLBvrjc3f9X7b7t_lZsWMTqbol0YzffbaOW_oJBcAjbvzrApRNO8LwFL_gMFALSy</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>69705537</pqid></control><display><type>article</type><title>Estimation of the average correlation coefficient for stratified bivariate data</title><source>MEDLINE</source><source>Wiley Online Library Journals Frontfile Complete</source><creator>Rubenstein, Linda M. ; Davis, Charles S.</creator><creatorcontrib>Rubenstein, Linda M. ; Davis, Charles S.</creatorcontrib><description>If the relationship between two ordered categorical variables X and Y is influenced by a third categorical variable with K levels, the Cochran–Mantel–Haenszel (CMH) correlation statistic QC is a useful stratum‐adjusted summary statistic for testing the null hypothesis of no association between X and Y. Although motivated by and developed for the case of K I×J contingency tables, the correlation statistic QC is also applicable when X and Y are continuous variables. In this paper we derive a corresponding estimator of the average correlation coefficient for K I×J tables. We also study two estimates of the variance of the average correlation coefficient. The first is a restricted variance based on the variances of the observed cell frequencies under the null hypothesis of no association. The second is an unrestricted variance based on an asymptotic variance derived by Brown and Benedetti. The estimator of the average correlation coefficient works well in tables with balanced and unbalanced margins, for equal and unequal stratum‐specific sample sizes, when correlation coefficients are constant over strata, and when correlation coefficients vary across strata. When the correlation coefficients are zero, close to zero, or the cell frequencies are small, the confidence intervals based on the restricted variance are preferred. For larger correlations and larger cell frequencies, the unrestricted confidence intervals give superior performance.
We also apply the CMH statistic and proposed estimators to continuous non‐normal data sampled from bivariate gamma distributions. We compare our methods to statistics for data sampled from normal distributions. The size and power of the CMH and normal theory statistics are comparable. When the stratum‐specific sample sizes are small and the distributions are skewed, the proposed estimator is superior to the normal theory estimator. When the correlation coefficient is zero or close to zero, the restricted confidence intervals provide the best performance. None of the confidence intervals studied provides acceptable performances across all correlation coefficients, sample sizes and non‐normal distributions. Copyright © 1999 John Wiley & Sons, Ltd.</description><identifier>ISSN: 0277-6715</identifier><identifier>EISSN: 1097-0258</identifier><identifier>DOI: 10.1002/(SICI)1097-0258(19990315)18:5<567::AID-SIM52>3.0.CO;2-F</identifier><identifier>PMID: 10209812</identifier><language>eng</language><publisher>Chichester, UK: John Wiley & Sons, Ltd</publisher><subject>Aged ; Aged, 80 and over ; Binomial Distribution ; Biological and medical sciences ; Computerized, statistical medical data processing and models in biomedicine ; Data Collection - methods ; Data Collection - statistics & numerical data ; Effect Modifier, Epidemiologic ; Humans ; Male ; Medical sciences ; Medical statistics ; Rural Health ; Sample Size ; Statistics, Nonparametric</subject><ispartof>Statistics in medicine, 1999-03, Vol.18 (5), p.567-580</ispartof><rights>Copyright © 1999 John Wiley & Sons, Ltd.</rights><rights>1999 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c5012-41a64714fe43a785b55c760538b223e3bc50608cf3d7500013f56d3b6c656843</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2F%28SICI%291097-0258%2819990315%2918%3A5%3C567%3A%3AAID-SIM52%3E3.0.CO%3B2-F$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2F%28SICI%291097-0258%2819990315%2918%3A5%3C567%3A%3AAID-SIM52%3E3.0.CO%3B2-F$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,780,784,1416,27923,27924,45573,45574</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=1711981$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/10209812$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Rubenstein, Linda M.</creatorcontrib><creatorcontrib>Davis, Charles S.</creatorcontrib><title>Estimation of the average correlation coefficient for stratified bivariate data</title><title>Statistics in medicine</title><addtitle>Statist. Med</addtitle><description>If the relationship between two ordered categorical variables X and Y is influenced by a third categorical variable with K levels, the Cochran–Mantel–Haenszel (CMH) correlation statistic QC is a useful stratum‐adjusted summary statistic for testing the null hypothesis of no association between X and Y. Although motivated by and developed for the case of K I×J contingency tables, the correlation statistic QC is also applicable when X and Y are continuous variables. In this paper we derive a corresponding estimator of the average correlation coefficient for K I×J tables. We also study two estimates of the variance of the average correlation coefficient. The first is a restricted variance based on the variances of the observed cell frequencies under the null hypothesis of no association. The second is an unrestricted variance based on an asymptotic variance derived by Brown and Benedetti. The estimator of the average correlation coefficient works well in tables with balanced and unbalanced margins, for equal and unequal stratum‐specific sample sizes, when correlation coefficients are constant over strata, and when correlation coefficients vary across strata. When the correlation coefficients are zero, close to zero, or the cell frequencies are small, the confidence intervals based on the restricted variance are preferred. For larger correlations and larger cell frequencies, the unrestricted confidence intervals give superior performance.
We also apply the CMH statistic and proposed estimators to continuous non‐normal data sampled from bivariate gamma distributions. We compare our methods to statistics for data sampled from normal distributions. The size and power of the CMH and normal theory statistics are comparable. When the stratum‐specific sample sizes are small and the distributions are skewed, the proposed estimator is superior to the normal theory estimator. When the correlation coefficient is zero or close to zero, the restricted confidence intervals provide the best performance. None of the confidence intervals studied provides acceptable performances across all correlation coefficients, sample sizes and non‐normal distributions. Copyright © 1999 John Wiley & Sons, Ltd.</description><subject>Aged</subject><subject>Aged, 80 and over</subject><subject>Binomial Distribution</subject><subject>Biological and medical sciences</subject><subject>Computerized, statistical medical data processing and models in biomedicine</subject><subject>Data Collection - methods</subject><subject>Data Collection - statistics & numerical data</subject><subject>Effect Modifier, Epidemiologic</subject><subject>Humans</subject><subject>Male</subject><subject>Medical sciences</subject><subject>Medical statistics</subject><subject>Rural Health</subject><subject>Sample Size</subject><subject>Statistics, Nonparametric</subject><issn>0277-6715</issn><issn>1097-0258</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1999</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkE1v1DAQhiMEokvhL6AcEGoPWfyRsZMFgarQLZEKe2hFuI0cxwZDdlPsbKH_Hi9ZChJInEaaefzq9ZMkryiZU0LYs6OLuqqPKSllRhgUR7QsS8IpHNNiAS9AyMXipH6dXdRvgb3kczKvVs9ZtryTzG7f3E1mhEmZCUnhIHkQwmdCKAUm7ycHlDBSFpTNktVpGN1ajW7YpINNx08mVdfGq48m1YP3pp9OejDWOu3MZkzt4NMw-niwznRp666Vd2o0aadG9TC5Z1UfzKP9PEwul6eX1ZvsfHVWVyfnmQZCWZZTJXJJc2tyrmQBLYCWggAvWsa44W3EBCm05Z0EEotzC6LjrdACRJHzw-TpFHvlh69bE0Zcu6BN36uNGbYBRSkJAJcRbCZQ-yEEbyxe-fhff4OU4E414k417rThThv-Uo20QMCoGjGqxp-qkSPBaoUMlzH58b7Ctl2b7o_cyW0EnuwBFbTqrVcb7cJvTlIauYh9mLBvrjc3f9X7b7t_lZsWMTqbol0YzffbaOW_oJBcAjbvzrApRNO8LwFL_gMFALSy</recordid><startdate>19990315</startdate><enddate>19990315</enddate><creator>Rubenstein, Linda M.</creator><creator>Davis, Charles S.</creator><general>John Wiley & Sons, Ltd</general><general>Wiley</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>19990315</creationdate><title>Estimation of the average correlation coefficient for stratified bivariate data</title><author>Rubenstein, Linda M. ; Davis, Charles S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c5012-41a64714fe43a785b55c760538b223e3bc50608cf3d7500013f56d3b6c656843</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1999</creationdate><topic>Aged</topic><topic>Aged, 80 and over</topic><topic>Binomial Distribution</topic><topic>Biological and medical sciences</topic><topic>Computerized, statistical medical data processing and models in biomedicine</topic><topic>Data Collection - methods</topic><topic>Data Collection - statistics & numerical data</topic><topic>Effect Modifier, Epidemiologic</topic><topic>Humans</topic><topic>Male</topic><topic>Medical sciences</topic><topic>Medical statistics</topic><topic>Rural Health</topic><topic>Sample Size</topic><topic>Statistics, Nonparametric</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Rubenstein, Linda M.</creatorcontrib><creatorcontrib>Davis, Charles S.</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Statistics in medicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Rubenstein, Linda M.</au><au>Davis, Charles S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Estimation of the average correlation coefficient for stratified bivariate data</atitle><jtitle>Statistics in medicine</jtitle><addtitle>Statist. Med</addtitle><date>1999-03-15</date><risdate>1999</risdate><volume>18</volume><issue>5</issue><spage>567</spage><epage>580</epage><pages>567-580</pages><issn>0277-6715</issn><eissn>1097-0258</eissn><abstract>If the relationship between two ordered categorical variables X and Y is influenced by a third categorical variable with K levels, the Cochran–Mantel–Haenszel (CMH) correlation statistic QC is a useful stratum‐adjusted summary statistic for testing the null hypothesis of no association between X and Y. Although motivated by and developed for the case of K I×J contingency tables, the correlation statistic QC is also applicable when X and Y are continuous variables. In this paper we derive a corresponding estimator of the average correlation coefficient for K I×J tables. We also study two estimates of the variance of the average correlation coefficient. The first is a restricted variance based on the variances of the observed cell frequencies under the null hypothesis of no association. The second is an unrestricted variance based on an asymptotic variance derived by Brown and Benedetti. The estimator of the average correlation coefficient works well in tables with balanced and unbalanced margins, for equal and unequal stratum‐specific sample sizes, when correlation coefficients are constant over strata, and when correlation coefficients vary across strata. When the correlation coefficients are zero, close to zero, or the cell frequencies are small, the confidence intervals based on the restricted variance are preferred. For larger correlations and larger cell frequencies, the unrestricted confidence intervals give superior performance.
We also apply the CMH statistic and proposed estimators to continuous non‐normal data sampled from bivariate gamma distributions. We compare our methods to statistics for data sampled from normal distributions. The size and power of the CMH and normal theory statistics are comparable. When the stratum‐specific sample sizes are small and the distributions are skewed, the proposed estimator is superior to the normal theory estimator. When the correlation coefficient is zero or close to zero, the restricted confidence intervals provide the best performance. None of the confidence intervals studied provides acceptable performances across all correlation coefficients, sample sizes and non‐normal distributions. Copyright © 1999 John Wiley & Sons, Ltd.</abstract><cop>Chichester, UK</cop><pub>John Wiley & Sons, Ltd</pub><pmid>10209812</pmid><doi>10.1002/(SICI)1097-0258(19990315)18:5<567::AID-SIM52>3.0.CO;2-F</doi><tpages>14</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0277-6715 |
ispartof | Statistics in medicine, 1999-03, Vol.18 (5), p.567-580 |
issn | 0277-6715 1097-0258 |
language | eng |
recordid | cdi_proquest_miscellaneous_69705537 |
source | MEDLINE; Wiley Online Library Journals Frontfile Complete |
subjects | Aged Aged, 80 and over Binomial Distribution Biological and medical sciences Computerized, statistical medical data processing and models in biomedicine Data Collection - methods Data Collection - statistics & numerical data Effect Modifier, Epidemiologic Humans Male Medical sciences Medical statistics Rural Health Sample Size Statistics, Nonparametric |
title | Estimation of the average correlation coefficient for stratified bivariate data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T14%3A00%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Estimation%20of%20the%20average%20correlation%20coefficient%20for%20stratified%20bivariate%20data&rft.jtitle=Statistics%20in%20medicine&rft.au=Rubenstein,%20Linda%20M.&rft.date=1999-03-15&rft.volume=18&rft.issue=5&rft.spage=567&rft.epage=580&rft.pages=567-580&rft.issn=0277-6715&rft.eissn=1097-0258&rft_id=info:doi/10.1002/(SICI)1097-0258(19990315)18:5%3C567::AID-SIM52%3E3.0.CO;2-F&rft_dat=%3Cproquest_cross%3E69705537%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=69705537&rft_id=info:pmid/10209812&rfr_iscdi=true |