Getting Rid of Dichotomous Sex Estimations: Why Logistic Regression Should be Preferred Over Discriminant Function Analysis
Sex estimation is an important part of creating a biological profile for skeletal remains in forensics. The commonly used methods for developing sex estimation equations are discriminant function analysis (DFA) and logistic regression (LogR). LogR equations provide a probability of the predicted sex...
Gespeichert in:
Veröffentlicht in: | Journal of forensic sciences 2020-09, Vol.65 (5), p.1685-1691 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1691 |
---|---|
container_issue | 5 |
container_start_page | 1685 |
container_title | Journal of forensic sciences |
container_volume | 65 |
creator | Bartholdy, Bjørn Peare Sandoval, Elena Hoogland, Menno L. P. Schrader, Sarah A. |
description | Sex estimation is an important part of creating a biological profile for skeletal remains in forensics. The commonly used methods for developing sex estimation equations are discriminant function analysis (DFA) and logistic regression (LogR). LogR equations provide a probability of the predicted sex, while DFA relies on cutoff points to segregate males and females, resulting in a rigid dichotomization of the sexes. This is problematic because sexual dimorphism exists along a continuum and there can be considerable overlap in trait expression between the sexes. In this study, we used humeral measurements to compare the performance of DFA and LogR and found them to be very similar under multiple conditions. The overall cross‐validated (leave‐one‐out) accuracy of DFA (75.76–95.14%) was slightly higher than LogR (75.76–93.82%) for simple and multiple variable equations, and also performed better under varying sample sizes (94.03% vs. 93.78%). Three of five DFA equations outperformed LogR under the B index, while all five LogR equations outperformed the DFA equations under the Q index. Both methods saw an improvement in overall accuracy (DFA: 86.74–95.79%; LogR: 86.74–95.76%) when individuals with a classification probability lower than 0.80 were excluded. Additionally, we propose a method for calculating additional cutoff points (PMarks) based on posterior probability values. In conclusion, we recommend using LogR over DFA due to the increased flexibility, robusticity, and benefits for future users of the statistical models; however, if DFA is preferred, use of the proposed PMarks facilitates future analysis while avoiding unnecessary dichotomization. |
doi_str_mv | 10.1111/1556-4029.14482 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7497157</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2412220511</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4672-55639067c54f86e9d76226c9843ea30f397f3a5f57d2ef3d001c6bded25e2bc3</originalsourceid><addsrcrecordid>eNqFkc1v1DAQxS0EotvCmRuyxIVLWn_ETsIBqSrdAlppUVuJo-W1J7uuEru1k8Kq_zxOt6yAC76MNPP807x5CL2h5Jjmd0KFkEVJWHNMy7Jmz9Bs33mOZoQwVlDa1AfoMKUbQoikkr5EB5wJRoloZujhAobB-TW-dBaHFn9yZhOG0Icx4Sv4ic_T4Ho9uODTB_x9s8WLsHa5Z_AlrCOklCf4ahPGzuIV4G8RWogRLF7eQ8y0ZKLrndd-wPPRmwmET73utsmlV-hFq7sEr5_qEbqen1-ffS4Wy4svZ6eLwpSyYkU2xBsiKyPKtpbQ2EoyJk1Tlxw0Jy1vqpZr0YrKMmi5JYQaubJgmQC2MvwIfdxhb8dVD9aAH6Lu1G1eTMetCtqpvyfebdQ63KuqbCoqqgx4_wSI4W6ENKg--4Ku0x7ynRQrKWOMCEqz9N0_0pswxux3UvGqFg17BJ7sVCaGlPLJ9stQoqZc1ZSimlJUj7nmH2__9LDX_w4yC8RO8MN1sP0fT32dL3fgX__Ermg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2437859257</pqid></control><display><type>article</type><title>Getting Rid of Dichotomous Sex Estimations: Why Logistic Regression Should be Preferred Over Discriminant Function Analysis</title><source>MEDLINE</source><source>Wiley Online Library Journals Frontfile Complete</source><creator>Bartholdy, Bjørn Peare ; Sandoval, Elena ; Hoogland, Menno L. P. ; Schrader, Sarah A.</creator><creatorcontrib>Bartholdy, Bjørn Peare ; Sandoval, Elena ; Hoogland, Menno L. P. ; Schrader, Sarah A.</creatorcontrib><description>Sex estimation is an important part of creating a biological profile for skeletal remains in forensics. The commonly used methods for developing sex estimation equations are discriminant function analysis (DFA) and logistic regression (LogR). LogR equations provide a probability of the predicted sex, while DFA relies on cutoff points to segregate males and females, resulting in a rigid dichotomization of the sexes. This is problematic because sexual dimorphism exists along a continuum and there can be considerable overlap in trait expression between the sexes. In this study, we used humeral measurements to compare the performance of DFA and LogR and found them to be very similar under multiple conditions. The overall cross‐validated (leave‐one‐out) accuracy of DFA (75.76–95.14%) was slightly higher than LogR (75.76–93.82%) for simple and multiple variable equations, and also performed better under varying sample sizes (94.03% vs. 93.78%). Three of five DFA equations outperformed LogR under the B index, while all five LogR equations outperformed the DFA equations under the Q index. Both methods saw an improvement in overall accuracy (DFA: 86.74–95.79%; LogR: 86.74–95.76%) when individuals with a classification probability lower than 0.80 were excluded. Additionally, we propose a method for calculating additional cutoff points (PMarks) based on posterior probability values. In conclusion, we recommend using LogR over DFA due to the increased flexibility, robusticity, and benefits for future users of the statistical models; however, if DFA is preferred, use of the proposed PMarks facilitates future analysis while avoiding unnecessary dichotomization.</description><identifier>ISSN: 0022-1198</identifier><identifier>EISSN: 1556-4029</identifier><identifier>DOI: 10.1111/1556-4029.14482</identifier><identifier>PMID: 32521059</identifier><language>eng</language><publisher>United States: Wiley Subscription Services, Inc</publisher><subject>Adult ; Aged ; Aged, 80 and over ; anthropometrics ; Conditional probability ; Discriminant Analysis ; discriminant function ; Female ; Forensic Anthropology - methods ; Function analysis ; Gender differences ; Human remains ; Humans ; humerus ; Humerus - anatomy & histology ; linear discriminant analysis ; Logistic Models ; logistic regression ; Male ; Middle Aged ; Regression analysis ; Sex ; Sex Determination by Skeleton - methods ; sex estimation ; Sexes ; Sexual dimorphism ; Statistical analysis ; Statistical models ; Technical Note ; Technical Notes ; Young Adult</subject><ispartof>Journal of forensic sciences, 2020-09, Vol.65 (5), p.1685-1691</ispartof><rights>2020 The Authors. Journal of Forensic Sciences published by Wiley Periodicals LLC on behalf of American Academy of Forensic Sciences</rights><rights>2020 The Authors. Journal of Forensic Sciences published by Wiley Periodicals LLC on behalf of American Academy of Forensic Sciences.</rights><rights>2020. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4672-55639067c54f86e9d76226c9843ea30f397f3a5f57d2ef3d001c6bded25e2bc3</citedby><cites>FETCH-LOGICAL-c4672-55639067c54f86e9d76226c9843ea30f397f3a5f57d2ef3d001c6bded25e2bc3</cites><orcidid>0000-0003-3985-1016</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1111%2F1556-4029.14482$$EPDF$$P50$$Gwiley$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1111%2F1556-4029.14482$$EHTML$$P50$$Gwiley$$Hfree_for_read</linktohtml><link.rule.ids>230,314,776,780,881,1411,27901,27902,45550,45551</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32521059$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Bartholdy, Bjørn Peare</creatorcontrib><creatorcontrib>Sandoval, Elena</creatorcontrib><creatorcontrib>Hoogland, Menno L. P.</creatorcontrib><creatorcontrib>Schrader, Sarah A.</creatorcontrib><title>Getting Rid of Dichotomous Sex Estimations: Why Logistic Regression Should be Preferred Over Discriminant Function Analysis</title><title>Journal of forensic sciences</title><addtitle>J Forensic Sci</addtitle><description>Sex estimation is an important part of creating a biological profile for skeletal remains in forensics. The commonly used methods for developing sex estimation equations are discriminant function analysis (DFA) and logistic regression (LogR). LogR equations provide a probability of the predicted sex, while DFA relies on cutoff points to segregate males and females, resulting in a rigid dichotomization of the sexes. This is problematic because sexual dimorphism exists along a continuum and there can be considerable overlap in trait expression between the sexes. In this study, we used humeral measurements to compare the performance of DFA and LogR and found them to be very similar under multiple conditions. The overall cross‐validated (leave‐one‐out) accuracy of DFA (75.76–95.14%) was slightly higher than LogR (75.76–93.82%) for simple and multiple variable equations, and also performed better under varying sample sizes (94.03% vs. 93.78%). Three of five DFA equations outperformed LogR under the B index, while all five LogR equations outperformed the DFA equations under the Q index. Both methods saw an improvement in overall accuracy (DFA: 86.74–95.79%; LogR: 86.74–95.76%) when individuals with a classification probability lower than 0.80 were excluded. Additionally, we propose a method for calculating additional cutoff points (PMarks) based on posterior probability values. In conclusion, we recommend using LogR over DFA due to the increased flexibility, robusticity, and benefits for future users of the statistical models; however, if DFA is preferred, use of the proposed PMarks facilitates future analysis while avoiding unnecessary dichotomization.</description><subject>Adult</subject><subject>Aged</subject><subject>Aged, 80 and over</subject><subject>anthropometrics</subject><subject>Conditional probability</subject><subject>Discriminant Analysis</subject><subject>discriminant function</subject><subject>Female</subject><subject>Forensic Anthropology - methods</subject><subject>Function analysis</subject><subject>Gender differences</subject><subject>Human remains</subject><subject>Humans</subject><subject>humerus</subject><subject>Humerus - anatomy & histology</subject><subject>linear discriminant analysis</subject><subject>Logistic Models</subject><subject>logistic regression</subject><subject>Male</subject><subject>Middle Aged</subject><subject>Regression analysis</subject><subject>Sex</subject><subject>Sex Determination by Skeleton - methods</subject><subject>sex estimation</subject><subject>Sexes</subject><subject>Sexual dimorphism</subject><subject>Statistical analysis</subject><subject>Statistical models</subject><subject>Technical Note</subject><subject>Technical Notes</subject><subject>Young Adult</subject><issn>0022-1198</issn><issn>1556-4029</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>24P</sourceid><sourceid>EIF</sourceid><recordid>eNqFkc1v1DAQxS0EotvCmRuyxIVLWn_ETsIBqSrdAlppUVuJo-W1J7uuEru1k8Kq_zxOt6yAC76MNPP807x5CL2h5Jjmd0KFkEVJWHNMy7Jmz9Bs33mOZoQwVlDa1AfoMKUbQoikkr5EB5wJRoloZujhAobB-TW-dBaHFn9yZhOG0Icx4Sv4ic_T4Ho9uODTB_x9s8WLsHa5Z_AlrCOklCf4ahPGzuIV4G8RWogRLF7eQ8y0ZKLrndd-wPPRmwmET73utsmlV-hFq7sEr5_qEbqen1-ffS4Wy4svZ6eLwpSyYkU2xBsiKyPKtpbQ2EoyJk1Tlxw0Jy1vqpZr0YrKMmi5JYQaubJgmQC2MvwIfdxhb8dVD9aAH6Lu1G1eTMetCtqpvyfebdQ63KuqbCoqqgx4_wSI4W6ENKg--4Ku0x7ynRQrKWOMCEqz9N0_0pswxux3UvGqFg17BJ7sVCaGlPLJ9stQoqZc1ZSimlJUj7nmH2__9LDX_w4yC8RO8MN1sP0fT32dL3fgX__Ermg</recordid><startdate>202009</startdate><enddate>202009</enddate><creator>Bartholdy, Bjørn Peare</creator><creator>Sandoval, Elena</creator><creator>Hoogland, Menno L. P.</creator><creator>Schrader, Sarah A.</creator><general>Wiley Subscription Services, Inc</general><general>John Wiley and Sons Inc</general><scope>24P</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>K7.</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-3985-1016</orcidid></search><sort><creationdate>202009</creationdate><title>Getting Rid of Dichotomous Sex Estimations: Why Logistic Regression Should be Preferred Over Discriminant Function Analysis</title><author>Bartholdy, Bjørn Peare ; Sandoval, Elena ; Hoogland, Menno L. P. ; Schrader, Sarah A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4672-55639067c54f86e9d76226c9843ea30f397f3a5f57d2ef3d001c6bded25e2bc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Adult</topic><topic>Aged</topic><topic>Aged, 80 and over</topic><topic>anthropometrics</topic><topic>Conditional probability</topic><topic>Discriminant Analysis</topic><topic>discriminant function</topic><topic>Female</topic><topic>Forensic Anthropology - methods</topic><topic>Function analysis</topic><topic>Gender differences</topic><topic>Human remains</topic><topic>Humans</topic><topic>humerus</topic><topic>Humerus - anatomy & histology</topic><topic>linear discriminant analysis</topic><topic>Logistic Models</topic><topic>logistic regression</topic><topic>Male</topic><topic>Middle Aged</topic><topic>Regression analysis</topic><topic>Sex</topic><topic>Sex Determination by Skeleton - methods</topic><topic>sex estimation</topic><topic>Sexes</topic><topic>Sexual dimorphism</topic><topic>Statistical analysis</topic><topic>Statistical models</topic><topic>Technical Note</topic><topic>Technical Notes</topic><topic>Young Adult</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bartholdy, Bjørn Peare</creatorcontrib><creatorcontrib>Sandoval, Elena</creatorcontrib><creatorcontrib>Hoogland, Menno L. P.</creatorcontrib><creatorcontrib>Schrader, Sarah A.</creatorcontrib><collection>Wiley Online Library Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Criminal Justice (Alumni)</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of forensic sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bartholdy, Bjørn Peare</au><au>Sandoval, Elena</au><au>Hoogland, Menno L. P.</au><au>Schrader, Sarah A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Getting Rid of Dichotomous Sex Estimations: Why Logistic Regression Should be Preferred Over Discriminant Function Analysis</atitle><jtitle>Journal of forensic sciences</jtitle><addtitle>J Forensic Sci</addtitle><date>2020-09</date><risdate>2020</risdate><volume>65</volume><issue>5</issue><spage>1685</spage><epage>1691</epage><pages>1685-1691</pages><issn>0022-1198</issn><eissn>1556-4029</eissn><abstract>Sex estimation is an important part of creating a biological profile for skeletal remains in forensics. The commonly used methods for developing sex estimation equations are discriminant function analysis (DFA) and logistic regression (LogR). LogR equations provide a probability of the predicted sex, while DFA relies on cutoff points to segregate males and females, resulting in a rigid dichotomization of the sexes. This is problematic because sexual dimorphism exists along a continuum and there can be considerable overlap in trait expression between the sexes. In this study, we used humeral measurements to compare the performance of DFA and LogR and found them to be very similar under multiple conditions. The overall cross‐validated (leave‐one‐out) accuracy of DFA (75.76–95.14%) was slightly higher than LogR (75.76–93.82%) for simple and multiple variable equations, and also performed better under varying sample sizes (94.03% vs. 93.78%). Three of five DFA equations outperformed LogR under the B index, while all five LogR equations outperformed the DFA equations under the Q index. Both methods saw an improvement in overall accuracy (DFA: 86.74–95.79%; LogR: 86.74–95.76%) when individuals with a classification probability lower than 0.80 were excluded. Additionally, we propose a method for calculating additional cutoff points (PMarks) based on posterior probability values. In conclusion, we recommend using LogR over DFA due to the increased flexibility, robusticity, and benefits for future users of the statistical models; however, if DFA is preferred, use of the proposed PMarks facilitates future analysis while avoiding unnecessary dichotomization.</abstract><cop>United States</cop><pub>Wiley Subscription Services, Inc</pub><pmid>32521059</pmid><doi>10.1111/1556-4029.14482</doi><tpages>7</tpages><orcidid>https://orcid.org/0000-0003-3985-1016</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0022-1198 |
ispartof | Journal of forensic sciences, 2020-09, Vol.65 (5), p.1685-1691 |
issn | 0022-1198 1556-4029 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7497157 |
source | MEDLINE; Wiley Online Library Journals Frontfile Complete |
subjects | Adult Aged Aged, 80 and over anthropometrics Conditional probability Discriminant Analysis discriminant function Female Forensic Anthropology - methods Function analysis Gender differences Human remains Humans humerus Humerus - anatomy & histology linear discriminant analysis Logistic Models logistic regression Male Middle Aged Regression analysis Sex Sex Determination by Skeleton - methods sex estimation Sexes Sexual dimorphism Statistical analysis Statistical models Technical Note Technical Notes Young Adult |
title | Getting Rid of Dichotomous Sex Estimations: Why Logistic Regression Should be Preferred Over Discriminant Function Analysis |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T06%3A32%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Getting%20Rid%20of%20Dichotomous%20Sex%20Estimations:%20Why%20Logistic%20Regression%20Should%20be%20Preferred%20Over%20Discriminant%20Function%20Analysis&rft.jtitle=Journal%20of%20forensic%20sciences&rft.au=Bartholdy,%20Bj%C3%B8rn%20Peare&rft.date=2020-09&rft.volume=65&rft.issue=5&rft.spage=1685&rft.epage=1691&rft.pages=1685-1691&rft.issn=0022-1198&rft.eissn=1556-4029&rft_id=info:doi/10.1111/1556-4029.14482&rft_dat=%3Cproquest_pubme%3E2412220511%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2437859257&rft_id=info:pmid/32521059&rfr_iscdi=true |