Machine Learning Outperforms Regression Analysis to Predict Next-Season Major League Baseball Player Injuries: Epidemiology and Validation of 13,982 Player-Years From Performance and Injury Profile Trends, 2000-2017
Background: Machine learning (ML) allows for the development of a predictive algorithm capable of imbibing historical data on a Major League Baseball (MLB) player to accurately project the player's future availability. Purpose: To determine the validity of an ML model in predicting the next-sea...
Gespeichert in:
Veröffentlicht in: | Orthopaedic journal of sports medicine 2020-11, Vol.8 (11), p.2325967120963046-2325967120963046 |
---|---|
Hauptverfasser: | , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2325967120963046 |
---|---|
container_issue | 11 |
container_start_page | 2325967120963046 |
container_title | Orthopaedic journal of sports medicine |
container_volume | 8 |
creator | Karnuta, Jaret M. Luu, Bryan C. Haeberle, Heather S. Saluan, Paul M. Frangiamore, Salvatore J. Stearns, Kim L. Farrow, Lutul D. Nwachukwu, Benedict U. Verma, Nikhil N. Makhni, Eric C. Schickendantz, Mark S. Ramkumar, Prem N. |
description | Background:
Machine learning (ML) allows for the development of a predictive algorithm capable of imbibing historical data on a Major League Baseball (MLB) player to accurately project the player's future availability.
Purpose:
To determine the validity of an ML model in predicting the next-season injury risk and anatomic injury location for both position players and pitchers in the MLB.
Study Design:
Descriptive epidemiology study.
Methods:
Using 4 online baseball databases, we compiled MLB player data, including age, performance metrics, and injury history. A total of 84 ML algorithms were developed. The output of each algorithm reported whether the player would sustain an injury the following season as well as the injury’s anatomic site. The area under the receiver operating characteristic curve (AUC) primarily determined validation.
Results:
Player data were generated from 1931 position players and 1245 pitchers, with a mean follow-up of 4.40 years (13,982 player-years) between the years of 2000 and 2017. Injured players spent a total of 108,656 days on the disabled list, with a mean of 34.21 total days per player. The mean AUC for predicting next-season injuries was 0.76 among position players and 0.65 among pitchers using the top 3 ensemble classification. Back injuries had the highest AUC among both position players and pitchers, at 0.73. Advanced ML models outperformed logistic regression in 13 of 14 cases.
Conclusion:
Advanced ML models generally outperformed logistic regression and demonstrated fair capability in predicting publicly reportable next-season injuries, including the anatomic region for position players, although not for pitchers. |
doi_str_mv | 10.1177/2325967120963046 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7672741</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_2325967120963046</sage_id><sourcerecordid>2482560919</sourcerecordid><originalsourceid>FETCH-LOGICAL-c462t-d4dd1368c32b9a7181023dd167e14e3153e3a0b51580fea1318f8cdcce18e6fa3</originalsourceid><addsrcrecordid>eNp1Uk1v1DAQtRCIVkvvnJAlLhwa8EdiJxyQStVCpS1dQUHiFHmdSeqVY2_tBHV_af8OTncppRK-jDXz3ps3mkHoJSVvKZXyHeOsqISkjFSCk1w8QftTKptyTx_899BBjCuSXlnQisvnaI9zllMiyD66PVf6yjjAc1DBGdfhi3FYQ2h96CP-Cl2AGI13-Mgpu4km4sHjRYDG6AF_gZsh-wYqpvq5WvkwqXQj4I8qwlJZixdWbSDgM7cag4H4Hp-sTQO98dZ3G6xcg38oaxo1TC18iyk_rEq2o2U_k6eIT4Pv8WJrSTkNd7Q7xU1y4ltjAV8GcE08xCxNmTFC5Qv0rFU2wsEuztD305PL48_Z_OLT2fHRPNO5YEPW5E1DuSg1Z8tKSVpSwnhKCQk0B04LDlyRZUGLkrSgKKdlW-pGa6AliFbxGfqw1V2Pyx4aDW4IytbrYHoVNrVXpv634sxV3flftRSSyZwmgTc7geCvR4hD3ZuowVrlwI-xZrnIBZFlsjJDrx9BV34MaS8TqmSFIFXa7wyRLUoHH2OA9t4MJfV0OPXjw0mUVw-HuCf8OZMEyLaAqDr42_W_gr8B1jfMKA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2482560919</pqid></control><display><type>article</type><title>Machine Learning Outperforms Regression Analysis to Predict Next-Season Major League Baseball Player Injuries: Epidemiology and Validation of 13,982 Player-Years From Performance and Injury Profile Trends, 2000-2017</title><source>DOAJ Directory of Open Access Journals</source><source>Sage Journals GOLD Open Access 2024</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><creator>Karnuta, Jaret M. ; Luu, Bryan C. ; Haeberle, Heather S. ; Saluan, Paul M. ; Frangiamore, Salvatore J. ; Stearns, Kim L. ; Farrow, Lutul D. ; Nwachukwu, Benedict U. ; Verma, Nikhil N. ; Makhni, Eric C. ; Schickendantz, Mark S. ; Ramkumar, Prem N.</creator><creatorcontrib>Karnuta, Jaret M. ; Luu, Bryan C. ; Haeberle, Heather S. ; Saluan, Paul M. ; Frangiamore, Salvatore J. ; Stearns, Kim L. ; Farrow, Lutul D. ; Nwachukwu, Benedict U. ; Verma, Nikhil N. ; Makhni, Eric C. ; Schickendantz, Mark S. ; Ramkumar, Prem N.</creatorcontrib><description>Background:
Machine learning (ML) allows for the development of a predictive algorithm capable of imbibing historical data on a Major League Baseball (MLB) player to accurately project the player's future availability.
Purpose:
To determine the validity of an ML model in predicting the next-season injury risk and anatomic injury location for both position players and pitchers in the MLB.
Study Design:
Descriptive epidemiology study.
Methods:
Using 4 online baseball databases, we compiled MLB player data, including age, performance metrics, and injury history. A total of 84 ML algorithms were developed. The output of each algorithm reported whether the player would sustain an injury the following season as well as the injury’s anatomic site. The area under the receiver operating characteristic curve (AUC) primarily determined validation.
Results:
Player data were generated from 1931 position players and 1245 pitchers, with a mean follow-up of 4.40 years (13,982 player-years) between the years of 2000 and 2017. Injured players spent a total of 108,656 days on the disabled list, with a mean of 34.21 total days per player. The mean AUC for predicting next-season injuries was 0.76 among position players and 0.65 among pitchers using the top 3 ensemble classification. Back injuries had the highest AUC among both position players and pitchers, at 0.73. Advanced ML models outperformed logistic regression in 13 of 14 cases.
Conclusion:
Advanced ML models generally outperformed logistic regression and demonstrated fair capability in predicting publicly reportable next-season injuries, including the anatomic region for position players, although not for pitchers.</description><identifier>ISSN: 2325-9671</identifier><identifier>EISSN: 2325-9671</identifier><identifier>DOI: 10.1177/2325967120963046</identifier><identifier>PMID: 33241060</identifier><language>eng</language><publisher>Los Angeles, CA: SAGE Publications</publisher><subject>Epidemiology ; Machine learning ; Orthopedics ; Professional baseball ; Sports injuries ; Sports medicine</subject><ispartof>Orthopaedic journal of sports medicine, 2020-11, Vol.8 (11), p.2325967120963046-2325967120963046</ispartof><rights>The Author(s) 2020</rights><rights>The Author(s) 2020.</rights><rights>The Author(s) 2020. This work is licensed under the Creative Commons Attribution – Non-Commercial – No Derivatives License https://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>The Author(s) 2020 2020 SAGE Publications</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c462t-d4dd1368c32b9a7181023dd167e14e3153e3a0b51580fea1318f8cdcce18e6fa3</citedby><cites>FETCH-LOGICAL-c462t-d4dd1368c32b9a7181023dd167e14e3153e3a0b51580fea1318f8cdcce18e6fa3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7672741/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7672741/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,21965,27852,27923,27924,44944,45332,53790,53792</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33241060$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Karnuta, Jaret M.</creatorcontrib><creatorcontrib>Luu, Bryan C.</creatorcontrib><creatorcontrib>Haeberle, Heather S.</creatorcontrib><creatorcontrib>Saluan, Paul M.</creatorcontrib><creatorcontrib>Frangiamore, Salvatore J.</creatorcontrib><creatorcontrib>Stearns, Kim L.</creatorcontrib><creatorcontrib>Farrow, Lutul D.</creatorcontrib><creatorcontrib>Nwachukwu, Benedict U.</creatorcontrib><creatorcontrib>Verma, Nikhil N.</creatorcontrib><creatorcontrib>Makhni, Eric C.</creatorcontrib><creatorcontrib>Schickendantz, Mark S.</creatorcontrib><creatorcontrib>Ramkumar, Prem N.</creatorcontrib><title>Machine Learning Outperforms Regression Analysis to Predict Next-Season Major League Baseball Player Injuries: Epidemiology and Validation of 13,982 Player-Years From Performance and Injury Profile Trends, 2000-2017</title><title>Orthopaedic journal of sports medicine</title><addtitle>Orthop J Sports Med</addtitle><description>Background:
Machine learning (ML) allows for the development of a predictive algorithm capable of imbibing historical data on a Major League Baseball (MLB) player to accurately project the player's future availability.
Purpose:
To determine the validity of an ML model in predicting the next-season injury risk and anatomic injury location for both position players and pitchers in the MLB.
Study Design:
Descriptive epidemiology study.
Methods:
Using 4 online baseball databases, we compiled MLB player data, including age, performance metrics, and injury history. A total of 84 ML algorithms were developed. The output of each algorithm reported whether the player would sustain an injury the following season as well as the injury’s anatomic site. The area under the receiver operating characteristic curve (AUC) primarily determined validation.
Results:
Player data were generated from 1931 position players and 1245 pitchers, with a mean follow-up of 4.40 years (13,982 player-years) between the years of 2000 and 2017. Injured players spent a total of 108,656 days on the disabled list, with a mean of 34.21 total days per player. The mean AUC for predicting next-season injuries was 0.76 among position players and 0.65 among pitchers using the top 3 ensemble classification. Back injuries had the highest AUC among both position players and pitchers, at 0.73. Advanced ML models outperformed logistic regression in 13 of 14 cases.
Conclusion:
Advanced ML models generally outperformed logistic regression and demonstrated fair capability in predicting publicly reportable next-season injuries, including the anatomic region for position players, although not for pitchers.</description><subject>Epidemiology</subject><subject>Machine learning</subject><subject>Orthopedics</subject><subject>Professional baseball</subject><subject>Sports injuries</subject><subject>Sports medicine</subject><issn>2325-9671</issn><issn>2325-9671</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>AFRWT</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp1Uk1v1DAQtRCIVkvvnJAlLhwa8EdiJxyQStVCpS1dQUHiFHmdSeqVY2_tBHV_af8OTncppRK-jDXz3ps3mkHoJSVvKZXyHeOsqISkjFSCk1w8QftTKptyTx_899BBjCuSXlnQisvnaI9zllMiyD66PVf6yjjAc1DBGdfhi3FYQ2h96CP-Cl2AGI13-Mgpu4km4sHjRYDG6AF_gZsh-wYqpvq5WvkwqXQj4I8qwlJZixdWbSDgM7cag4H4Hp-sTQO98dZ3G6xcg38oaxo1TC18iyk_rEq2o2U_k6eIT4Pv8WJrSTkNd7Q7xU1y4ltjAV8GcE08xCxNmTFC5Qv0rFU2wsEuztD305PL48_Z_OLT2fHRPNO5YEPW5E1DuSg1Z8tKSVpSwnhKCQk0B04LDlyRZUGLkrSgKKdlW-pGa6AliFbxGfqw1V2Pyx4aDW4IytbrYHoVNrVXpv634sxV3flftRSSyZwmgTc7geCvR4hD3ZuowVrlwI-xZrnIBZFlsjJDrx9BV34MaS8TqmSFIFXa7wyRLUoHH2OA9t4MJfV0OPXjw0mUVw-HuCf8OZMEyLaAqDr42_W_gr8B1jfMKA</recordid><startdate>20201101</startdate><enddate>20201101</enddate><creator>Karnuta, Jaret M.</creator><creator>Luu, Bryan C.</creator><creator>Haeberle, Heather S.</creator><creator>Saluan, Paul M.</creator><creator>Frangiamore, Salvatore J.</creator><creator>Stearns, Kim L.</creator><creator>Farrow, Lutul D.</creator><creator>Nwachukwu, Benedict U.</creator><creator>Verma, Nikhil N.</creator><creator>Makhni, Eric C.</creator><creator>Schickendantz, Mark S.</creator><creator>Ramkumar, Prem N.</creator><general>SAGE Publications</general><general>Sage Publications Ltd</general><scope>AFRWT</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7RV</scope><scope>7X7</scope><scope>7XB</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>K9.</scope><scope>KB0</scope><scope>M0S</scope><scope>NAPCQ</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20201101</creationdate><title>Machine Learning Outperforms Regression Analysis to Predict Next-Season Major League Baseball Player Injuries: Epidemiology and Validation of 13,982 Player-Years From Performance and Injury Profile Trends, 2000-2017</title><author>Karnuta, Jaret M. ; Luu, Bryan C. ; Haeberle, Heather S. ; Saluan, Paul M. ; Frangiamore, Salvatore J. ; Stearns, Kim L. ; Farrow, Lutul D. ; Nwachukwu, Benedict U. ; Verma, Nikhil N. ; Makhni, Eric C. ; Schickendantz, Mark S. ; Ramkumar, Prem N.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c462t-d4dd1368c32b9a7181023dd167e14e3153e3a0b51580fea1318f8cdcce18e6fa3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Epidemiology</topic><topic>Machine learning</topic><topic>Orthopedics</topic><topic>Professional baseball</topic><topic>Sports injuries</topic><topic>Sports medicine</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Karnuta, Jaret M.</creatorcontrib><creatorcontrib>Luu, Bryan C.</creatorcontrib><creatorcontrib>Haeberle, Heather S.</creatorcontrib><creatorcontrib>Saluan, Paul M.</creatorcontrib><creatorcontrib>Frangiamore, Salvatore J.</creatorcontrib><creatorcontrib>Stearns, Kim L.</creatorcontrib><creatorcontrib>Farrow, Lutul D.</creatorcontrib><creatorcontrib>Nwachukwu, Benedict U.</creatorcontrib><creatorcontrib>Verma, Nikhil N.</creatorcontrib><creatorcontrib>Makhni, Eric C.</creatorcontrib><creatorcontrib>Schickendantz, Mark S.</creatorcontrib><creatorcontrib>Ramkumar, Prem N.</creatorcontrib><collection>Sage Journals GOLD Open Access 2024</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Nursing & Allied Health Database</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Nursing & Allied Health Database (Alumni Edition)</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Nursing & Allied Health Premium</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Orthopaedic journal of sports medicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Karnuta, Jaret M.</au><au>Luu, Bryan C.</au><au>Haeberle, Heather S.</au><au>Saluan, Paul M.</au><au>Frangiamore, Salvatore J.</au><au>Stearns, Kim L.</au><au>Farrow, Lutul D.</au><au>Nwachukwu, Benedict U.</au><au>Verma, Nikhil N.</au><au>Makhni, Eric C.</au><au>Schickendantz, Mark S.</au><au>Ramkumar, Prem N.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Machine Learning Outperforms Regression Analysis to Predict Next-Season Major League Baseball Player Injuries: Epidemiology and Validation of 13,982 Player-Years From Performance and Injury Profile Trends, 2000-2017</atitle><jtitle>Orthopaedic journal of sports medicine</jtitle><addtitle>Orthop J Sports Med</addtitle><date>2020-11-01</date><risdate>2020</risdate><volume>8</volume><issue>11</issue><spage>2325967120963046</spage><epage>2325967120963046</epage><pages>2325967120963046-2325967120963046</pages><issn>2325-9671</issn><eissn>2325-9671</eissn><abstract>Background:
Machine learning (ML) allows for the development of a predictive algorithm capable of imbibing historical data on a Major League Baseball (MLB) player to accurately project the player's future availability.
Purpose:
To determine the validity of an ML model in predicting the next-season injury risk and anatomic injury location for both position players and pitchers in the MLB.
Study Design:
Descriptive epidemiology study.
Methods:
Using 4 online baseball databases, we compiled MLB player data, including age, performance metrics, and injury history. A total of 84 ML algorithms were developed. The output of each algorithm reported whether the player would sustain an injury the following season as well as the injury’s anatomic site. The area under the receiver operating characteristic curve (AUC) primarily determined validation.
Results:
Player data were generated from 1931 position players and 1245 pitchers, with a mean follow-up of 4.40 years (13,982 player-years) between the years of 2000 and 2017. Injured players spent a total of 108,656 days on the disabled list, with a mean of 34.21 total days per player. The mean AUC for predicting next-season injuries was 0.76 among position players and 0.65 among pitchers using the top 3 ensemble classification. Back injuries had the highest AUC among both position players and pitchers, at 0.73. Advanced ML models outperformed logistic regression in 13 of 14 cases.
Conclusion:
Advanced ML models generally outperformed logistic regression and demonstrated fair capability in predicting publicly reportable next-season injuries, including the anatomic region for position players, although not for pitchers.</abstract><cop>Los Angeles, CA</cop><pub>SAGE Publications</pub><pmid>33241060</pmid><doi>10.1177/2325967120963046</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2325-9671 |
ispartof | Orthopaedic journal of sports medicine, 2020-11, Vol.8 (11), p.2325967120963046-2325967120963046 |
issn | 2325-9671 2325-9671 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7672741 |
source | DOAJ Directory of Open Access Journals; Sage Journals GOLD Open Access 2024; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central |
subjects | Epidemiology Machine learning Orthopedics Professional baseball Sports injuries Sports medicine |
title | Machine Learning Outperforms Regression Analysis to Predict Next-Season Major League Baseball Player Injuries: Epidemiology and Validation of 13,982 Player-Years From Performance and Injury Profile Trends, 2000-2017 |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T08%3A05%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Machine%20Learning%20Outperforms%20Regression%20Analysis%20to%20Predict%20Next-Season%20Major%20League%20Baseball%20Player%20Injuries:%20Epidemiology%20and%20Validation%20of%2013,982%20Player-Years%20From%20Performance%20and%20Injury%20Profile%20Trends,%202000-2017&rft.jtitle=Orthopaedic%20journal%20of%20sports%20medicine&rft.au=Karnuta,%20Jaret%20M.&rft.date=2020-11-01&rft.volume=8&rft.issue=11&rft.spage=2325967120963046&rft.epage=2325967120963046&rft.pages=2325967120963046-2325967120963046&rft.issn=2325-9671&rft.eissn=2325-9671&rft_id=info:doi/10.1177/2325967120963046&rft_dat=%3Cproquest_pubme%3E2482560919%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2482560919&rft_id=info:pmid/33241060&rft_sage_id=10.1177_2325967120963046&rfr_iscdi=true |