A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?
Data-related analysis in football increasingly benefits from Big Data approaches and machine learning methods. One relevant application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of a match. The present paper tackles two neglected...
Gespeichert in:
Veröffentlicht in: | Social network analysis and mining 2022-12, Vol.12 (1), p.23-23, Article 23 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 23 |
---|---|
container_issue | 1 |
container_start_page | 23 |
container_title | Social network analysis and mining |
container_volume | 12 |
creator | Wunderlich, Fabian Memmert, Daniel |
description | Data-related analysis in football increasingly benefits from Big Data approaches and machine learning methods. One relevant application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the information reflected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000 matches from the main European football leagues supports the notion that the predictive value of in-play information is highly limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases when the match progresses, which might be caused by an unjustified high expectation of football fans before the match. |
doi_str_mv | 10.1007/s13278-021-00842-z |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8714875</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2919537693</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-cfcbd4903a2efa92a8a82202d89b74e000e777923c205826c3c420a06c704a313</originalsourceid><addsrcrecordid>eNp9Uctu1TAQjRCIVqU_wAJZYsMm4FdimwWoqnhJldiUtTVxnNSVE19sp9Xtj_R3cUi5PBasPDrnzBnPnKp6TvBrgrF4kwijQtaYkhpjyWl996g6JrJVdcNb9fhQN_ioOk3pGmNMMGMKt0-rI8aVaCmVx9X9GerciHrIgGAGv08uoTCgy1uXs40b0S_RzSPaRTu5gnkL42LRBNlc2fQW9QHlW2tzQibMGdyM3DyEWHgXZnQDfoHOW1SgQtQ7D_u1tgZSXm3LtDGAT4UseMgdeP_-WfVkKJg9fXhPqm8fP1yef64vvn76cn52URsueK7NYLqeK8yA2gEUBQmSUkx7qTrBbVnaCiEUZYbiRtLWMMMpBtwagTkwwk6qd5vvbukm2xs75whe76KbIO51AKf_ZmZ3pcdwo6UgXIqmGLx6MIjh-2JT1pNLxnoPsw1L0rQlLZXrvYv05T_S67DEcvSiUkQ1TLSKFRXdVCaGlKIdDp8hWK_R6y16XaLXP6PXd6XpxZ9rHFp-BV0EbBOk3Zqljb9n_8f2B56BvJU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2919537693</pqid></control><display><type>article</type><title>A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?</title><source>ProQuest Central Essentials</source><source>ProQuest Central (Alumni Edition)</source><source>ProQuest Central Student</source><source>ProQuest Central Korea</source><source>ProQuest Central UK/Ireland</source><source>SpringerLink Journals - AutoHoldings</source><source>ProQuest Central</source><creator>Wunderlich, Fabian ; Memmert, Daniel</creator><creatorcontrib>Wunderlich, Fabian ; Memmert, Daniel</creatorcontrib><description>Data-related analysis in football increasingly benefits from Big Data approaches and machine learning methods. One relevant application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the information reflected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000 matches from the main European football leagues supports the notion that the predictive value of in-play information is highly limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases when the match progresses, which might be caused by an unjustified high expectation of football fans before the match.</description><identifier>ISSN: 1869-5450</identifier><identifier>EISSN: 1869-5469</identifier><identifier>DOI: 10.1007/s13278-021-00842-z</identifier><identifier>PMID: 34976228</identifier><language>eng</language><publisher>Vienna: Springer Vienna</publisher><subject>Analysis ; Applications of Graph Theory and Complex Networks ; Big Data ; Collaboration ; Computer Science ; Data analysis ; Data mining ; Data Mining and Knowledge Discovery ; Economics ; Football ; Forecasting ; Gambling ; Game Theory ; Humanities ; Information sources ; Investigations ; Law ; Machine learning ; Mathematical models ; Methodology of the Social Sciences ; Objectives ; Original ; Original Article ; Prediction markets ; Sentiment analysis ; Soccer ; Social and Behav. Sciences ; Social networks ; Statistics for Social Sciences ; Teams</subject><ispartof>Social network analysis and mining, 2022-12, Vol.12 (1), p.23-23, Article 23</ispartof><rights>The Author(s) 2021</rights><rights>The Author(s) 2021.</rights><rights>The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-cfcbd4903a2efa92a8a82202d89b74e000e777923c205826c3c420a06c704a313</citedby><cites>FETCH-LOGICAL-c474t-cfcbd4903a2efa92a8a82202d89b74e000e777923c205826c3c420a06c704a313</cites><orcidid>0000-0002-7445-6858</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s13278-021-00842-z$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2919537693?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>230,314,780,784,885,21387,21388,21389,21390,23255,27923,27924,33529,33530,33702,33703,33743,33744,34004,34005,34313,34314,41487,42556,43658,43786,43804,43952,44066,51318,64384,64386,64388,72340</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34976228$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wunderlich, Fabian</creatorcontrib><creatorcontrib>Memmert, Daniel</creatorcontrib><title>A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?</title><title>Social network analysis and mining</title><addtitle>Soc. Netw. Anal. Min</addtitle><addtitle>Soc Netw Anal Min</addtitle><description>Data-related analysis in football increasingly benefits from Big Data approaches and machine learning methods. One relevant application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the information reflected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000 matches from the main European football leagues supports the notion that the predictive value of in-play information is highly limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases when the match progresses, which might be caused by an unjustified high expectation of football fans before the match.</description><subject>Analysis</subject><subject>Applications of Graph Theory and Complex Networks</subject><subject>Big Data</subject><subject>Collaboration</subject><subject>Computer Science</subject><subject>Data analysis</subject><subject>Data mining</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Economics</subject><subject>Football</subject><subject>Forecasting</subject><subject>Gambling</subject><subject>Game Theory</subject><subject>Humanities</subject><subject>Information sources</subject><subject>Investigations</subject><subject>Law</subject><subject>Machine learning</subject><subject>Mathematical models</subject><subject>Methodology of the Social Sciences</subject><subject>Objectives</subject><subject>Original</subject><subject>Original Article</subject><subject>Prediction markets</subject><subject>Sentiment analysis</subject><subject>Soccer</subject><subject>Social and Behav. Sciences</subject><subject>Social networks</subject><subject>Statistics for Social Sciences</subject><subject>Teams</subject><issn>1869-5450</issn><issn>1869-5469</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9Uctu1TAQjRCIVqU_wAJZYsMm4FdimwWoqnhJldiUtTVxnNSVE19sp9Xtj_R3cUi5PBasPDrnzBnPnKp6TvBrgrF4kwijQtaYkhpjyWl996g6JrJVdcNb9fhQN_ioOk3pGmNMMGMKt0-rI8aVaCmVx9X9GerciHrIgGAGv08uoTCgy1uXs40b0S_RzSPaRTu5gnkL42LRBNlc2fQW9QHlW2tzQibMGdyM3DyEWHgXZnQDfoHOW1SgQtQ7D_u1tgZSXm3LtDGAT4UseMgdeP_-WfVkKJg9fXhPqm8fP1yef64vvn76cn52URsueK7NYLqeK8yA2gEUBQmSUkx7qTrBbVnaCiEUZYbiRtLWMMMpBtwagTkwwk6qd5vvbukm2xs75whe76KbIO51AKf_ZmZ3pcdwo6UgXIqmGLx6MIjh-2JT1pNLxnoPsw1L0rQlLZXrvYv05T_S67DEcvSiUkQ1TLSKFRXdVCaGlKIdDp8hWK_R6y16XaLXP6PXd6XpxZ9rHFp-BV0EbBOk3Zqljb9n_8f2B56BvJU</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>Wunderlich, Fabian</creator><creator>Memmert, Daniel</creator><general>Springer Vienna</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>3V.</scope><scope>7XB</scope><scope>88J</scope><scope>8BJ</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FQK</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JBE</scope><scope>JQ2</scope><scope>K7-</scope><scope>M2R</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-7445-6858</orcidid></search><sort><creationdate>20221201</creationdate><title>A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?</title><author>Wunderlich, Fabian ; Memmert, Daniel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-cfcbd4903a2efa92a8a82202d89b74e000e777923c205826c3c420a06c704a313</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Analysis</topic><topic>Applications of Graph Theory and Complex Networks</topic><topic>Big Data</topic><topic>Collaboration</topic><topic>Computer Science</topic><topic>Data analysis</topic><topic>Data mining</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Economics</topic><topic>Football</topic><topic>Forecasting</topic><topic>Gambling</topic><topic>Game Theory</topic><topic>Humanities</topic><topic>Information sources</topic><topic>Investigations</topic><topic>Law</topic><topic>Machine learning</topic><topic>Mathematical models</topic><topic>Methodology of the Social Sciences</topic><topic>Objectives</topic><topic>Original</topic><topic>Original Article</topic><topic>Prediction markets</topic><topic>Sentiment analysis</topic><topic>Soccer</topic><topic>Social and Behav. Sciences</topic><topic>Social networks</topic><topic>Statistics for Social Sciences</topic><topic>Teams</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wunderlich, Fabian</creatorcontrib><creatorcontrib>Memmert, Daniel</creatorcontrib><collection>Springer Nature OA/Free Journals</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Social Science Database (Alumni Edition)</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Social Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Social network analysis and mining</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wunderlich, Fabian</au><au>Memmert, Daniel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?</atitle><jtitle>Social network analysis and mining</jtitle><stitle>Soc. Netw. Anal. Min</stitle><addtitle>Soc Netw Anal Min</addtitle><date>2022-12-01</date><risdate>2022</risdate><volume>12</volume><issue>1</issue><spage>23</spage><epage>23</epage><pages>23-23</pages><artnum>23</artnum><issn>1869-5450</issn><eissn>1869-5469</eissn><abstract>Data-related analysis in football increasingly benefits from Big Data approaches and machine learning methods. One relevant application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the information reflected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000 matches from the main European football leagues supports the notion that the predictive value of in-play information is highly limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases when the match progresses, which might be caused by an unjustified high expectation of football fans before the match.</abstract><cop>Vienna</cop><pub>Springer Vienna</pub><pmid>34976228</pmid><doi>10.1007/s13278-021-00842-z</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-7445-6858</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1869-5450 |
ispartof | Social network analysis and mining, 2022-12, Vol.12 (1), p.23-23, Article 23 |
issn | 1869-5450 1869-5469 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8714875 |
source | ProQuest Central Essentials; ProQuest Central (Alumni Edition); ProQuest Central Student; ProQuest Central Korea; ProQuest Central UK/Ireland; SpringerLink Journals - AutoHoldings; ProQuest Central |
subjects | Analysis Applications of Graph Theory and Complex Networks Big Data Collaboration Computer Science Data analysis Data mining Data Mining and Knowledge Discovery Economics Football Forecasting Gambling Game Theory Humanities Information sources Investigations Law Machine learning Mathematical models Methodology of the Social Sciences Objectives Original Original Article Prediction markets Sentiment analysis Soccer Social and Behav. Sciences Social networks Statistics for Social Sciences Teams |
title | A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football? |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T12%3A01%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20big%20data%20analysis%20of%20Twitter%20data%20during%20premier%20league%20matches:%20do%20tweets%20contain%20information%20valuable%20for%20in-play%20forecasting%20of%20goals%20in%20football?&rft.jtitle=Social%20network%20analysis%20and%20mining&rft.au=Wunderlich,%20Fabian&rft.date=2022-12-01&rft.volume=12&rft.issue=1&rft.spage=23&rft.epage=23&rft.pages=23-23&rft.artnum=23&rft.issn=1869-5450&rft.eissn=1869-5469&rft_id=info:doi/10.1007/s13278-021-00842-z&rft_dat=%3Cproquest_pubme%3E2919537693%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2919537693&rft_id=info:pmid/34976228&rfr_iscdi=true |