A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?

Data-related analysis in football increasingly benefits from Big Data approaches and machine learning methods. One relevant application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of a match. The present paper tackles two neglected...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Social network analysis and mining 2022-12, Vol.12 (1), p.23-23, Article 23
Hauptverfasser: Wunderlich, Fabian, Memmert, Daniel
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 23
container_issue 1
container_start_page 23
container_title Social network analysis and mining
container_volume 12
creator Wunderlich, Fabian
Memmert, Daniel
description Data-related analysis in football increasingly benefits from Big Data approaches and machine learning methods. One relevant application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the information reflected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000 matches from the main European football leagues supports the notion that the predictive value of in-play information is highly limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases when the match progresses, which might be caused by an unjustified high expectation of football fans before the match.
doi_str_mv 10.1007/s13278-021-00842-z
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8714875</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2919537693</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-cfcbd4903a2efa92a8a82202d89b74e000e777923c205826c3c420a06c704a313</originalsourceid><addsrcrecordid>eNp9Uctu1TAQjRCIVqU_wAJZYsMm4FdimwWoqnhJldiUtTVxnNSVE19sp9Xtj_R3cUi5PBasPDrnzBnPnKp6TvBrgrF4kwijQtaYkhpjyWl996g6JrJVdcNb9fhQN_ioOk3pGmNMMGMKt0-rI8aVaCmVx9X9GerciHrIgGAGv08uoTCgy1uXs40b0S_RzSPaRTu5gnkL42LRBNlc2fQW9QHlW2tzQibMGdyM3DyEWHgXZnQDfoHOW1SgQtQ7D_u1tgZSXm3LtDGAT4UseMgdeP_-WfVkKJg9fXhPqm8fP1yef64vvn76cn52URsueK7NYLqeK8yA2gEUBQmSUkx7qTrBbVnaCiEUZYbiRtLWMMMpBtwagTkwwk6qd5vvbukm2xs75whe76KbIO51AKf_ZmZ3pcdwo6UgXIqmGLx6MIjh-2JT1pNLxnoPsw1L0rQlLZXrvYv05T_S67DEcvSiUkQ1TLSKFRXdVCaGlKIdDp8hWK_R6y16XaLXP6PXd6XpxZ9rHFp-BV0EbBOk3Zqljb9n_8f2B56BvJU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2919537693</pqid></control><display><type>article</type><title>A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?</title><source>ProQuest Central Essentials</source><source>ProQuest Central (Alumni Edition)</source><source>ProQuest Central Student</source><source>ProQuest Central Korea</source><source>ProQuest Central UK/Ireland</source><source>SpringerLink Journals - AutoHoldings</source><source>ProQuest Central</source><creator>Wunderlich, Fabian ; Memmert, Daniel</creator><creatorcontrib>Wunderlich, Fabian ; Memmert, Daniel</creatorcontrib><description>Data-related analysis in football increasingly benefits from Big Data approaches and machine learning methods. One relevant application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the information reflected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000 matches from the main European football leagues supports the notion that the predictive value of in-play information is highly limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases when the match progresses, which might be caused by an unjustified high expectation of football fans before the match.</description><identifier>ISSN: 1869-5450</identifier><identifier>EISSN: 1869-5469</identifier><identifier>DOI: 10.1007/s13278-021-00842-z</identifier><identifier>PMID: 34976228</identifier><language>eng</language><publisher>Vienna: Springer Vienna</publisher><subject>Analysis ; Applications of Graph Theory and Complex Networks ; Big Data ; Collaboration ; Computer Science ; Data analysis ; Data mining ; Data Mining and Knowledge Discovery ; Economics ; Football ; Forecasting ; Gambling ; Game Theory ; Humanities ; Information sources ; Investigations ; Law ; Machine learning ; Mathematical models ; Methodology of the Social Sciences ; Objectives ; Original ; Original Article ; Prediction markets ; Sentiment analysis ; Soccer ; Social and Behav. Sciences ; Social networks ; Statistics for Social Sciences ; Teams</subject><ispartof>Social network analysis and mining, 2022-12, Vol.12 (1), p.23-23, Article 23</ispartof><rights>The Author(s) 2021</rights><rights>The Author(s) 2021.</rights><rights>The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-cfcbd4903a2efa92a8a82202d89b74e000e777923c205826c3c420a06c704a313</citedby><cites>FETCH-LOGICAL-c474t-cfcbd4903a2efa92a8a82202d89b74e000e777923c205826c3c420a06c704a313</cites><orcidid>0000-0002-7445-6858</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s13278-021-00842-z$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2919537693?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>230,314,780,784,885,21387,21388,21389,21390,23255,27923,27924,33529,33530,33702,33703,33743,33744,34004,34005,34313,34314,41487,42556,43658,43786,43804,43952,44066,51318,64384,64386,64388,72340</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34976228$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wunderlich, Fabian</creatorcontrib><creatorcontrib>Memmert, Daniel</creatorcontrib><title>A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?</title><title>Social network analysis and mining</title><addtitle>Soc. Netw. Anal. Min</addtitle><addtitle>Soc Netw Anal Min</addtitle><description>Data-related analysis in football increasingly benefits from Big Data approaches and machine learning methods. One relevant application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the information reflected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000 matches from the main European football leagues supports the notion that the predictive value of in-play information is highly limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases when the match progresses, which might be caused by an unjustified high expectation of football fans before the match.</description><subject>Analysis</subject><subject>Applications of Graph Theory and Complex Networks</subject><subject>Big Data</subject><subject>Collaboration</subject><subject>Computer Science</subject><subject>Data analysis</subject><subject>Data mining</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Economics</subject><subject>Football</subject><subject>Forecasting</subject><subject>Gambling</subject><subject>Game Theory</subject><subject>Humanities</subject><subject>Information sources</subject><subject>Investigations</subject><subject>Law</subject><subject>Machine learning</subject><subject>Mathematical models</subject><subject>Methodology of the Social Sciences</subject><subject>Objectives</subject><subject>Original</subject><subject>Original Article</subject><subject>Prediction markets</subject><subject>Sentiment analysis</subject><subject>Soccer</subject><subject>Social and Behav. Sciences</subject><subject>Social networks</subject><subject>Statistics for Social Sciences</subject><subject>Teams</subject><issn>1869-5450</issn><issn>1869-5469</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp9Uctu1TAQjRCIVqU_wAJZYsMm4FdimwWoqnhJldiUtTVxnNSVE19sp9Xtj_R3cUi5PBasPDrnzBnPnKp6TvBrgrF4kwijQtaYkhpjyWl996g6JrJVdcNb9fhQN_ioOk3pGmNMMGMKt0-rI8aVaCmVx9X9GerciHrIgGAGv08uoTCgy1uXs40b0S_RzSPaRTu5gnkL42LRBNlc2fQW9QHlW2tzQibMGdyM3DyEWHgXZnQDfoHOW1SgQtQ7D_u1tgZSXm3LtDGAT4UseMgdeP_-WfVkKJg9fXhPqm8fP1yef64vvn76cn52URsueK7NYLqeK8yA2gEUBQmSUkx7qTrBbVnaCiEUZYbiRtLWMMMpBtwagTkwwk6qd5vvbukm2xs75whe76KbIO51AKf_ZmZ3pcdwo6UgXIqmGLx6MIjh-2JT1pNLxnoPsw1L0rQlLZXrvYv05T_S67DEcvSiUkQ1TLSKFRXdVCaGlKIdDp8hWK_R6y16XaLXP6PXd6XpxZ9rHFp-BV0EbBOk3Zqljb9n_8f2B56BvJU</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>Wunderlich, Fabian</creator><creator>Memmert, Daniel</creator><general>Springer Vienna</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>3V.</scope><scope>7XB</scope><scope>88J</scope><scope>8BJ</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FQK</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JBE</scope><scope>JQ2</scope><scope>K7-</scope><scope>M2R</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-7445-6858</orcidid></search><sort><creationdate>20221201</creationdate><title>A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?</title><author>Wunderlich, Fabian ; Memmert, Daniel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-cfcbd4903a2efa92a8a82202d89b74e000e777923c205826c3c420a06c704a313</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Analysis</topic><topic>Applications of Graph Theory and Complex Networks</topic><topic>Big Data</topic><topic>Collaboration</topic><topic>Computer Science</topic><topic>Data analysis</topic><topic>Data mining</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Economics</topic><topic>Football</topic><topic>Forecasting</topic><topic>Gambling</topic><topic>Game Theory</topic><topic>Humanities</topic><topic>Information sources</topic><topic>Investigations</topic><topic>Law</topic><topic>Machine learning</topic><topic>Mathematical models</topic><topic>Methodology of the Social Sciences</topic><topic>Objectives</topic><topic>Original</topic><topic>Original Article</topic><topic>Prediction markets</topic><topic>Sentiment analysis</topic><topic>Soccer</topic><topic>Social and Behav. Sciences</topic><topic>Social networks</topic><topic>Statistics for Social Sciences</topic><topic>Teams</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wunderlich, Fabian</creatorcontrib><creatorcontrib>Memmert, Daniel</creatorcontrib><collection>Springer Nature OA/Free Journals</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Social Science Database (Alumni Edition)</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Social Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Social network analysis and mining</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wunderlich, Fabian</au><au>Memmert, Daniel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?</atitle><jtitle>Social network analysis and mining</jtitle><stitle>Soc. Netw. Anal. Min</stitle><addtitle>Soc Netw Anal Min</addtitle><date>2022-12-01</date><risdate>2022</risdate><volume>12</volume><issue>1</issue><spage>23</spage><epage>23</epage><pages>23-23</pages><artnum>23</artnum><issn>1869-5450</issn><eissn>1869-5469</eissn><abstract>Data-related analysis in football increasingly benefits from Big Data approaches and machine learning methods. One relevant application of data analysis in football is forecasting, which relies on understanding and accurately modelling the process of a match. The present paper tackles two neglected facets of forecasting in football: Forecasts on the total number of goals and in-play forecasting (forecasts based on within-match information). Sentiment analysis techniques were used to extract the information reflected in almost two million tweets from more than 400 Premier League matches. By means of wordclouds and timely analysis of several tweet-based features, the Twitter communication over the full course of matches and shortly before and after goals was visualized and systematically analysed. Moreover, several forecasting models including a random forest model have been used to obtain in-play forecasts. Results suggest that in-play forecasting of goals is highly challenging, and in-play information does not improve forecasting accuracy. An additional analysis of goals from more than 30,000 matches from the main European football leagues supports the notion that the predictive value of in-play information is highly limited compared to pre-game information. This is a relevant result for coaches, match analysts and broadcasters who should not overestimate the value of in-play information. The present study also sheds light on how the perception and behaviour of Twitter users change over the course of a football match. A main result is that the sentiment of Twitter users decreases when the match progresses, which might be caused by an unjustified high expectation of football fans before the match.</abstract><cop>Vienna</cop><pub>Springer Vienna</pub><pmid>34976228</pmid><doi>10.1007/s13278-021-00842-z</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-7445-6858</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1869-5450
ispartof Social network analysis and mining, 2022-12, Vol.12 (1), p.23-23, Article 23
issn 1869-5450
1869-5469
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8714875
source ProQuest Central Essentials; ProQuest Central (Alumni Edition); ProQuest Central Student; ProQuest Central Korea; ProQuest Central UK/Ireland; SpringerLink Journals - AutoHoldings; ProQuest Central
subjects Analysis
Applications of Graph Theory and Complex Networks
Big Data
Collaboration
Computer Science
Data analysis
Data mining
Data Mining and Knowledge Discovery
Economics
Football
Forecasting
Gambling
Game Theory
Humanities
Information sources
Investigations
Law
Machine learning
Mathematical models
Methodology of the Social Sciences
Objectives
Original
Original Article
Prediction markets
Sentiment analysis
Soccer
Social and Behav. Sciences
Social networks
Statistics for Social Sciences
Teams
title A big data analysis of Twitter data during premier league matches: do tweets contain information valuable for in-play forecasting of goals in football?
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T12%3A01%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20big%20data%20analysis%20of%20Twitter%20data%20during%20premier%20league%20matches:%20do%20tweets%20contain%20information%20valuable%20for%20in-play%20forecasting%20of%20goals%20in%20football?&rft.jtitle=Social%20network%20analysis%20and%20mining&rft.au=Wunderlich,%20Fabian&rft.date=2022-12-01&rft.volume=12&rft.issue=1&rft.spage=23&rft.epage=23&rft.pages=23-23&rft.artnum=23&rft.issn=1869-5450&rft.eissn=1869-5469&rft_id=info:doi/10.1007/s13278-021-00842-z&rft_dat=%3Cproquest_pubme%3E2919537693%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2919537693&rft_id=info:pmid/34976228&rfr_iscdi=true