Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media

Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on soci...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Scientific reports 2021-11, Vol.11 (1), p.21678-12, Article 21678
Hauptverfasser: Tao, Dandan, Zhang, Dongyu, Hu, Ruofan, Rundensteiner, Elke, Feng, Hao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 12
container_issue 1
container_start_page 21678
container_title Scientific reports
container_volume 11
creator Tao, Dandan
Zhang, Dongyu
Hu, Ruofan
Rundensteiner, Elke
Feng, Hao
description Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on social media may provide new means of reducing the risks and curtailing the outbreaks. In recent years, Twitter has been employed as a new tool for identifying unreported foodborne illnesses. However, there is a huge gap between the identification of sporadic illnesses and the early detection of a potential outbreak. In this work, the dual-task BERTweet model was developed to identify unreported foodborne illnesses and extract foodborne-illness-related entities from Twitter. Unlike previous methods, our model leveraged the mutually beneficial relationships between the two tasks. The results showed that the F1-score of relevance prediction was 0.87, and the F1-score of entity extraction was 0.61. Key elements such as time, location, and food detected from sentences indicating foodborne illnesses were used to analyze potential foodborne outbreaks in massive historical tweets. A case study on tweets indicating foodborne illnesses showed that the discovered trend is consistent with the true outbreaks that occurred during the same period.
doi_str_mv 10.1038/s41598-021-00766-w
format Article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_687c6d57923345e88d848bcfb42e1505</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_687c6d57923345e88d848bcfb42e1505</doaj_id><sourcerecordid>2593361013</sourcerecordid><originalsourceid>FETCH-LOGICAL-c540t-9f135235f9a7e35847bc3004c428d8a3f5bdefdb5977f32cf70f34ce222704743</originalsourceid><addsrcrecordid>eNp9kk1v1DAQhiNERau2f4ADisQ5YHvs2LkgoRUflSpxoWfL8cfWS9YOtpctN3463k0p7aW-2HrnnWc8mmma1xi9wwjE-0wxG0SHCO4Q4n3f7V80ZwRR1hEg5OWj92lzmfMG1cPIQPHwqjkFyoEDYWfNn1WKe5PjLmkf1q0Kpt0qfeuDbSerUjiK85xiFW1uXUytvStJ6XKI2FB88VX3wXitjtocy0FWUzVHM8ZUUXFXxmTVjwpIcdvmqA_xrTVeXTQnTk3ZXt7f583N50_fV1-7629frlYfrzvNKCrd4DAwAswNiltggvJRA0JUUyKMUODYaKwzIxs4d0C048gB1ZYQwhHlFM6bq4VrotrIOfmtSr9lVF4ehZjWUqXi9WRlL7juDeMDAaDMilqAilG7kRKLGWKV9WFhzbuxNqFrv0lNT6BPI8HfynX8JQXrxcD7Cnh7D0jx587mIjd1AqH2LwkbAHqMMFQXWVw6xZyTdQ8VMJKHJZDLEsi6BPK4BHJfk948_ttDyr-RVwMshlxDYW3T_9rPYP8C58rAkA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2593361013</pqid></control><display><type>article</type><title>Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media</title><source>Nature Open Access</source><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Springer Nature OA Free Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><creator>Tao, Dandan ; Zhang, Dongyu ; Hu, Ruofan ; Rundensteiner, Elke ; Feng, Hao</creator><creatorcontrib>Tao, Dandan ; Zhang, Dongyu ; Hu, Ruofan ; Rundensteiner, Elke ; Feng, Hao</creatorcontrib><description>Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on social media may provide new means of reducing the risks and curtailing the outbreaks. In recent years, Twitter has been employed as a new tool for identifying unreported foodborne illnesses. However, there is a huge gap between the identification of sporadic illnesses and the early detection of a potential outbreak. In this work, the dual-task BERTweet model was developed to identify unreported foodborne illnesses and extract foodborne-illness-related entities from Twitter. Unlike previous methods, our model leveraged the mutually beneficial relationships between the two tasks. The results showed that the F1-score of relevance prediction was 0.87, and the F1-score of entity extraction was 0.61. Key elements such as time, location, and food detected from sentences indicating foodborne illnesses were used to analyze potential foodborne outbreaks in massive historical tweets. A case study on tweets indicating foodborne illnesses showed that the discovered trend is consistent with the true outbreaks that occurred during the same period.</description><identifier>ISSN: 2045-2322</identifier><identifier>EISSN: 2045-2322</identifier><identifier>DOI: 10.1038/s41598-021-00766-w</identifier><identifier>PMID: 34737325</identifier><language>eng</language><publisher>London: Nature Publishing Group UK</publisher><subject>631/326/2521 ; 639/705/258 ; 692/499 ; 692/699/255/1318 ; Contact Tracing - methods ; Crowdsourcing - methods ; Disease Outbreaks - prevention &amp; control ; Foodborne diseases ; Foodborne Diseases - epidemiology ; Foodborne Diseases - etiology ; Humanities and Social Sciences ; Humans ; Learning algorithms ; Machine Learning ; Models, Theoretical ; multidisciplinary ; Outbreaks ; Population Surveillance - methods ; Public health ; Public Health - methods ; Public Health - trends ; Risk reduction ; Science ; Science (multidisciplinary) ; Social Media - trends ; Social networks</subject><ispartof>Scientific reports, 2021-11, Vol.11 (1), p.21678-12, Article 21678</ispartof><rights>The Author(s) 2021</rights><rights>2021. The Author(s).</rights><rights>The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c540t-9f135235f9a7e35847bc3004c428d8a3f5bdefdb5977f32cf70f34ce222704743</citedby><cites>FETCH-LOGICAL-c540t-9f135235f9a7e35847bc3004c428d8a3f5bdefdb5977f32cf70f34ce222704743</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8568976/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8568976/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,2100,27923,27924,41119,42188,51575,53790,53792</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34737325$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Tao, Dandan</creatorcontrib><creatorcontrib>Zhang, Dongyu</creatorcontrib><creatorcontrib>Hu, Ruofan</creatorcontrib><creatorcontrib>Rundensteiner, Elke</creatorcontrib><creatorcontrib>Feng, Hao</creatorcontrib><title>Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media</title><title>Scientific reports</title><addtitle>Sci Rep</addtitle><addtitle>Sci Rep</addtitle><description>Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on social media may provide new means of reducing the risks and curtailing the outbreaks. In recent years, Twitter has been employed as a new tool for identifying unreported foodborne illnesses. However, there is a huge gap between the identification of sporadic illnesses and the early detection of a potential outbreak. In this work, the dual-task BERTweet model was developed to identify unreported foodborne illnesses and extract foodborne-illness-related entities from Twitter. Unlike previous methods, our model leveraged the mutually beneficial relationships between the two tasks. The results showed that the F1-score of relevance prediction was 0.87, and the F1-score of entity extraction was 0.61. Key elements such as time, location, and food detected from sentences indicating foodborne illnesses were used to analyze potential foodborne outbreaks in massive historical tweets. A case study on tweets indicating foodborne illnesses showed that the discovered trend is consistent with the true outbreaks that occurred during the same period.</description><subject>631/326/2521</subject><subject>639/705/258</subject><subject>692/499</subject><subject>692/699/255/1318</subject><subject>Contact Tracing - methods</subject><subject>Crowdsourcing - methods</subject><subject>Disease Outbreaks - prevention &amp; control</subject><subject>Foodborne diseases</subject><subject>Foodborne Diseases - epidemiology</subject><subject>Foodborne Diseases - etiology</subject><subject>Humanities and Social Sciences</subject><subject>Humans</subject><subject>Learning algorithms</subject><subject>Machine Learning</subject><subject>Models, Theoretical</subject><subject>multidisciplinary</subject><subject>Outbreaks</subject><subject>Population Surveillance - methods</subject><subject>Public health</subject><subject>Public Health - methods</subject><subject>Public Health - trends</subject><subject>Risk reduction</subject><subject>Science</subject><subject>Science (multidisciplinary)</subject><subject>Social Media - trends</subject><subject>Social networks</subject><issn>2045-2322</issn><issn>2045-2322</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>DOA</sourceid><recordid>eNp9kk1v1DAQhiNERau2f4ADisQ5YHvs2LkgoRUflSpxoWfL8cfWS9YOtpctN3463k0p7aW-2HrnnWc8mmma1xi9wwjE-0wxG0SHCO4Q4n3f7V80ZwRR1hEg5OWj92lzmfMG1cPIQPHwqjkFyoEDYWfNn1WKe5PjLmkf1q0Kpt0qfeuDbSerUjiK85xiFW1uXUytvStJ6XKI2FB88VX3wXitjtocy0FWUzVHM8ZUUXFXxmTVjwpIcdvmqA_xrTVeXTQnTk3ZXt7f583N50_fV1-7629frlYfrzvNKCrd4DAwAswNiltggvJRA0JUUyKMUODYaKwzIxs4d0C048gB1ZYQwhHlFM6bq4VrotrIOfmtSr9lVF4ehZjWUqXi9WRlL7juDeMDAaDMilqAilG7kRKLGWKV9WFhzbuxNqFrv0lNT6BPI8HfynX8JQXrxcD7Cnh7D0jx587mIjd1AqH2LwkbAHqMMFQXWVw6xZyTdQ8VMJKHJZDLEsi6BPK4BHJfk948_ttDyr-RVwMshlxDYW3T_9rPYP8C58rAkA</recordid><startdate>20211104</startdate><enddate>20211104</enddate><creator>Tao, Dandan</creator><creator>Zhang, Dongyu</creator><creator>Hu, Ruofan</creator><creator>Rundensteiner, Elke</creator><creator>Feng, Hao</creator><general>Nature Publishing Group UK</general><general>Nature Publishing Group</general><general>Nature Portfolio</general><scope>C6C</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>M7P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20211104</creationdate><title>Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media</title><author>Tao, Dandan ; Zhang, Dongyu ; Hu, Ruofan ; Rundensteiner, Elke ; Feng, Hao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c540t-9f135235f9a7e35847bc3004c428d8a3f5bdefdb5977f32cf70f34ce222704743</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>631/326/2521</topic><topic>639/705/258</topic><topic>692/499</topic><topic>692/699/255/1318</topic><topic>Contact Tracing - methods</topic><topic>Crowdsourcing - methods</topic><topic>Disease Outbreaks - prevention &amp; control</topic><topic>Foodborne diseases</topic><topic>Foodborne Diseases - epidemiology</topic><topic>Foodborne Diseases - etiology</topic><topic>Humanities and Social Sciences</topic><topic>Humans</topic><topic>Learning algorithms</topic><topic>Machine Learning</topic><topic>Models, Theoretical</topic><topic>multidisciplinary</topic><topic>Outbreaks</topic><topic>Population Surveillance - methods</topic><topic>Public health</topic><topic>Public Health - methods</topic><topic>Public Health - trends</topic><topic>Risk reduction</topic><topic>Science</topic><topic>Science (multidisciplinary)</topic><topic>Social Media - trends</topic><topic>Social networks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tao, Dandan</creatorcontrib><creatorcontrib>Zhang, Dongyu</creatorcontrib><creatorcontrib>Hu, Ruofan</creatorcontrib><creatorcontrib>Rundensteiner, Elke</creatorcontrib><creatorcontrib>Feng, Hao</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Science Database</collection><collection>Biological Science Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Scientific reports</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tao, Dandan</au><au>Zhang, Dongyu</au><au>Hu, Ruofan</au><au>Rundensteiner, Elke</au><au>Feng, Hao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media</atitle><jtitle>Scientific reports</jtitle><stitle>Sci Rep</stitle><addtitle>Sci Rep</addtitle><date>2021-11-04</date><risdate>2021</risdate><volume>11</volume><issue>1</issue><spage>21678</spage><epage>12</epage><pages>21678-12</pages><artnum>21678</artnum><issn>2045-2322</issn><eissn>2045-2322</eissn><abstract>Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on social media may provide new means of reducing the risks and curtailing the outbreaks. In recent years, Twitter has been employed as a new tool for identifying unreported foodborne illnesses. However, there is a huge gap between the identification of sporadic illnesses and the early detection of a potential outbreak. In this work, the dual-task BERTweet model was developed to identify unreported foodborne illnesses and extract foodborne-illness-related entities from Twitter. Unlike previous methods, our model leveraged the mutually beneficial relationships between the two tasks. The results showed that the F1-score of relevance prediction was 0.87, and the F1-score of entity extraction was 0.61. Key elements such as time, location, and food detected from sentences indicating foodborne illnesses were used to analyze potential foodborne outbreaks in massive historical tweets. A case study on tweets indicating foodborne illnesses showed that the discovered trend is consistent with the true outbreaks that occurred during the same period.</abstract><cop>London</cop><pub>Nature Publishing Group UK</pub><pmid>34737325</pmid><doi>10.1038/s41598-021-00766-w</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2045-2322
ispartof Scientific reports, 2021-11, Vol.11 (1), p.21678-12, Article 21678
issn 2045-2322
2045-2322
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_687c6d57923345e88d848bcfb42e1505
source Nature Open Access; MEDLINE; DOAJ Directory of Open Access Journals; Springer Nature OA Free Journals; EZB-FREE-00999 freely available EZB journals; PubMed Central; Free Full-Text Journals in Chemistry
subjects 631/326/2521
639/705/258
692/499
692/699/255/1318
Contact Tracing - methods
Crowdsourcing - methods
Disease Outbreaks - prevention & control
Foodborne diseases
Foodborne Diseases - epidemiology
Foodborne Diseases - etiology
Humanities and Social Sciences
Humans
Learning algorithms
Machine Learning
Models, Theoretical
multidisciplinary
Outbreaks
Population Surveillance - methods
Public health
Public Health - methods
Public Health - trends
Risk reduction
Science
Science (multidisciplinary)
Social Media - trends
Social networks
title Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T13%3A33%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Crowdsourcing%20and%20machine%20learning%20approaches%20for%20extracting%20entities%20indicating%20potential%20foodborne%20outbreaks%20from%20social%20media&rft.jtitle=Scientific%20reports&rft.au=Tao,%20Dandan&rft.date=2021-11-04&rft.volume=11&rft.issue=1&rft.spage=21678&rft.epage=12&rft.pages=21678-12&rft.artnum=21678&rft.issn=2045-2322&rft.eissn=2045-2322&rft_id=info:doi/10.1038/s41598-021-00766-w&rft_dat=%3Cproquest_doaj_%3E2593361013%3C/proquest_doaj_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2593361013&rft_id=info:pmid/34737325&rft_doaj_id=oai_doaj_org_article_687c6d57923345e88d848bcfb42e1505&rfr_iscdi=true