Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media
Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on soci...
Gespeichert in:
Veröffentlicht in: | Scientific reports 2021-11, Vol.11 (1), p.21678-12, Article 21678 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 12 |
---|---|
container_issue | 1 |
container_start_page | 21678 |
container_title | Scientific reports |
container_volume | 11 |
creator | Tao, Dandan Zhang, Dongyu Hu, Ruofan Rundensteiner, Elke Feng, Hao |
description | Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on social media may provide new means of reducing the risks and curtailing the outbreaks. In recent years, Twitter has been employed as a new tool for identifying unreported foodborne illnesses. However, there is a huge gap between the identification of sporadic illnesses and the early detection of a potential outbreak. In this work, the dual-task BERTweet model was developed to identify unreported foodborne illnesses and extract foodborne-illness-related entities from Twitter. Unlike previous methods, our model leveraged the mutually beneficial relationships between the two tasks. The results showed that the F1-score of relevance prediction was 0.87, and the F1-score of entity extraction was 0.61. Key elements such as time, location, and food detected from sentences indicating foodborne illnesses were used to analyze potential foodborne outbreaks in massive historical tweets. A case study on tweets indicating foodborne illnesses showed that the discovered trend is consistent with the true outbreaks that occurred during the same period. |
doi_str_mv | 10.1038/s41598-021-00766-w |
format | Article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_687c6d57923345e88d848bcfb42e1505</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_687c6d57923345e88d848bcfb42e1505</doaj_id><sourcerecordid>2593361013</sourcerecordid><originalsourceid>FETCH-LOGICAL-c540t-9f135235f9a7e35847bc3004c428d8a3f5bdefdb5977f32cf70f34ce222704743</originalsourceid><addsrcrecordid>eNp9kk1v1DAQhiNERau2f4ADisQ5YHvs2LkgoRUflSpxoWfL8cfWS9YOtpctN3463k0p7aW-2HrnnWc8mmma1xi9wwjE-0wxG0SHCO4Q4n3f7V80ZwRR1hEg5OWj92lzmfMG1cPIQPHwqjkFyoEDYWfNn1WKe5PjLmkf1q0Kpt0qfeuDbSerUjiK85xiFW1uXUytvStJ6XKI2FB88VX3wXitjtocy0FWUzVHM8ZUUXFXxmTVjwpIcdvmqA_xrTVeXTQnTk3ZXt7f583N50_fV1-7629frlYfrzvNKCrd4DAwAswNiltggvJRA0JUUyKMUODYaKwzIxs4d0C048gB1ZYQwhHlFM6bq4VrotrIOfmtSr9lVF4ehZjWUqXi9WRlL7juDeMDAaDMilqAilG7kRKLGWKV9WFhzbuxNqFrv0lNT6BPI8HfynX8JQXrxcD7Cnh7D0jx587mIjd1AqH2LwkbAHqMMFQXWVw6xZyTdQ8VMJKHJZDLEsi6BPK4BHJfk948_ttDyr-RVwMshlxDYW3T_9rPYP8C58rAkA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2593361013</pqid></control><display><type>article</type><title>Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media</title><source>Nature Open Access</source><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Springer Nature OA Free Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><creator>Tao, Dandan ; Zhang, Dongyu ; Hu, Ruofan ; Rundensteiner, Elke ; Feng, Hao</creator><creatorcontrib>Tao, Dandan ; Zhang, Dongyu ; Hu, Ruofan ; Rundensteiner, Elke ; Feng, Hao</creatorcontrib><description>Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on social media may provide new means of reducing the risks and curtailing the outbreaks. In recent years, Twitter has been employed as a new tool for identifying unreported foodborne illnesses. However, there is a huge gap between the identification of sporadic illnesses and the early detection of a potential outbreak. In this work, the dual-task BERTweet model was developed to identify unreported foodborne illnesses and extract foodborne-illness-related entities from Twitter. Unlike previous methods, our model leveraged the mutually beneficial relationships between the two tasks. The results showed that the F1-score of relevance prediction was 0.87, and the F1-score of entity extraction was 0.61. Key elements such as time, location, and food detected from sentences indicating foodborne illnesses were used to analyze potential foodborne outbreaks in massive historical tweets. A case study on tweets indicating foodborne illnesses showed that the discovered trend is consistent with the true outbreaks that occurred during the same period.</description><identifier>ISSN: 2045-2322</identifier><identifier>EISSN: 2045-2322</identifier><identifier>DOI: 10.1038/s41598-021-00766-w</identifier><identifier>PMID: 34737325</identifier><language>eng</language><publisher>London: Nature Publishing Group UK</publisher><subject>631/326/2521 ; 639/705/258 ; 692/499 ; 692/699/255/1318 ; Contact Tracing - methods ; Crowdsourcing - methods ; Disease Outbreaks - prevention & control ; Foodborne diseases ; Foodborne Diseases - epidemiology ; Foodborne Diseases - etiology ; Humanities and Social Sciences ; Humans ; Learning algorithms ; Machine Learning ; Models, Theoretical ; multidisciplinary ; Outbreaks ; Population Surveillance - methods ; Public health ; Public Health - methods ; Public Health - trends ; Risk reduction ; Science ; Science (multidisciplinary) ; Social Media - trends ; Social networks</subject><ispartof>Scientific reports, 2021-11, Vol.11 (1), p.21678-12, Article 21678</ispartof><rights>The Author(s) 2021</rights><rights>2021. The Author(s).</rights><rights>The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c540t-9f135235f9a7e35847bc3004c428d8a3f5bdefdb5977f32cf70f34ce222704743</citedby><cites>FETCH-LOGICAL-c540t-9f135235f9a7e35847bc3004c428d8a3f5bdefdb5977f32cf70f34ce222704743</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8568976/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8568976/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,2100,27923,27924,41119,42188,51575,53790,53792</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34737325$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Tao, Dandan</creatorcontrib><creatorcontrib>Zhang, Dongyu</creatorcontrib><creatorcontrib>Hu, Ruofan</creatorcontrib><creatorcontrib>Rundensteiner, Elke</creatorcontrib><creatorcontrib>Feng, Hao</creatorcontrib><title>Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media</title><title>Scientific reports</title><addtitle>Sci Rep</addtitle><addtitle>Sci Rep</addtitle><description>Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on social media may provide new means of reducing the risks and curtailing the outbreaks. In recent years, Twitter has been employed as a new tool for identifying unreported foodborne illnesses. However, there is a huge gap between the identification of sporadic illnesses and the early detection of a potential outbreak. In this work, the dual-task BERTweet model was developed to identify unreported foodborne illnesses and extract foodborne-illness-related entities from Twitter. Unlike previous methods, our model leveraged the mutually beneficial relationships between the two tasks. The results showed that the F1-score of relevance prediction was 0.87, and the F1-score of entity extraction was 0.61. Key elements such as time, location, and food detected from sentences indicating foodborne illnesses were used to analyze potential foodborne outbreaks in massive historical tweets. A case study on tweets indicating foodborne illnesses showed that the discovered trend is consistent with the true outbreaks that occurred during the same period.</description><subject>631/326/2521</subject><subject>639/705/258</subject><subject>692/499</subject><subject>692/699/255/1318</subject><subject>Contact Tracing - methods</subject><subject>Crowdsourcing - methods</subject><subject>Disease Outbreaks - prevention & control</subject><subject>Foodborne diseases</subject><subject>Foodborne Diseases - epidemiology</subject><subject>Foodborne Diseases - etiology</subject><subject>Humanities and Social Sciences</subject><subject>Humans</subject><subject>Learning algorithms</subject><subject>Machine Learning</subject><subject>Models, Theoretical</subject><subject>multidisciplinary</subject><subject>Outbreaks</subject><subject>Population Surveillance - methods</subject><subject>Public health</subject><subject>Public Health - methods</subject><subject>Public Health - trends</subject><subject>Risk reduction</subject><subject>Science</subject><subject>Science (multidisciplinary)</subject><subject>Social Media - trends</subject><subject>Social networks</subject><issn>2045-2322</issn><issn>2045-2322</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>DOA</sourceid><recordid>eNp9kk1v1DAQhiNERau2f4ADisQ5YHvs2LkgoRUflSpxoWfL8cfWS9YOtpctN3463k0p7aW-2HrnnWc8mmma1xi9wwjE-0wxG0SHCO4Q4n3f7V80ZwRR1hEg5OWj92lzmfMG1cPIQPHwqjkFyoEDYWfNn1WKe5PjLmkf1q0Kpt0qfeuDbSerUjiK85xiFW1uXUytvStJ6XKI2FB88VX3wXitjtocy0FWUzVHM8ZUUXFXxmTVjwpIcdvmqA_xrTVeXTQnTk3ZXt7f583N50_fV1-7629frlYfrzvNKCrd4DAwAswNiltggvJRA0JUUyKMUODYaKwzIxs4d0C048gB1ZYQwhHlFM6bq4VrotrIOfmtSr9lVF4ehZjWUqXi9WRlL7juDeMDAaDMilqAilG7kRKLGWKV9WFhzbuxNqFrv0lNT6BPI8HfynX8JQXrxcD7Cnh7D0jx587mIjd1AqH2LwkbAHqMMFQXWVw6xZyTdQ8VMJKHJZDLEsi6BPK4BHJfk948_ttDyr-RVwMshlxDYW3T_9rPYP8C58rAkA</recordid><startdate>20211104</startdate><enddate>20211104</enddate><creator>Tao, Dandan</creator><creator>Zhang, Dongyu</creator><creator>Hu, Ruofan</creator><creator>Rundensteiner, Elke</creator><creator>Feng, Hao</creator><general>Nature Publishing Group UK</general><general>Nature Publishing Group</general><general>Nature Portfolio</general><scope>C6C</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>M7P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20211104</creationdate><title>Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media</title><author>Tao, Dandan ; Zhang, Dongyu ; Hu, Ruofan ; Rundensteiner, Elke ; Feng, Hao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c540t-9f135235f9a7e35847bc3004c428d8a3f5bdefdb5977f32cf70f34ce222704743</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>631/326/2521</topic><topic>639/705/258</topic><topic>692/499</topic><topic>692/699/255/1318</topic><topic>Contact Tracing - methods</topic><topic>Crowdsourcing - methods</topic><topic>Disease Outbreaks - prevention & control</topic><topic>Foodborne diseases</topic><topic>Foodborne Diseases - epidemiology</topic><topic>Foodborne Diseases - etiology</topic><topic>Humanities and Social Sciences</topic><topic>Humans</topic><topic>Learning algorithms</topic><topic>Machine Learning</topic><topic>Models, Theoretical</topic><topic>multidisciplinary</topic><topic>Outbreaks</topic><topic>Population Surveillance - methods</topic><topic>Public health</topic><topic>Public Health - methods</topic><topic>Public Health - trends</topic><topic>Risk reduction</topic><topic>Science</topic><topic>Science (multidisciplinary)</topic><topic>Social Media - trends</topic><topic>Social networks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tao, Dandan</creatorcontrib><creatorcontrib>Zhang, Dongyu</creatorcontrib><creatorcontrib>Hu, Ruofan</creatorcontrib><creatorcontrib>Rundensteiner, Elke</creatorcontrib><creatorcontrib>Feng, Hao</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Science Database</collection><collection>Biological Science Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Scientific reports</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tao, Dandan</au><au>Zhang, Dongyu</au><au>Hu, Ruofan</au><au>Rundensteiner, Elke</au><au>Feng, Hao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media</atitle><jtitle>Scientific reports</jtitle><stitle>Sci Rep</stitle><addtitle>Sci Rep</addtitle><date>2021-11-04</date><risdate>2021</risdate><volume>11</volume><issue>1</issue><spage>21678</spage><epage>12</epage><pages>21678-12</pages><artnum>21678</artnum><issn>2045-2322</issn><eissn>2045-2322</eissn><abstract>Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on social media may provide new means of reducing the risks and curtailing the outbreaks. In recent years, Twitter has been employed as a new tool for identifying unreported foodborne illnesses. However, there is a huge gap between the identification of sporadic illnesses and the early detection of a potential outbreak. In this work, the dual-task BERTweet model was developed to identify unreported foodborne illnesses and extract foodborne-illness-related entities from Twitter. Unlike previous methods, our model leveraged the mutually beneficial relationships between the two tasks. The results showed that the F1-score of relevance prediction was 0.87, and the F1-score of entity extraction was 0.61. Key elements such as time, location, and food detected from sentences indicating foodborne illnesses were used to analyze potential foodborne outbreaks in massive historical tweets. A case study on tweets indicating foodborne illnesses showed that the discovered trend is consistent with the true outbreaks that occurred during the same period.</abstract><cop>London</cop><pub>Nature Publishing Group UK</pub><pmid>34737325</pmid><doi>10.1038/s41598-021-00766-w</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2045-2322 |
ispartof | Scientific reports, 2021-11, Vol.11 (1), p.21678-12, Article 21678 |
issn | 2045-2322 2045-2322 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_687c6d57923345e88d848bcfb42e1505 |
source | Nature Open Access; MEDLINE; DOAJ Directory of Open Access Journals; Springer Nature OA Free Journals; EZB-FREE-00999 freely available EZB journals; PubMed Central; Free Full-Text Journals in Chemistry |
subjects | 631/326/2521 639/705/258 692/499 692/699/255/1318 Contact Tracing - methods Crowdsourcing - methods Disease Outbreaks - prevention & control Foodborne diseases Foodborne Diseases - epidemiology Foodborne Diseases - etiology Humanities and Social Sciences Humans Learning algorithms Machine Learning Models, Theoretical multidisciplinary Outbreaks Population Surveillance - methods Public health Public Health - methods Public Health - trends Risk reduction Science Science (multidisciplinary) Social Media - trends Social networks |
title | Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T13%3A33%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Crowdsourcing%20and%20machine%20learning%20approaches%20for%20extracting%20entities%20indicating%20potential%20foodborne%20outbreaks%20from%20social%20media&rft.jtitle=Scientific%20reports&rft.au=Tao,%20Dandan&rft.date=2021-11-04&rft.volume=11&rft.issue=1&rft.spage=21678&rft.epage=12&rft.pages=21678-12&rft.artnum=21678&rft.issn=2045-2322&rft.eissn=2045-2322&rft_id=info:doi/10.1038/s41598-021-00766-w&rft_dat=%3Cproquest_doaj_%3E2593361013%3C/proquest_doaj_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2593361013&rft_id=info:pmid/34737325&rft_doaj_id=oai_doaj_org_article_687c6d57923345e88d848bcfb42e1505&rfr_iscdi=true |