Influenza-like illness surveillance on Twitter through automated learning of naïve language

Twitter has the potential to be a timely and cost-effective source of data for syndromic surveillance. When speaking of an illness, Twitter users often report a combination of symptoms, rather than a suspected or final diagnosis, using naïve, everyday language. We developed a minimally trained algor...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PloS one 2013-12, Vol.8 (12), p.e82489-e82489
Hauptverfasser:	Gesualdo, Francesco, Stilo, Giovanni, Agricola, Eleonora, Gonfiantini, Michaela V, Pandolfi, Elisabetta, Velardi, Paola, Tozzi, Alberto E
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Boolean algebra Communication Computer Simulation Correlation coefficient Correlation coefficients Disease control Disease prevention Hospitals Humans Influenza Influenza, Human - epidemiology Informatics Internet Keywords Language Localization Pandemics Population Surveillance - methods Public health Social networks Surveillance Surveillance systems Swine flu Terminology Terminology as Topic Trends
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Twitter has the potential to be a timely and cost-effective source of data for syndromic surveillance. When speaking of an illness, Twitter users often report a combination of symptoms, rather than a suspected or final diagnosis, using naïve, everyday language. We developed a minimally trained algorithm that exploits the abundance of health-related web pages to identify all jargon expressions related to a specific technical term. We then translated an influenza case definition into a Boolean query, each symptom being described by a technical term and all related jargon expressions, as identified by the algorithm. Subsequently, we monitored all tweets that reported a combination of symptoms satisfying the case definition query. In order to geolocalize messages, we defined 3 localization strategies based on codes associated with each tweet. We found a high correlation coefficient between the trend of our influenza-positive tweets and ILI trends identified by US traditional surveillance systems.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0082489