The reliability of tweets as a supplementary method of seasonal influenza surveillance

Existing influenza surveillance in the United States is focused on the collection of data from sentinel physicians and hospitals; however, the compilation and distribution of reports are usually delayed by up to 2 weeks. With the popularity of social media growing, the Internet is a source for syndr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of medical Internet research 2014-11, Vol.16 (11), p.e250-e250
Hauptverfasser: Aslam, Anoshé A, Tsou, Ming-Hsiang, Spitzberg, Brian H, An, Li, Gawron, J Mark, Gupta, Dipak K, Peddecord, K Michael, Nagel, Anna C, Allen, Christopher, Yang, Jiue-An, Lindsay, Suzanne
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Existing influenza surveillance in the United States is focused on the collection of data from sentinel physicians and hospitals; however, the compilation and distribution of reports are usually delayed by up to 2 weeks. With the popularity of social media growing, the Internet is a source for syndromic surveillance due to the availability of large amounts of data. In this study, tweets, or posts of 140 characters or less, from the website Twitter were collected and analyzed for their potential as surveillance for seasonal influenza. There were three aims: (1) to improve the correlation of tweets to sentinel-provided influenza-like illness (ILI) rates by city through filtering and a machine-learning classifier, (2) to observe correlations of tweets for emergency department ILI rates by city, and (3) to explore correlations for tweets to laboratory-confirmed influenza cases in San Diego. Tweets containing the keyword "flu" were collected within a 17-mile radius from 11 US cities selected for population and availability of ILI data. At the end of the collection period, 159,802 tweets were used for correlation analyses with sentinel-provided ILI and emergency department ILI rates as reported by the corresponding city or county health department. Two separate methods were used to observe correlations between tweets and ILI rates: filtering the tweets by type (non-retweets, retweets, tweets with a URL, tweets without a URL), and the use of a machine-learning classifier that determined whether a tweet was "valid", or from a user who was likely ill with the flu. Correlations varied by city but general trends were observed. Non-retweets and tweets without a URL had higher and more significant (P
ISSN:1438-8871
1439-4456
1438-8871
DOI:10.2196/jmir.3532