What Can Digital Disease Detection Learn from (an External Revision to) Google Flu Trends?

Background Google Flu Trends (GFT) claimed to generate real-time, valid predictions of population influenza-like illness (ILI) using search queries, heralding acclaim and replication across public health. However, recent studies have questioned the validity of GFT. Purpose To propose an alternative...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:American journal of preventive medicine 2014-09, Vol.47 (3), p.341-347
Hauptverfasser: Santillana, Mauricio, PhD, MS, Zhang, D. Wendong, MA, Althouse, Benjamin M., PhD, ScM, Ayers, John W., PhD, MA
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Background Google Flu Trends (GFT) claimed to generate real-time, valid predictions of population influenza-like illness (ILI) using search queries, heralding acclaim and replication across public health. However, recent studies have questioned the validity of GFT. Purpose To propose an alternative methodology that better realizes the potential of GFT, with collateral value for digital disease detection broadly. Methods Our alternative method automatically selects specific queries to monitor and autonomously updates the model each week as new information about CDC-reported ILI becomes available, as developed in 2013. Root mean squared errors (RMSEs) and Pearson correlations comparing predicted ILI (proportion of patient visits indicative of ILI) with subsequently observed ILI were used to judge model performance. Results During the height of the H1N1 pandemic (August 2 to December 22, 2009) and the 2012–2013 season (September 30, 2012, to April 12, 2013), GFT’s predictions had RMSEs of 0.023 and 0.022 (i.e., hypothetically, if GFT predicted 0.061 ILI one week, it is expected to err by 0.023) and correlations of r =0.916 and 0.927. Our alternative method had RMSEs of 0.006 and 0.009, and correlations of r =0.961 and 0.919 for the same periods. Critically, during these important periods, the alternative method yielded more accurate ILI predictions every week, and was typically more accurate during other influenza seasons. Conclusions GFT may be inaccurate, but improved methodologic underpinnings can yield accurate predictions. Applying similar methods elsewhere can improve digital disease detection, with broader transparency, improved accuracy, and real-world public health impacts.
ISSN:0749-3797
1873-2607
DOI:10.1016/j.amepre.2014.05.020