Mining Twitter Data to Augment NASA GPM Validation
The Twitter data stream is an important new source of real-time and historical global information for potentially augmenting the validation program of NASA's Global Precipitation Measurement (GPM) mission. There have been other similar uses of Twitter, though mostly related to natural hazards m...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Other |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Twitter data stream is an important new source of real-time and historical global information for potentially augmenting the validation program of NASA's Global Precipitation Measurement (GPM) mission. There have been other similar uses of Twitter, though mostly related to natural hazards monitoring and management. The validation of satellite precipitation estimates is challenging, because many regions lack data or access to data, especially outside of the U.S. and in remote and developing areas. The time-varying set of "precipitation" tweets can be thought of as an organic network of rain gauges, potentially providing a widespread view of precipitation occurrence. Twitter provides a large source of crowd for crowdsourcing. During a 24-hour period in the middle of the snow storm this past March in the U.S. Northeast, we collected more than 13,000 relevant precipitation tweets with exact geolocation. The overall objective of our project is to determine the extent to which processed tweets can provide additional information that improves the validation of GPM data. Though our current effort focuses on tweets and precipitation, our approach is general and applicable to other social media and other geophysical measurements. Specifically, we have developed an operational infrastructure for processing tweets, in a format suitable for analysis with GPM data; engaged with potential participants, both passive and active, to "enrich" the Twitter stream; and inter-compared "precipitation" tweet data, ground station data, and GPM retrievals. In this presentation, we detail the technical capabilities of our tweet processing infrastructure, including data abstraction, feature extraction, search engine, context-awareness, real-time processing, and high volume (big) data processing; various means for "enriching" the Twitter stream; and results of inter-comparisons. Our project should bring a new kind of visibility to Twitter and engender a new kind of appreciation of the value of Twitter by the science research communities. |
---|