Community search signatures as foundation features for human-centered geospatial modeling
Aggregated relative search frequencies offer a unique composite signal reflecting people's habits, concerns, interests, intents, and general information needs, which are not found in other readily available datasets. Temporal search trends have been successfully used in time series modeling acr...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Aggregated relative search frequencies offer a unique composite signal
reflecting people's habits, concerns, interests, intents, and general
information needs, which are not found in other readily available datasets.
Temporal search trends have been successfully used in time series modeling
across a variety of domains such as infectious diseases, unemployment rates,
and retail sales. However, most existing applications require curating
specialized datasets of individual keywords, queries, or query clusters, and
the search data need to be temporally aligned with the outcome variable of
interest. We propose a novel approach for generating an aggregated and
anonymized representation of search interest as foundation features at the
community level for geospatial modeling. We benchmark these features using
spatial datasets across multiple domains. In zip codes with a population
greater than 3000 that cover over 95% of the contiguous US population, our
models for predicting missing values in a 20% set of holdout counties achieve
an average $R^2$ score of 0.74 across 21 health variables, and 0.80 across 6
demographic and environmental variables. Our results demonstrate that these
search features can be used for spatial predictions without strict temporal
alignment, and that the resulting models outperform spatial interpolation and
state of the art methods using satellite imagery features. |
---|---|
DOI: | 10.48550/arxiv.2410.22721 |