Determining Research Priorities Using Machine Learning
We summarize our exploratory investigation into whether Machine Learning (ML) techniques applied to publicly available professional text can substantially augment strategic planning for astronomy. We find that an approach based on Latent Dirichlet Allocation (LDA) using content drawn from astronomy...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We summarize our exploratory investigation into whether Machine Learning (ML)
techniques applied to publicly available professional text can substantially
augment strategic planning for astronomy. We find that an approach based on
Latent Dirichlet Allocation (LDA) using content drawn from astronomy journal
papers can be used to infer high-priority research areas. While the LDA models
are challenging to interpret, we find that they may be strongly associated with
meaningful keywords and scientific papers which allow for human interpretation
of the topic models.
Significant correlation is found between the results of applying these models
to the previous decade of astronomical research ("1998-2010" corpus) and the
contents of the science frontier panel report which contains high-priority
research areas identified by the 2010 National Academies' Astronomy and
Astrophysics Decadal Survey ("DS2010" corpus). Significant correlations also
exist between model results of the 1998-2010 corpus and the submitted
whitepapers to the Decadal Survey ("whitepapers" corpus). Importantly, we
derive predictive metrics based on these results which can provide leading
indicators of which content modeled by the topic models will become highly
cited in the future. Using these identified metrics and the associations
between papers and topic models it is possible to identify important papers for
planners to consider.
A preliminary version of our work was presented by Thronson etal. 2021 and
Thomas etal. 2022. |
---|---|
DOI: | 10.48550/arxiv.2407.02533 |