ACTIVE LEARNING TO OVERCOME SAMPLE SELECTION BIAS: APPLICATION TO PHOTOMETRIC VARIABLE STAR CLASSIFICATION

Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Astrophysical journal. Letters 2012-01, Vol.744 (2), p.1-19
Hauptverfasser: Richards, Joseph W, Starr, Dan L, Brink, Henrik, Miller, Adam A, Bloom, Joshua S, Butler, Nathaniel R, JAMES, J BERIAN, Long, James P, Rice, John
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Despite the great promise of machine-learning algorithms to classify and predict astrophysical parameters for the vast numbers of astrophysical sources and transients observed in large-scale surveys, the peculiarities of the training data often manifest as strongly biased predictions on the data of interest. Typically, training sets are derived from historical surveys of brighter, more nearby objects than those from more extensive, deeper surveys (testing data). We explore possible remedies to sample selection bias, including importance weighting, co-training, and active learning (AL). We argue that AL -- where the data whose inclusion in the training set would most improve predictions on the testing set are queried for manual follow-up -- is an effective approach and is appropriate for many astronomical applications. To aid with manual labeling of variable stars, we developed a Web interface which allows for easy light curve visualization and querying of external databases.
ISSN:2041-8205
0004-637X
2041-8213
1538-4357
DOI:10.1088/0004-637X/744/2/192