Validation of an efficient non-negative matrix factorization method and its preliminary application in Central California

Positive matrix factorization (PMF) techniques have been applied in many environmental studies. The commercial version of the PMF method has a relatively moderate practical limit for the size of the input data matrix, since the computer memory and time needed for the commercial method increases quad...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Atmospheric environment (1994) 2006-04, Vol.40 (11), p.1991-2001
Hauptverfasser: Liang, Jinyou, Fairley, David
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Positive matrix factorization (PMF) techniques have been applied in many environmental studies. The commercial version of the PMF method has a relatively moderate practical limit for the size of the input data matrix, since the computer memory and time needed for the commercial method increases quadratically with the number of elements of solution matrices. To extend the applications of the PMF techniques to large datasets, we exercised alternative methods that demand less computer memory and time. One such method, called non-negative matrix factorization (NMF) here, is extremely memory efficient, compared with the commercial PMF method. Both NMF and PMF methods are sensitive to the initialization of solution matrices, and the use of random numbers in the initialization usually starts with a large prediction error, and requires a number of model runs with different random seeds. A novel, chemical mass balance method (ROC) is introduced here to provide a reasonable initialization for the NMF method for large data sets. Both NMF and ROC methods were validated with an ideal Cross example and the benchmark example of the commercial PMF method. The NMF-ROC method was further evaluated, in terms of computer time and the prediction error, in the preliminary application to a data set that contains particle-phase polar organic compounds analyzed for a number of samples collected in Central California during the California Regional PM 10/PM 2.5 air quality study (CRPAQS, 1999–2001). The NMF-ROC method was demonstrated to perform better than the NMF, PMF and PMF-ROC methods in the CRPAQS data set. This performance enhancement is expected to be magnified for larger data sets.
ISSN:1352-2310
1873-2844
DOI:10.1016/j.atmosenv.2005.11.035