A technique for analyzing optimal relationships among multiple sets of data fields. Part I: The method

A multiple-set canonical correlation analysis (MCCA), which can be used to study atmospheric motions by analyzing the relationships among more than two sets of data fields, is proposed. By using the product or squared product of correlation matrices as the optimization criterion, this method general...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Monthly weather review 1994-01, Vol.122 (11), p.2482-2493
Hauptverfasser: Chen, Jeng-Ming, Chang, C-P, Harr, Patrick A
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A multiple-set canonical correlation analysis (MCCA), which can be used to study atmospheric motions by analyzing the relationships among more than two sets of data fields, is proposed. By using the product or squared product of correlation matrices as the optimization criterion, this method generalizes the two-set canonical correlation analysis (CCA) and reduces the complications associated with the supermatrix approaches previously proposed in statistical textbooks. The final optimization equations can be greatly simplified to involve weighting functions of one field at a time. Furthermore, excluding or emphasizing correlations between special field pairs based on physical considerations can be easily implemented. The method is identical to a supermatrix approach based on maximizing the product of canonical correlation coefficients when the individual canonical correlation matrices are perfectly diagonal. This would be true for idealized data that contain only orthogonal motion systems so that all datasets are perfectly correlated. In such a case, all supermatrix methods will also converge to the same solution. In real cases, cross-component correlations will occur, and their largest values, called largest residual correlations (LRCs), are a crude measure of the validity of the approximation. When LRCs are small compared to the corresponding canonical correlation coefficients, the results are reliable. Otherwise, solutions of different methods diverge and are all doubtful. A statistical textbook example illustrates that solutions obtained are comparable to those from the supermatrix methods, and the relative LRCs are about 20%. A meteorological application example shows that, compared to the two-set CCA, the proposed MCCA gives a more powerful concentration of variance in the leading modes and higher canonical correlation coefficients. The resultant relative LRCs are small throughout all leading modes, apparently because meteorological data contain highly correlated variations. The proposed technique may also be applied to the singular-value decomposition analysis to allow a multiple-set singular-value decomposition analysis to be used on more than two sets of data fields.
ISSN:0027-0644
DOI:10.1175/1520-0493(1994)122<2482:atfaor>2.0.co;2