An Analysis of Statistical Techniques Applying to Multi-Feature Similarity Comparison between Corpora

Statistical techniques applying to multi-feature similarity comparison belong to the type of goodness-of-fit test which include chi-square test, rank correlation test and Kolmogorov-Smirnov test (K-S test). Experiments show that both chi-square independence test and rank correlation test are subject...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied Mechanics and Materials 2011-07, Vol.66-68, p.2323-2329
Hauptverfasser: Ge, Shi Li, Chen, Xiao Xiao, Lin, Min
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Statistical techniques applying to multi-feature similarity comparison belong to the type of goodness-of-fit test which include chi-square test, rank correlation test and Kolmogorov-Smirnov test (K-S test). Experiments show that both chi-square independence test and rank correlation test are subject to the variation of sample size. With the expansion of sample size, the former test achieves the results of significant difference and the latter achieves the results of significant correlation easily. However, both results fail to reveal the actual situation of multi-feature similarity comparison between corpora. Only K-S test, which quantifies a distance between the empirical distribution functions of two samples, can achieve the highest statistical effectiveness.
ISSN:1660-9336
1662-7482
1662-7482
DOI:10.4028/www.scientific.net/AMM.66-68.2323