An Empirical Study on Document Similarity Comparison Evaluation Between Machine Learning Techniques and Human Experts

Current machine-learning training focuses solely on accuracy. In this study, the weights of other dimensions were examined rather than measuring only the accuracy of machine learning. By comparatively analyzing the decision-making of machine learning and humans in various fields, this study examines...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Tehnički vjesnik 2024-10, Vol.31 (5), p.1668-1679
1. Verfasser:	Jang, Won-Jung
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms ANN model Computational linguistics count-based model document similarity ensemble learning model Language processing Machine learning Methods Natural language interfaces Neural networks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Current machine-learning training focuses solely on accuracy. In this study, the weights of other dimensions were examined rather than measuring only the accuracy of machine learning. By comparatively analyzing the decision-making of machine learning and humans in various fields, this study examines how well organizational vision is propagated to lower levels of the organization. Also, the results evaluated by humans and machine learning models were comparatively analyzed from multiple perspectives. As numerical representation methods of words, count-based models (Bag of Words, TF-IDF), artificial neural network (ANN) models (Word2Vec, GloVe), and a vision propagation measurement (VPMS) model combining two methods were used to calculate the similarity between documents, which are comparatively analyzed with the actual results measured by an expert group. The findings of this study can be used as an evaluation metric for how effectively the vision of the upper organization is being disseminated to the lower-level organizations. Additionally, it could be utilized in developing algorithms such as customer segmentation for target marketing using text data. The study makes two key contributions - (i) providing an extensive empirical comparison of document similarity analysis by different ML techniques versus human experts, and (ii) proposing a new VPMS model that outperforms existing methods.
ISSN:	1330-3651 1848-6339
DOI:	10.17559/TV-20231011001013