A Text Mining Discovery of Similarities and Dissimilarities Among Sacred Scriptures

The careful examination of sacred texts gives valuable insights into human psychology, different ideas regarding the organization of societies as well as into terms like truth and God. To improve and deepen our understanding of sacred texts, their comparison, and their separation is crucial. For thi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2021-02
Hauptverfasser: Peuriekeu, Younous Mofenjou, Noyum, Victoire Djimna, Feudjio, Cyrille, Alkan Goktug, Fokoue, Ernest
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The careful examination of sacred texts gives valuable insights into human psychology, different ideas regarding the organization of societies as well as into terms like truth and God. To improve and deepen our understanding of sacred texts, their comparison, and their separation is crucial. For this purpose, we use our data set has nine sacred scriptures. This work deals with the separation of the Quran, the Asian scriptures Tao-Te-Ching, the Buddhism, the Yogasutras, and the Upanishads as well as the four books from the Bible, namely the Book of Proverbs, the Book of Ecclesiastes, the Book of Ecclesiasticus, and the Book of Wisdom. These scriptures are analyzed based on the natural language processing NLP creating the mathematical representation of the corpus in terms of frequencies called document term matrix (DTM). After this analysis, machine learning methods like supervised and unsupervised learning are applied to perform classification. Here we use the Multinomial Naive Bayes (MNB), the Super Vector Machine (SVM), the Random Forest (RF), and the K-nearest Neighbors (KNN). We obtain that among these methods MNB is able to predict the class of a sacred text with an accuracy of about 85.84 %.
ISSN:2331-8422