A review of techniques for semantic understanding of the text with term weighting

The process of gleaning important facts and insights from vast amounts of unstructured textual data is called text mining, sometimes referred to as text analytics. In today’s data-driven business climate, text analytics is becoming increasingly significant. A key component of text mining is term-wei...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Tripathi, Chanchla A., Panchbhai, Vishal V., Damahe, Lalit B., Shirole, Mahesh R., Tiwari, Shruti, Rathi, Raunak, Varma, Prateek
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The process of gleaning important facts and insights from vast amounts of unstructured textual data is called text mining, sometimes referred to as text analytics. In today’s data-driven business climate, text analytics is becoming increasingly significant. A key component of text mining is term-weighting, which aids in locating significant keywords or phrases in a text document. Text mining algorithms can more precisely identify and analyze documents by giving these phrases weights, which can result in important insights and knowledge discovery. The study article thoroughly analyses several methods put out in the literature to help computers detect language and text. It covers the mathematical underpinnings of term weighting schemes, categorizes them broadly into supervised and statistical methods, and provides examples from both groups. The Vector-Space Model and its variations, which are at the core of many other techniques covered in the paper, are given special emphasis in this article. The significant research highlights the necessity for machines to effectively understand language and material without engaging in plagiarism or other immoral activities. Overall, the study lays the framework for further research in this field by offering a thorough overview of contemporary language and text recognition methods.
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0224550