Research on Web Text Representation and the Similarity Based on Improved VSM in Uyghur Web Information Retrieval

In the information retrieval technology based on vector space model, represent the Web documents with the vector space model, take the Indexed term weight as a main basis carry on the similarity computation between the user query and Web documents, and sorting query results according to the similari...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Tohti, T, Hamdulla, A, Musajan, W
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In the information retrieval technology based on vector space model, represent the Web documents with the vector space model, take the Indexed term weight as a main basis carry on the similarity computation between the user query and Web documents, and sorting query results according to the similarity size. In this paper, adjusted Indexed term weight with the position weighting factor, considering the term weight ,position, mutual distance, order and as well as the Uighur word similarity contributions, has carried on the user query and the Web documents similarity measure. Tests the experiment in the Uygur search engine, the results show that , the improved method obviously improved the accuracy, recall and sorting capacity of the Web information retrieval system.
DOI:10.1109/CCPR.2010.5659262