Method and apparatus for cleaning data sets for a search process

An approach is provided for cleaning data sets for a search process. The cleanup platform determines one or more reference documents associated with at least one region. Next, the cleanup platform processes and/or facilitates a processing of the one or more reference documents to determine a frequen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: TURNER ROSS, AGRAWAL ASHISH KUMAR, HEINONEN JARKKO
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:An approach is provided for cleaning data sets for a search process. The cleanup platform determines one or more reference documents associated with at least one region. Next, the cleanup platform processes and/or facilitates a processing of the one or more reference documents to determine a frequency distribution of one or more candidate stop words with respect to the at least one region. Then, the cleanup platform causes, at least in part, selection of one or more stop words applicable to the at least one region from the one or more candidate stop words based, at least in part, on one or more frequency distribution criteria. Additionally, the cleanup platform processes and/or facilitates a processing of at least one data set associated with a search process to generate at least one enhanced data set by filtering the one or more stop words from the at least one data set.