NLP Versus IR Approaches to Fuzzy Name Searching in Digital Libraries

Name Search is an important search function in Digital Library systems and various types of information retrieval systems, such as directory search systems, electronic phonebooks and yellow pages. The paper discusses two main approaches to fuzzy name matchingthe natural language processing (NLP) app...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Wu, Paul Horng-Jyh, Na, Jin-Cheon, Khoo, Christopher S. G.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Name Search is an important search function in Digital Library systems and various types of information retrieval systems, such as directory search systems, electronic phonebooks and yellow pages. The paper discusses two main approaches to fuzzy name matchingthe natural language processing (NLP) approach and the information retrieval (IR) approachand proposes a hybrid approach. Person names can be considered a (sub-)language, in which case a name search system will be developed using Natural Language Processing apparatus including dictionary, thesaurus and grammatical schema. On the other hand, if names are perceived as (free) text, then an entirely different system may be built incorporating indexing, retrieving, relevance ranking and other Information Retrieval techniques. These two schools of thought, NLP and IR, have somewhat different sets of techniques originating from different theoretical concerns and research traditions. A selective combination of their complementary features is likely to be more effective for fuzzy name matching. Two principles, position attribute identity (PAI) and position transition likelihood (PTL), are proposed to incorporate aspects of both approaches. The two principles have been implemented in an NLP- and IR- hybrid model system called Friendly Name Search (FNS) for real world applications in multilingual directory searches on the Singapore Yellowpages website.
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-540-30230-8_14