Design of a language-independent parallel string matching unit for NLP

In natural language processing applications, string matching is the main time-consuming operation due to the large size of lexicon. Data dependence is minimal in string matching operations, and hence it is ideal for parallelization. A dedicated hardware for string matching that uses memory interleav...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Murty, V.S., Reghu Raj, P.C., Raman, S.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Application software Approximate match Computer science Field programmable gate arrays Hardware Interleaved codes interleaved lexicon language independence Natural language processing Natural languages parallel matching Pattern matching Pattern recognition Random access memory
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In natural language processing applications, string matching is the main time-consuming operation due to the large size of lexicon. Data dependence is minimal in string matching operations, and hence it is ideal for parallelization. A dedicated hardware for string matching that uses memory interleaving and parallel processing techniques can relieve the host CPU from this burden, thereby making the system suitable for real-time applications. This paper reports the FPGA design of such a system with m parallel matching units. The time complexity of the proposed algorithm is O (log 2 n), where n is the total number of lexical entries. This has been achieved by a proper selection of the value of m. A special memory organization technique, which reduces the storage space by nearly 70%, has been adopted for storing lexical entries. The techniques used for matching and storage of lexical entries make the system language independent
DOI:	10.1109/CAMP.2003.1598159