Optimizing Regular Expression Matching with SR-NFA on Multi-Core Systems

Conventionally, regular expression matching (REM) has been performed by sequentially comparing the regular expression (regex) to the input stream, which can be slow due to excessive backtracking [21]. Alternatively, the regex can be converted to a deterministic finite automaton (DFA) for efficient m...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yang, Y-H E., Prasanna, V. K.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Automata bit-level parallelism Complexity theory Doped fiber amplifiers Explosions multi-core processor Multicore processing NFA Regular expression thread-level parallelism Throughput Vectors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Conventionally, regular expression matching (REM) has been performed by sequentially comparing the regular expression (regex) to the input stream, which can be slow due to excessive backtracking [21]. Alternatively, the regex can be converted to a deterministic finite automaton (DFA) for efficient matching, which however may require an extremely large state transition table (STT) due to exponential state explosion [17, 27]. We propose the segmented regex-NFA (SR-NFA) architecture, where the regex is first compiled into modular nondeterministic finite automata (NFA), then partitioned, optimized, and matched efficiently on modern multi-core processors. SR-NFA offers attack-resilient multi-gigabit per second matching throughput, does not suffer from either backtracking or state explosion, and can be rapidly constructed. For regex sets that construct a DFA with moderate state explosion, i.e., on average 200k states in the STT, the proposed SR-NFA is 367k times faster to construct and update and use 23k times less memory than the DFA approach. Running on an 8-core 2.6 GHz Opteron platform, our prototype achieves 2.2 Gbps average matching throughput for regex sets with up to 4,000 SR-NFA states per regex set.
ISSN:	1089-795X 2641-7944
DOI:	10.1109/PACT.2011.73