On-line synthesis of parsers for string events
A string event is the occurrence of a specific pattern in the textual output of a program. The capture and treatment of string events has several applications, such as log anonymization, error handling and user notification. However, there is no systematic approach to identify and treat string event...
Gespeichert in:
Veröffentlicht in: | Journal of computer languages (Online) 2021-02, Vol.62, p.101022, Article 101022 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A string event is the occurrence of a specific pattern in the textual output of a program. The capture and treatment of string events has several applications, such as log anonymization, error handling and user notification. However, there is no systematic approach to identify and treat string events today. This paper formally defines string events and brings forward the theory and practice of a general framework to handle them. The framework encompasses an example-based user interface to specify string patterns plus a grammar synthesizer that allows efficiently parsing such patterns. We demonstrate the effectiveness of this framework by using it to implement Zhefuscator, a system that redacts occurrences of sensitive information in database logs. Zhefuscator is implemented as an extension to the Java Virtual Machine (JVM). It intercepts patterns of interest on-the-fly and does not require interventions in the source code of the protected program. It can infer log formats and capture string events with minimal performance overhead. As an illustration, it is up to 14x faster than an equivalent brute-force approach, converging to a definitive grammar after observing less than 10 examples from typical logs. |
---|---|
ISSN: | 2590-1184 2590-1184 2665-9182 |
DOI: | 10.1016/j.cola.2021.101022 |