Cnosso, a novel method for business document automation based on open information extraction
The state-of-the-art in automated processing of unstructured business documents has evolved from manual labor to advanced AI systems in the span of mere decades. Such systems involve learning techniques, rule or clause sets, neural models – either used alone or in combination – for the extraction to...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2024-07, Vol.245, p.123038, Article 123038 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The state-of-the-art in automated processing of unstructured business documents has evolved from manual labor to advanced AI systems in the span of mere decades. Such systems involve learning techniques, rule or clause sets, neural models – either used alone or in combination – for the extraction to work. As an example, rule-based processes operate on a perceived layout or positioning of the information, whereas model-based frameworks adopt a semantic, and often uninspectable, approach. Verb-Based Semantic Role Labeling (VBSRL) is a novel system presented in a former paper that uses a hybrid foundation to inform the extraction phase via a set of rules modeling natural language. We propose a new VBSRL-based document processing method, aided by valuable and innovative architectural choices, which has been implemented for the Italian language and experimented upon with promising results. Even in its infancy, in fact, the first implementation of this system shows better results than comparable IE solutions, obtaining an aggregate, average F-measure of nearly 79%.
•Automating business document analysis is crucial and time consuming in enterprises.•Classification and information extraction for unstructured documents are hard tasks.•Document processing method via pre-processing, normalization and post-processing.•Information Extraction as Conceptual Dependency Theory plus Semantic Role Labeling.•Performances on real case scenario show better results than comparable IE solutions. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2023.123038 |