Analyzer: A Complex System for Data Analysis

Recently eXtensible Markup Language (XML) has achieved the leading role among languages for data representation and, thus, we can witness a massive boom of corresponding techniques for managing XML data. Most of the processing techniques, however, suffer from various bottlenecks worsening their time...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer journal 2012-05, Vol.55 (5), p.590-615
Hauptverfasser: Starka, J., Svoboda, M., Sochna, J., Schejbal, J., Mlynkova, I., Bednarek, D.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently eXtensible Markup Language (XML) has achieved the leading role among languages for data representation and, thus, we can witness a massive boom of corresponding techniques for managing XML data. Most of the processing techniques, however, suffer from various bottlenecks worsening their time and/or space efficiency. We assume that the main reason is they consider XML collections too globally, involving all their possible features, although real-world data are often much simpler. Even though some techniques do restrict the input data, the restrictions are mostly unnatural. This paper aims to introduce Analyzer--a complex framework for performing statistical analyses of real-world documents. Exploitation of results of these analyses is a classical way how data processing can be optimized in many areas. Although this intent is legitimate, ad hoc and dedicated analyses soon become obsolete, they are usually built on insufficiently extensive collections and are difficult to repeat. Analyzer represents an easily extensible framework, which helps the user with gathering documents, managing analyses and browsing computed reports. [PUBLICATION ABSTRACT]
ISSN:0010-4620
1460-2067
DOI:10.1093/comjnl/bxr103