Semi-Automated Wrappers Using Rule Trees

In this paper we describe the concept of a semi-automated wrapper for extracting information from semi-structured pages, usually part of the e-commerce data intensive web sites. The process is based on creating extraction rules in a visual manner, using the DOM tree associated to a XHTML document, h...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Iasinschi, A., Cosulschi, M.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper we describe the concept of a semi-automated wrapper for extracting information from semi-structured pages, usually part of the e-commerce data intensive web sites. The process is based on creating extraction rules in a visual manner, using the DOM tree associated to a XHTML document, helping the user to make the right decisions. The extraction rules defined have a natural tree structure. Based on the model designed, the wrapper can then be used to navigate through the site and extract the relevant data.
DOI:10.1109/SYNASC.2008.67