Towards creating a knowledge base for World-Wide Web documents

The lack of organization of information on the web results in non-efficient information retrieval. Several approaches for improvement have been suggested. We propose to use a document knowledge base that contains semantic and structural information concerning the retrievable documents that is extrac...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Lambrix, P., Shahmehri, N., Aberg, J.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Data mining HTML Information retrieval Information science Internet Search engines TECHNOLOGY TEKNIKVETENSKAP Text analysis Web search Web sites
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The lack of organization of information on the web results in non-efficient information retrieval. Several approaches for improvement have been suggested. We propose to use a document knowledge base that contains semantic and structural information concerning the retrievable documents that is extracted from the actual documents. We show that using such a knowledge base gives a number of advantages, including advanced query functionality. We also discuss the creation of such a knowledge base and in particular we show how we can automatically extract structural information from HTML documents for addition to the document knowledge base.
DOI:	10.1109/IIS.1997.645367