Compact Encodings for All Local Path Information in Web Taxonomies with Application to WordNet

We consider the problem of finding a compact labelling for large, rooted web taxonomies that can be used to encode all local path information for each taxonomy element. This research is motivated by the problem of developing standards for taxonomic data, and addresses the data intensive problem of e...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Strunjaš-Yoshikawa, Svetlana, Annexstein, Fred S., Berman, Kenneth A.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We consider the problem of finding a compact labelling for large, rooted web taxonomies that can be used to encode all local path information for each taxonomy element. This research is motivated by the problem of developing standards for taxonomic data, and addresses the data intensive problem of evaluating semantic similarities between taxonomic elements. Evaluating such similarities often requires the processing of large common ancestor sets between elements. We propose a new class of compact labelling schemes, designed for directed acyclic graphs, and tailored for applications to large web taxonomies. Our labelling schemes significantly reduce the complexity of evaluating similarities among taxonomy elements by enabling the gleaning of inferences from the labels alone, without searching the data structure. We provide an analysis of the label lengths for the proposed schemes based on structural properties of the taxonomy. Finally, we provide supporting empirical evidence for the quality of these schemes by evaluating the performance on the WordNet taxonomy.
ISSN:0302-9743
1611-3349
DOI:10.1007/11611257_49