Geographic Knowledge Graph Attribute Normalization: Improving the Accuracy by Fusing Optimal Granularity Clustering and Co-Occurrence Analysis

Expansion of the entity attribute information of geographic knowledge graphs is essentially the fusion of the Internet’s encyclopedic knowledge. However, it lacks structured attribute information, and synonymy and polysemy always exist. These reduce the quality of the knowledge graph and cause incom...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ISPRS international journal of geo-information 2022-07, Vol.11 (7), p.360
Hauptverfasser:	Yin, Chuan, Zhang, Binyu, Liu, Wanzeng, Du, Mingyi, Luo, Nana, Zhai, Xi, Ba, Tu
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Algorithms Altitude Analysis attribute normalization Clustering co-occurrence analysis geographic knowledge graph Geography Graphs Information services Knowledge representation Lakes Methods Ontology optimal clustering granularity Recall Semantics Similarity Synonymy Target detection
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Expansion of the entity attribute information of geographic knowledge graphs is essentially the fusion of the Internet’s encyclopedic knowledge. However, it lacks structured attribute information, and synonymy and polysemy always exist. These reduce the quality of the knowledge graph and cause incomplete and inaccurate semantic retrieval. Therefore, we normalize the attributes of a geographic knowledge graph based on optimal granularity clustering and co-occurrence analysis, and use structure and the semantic relation of the entity attributes to identify synonymy and correlation between attributes. Specifically: (1) We design a classification system for geographic attributes, that is, using a community discovery algorithm to classify the attribute names. The optimal clustering granularity is identified by the marker target detection algorithm. (2) We complete the fine-grained identification of attribute relations by analyzing co-occurrence relations of the attributes and rule inference. (3) Finally, the performance of the system is verified by manual discrimination using the case of “landscape, forest, field, lake and grass”. The results show the following: (1) The average precision of spatial relations was 0.974 and the average recall was 0.937; the average precision of data relations was 0.977 and the average recall was 0.998. (2) The average F1 for similarity results is 0.473; the average F1 for co-occurrence analysis results is 0.735; the average F1 for rule-based modification results is 0.934; the results show that the accuracy is greater than 90%. Compared to traditional methods only focusing on similarity, the accuracy of synonymous attribute recognition improves the system and we are capable of identifying near-sense attributes. Integration of our system and attribute normalization can greatly improve both the processing efficiency and accuracy.
ISSN:	2220-9964 2220-9964
DOI:	10.3390/ijgi11070360