An Ad-hoc graph node vector embedding algorithm for general knowledge graphs using Kinetica-Graph
This paper discusses how to generate general graph node embeddings from knowledge graph representations. The embedded space is composed of a number of sub-features to mimic both local affinity and remote structural relevance. These sub-feature dimensions are defined by several indicators that we spe...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper discusses how to generate general graph node embeddings from
knowledge graph representations. The embedded space is composed of a number of
sub-features to mimic both local affinity and remote structural relevance.
These sub-feature dimensions are defined by several indicators that we
speculate to catch nodal similarities, such as hop-based topological patterns,
the number of overlapping labels, the transitional probabilities (markov-chain
probabilities), and the cluster indices computed by our recursive spectral
bisection (RSB) algorithm. These measures are flattened over the one
dimensional vector space into their respective sub-component ranges such that
the entire set of vector similarity functions could be used for finding similar
nodes. The error is defined by the sum of pairwise square differences across a
randomly selected sample of graph nodes between the assumed embeddings and the
ground truth estimates as our novel loss function. The ground truth is
estimated to be a combination of pairwise Jaccard similarity and the number of
overlapping labels. Finally, we demonstrate a multi-variate stochastic gradient
descent (SGD) algorithm to compute the weighing factors among sub-vector spaces
to minimize the average error using a random sampling logic. |
---|---|
DOI: | 10.48550/arxiv.2407.15906 |