Selective Network Discovery via Deep Reinforcement Learning on Embedded Spaces
Complex networks are often either too large for full exploration, partially accessible, or partially observed. Downstream learning tasks on these incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a result, ne...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Complex networks are often either too large for full exploration, partially
accessible, or partially observed. Downstream learning tasks on these
incomplete networks can produce low quality results. In addition, reducing the
incompleteness of the network can be costly and nontrivial. As a result,
network discovery algorithms optimized for specific downstream learning tasks
given resource collection constraints are of great interest. In this paper, we
formulate the task-specific network discovery problem in an incomplete network
setting as a sequential decision making problem. Our downstream task is
selective harvesting, the optimal collection of vertices with a particular
attribute. We propose a framework, called Network Actor Critic (NAC), which
learns a policy and notion of future reward in an offline setting via a deep
reinforcement learning algorithm. The NAC paradigm utilizes a task-specific
network embedding to reduce the state space complexity. A detailed comparative
analysis of popular network embeddings is presented with respect to their role
in supporting offline planning. Furthermore, a quantitative study is presented
on several synthetic and real benchmarks using NAC and several baselines. We
show that offline models of reward and network discovery policies lead to
significantly improved performance when compared to competitive online
discovery algorithms. Finally, we outline learning regimes where planning is
critical in addressing sparse and changing reward signals. |
---|---|
DOI: | 10.48550/arxiv.1909.07294 |