Truncated Matrix Power Iteration for Differentiable DAG Learning
Recovering underlying Directed Acyclic Graph (DAG) structures from observational data is highly challenging due to the combinatorial nature of the DAG-constrained optimization problem. Recently, DAG learning has been cast as a continuous optimization problem by characterizing the DAG constraint as a...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recovering underlying Directed Acyclic Graph (DAG) structures from
observational data is highly challenging due to the combinatorial nature of the
DAG-constrained optimization problem. Recently, DAG learning has been cast as a
continuous optimization problem by characterizing the DAG constraint as a
smooth equality one, generally based on polynomials over adjacency matrices.
Existing methods place very small coefficients on high-order polynomial terms
for stabilization, since they argue that large coefficients on the higher-order
terms are harmful due to numeric exploding. On the contrary, we discover that
large coefficients on higher-order terms are beneficial for DAG learning, when
the spectral radiuses of the adjacency matrices are small, and that larger
coefficients for higher-order terms can approximate the DAG constraints much
better than the small counterparts. Based on this, we propose a novel DAG
learning method with efficient truncated matrix power iteration to approximate
geometric series based DAG constraints. Empirically, our DAG learning method
outperforms the previous state-of-the-arts in various settings, often by a
factor of $3$ or more in terms of structural Hamming distance. |
---|---|
DOI: | 10.48550/arxiv.2208.14571 |