A novel data enhancement approach to DAG learning with small data samples

Learning a directed acyclic graph (DAG) from observational data plays a crucial role in causal inference and machine learning. However, the scarcity of observational data is a common phenomenon in real-world applications, where the current DAG learning methods may cause unsatisfactory performance in...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-11, Vol.53 (22), p.27589-27607
Hauptverfasser:	Huang, Xiaoling, Guo, Xianjie, Li, Yuling, Yu, Kui
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptive algorithms Adaptive sampling Artificial Intelligence Computer Science Data sampling Datasets Machine learning Machines Manufacturing Mechanical Engineering Processes
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Learning a directed acyclic graph (DAG) from observational data plays a crucial role in causal inference and machine learning. However, the scarcity of observational data is a common phenomenon in real-world applications, where the current DAG learning methods may cause unsatisfactory performance in the context of small data samples. Data enhancement has been recognized as one of the key techniques for improving the generalization abilities of learning models utilizing small data samples. However, due to the inherent difficulty of sampling small datasets to generate high-quality new data samples, this approach has not been widely used in DAG learning. To alleviate this problem, we propose a data enhancement-based DAG learning (DE-DAG) approach. Specifically, DE-DAG first presents an integrated data sampling strategy for DAG learning and data sampling, then constructs a sample-level adaptive distance computing algorithm for selecting high-quality samples from the sampled datasets, and finally implements a DAG learning method on enhanced datasets consisting of high-quality samples and the original data samples. Experimental results obtained on benchmark datasets demonstrate that our proposed approach outperforms the state-of-the-art baselines. Graphical abstract
ISSN:	0924-669X 1573-7497
DOI:	10.1007/s10489-023-04999-2