MoleMCL: a multi-level contrastive learning framework for molecular pre-training

Abstract Motivation Molecular representation learning plays an indispensable role in crucial tasks such as property prediction and drug design. Despite the notable achievements of molecular pre-training models, current methods often fail to capture both the structural and feature semantics of molecu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Bioinformatics (Oxford, England) England), 2024-03, Vol.40 (4)
Hauptverfasser:	Zhang, Xinyi, Xu, Yanni, Jiang, Changzhi, Shen, Lian, Liu, Xiangrong
Format:	Artikel
Sprache:	eng
Schlagworte:	Availability Drug development Graphical representations Graphs Learning Molecular modelling Molecular structure Original Paper Parameters Perturbation Semantics Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Abstract Motivation Molecular representation learning plays an indispensable role in crucial tasks such as property prediction and drug design. Despite the notable achievements of molecular pre-training models, current methods often fail to capture both the structural and feature semantics of molecular graphs. Moreover, while graph contrastive learning has unveiled new prospects, existing augmentation techniques often struggle to retain their core semantics. To overcome these limitations, we propose a gradient-compensated encoder parameter perturbation approach, ensuring efficient and stable feature augmentation. By merging enhancement strategies grounded in attribute masking and parameter perturbation, we introduce MoleMCL, a new MOLEcular pre-training model based on multi-level contrastive learning. Results Experimental results demonstrate that MoleMCL adeptly dissects the structure and feature semantics of molecular graphs, surpassing current state-of-the-art models in molecular prediction tasks, paving a novel avenue for molecular modeling. Availability and implementation The code and data underlying this work are available in GitHub at https://github.com/BioSequenceAnalysis/MoleMCL.
ISSN:	1367-4811 1367-4803 1367-4811
DOI:	10.1093/bioinformatics/btae164