A pre-trained multi-representation fusion network for molecular property prediction
In the field of machine learning and cheminformatics, the prediction of molecular properties holds significant importance. Molecules can be represented in various formats, including 1D SMILES string, 2D graph, and 3D conformation. Numerous models have been proposed for different representations to a...
Gespeichert in:
Veröffentlicht in: | Information fusion 2024-03, Vol.103, p.102092, Article 102092 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the field of machine learning and cheminformatics, the prediction of molecular properties holds significant importance. Molecules can be represented in various formats, including 1D SMILES string, 2D graph, and 3D conformation. Numerous models have been proposed for different representations to accomplish molecular property prediction. However, most recent works have focused on one or two representations or combining embedding vectors from different perspectives in an unsophisticated manner. To address this issue, we present PremuNet, a novel pre-trained multi-representation fusion network for molecular property prediction. PremuNet can extract comprehensive molecular information from multiple views and combine them interactively through pre-training and fine-tuning. The framework of PremuNet consists of two branches: a Transformer-GNN branch that extracts SMILES and graph information, and a Fusion Net branch that extracts topology and geometry information, called PremuNet-L and PremuNet-H respectively. We employ masked self-supervised methods to enable the model to learn information fusion and achieve enhanced performance in downstream tasks. The proposed model has been evaluated on eight molecular property prediction tasks, including five classification and three regression tasks, and attained state-of-the-art performance in most cases. Additionally, we conduct the ablation studies to demonstrate the effect of each view and the branch combination approaches.
•A novel molecular property prediction framework with multi-model fusion network.•We introduce new information fusion schemes with graph neural network structure.•Our model outperforms the baselines in eight molecular property prediction tasks. |
---|---|
ISSN: | 1566-2535 1872-6305 |
DOI: | 10.1016/j.inffus.2023.102092 |