A Novel Molecular Representation Learning for Molecular Property Prediction with a Multiple SMILES-Based Augmentation

Deep learning has brought a rapid development in the aspect of molecular representation for various tasks, such as molecular property prediction. The prediction of molecular properties is a crucial task in the field of drug discovery for finding specific drugs with good pharmacological activity and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational intelligence and neuroscience 2022-01, Vol.2022, p.8464452-11
Hauptverfasser: Li, Chunyan, Feng, Jihua, Liu, Shihu, Yao, Junfeng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Deep learning has brought a rapid development in the aspect of molecular representation for various tasks, such as molecular property prediction. The prediction of molecular properties is a crucial task in the field of drug discovery for finding specific drugs with good pharmacological activity and pharmacokinetic properties. SMILES string is always used as a kind of character approach in deep neural network models, inspired by natural language processing techniques. However, the deep learning models are hindered by the nonunique nature of the SMILES string. To efficiently learn molecular features along all message paths, in this paper we encode multiple SMILES for every molecule as an automated data augmentation for the prediction of molecular properties, which alleviates the overfitting problem caused by the small amount of data in the datasets of molecular property prediction. As a result, by using the multiple SMILES-based augmentation, we obtained better molecular representation and showed superior performance in the tasks of predicting molecular properties.
ISSN:1687-5265
1687-5273
DOI:10.1155/2022/8464452