A Semi-Supervised Autoencoder-Based Approach for Protein Function Prediction
After the development of next-generation sequencing techniques, protein sequences are abundantly available. Determining the functional characteristics of these proteins is costly and time-consuming. The gap between the number of protein sequences and their corresponding functions is continuously inc...
Gespeichert in:
Veröffentlicht in: | IEEE journal of biomedical and health informatics 2022-10, Vol.26 (10), p.4957-4965 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | After the development of next-generation sequencing techniques, protein sequences are abundantly available. Determining the functional characteristics of these proteins is costly and time-consuming. The gap between the number of protein sequences and their corresponding functions is continuously increasing. Advanced machine-learning methods have stepped up to fill this gap. In this work, an advanced deep-learning-based approach is proposed for protein function prediction using protein sequences. A set of autoencoders is trained in a semi-supervised manner with protein sequences. Each autoencoder corresponds to a single protein function only. In particular, 932 autoencoders corresponding to 932 biological processes and 585 autoencoders corresponding to 585 molecular functions are trained separately. Reconstruction losses of each protein sample for every autoencoder are used as a feature to classify these sequences into their corresponding functions. The proposed model is tested on test protein samples and achieves promising results. This method can be easily extended to predict any number of functions having an ample amount of supporting protein sequences. All relevant codes, data and trained models are available at https://github.com/richadhanuka/PFP-Autoencoders . |
---|---|
ISSN: | 2168-2194 2168-2208 |
DOI: | 10.1109/JBHI.2022.3163150 |