GraphLoc: a graph neural network model for predicting protein subcellular localization from immunohistochemistry images
MOTIVATIONRecognition of protein subcellular distribution patterns and identification of location biomarker proteins in cancer tissues are important for understanding protein functions and related diseases. Immunohistochemical (IHC) images enable visualizing the distribution of proteins at the tissu...
Gespeichert in:
Veröffentlicht in: | Bioinformatics (Oxford, England) England), 2022-10, Vol.38 (21), p.4941-4948 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | MOTIVATIONRecognition of protein subcellular distribution patterns and identification of location biomarker proteins in cancer tissues are important for understanding protein functions and related diseases. Immunohistochemical (IHC) images enable visualizing the distribution of proteins at the tissue level, providing an important resource for the protein localization studies. In the past decades, several image-based protein subcellular location prediction methods have been developed, but the prediction accuracies still have much space to improve due to the complexity of protein patterns resulting from multi-label proteins and the variation of location patterns across cell types or states. RESULTSHere, we propose a multi-label multi-instance model based on deep graph convolutional neural networks, GraphLoc, to recognize protein subcellular location patterns. GraphLoc builds a graph of multiple IHC images for one protein, learns protein-level representations by graph convolutions and predicts multi-label information by a dynamic threshold method. Our results show that GraphLoc is a promising model for image-based protein subcellular location prediction with model interpretability. Furthermore, we apply GraphLoc to the identification of candidate location biomarkers and potential members for protein networks. A large portion of the predicted results have supporting evidence from the existing literatures and the new candidates also provide guidance for further experimental screening. AVAILABILITY AND IMPLEMENTATIONThe dataset and code are available at: www.csbio.sjtu.edu.cn/bioinf/GraphLoc. SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online. |
---|---|
ISSN: | 1367-4803 1367-4811 |
DOI: | 10.1093/bioinformatics/btac634 |