Deep learning is combined with massive-scale citizen science to improve large-scale image classification

Pattern recognition in imaging data by >300,000 players of a global, online, commercial computer game is combined with deep learning to improve the accuracy of annotation of subcellular protein localization. Pattern recognition and classification of images are key challenges throughout the life s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature biotechnology 2018-10, Vol.36 (9), p.820-828
Hauptverfasser: Sullivan, Devin P, Winsnes, Casper F, Åkesson, Lovisa, Hjelmare, Martin, Wiking, Mikaela, Schutten, Rutger, Campbell, Linzi, Leifsson, Hjalti, Rhodes, Scott, Nordgren, Andie, Smith, Kevin, Revaz, Bernard, Finnbogason, Bergur, Szantner, Attila, Lundberg, Emma
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Pattern recognition in imaging data by >300,000 players of a global, online, commercial computer game is combined with deep learning to improve the accuracy of annotation of subcellular protein localization. Pattern recognition and classification of images are key challenges throughout the life sciences. We combined two approaches for large-scale classification of fluorescence microscopy images. First, using the publicly available data set from the Cell Atlas of the Human Protein Atlas (HPA), we integrated an image-classification task into a mainstream video game (EVE Online) as a mini-game, named Project Discovery. Participation by 322,006 gamers over 1 year provided nearly 33 million classifications of subcellular localization patterns, including patterns that were not previously annotated by the HPA. Second, we used deep learning to build an automated Localization Cellular Annotation Tool (Loc-CAT). This tool classifies proteins into 29 subcellular localization patterns and can deal efficiently with multi-localization proteins, performing robustly across different cell types. Combining the annotations of gamers and deep learning, we applied transfer learning to create a boosted learner that can characterize subcellular protein distribution with F1 score of 0.72. We found that engaging players of commercial computer games provided data that augmented deep learning and enabled scalable and readily improved image classification.
ISSN:1087-0156
1546-1696
1546-1696
DOI:10.1038/nbt.4225