Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology
Featurizing microscopy images for use in biological research remains a significant challenge, especially for large-scale experiments spanning millions of images. This work explores the scaling properties of weakly supervised classifiers and self-supervised masked autoencoders (MAEs) when training wi...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Featurizing microscopy images for use in biological research remains a
significant challenge, especially for large-scale experiments spanning millions
of images. This work explores the scaling properties of weakly supervised
classifiers and self-supervised masked autoencoders (MAEs) when training with
increasingly larger model backbones and microscopy datasets. Our results show
that ViT-based MAEs outperform weakly supervised classifiers on a variety of
tasks, achieving as much as a 11.5% relative improvement when recalling known
biological relationships curated from public databases. Additionally, we
develop a new channel-agnostic MAE architecture (CA-MAE) that allows for
inputting images of different numbers and orders of channels at inference time.
We demonstrate that CA-MAEs effectively generalize by inferring and evaluating
on a microscopy image dataset (JUMP-CP) generated under different experimental
conditions with a different channel structure than our pretraining data
(RPI-93M). Our findings motivate continued research into scaling
self-supervised learning on microscopy data in order to create powerful
foundation models of cellular biology that have the potential to catalyze
advancements in drug discovery and beyond. |
---|---|
DOI: | 10.48550/arxiv.2404.10242 |