ESM Atlas v0 random sample of high confidence predicted protein structures

A random sample out of the 225M high confidence predictions in the ESM Atlas v0 dataset introduced in "Evolutionary-scale prediction of atomic level protein structure with a language model.". All predictions can be accessed in the ESM Metagenomic Atlas (https://esmatlas.com) open science r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lin, Zeming, Akin, Halil, Rao, Roshan, Hie, Brian, Zhu, Zhongkai, Lu, Wenting, Smetanin, Nikita, Verkuil, Robert, Kabeli, Ori, Shmueli, Yaniv, dos Santos Costa, Allan, Fazel-Zarandi, Maryam, Sercu, Tom, Candido, Salvatore, Rives, Alexander
Format: Dataset
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A random sample out of the 225M high confidence predictions in the ESM Atlas v0 dataset introduced in "Evolutionary-scale prediction of atomic level protein structure with a language model.". All predictions can be accessed in the ESM Metagenomic Atlas (https://esmatlas.com) open science resource, released on 2022-11-01. High confidence is defined as mean pLDDT > 0.7 and pTM > 0.7 and corresponds to ∼36% of the total 617M proteins folded. This is the random sample used for analysis in the paper as well as visualization on the esmatlas.com Explore page. Sample size: 999,520 based on 999,996 unique randomly sampled IDs and 0.05% missing data in the processing pipeline.
DOI:10.5281/zenodo.7623626