Data (part 2) from Illuminating the functional landscape of the dark proteome across the Animal Tree of Life through natural language processing models

Part 2 contains: longest_isoforms_gopredsim_seqvec_F_to_O.tar.gz : GOPredSim GO annotation using SeqVec model for the longest isoform or species whose code first letter goes from F to O. IC_longest_isoforms_gopredsim_prott5.tar.gz : Information content (IC) of the unfiltered GOPredSim-ProtT5 GO anno...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Martínez-Redondo, Gemma I.
Format: Dataset
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Part 2 contains: longest_isoforms_gopredsim_seqvec_F_to_O.tar.gz : GOPredSim GO annotation using SeqVec model for the longest isoform or species whose code first letter goes from F to O. IC_longest_isoforms_gopredsim_prott5.tar.gz : Information content (IC) of the unfiltered GOPredSim-ProtT5 GO annotation of the longest isoforms of all species. IC_longest_isoforms_gopredsim_seqvec.tar.gz : IC of the unfiltered GOPredSim-SeqVec GO annotation of the longest isoforms of all species. IC_all_isoforms_eggnog_filtered_go_terms_cdhit.tar.gz : IC of the eggNOG-mapper GO annotation of all isoforms of a subset of 102 species. Isoforms removed by CD-HIT are filtered from this file (but included in the original eggNOG-mapper output). IC_all_isoforms_gopredsim_prott5.tar.gz : IC of the GOPredSim-ProtT5 GO annotation of all isoforms of a subset of 102 species. IC_all_isoforms_gopredsim_seqvec.tar.gz : IC of the GOPredSim-SeqVec GO annotation of all isoforms of a subset of 102 species.
DOI:10.5281/zenodo.10717483