Machine-learning approach expands the repertoire of anti-CRISPR protein families

The CRISPR-Cas are adaptive bacterial and archaeal immunity systems that have been harnessed for the development of powerful genome editing and engineering tools. In the incessant host-parasite arms race, viruses evolved multiple anti-defense mechanisms including diverse anti-CRISPR proteins (Acrs)...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature communications 2020-07, Vol.11 (1), p.3784-3784, Article 3784
Hauptverfasser: Gussow, Ayal B., Park, Allyson E., Borges, Adair L., Shmakov, Sergey A., Makarova, Kira S., Wolf, Yuri I., Bondy-Denomy, Joseph, Koonin, Eugene V.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The CRISPR-Cas are adaptive bacterial and archaeal immunity systems that have been harnessed for the development of powerful genome editing and engineering tools. In the incessant host-parasite arms race, viruses evolved multiple anti-defense mechanisms including diverse anti-CRISPR proteins (Acrs) that specifically inhibit CRISPR-Cas and therefore have enormous potential for application as modulators of genome editing tools. Most Acrs are small and highly variable proteins which makes their bioinformatic prediction a formidable task. We present a machine-learning approach for comprehensive Acr prediction. The model shows high predictive power when tested against an unseen test set and was employed to predict 2,500 candidate Acr families. Experimental validation of top candidates revealed two unknown Acrs (AcrIC9, IC10) and three other top candidates were coincidentally identified and found to possess anti-CRISPR activity. These results substantially expand the repertoire of predicted Acrs and provide a resource for experimental Acr discovery. CRISPR-Cas is a host adaptive immunity system and viruses harbor diverse anti-CRISPR proteins (Acrs). Here, the authors develop a random forest machine-learning approach to predict Acrs, identifying 2500 candidate Acr families, which expand the current repertoire of predicted Acrs by two orders of magnitude.
ISSN:2041-1723
2041-1723
DOI:10.1038/s41467-020-17652-0