Advances in machine learning for directed evolution

Machine learning (ML) can expedite directed evolution by allowing researchers to move expensive experimental screens in silico. Gathering sequence-function data for training ML models, however, can still be costly. In contrast, raw protein sequence data is widely available. Recent advances in ML app...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Current opinion in structural biology 2021-08, Vol.69, p.11-18
Hauptverfasser: Wittmann, Bruce J, Johnston, Kadina E, Wu, Zachary, Arnold, Frances H
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Machine learning (ML) can expedite directed evolution by allowing researchers to move expensive experimental screens in silico. Gathering sequence-function data for training ML models, however, can still be costly. In contrast, raw protein sequence data is widely available. Recent advances in ML approaches use protein sequences to augment limited sequence-function data for directed evolution. We highlight contributions in a growing effort to use sequences to reduce or eliminate the amount of sequence-function data needed for effective in silico screening. We also highlight approaches that use ML models trained on sequences to generate new functional sequence diversity, focusing on strategies that use these generative models to efficiently explore vast regions of protein space.
ISSN:0959-440X
1879-033X
DOI:10.1016/j.sbi.2021.01.008