VARIANT CALLER

Processes and systems for reading variants from a genome sample relative to a reference genomic sequence are provided. An exemplary process includes collecting a set reads and generating a k-mer graph from the reads. For example, the k-mer graph can be constructed to represent all possible substring...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	GIBIANSKY, ANDREW, LEONIDOVICH, HAQUE, IMRAN, SAEEDUL, ROBERTSON, ALEXANDER, DE JONG, MAGUIRE, JARED, ROBERT
Format:	Patent
Sprache:	eng ; fre
Schlagworte:	CHEMISTRY COMBINATORIAL CHEMISTRY INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTEDFOR SPECIFIC APPLICATION FIELDS INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIRCHEMICAL OR PHYSICAL PROPERTIES LIBRARIES, e.g. CHEMICAL LIBRARIES, IN SILICOLIBRARIES MEASURING METALLURGY PHYSICS TESTING
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Processes and systems for reading variants from a genome sample relative to a reference genomic sequence are provided. An exemplary process includes collecting a set reads and generating a k-mer graph from the reads. For example, the k-mer graph can be constructed to represent all possible substrings of the collected reads. The k-mer graph may be reduced to a contiguous graph, and a set of possible haplotypes generated from the contiguous graph. The process may further generate, the error table providing a filter for common sequencer errors. The process may then generate a set of diplotypes based on the set of haplotypes and the generated error table and score the set of diplotypes to identify variants from the reference genome. Scoring the diplotypes may include determining a posterior probability for each of the diplotypes, with the highest scoring diplotype(s) reported as the result. L'invention concerne des procédés et des systèmes permettant de lire des variants dans un échantillon génomique par rapport à une séquence génomique de référence. Un procédé donné à titre d'exemple consiste à recueillir un ensemble de lectures et générer un graphe K-mer à partir des lectures. Par exemple, le graphe K-mer peut être conçu pour représenter toutes les sous-chaînes possibles des lectures recueillies. Le graphe K-mer peut être réduit à un graphe contigu, et un ensemble d'haplotypes possibles générés à partir du graphe contigu. Le procédé peut en outre générer la table d'erreurs fournissant un filtre pour des erreurs de séquenceur communes. Le procédé peut alors générer un ensemble de diplotypes sur la base de l'ensemble d'haplotypes et de la table d'erreurs générée et évaluer l'ensemble de diplotypes pour identifier des variants à partir du génome de référence. L'évaluation des diplotypes peut consister à déterminer une probabilité postérieure pour chaque diplotype, le(s) diplotype(s) ayant l'évaluation la plus élevée étant notifié(s) comme résultat.