Three-generation sequence alignment method based on longest path search

The invention provides a three-generation sequence alignment method based on longest path search. The method comprises the following steps: firstly, constructing a Hash index of a reference genome sequence, then extracting each k-mer of a to-be-aligned sequence, and searching all positions of the k-...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: WEI ZEGANG
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention provides a three-generation sequence alignment method based on longest path search. The method comprises the following steps: firstly, constructing a Hash index of a reference genome sequence, then extracting each k-mer of a to-be-aligned sequence, and searching all positions of the k-mer in a genome through the Hash index; taking each matched k-mer as a node, and constructing a k-mer l-neighborhood directed acyclic graph; according to the position information of the matched k-mer in the to-be-compared sequence, whether the nodes have edge connection and direction can be determined; filtering out isolated nodes and an isolated network with a small scale, designing a dynamic scoring strategy, determining a precursor node of each node, selecting a maximum score from the precursor nodes, and recording a score path; selecting a longest path; a sequence to be aligned and a reference genome can be divided into a seed region and a non-seed region; and for a non-seed region, a detailed base comparison r