False positive gene mutation filtering method for targeted capture of gene sequencing data

The invention discloses a false positive gene mutation filtering method for targeted capture of gene sequencing data. The false positive gene mutation filtering method comprises the following steps: preprocessing gene mutation detection data; selecting three different supervised learning algorithms...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: WANG SHENJIE, WANG MIAO, WANG XUWEN, GUAN YANFANG, HAN BO, ZHANG XUANPING, WANG JIAYIN, LIU TAO
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention discloses a false positive gene mutation filtering method for targeted capture of gene sequencing data. The false positive gene mutation filtering method comprises the following steps: preprocessing gene mutation detection data; selecting three different supervised learning algorithms based on a triple training method to construct three different initial classifiers H1, H2 and H3, namely selecting three different supervised learning automatons and a learner generated based on an initial training set; training the H1, H2 and H3 to obtain an extended training set, and updating themodel; and marking the unmarked sample set U by using the trained model, and filtering according to a marking result. The method solves the problem that the traditional method cannot effectively copewith the batch difference. 本发明公开了一种针对靶向捕获基因测序数据的假阳性基因突变过滤方法,对基因突变的检测数据进行预处理;基于三重训练法选择三个不同的监督学习算法构造三个不同的初始分类器H,H,H,即选用三个不同的监督学习自动机并基于初始训练集生成的学习器;对H,H,H进行训练得到扩充训练集,由此对模型进行更新;使用训练的模型对未标记样本集U进行标记,根据标记结果完成过滤。本发明解决了传统方法无法有效应对批次差异的问题