Multicenter Next-Generation Sequencing Studies between Theory and Practice: Harmonization of Data Analysis Using Real-World Myelodysplastic Syndrome Data

In the age of personalized medicine, genetic testing by means of targeted sequencing has taken a key role. However, when comparing different sets of targeted sequencing data, these are often characterized by a considerable lack of harmonization. Laboratories follow their own best practices, analyzin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of molecular diagnostics : JMD 2021-03, Vol.23 (3), p.347-357
Hauptverfasser: Sandmann, Sarah, de Graaf, Aniek O, Tobiasson, Magnus, Kosmider, Olivier, Abáigar, María, Clappier, Emmanuelle, Gallì, Anna, van der Reijden, Bert A, Malcovati, Luca, Fenaux, Pierre, Díez-Campelo, María, Fontenay, Michaela, Hellström-Lindberg, Eva, Jansen, Joop H, Dugas, Martin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 357
container_issue 3
container_start_page 347
container_title The Journal of molecular diagnostics : JMD
container_volume 23
creator Sandmann, Sarah
de Graaf, Aniek O
Tobiasson, Magnus
Kosmider, Olivier
Abáigar, María
Clappier, Emmanuelle
Gallì, Anna
van der Reijden, Bert A
Malcovati, Luca
Fenaux, Pierre
Díez-Campelo, María
Fontenay, Michaela
Hellström-Lindberg, Eva
Jansen, Joop H
Dugas, Martin
description In the age of personalized medicine, genetic testing by means of targeted sequencing has taken a key role. However, when comparing different sets of targeted sequencing data, these are often characterized by a considerable lack of harmonization. Laboratories follow their own best practices, analyzing their own target regions. The question on how to best integrate data from different sites remains unanswered. Studying the example of myelodysplastic syndrome (MDS), we analyzed 11 targeted sequencing sets, collected from six different centers (n = 831). An intersecting target region of 43,076 bp (30 genes) was identified; whereas, the original target regions covered up to 499,097 bp (117 genes). Considering a region of interest in the context of MDS, a target region of 55,969 bp (31 genes) was identified. For each gene, coverage and sequencing data quality was evaluated, calculating a sequencing score. Analyses revealed huge differences between different data sets as well as between different genes. Analysis of the relation between sequencing score and mutation frequency in MDS revealed that most genes with high frequency in MDS could be sequenced without expecting low coverage or quality. Still, no gene appeared consistently unproblematic for all data sets. To allow for comparable results in a multicenter setting analyzing MDS, we propose to use a predefined target region of interest and to perform centralized data analysis using harmonized criteria.
doi_str_mv 10.1016/j.jmoldx.2020.12.001
format Article
fullrecord <record><control><sourceid>proquest_swepu</sourceid><recordid>TN_cdi_proquest_miscellaneous_2473414260</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2473414260</sourcerecordid><originalsourceid>FETCH-LOGICAL-p299t-a865c43d16257e176ea3813b579b0297b2d27852a09ebc56cd4de1fb0fa031bf3</originalsourceid><addsrcrecordid>eNp1kc9u1DAQhy0EoqXwBgj5yCWL_8VZc6sKtEgtILYVx8iOJ-DFsVM7URvehLfFZbfAhdOMRt_3m5GN0HNKVpRQ-Wq72g7R29sVI6yM2IoQ-gAdUiV41awpffhPf4Ce5LwtgBCSPUYHnPNaibo-RD8vZj-5DsIECX-A26k6hQBJTy4GvIHrGULnwle8mWbrIGMD0w1AwJffIKYF62Dxp6S7u4jX-EynIQb3Y2fHHr_Rk8bHQfslu4yv8l3SZ9C--hKTt_hiAR_tkkevc0nAmyXYFAf47T1Fj3rtMzzb1yN09e7t5clZdf7x9P3J8Xk1MqWmSq9l3QluqWR1A7SRoPmaclM3yhCmGsMsa9Y100SB6WrZWWGB9ob0mnBqen6Eql1uvoFxNu2Y3KDT0kbt2v3oe-mgFbKWVBVe_ZcfU7R_pXuRiuKV62RxX-7cApanzVM7uNyB9zpAnHPLRMMFFUySgr7Yo7MZwP5Zc_91_BeShKGm</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2473414260</pqid></control><display><type>article</type><title>Multicenter Next-Generation Sequencing Studies between Theory and Practice: Harmonization of Data Analysis Using Real-World Myelodysplastic Syndrome Data</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals Complete</source><source>Alma/SFX Local Collection</source><source>SWEPUB Freely available online</source><source>EZB Electronic Journals Library</source><creator>Sandmann, Sarah ; de Graaf, Aniek O ; Tobiasson, Magnus ; Kosmider, Olivier ; Abáigar, María ; Clappier, Emmanuelle ; Gallì, Anna ; van der Reijden, Bert A ; Malcovati, Luca ; Fenaux, Pierre ; Díez-Campelo, María ; Fontenay, Michaela ; Hellström-Lindberg, Eva ; Jansen, Joop H ; Dugas, Martin</creator><creatorcontrib>Sandmann, Sarah ; de Graaf, Aniek O ; Tobiasson, Magnus ; Kosmider, Olivier ; Abáigar, María ; Clappier, Emmanuelle ; Gallì, Anna ; van der Reijden, Bert A ; Malcovati, Luca ; Fenaux, Pierre ; Díez-Campelo, María ; Fontenay, Michaela ; Hellström-Lindberg, Eva ; Jansen, Joop H ; Dugas, Martin</creatorcontrib><description>In the age of personalized medicine, genetic testing by means of targeted sequencing has taken a key role. However, when comparing different sets of targeted sequencing data, these are often characterized by a considerable lack of harmonization. Laboratories follow their own best practices, analyzing their own target regions. The question on how to best integrate data from different sites remains unanswered. Studying the example of myelodysplastic syndrome (MDS), we analyzed 11 targeted sequencing sets, collected from six different centers (n = 831). An intersecting target region of 43,076 bp (30 genes) was identified; whereas, the original target regions covered up to 499,097 bp (117 genes). Considering a region of interest in the context of MDS, a target region of 55,969 bp (31 genes) was identified. For each gene, coverage and sequencing data quality was evaluated, calculating a sequencing score. Analyses revealed huge differences between different data sets as well as between different genes. Analysis of the relation between sequencing score and mutation frequency in MDS revealed that most genes with high frequency in MDS could be sequenced without expecting low coverage or quality. Still, no gene appeared consistently unproblematic for all data sets. To allow for comparable results in a multicenter setting analyzing MDS, we propose to use a predefined target region of interest and to perform centralized data analysis using harmonized criteria.</description><identifier>ISSN: 1943-7811</identifier><identifier>ISSN: 1525-1578</identifier><identifier>EISSN: 1943-7811</identifier><identifier>DOI: 10.1016/j.jmoldx.2020.12.001</identifier><identifier>PMID: 33359455</identifier><language>eng</language><publisher>United States</publisher><subject>Algorithms ; Alleles ; Biomarkers ; Data Interpretation, Statistical ; Gene Frequency ; Genetic Testing ; High-Throughput Nucleotide Sequencing - methods ; High-Throughput Nucleotide Sequencing - standards ; Humans ; Medicin och hälsovetenskap ; Mutation ; Myelodysplastic Syndromes - diagnosis ; Myelodysplastic Syndromes - genetics ; Myelodysplastic Syndromes - therapy ; Reproducibility of Results ; Sensitivity and Specificity</subject><ispartof>The Journal of molecular diagnostics : JMD, 2021-03, Vol.23 (3), p.347-357</ispartof><rights>Copyright © 2021 Association for Molecular Pathology and American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,552,780,784,885,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33359455$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttp://kipublications.ki.se/Default.aspx?queryparsed=id:146192576$$DView record from Swedish Publication Index$$Hfree_for_read</backlink></links><search><creatorcontrib>Sandmann, Sarah</creatorcontrib><creatorcontrib>de Graaf, Aniek O</creatorcontrib><creatorcontrib>Tobiasson, Magnus</creatorcontrib><creatorcontrib>Kosmider, Olivier</creatorcontrib><creatorcontrib>Abáigar, María</creatorcontrib><creatorcontrib>Clappier, Emmanuelle</creatorcontrib><creatorcontrib>Gallì, Anna</creatorcontrib><creatorcontrib>van der Reijden, Bert A</creatorcontrib><creatorcontrib>Malcovati, Luca</creatorcontrib><creatorcontrib>Fenaux, Pierre</creatorcontrib><creatorcontrib>Díez-Campelo, María</creatorcontrib><creatorcontrib>Fontenay, Michaela</creatorcontrib><creatorcontrib>Hellström-Lindberg, Eva</creatorcontrib><creatorcontrib>Jansen, Joop H</creatorcontrib><creatorcontrib>Dugas, Martin</creatorcontrib><title>Multicenter Next-Generation Sequencing Studies between Theory and Practice: Harmonization of Data Analysis Using Real-World Myelodysplastic Syndrome Data</title><title>The Journal of molecular diagnostics : JMD</title><addtitle>J Mol Diagn</addtitle><description>In the age of personalized medicine, genetic testing by means of targeted sequencing has taken a key role. However, when comparing different sets of targeted sequencing data, these are often characterized by a considerable lack of harmonization. Laboratories follow their own best practices, analyzing their own target regions. The question on how to best integrate data from different sites remains unanswered. Studying the example of myelodysplastic syndrome (MDS), we analyzed 11 targeted sequencing sets, collected from six different centers (n = 831). An intersecting target region of 43,076 bp (30 genes) was identified; whereas, the original target regions covered up to 499,097 bp (117 genes). Considering a region of interest in the context of MDS, a target region of 55,969 bp (31 genes) was identified. For each gene, coverage and sequencing data quality was evaluated, calculating a sequencing score. Analyses revealed huge differences between different data sets as well as between different genes. Analysis of the relation between sequencing score and mutation frequency in MDS revealed that most genes with high frequency in MDS could be sequenced without expecting low coverage or quality. Still, no gene appeared consistently unproblematic for all data sets. To allow for comparable results in a multicenter setting analyzing MDS, we propose to use a predefined target region of interest and to perform centralized data analysis using harmonized criteria.</description><subject>Algorithms</subject><subject>Alleles</subject><subject>Biomarkers</subject><subject>Data Interpretation, Statistical</subject><subject>Gene Frequency</subject><subject>Genetic Testing</subject><subject>High-Throughput Nucleotide Sequencing - methods</subject><subject>High-Throughput Nucleotide Sequencing - standards</subject><subject>Humans</subject><subject>Medicin och hälsovetenskap</subject><subject>Mutation</subject><subject>Myelodysplastic Syndromes - diagnosis</subject><subject>Myelodysplastic Syndromes - genetics</subject><subject>Myelodysplastic Syndromes - therapy</subject><subject>Reproducibility of Results</subject><subject>Sensitivity and Specificity</subject><issn>1943-7811</issn><issn>1525-1578</issn><issn>1943-7811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>D8T</sourceid><recordid>eNp1kc9u1DAQhy0EoqXwBgj5yCWL_8VZc6sKtEgtILYVx8iOJ-DFsVM7URvehLfFZbfAhdOMRt_3m5GN0HNKVpRQ-Wq72g7R29sVI6yM2IoQ-gAdUiV41awpffhPf4Ce5LwtgBCSPUYHnPNaibo-RD8vZj-5DsIECX-A26k6hQBJTy4GvIHrGULnwle8mWbrIGMD0w1AwJffIKYF62Dxp6S7u4jX-EynIQb3Y2fHHr_Rk8bHQfslu4yv8l3SZ9C--hKTt_hiAR_tkkevc0nAmyXYFAf47T1Fj3rtMzzb1yN09e7t5clZdf7x9P3J8Xk1MqWmSq9l3QluqWR1A7SRoPmaclM3yhCmGsMsa9Y100SB6WrZWWGB9ob0mnBqen6Eql1uvoFxNu2Y3KDT0kbt2v3oe-mgFbKWVBVe_ZcfU7R_pXuRiuKV62RxX-7cApanzVM7uNyB9zpAnHPLRMMFFUySgr7Yo7MZwP5Zc_91_BeShKGm</recordid><startdate>20210301</startdate><enddate>20210301</enddate><creator>Sandmann, Sarah</creator><creator>de Graaf, Aniek O</creator><creator>Tobiasson, Magnus</creator><creator>Kosmider, Olivier</creator><creator>Abáigar, María</creator><creator>Clappier, Emmanuelle</creator><creator>Gallì, Anna</creator><creator>van der Reijden, Bert A</creator><creator>Malcovati, Luca</creator><creator>Fenaux, Pierre</creator><creator>Díez-Campelo, María</creator><creator>Fontenay, Michaela</creator><creator>Hellström-Lindberg, Eva</creator><creator>Jansen, Joop H</creator><creator>Dugas, Martin</creator><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>7X8</scope><scope>ADTPV</scope><scope>AOWAS</scope><scope>D8T</scope><scope>ZZAVC</scope></search><sort><creationdate>20210301</creationdate><title>Multicenter Next-Generation Sequencing Studies between Theory and Practice: Harmonization of Data Analysis Using Real-World Myelodysplastic Syndrome Data</title><author>Sandmann, Sarah ; de Graaf, Aniek O ; Tobiasson, Magnus ; Kosmider, Olivier ; Abáigar, María ; Clappier, Emmanuelle ; Gallì, Anna ; van der Reijden, Bert A ; Malcovati, Luca ; Fenaux, Pierre ; Díez-Campelo, María ; Fontenay, Michaela ; Hellström-Lindberg, Eva ; Jansen, Joop H ; Dugas, Martin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p299t-a865c43d16257e176ea3813b579b0297b2d27852a09ebc56cd4de1fb0fa031bf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Alleles</topic><topic>Biomarkers</topic><topic>Data Interpretation, Statistical</topic><topic>Gene Frequency</topic><topic>Genetic Testing</topic><topic>High-Throughput Nucleotide Sequencing - methods</topic><topic>High-Throughput Nucleotide Sequencing - standards</topic><topic>Humans</topic><topic>Medicin och hälsovetenskap</topic><topic>Mutation</topic><topic>Myelodysplastic Syndromes - diagnosis</topic><topic>Myelodysplastic Syndromes - genetics</topic><topic>Myelodysplastic Syndromes - therapy</topic><topic>Reproducibility of Results</topic><topic>Sensitivity and Specificity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sandmann, Sarah</creatorcontrib><creatorcontrib>de Graaf, Aniek O</creatorcontrib><creatorcontrib>Tobiasson, Magnus</creatorcontrib><creatorcontrib>Kosmider, Olivier</creatorcontrib><creatorcontrib>Abáigar, María</creatorcontrib><creatorcontrib>Clappier, Emmanuelle</creatorcontrib><creatorcontrib>Gallì, Anna</creatorcontrib><creatorcontrib>van der Reijden, Bert A</creatorcontrib><creatorcontrib>Malcovati, Luca</creatorcontrib><creatorcontrib>Fenaux, Pierre</creatorcontrib><creatorcontrib>Díez-Campelo, María</creatorcontrib><creatorcontrib>Fontenay, Michaela</creatorcontrib><creatorcontrib>Hellström-Lindberg, Eva</creatorcontrib><creatorcontrib>Jansen, Joop H</creatorcontrib><creatorcontrib>Dugas, Martin</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>MEDLINE - Academic</collection><collection>SwePub</collection><collection>SwePub Articles</collection><collection>SWEPUB Freely available online</collection><collection>SwePub Articles full text</collection><jtitle>The Journal of molecular diagnostics : JMD</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sandmann, Sarah</au><au>de Graaf, Aniek O</au><au>Tobiasson, Magnus</au><au>Kosmider, Olivier</au><au>Abáigar, María</au><au>Clappier, Emmanuelle</au><au>Gallì, Anna</au><au>van der Reijden, Bert A</au><au>Malcovati, Luca</au><au>Fenaux, Pierre</au><au>Díez-Campelo, María</au><au>Fontenay, Michaela</au><au>Hellström-Lindberg, Eva</au><au>Jansen, Joop H</au><au>Dugas, Martin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multicenter Next-Generation Sequencing Studies between Theory and Practice: Harmonization of Data Analysis Using Real-World Myelodysplastic Syndrome Data</atitle><jtitle>The Journal of molecular diagnostics : JMD</jtitle><addtitle>J Mol Diagn</addtitle><date>2021-03-01</date><risdate>2021</risdate><volume>23</volume><issue>3</issue><spage>347</spage><epage>357</epage><pages>347-357</pages><issn>1943-7811</issn><issn>1525-1578</issn><eissn>1943-7811</eissn><abstract>In the age of personalized medicine, genetic testing by means of targeted sequencing has taken a key role. However, when comparing different sets of targeted sequencing data, these are often characterized by a considerable lack of harmonization. Laboratories follow their own best practices, analyzing their own target regions. The question on how to best integrate data from different sites remains unanswered. Studying the example of myelodysplastic syndrome (MDS), we analyzed 11 targeted sequencing sets, collected from six different centers (n = 831). An intersecting target region of 43,076 bp (30 genes) was identified; whereas, the original target regions covered up to 499,097 bp (117 genes). Considering a region of interest in the context of MDS, a target region of 55,969 bp (31 genes) was identified. For each gene, coverage and sequencing data quality was evaluated, calculating a sequencing score. Analyses revealed huge differences between different data sets as well as between different genes. Analysis of the relation between sequencing score and mutation frequency in MDS revealed that most genes with high frequency in MDS could be sequenced without expecting low coverage or quality. Still, no gene appeared consistently unproblematic for all data sets. To allow for comparable results in a multicenter setting analyzing MDS, we propose to use a predefined target region of interest and to perform centralized data analysis using harmonized criteria.</abstract><cop>United States</cop><pmid>33359455</pmid><doi>10.1016/j.jmoldx.2020.12.001</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1943-7811
ispartof The Journal of molecular diagnostics : JMD, 2021-03, Vol.23 (3), p.347-357
issn 1943-7811
1525-1578
1943-7811
language eng
recordid cdi_proquest_miscellaneous_2473414260
source MEDLINE; Elsevier ScienceDirect Journals Complete; Alma/SFX Local Collection; SWEPUB Freely available online; EZB Electronic Journals Library
subjects Algorithms
Alleles
Biomarkers
Data Interpretation, Statistical
Gene Frequency
Genetic Testing
High-Throughput Nucleotide Sequencing - methods
High-Throughput Nucleotide Sequencing - standards
Humans
Medicin och hälsovetenskap
Mutation
Myelodysplastic Syndromes - diagnosis
Myelodysplastic Syndromes - genetics
Myelodysplastic Syndromes - therapy
Reproducibility of Results
Sensitivity and Specificity
title Multicenter Next-Generation Sequencing Studies between Theory and Practice: Harmonization of Data Analysis Using Real-World Myelodysplastic Syndrome Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T16%3A26%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_swepu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multicenter%20Next-Generation%20Sequencing%20Studies%20between%20Theory%20and%20Practice:%20Harmonization%20of%20Data%20Analysis%20Using%20Real-World%20Myelodysplastic%20Syndrome%20Data&rft.jtitle=The%20Journal%20of%20molecular%20diagnostics%20:%20JMD&rft.au=Sandmann,%20Sarah&rft.date=2021-03-01&rft.volume=23&rft.issue=3&rft.spage=347&rft.epage=357&rft.pages=347-357&rft.issn=1943-7811&rft.eissn=1943-7811&rft_id=info:doi/10.1016/j.jmoldx.2020.12.001&rft_dat=%3Cproquest_swepu%3E2473414260%3C/proquest_swepu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2473414260&rft_id=info:pmid/33359455&rfr_iscdi=true