K-mer applied in Mycobacterium tuberculosis genome cluster analysis

Abstract According to studies carried out, approximately 10 million people developed tuberculosis in 2018. Of this total, 1.5 million people died from the disease. To study the behavior of the genome sequences of Mycobacterium tuberculosis (MTB), the bacterium responsible for the development of tube...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Brazilian journal of biology 2024, Vol.84, p.1-8
Hauptverfasser: Ferreira, Leila Maria, Sáfadi, Thelma, Ferreira, Juliano Lino
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 8
container_issue
container_start_page 1
container_title Brazilian journal of biology
container_volume 84
creator Ferreira, Leila Maria
Sáfadi, Thelma
Ferreira, Juliano Lino
description Abstract According to studies carried out, approximately 10 million people developed tuberculosis in 2018. Of this total, 1.5 million people died from the disease. To study the behavior of the genome sequences of Mycobacterium tuberculosis (MTB), the bacterium responsible for the development of tuberculosis (TB), an analysis was performed using k-mers (DNA word frequency). The k values ranged from 1 to 10, because the analysis was performed on the full length of the sequences, where each sequence is composed of approximately 4 million base pairs, k values above 10, the analysis is interrupted, as consequence of the program's capacity. The aim of this work was to verify the formation of the phylogenetic tree in each k-mer analyzed. The results showed the formation of distinct groups in some k-mers analyzed, taking into account the threshold line. However, in all groups, the multidrug-resistant (MDR) and extensively drug-resistant (XDR) strains remained together and separated from the other strains. RESUMO De acordo com estudos realizados, cerca de 10 milhões de pessoas desenvolveram tuberculose em 2018. Desse total, 1,5 milhão de pessoas morreram devido à doença. Procurando estudar o comportamento das sequências do genoma da Mycobacteruim tuberculosis (MTB), bactéria responsável por desenvolver a Tuberculose (TB), foi realizada uma análise aplicando o k-mer (frequência de palavras do DNA). Os valores de k variaram de 1 a 10, pois devido a análise ter sido feita no comprimento total das sequencias, onde cada sequencia é composta por aproximadamente 4 milhões de pares de bases, valores de k acima de 10, a análise é interrompida, como consequência da capacidade do programa. O intuito do trabalho foi de verificar a formação da árvore filogenética em cada k-mer analisado. Os resultados obtidos evidenciaram a formação de grupos distintos em alguns k-mers analisados, levando-se em consideração a linha de corte. Entretanto, em todos os grupos formados as cepas multidroga resistente (MDR) e extensivamente resistente à droga (XDR) permaneceram juntas e separadas das demais cepas.
doi_str_mv 10.1590/1519-6984.258258
format Article
fullrecord <record><control><sourceid>proquest_sciel</sourceid><recordid>TN_cdi_scielo_journals_S1519_69842024000100344</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><scielo_id>S1519_69842024000100344</scielo_id><sourcerecordid>2682257189</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2538-cc91b74125794459165d4a42a2eac1dbf9d6c0060a3af7c6dacc8218676eec223</originalsourceid><addsrcrecordid>eNpdUU1LxDAQDaLg-nH3WPDipeskzedRFr9wxYN6DmmaSpe0qUl72H9vS3UPwsAMM-893mMQusKwxkzBLWZY5VxJuiZMTnWEVpgLmdNCsONp_juforOUdgCEQSFXaPOSty5mpu9946qs6bLXvQ2lsYOLzdhmw1i6aEcfUpOyL9eF1mXWj2mYSZ3x-2l_gU5q45O7_O3n6PPh_mPzlG_fHp83d9vcElbI3FqFS0ExYUJRyhTmrKKGEkOcsbgqa1VxC8DBFKYWllfGWkmw5II7ZwkpztF60U22cT7oXRjjZCHp9zmdntMRIBQAMEBB6US4WQh9DN-jS4Num2Sd96ZzYUyacEkmN1iqCXr9D3pQJwILzICrGQULysaQUnS17mPTmrjXGPT8Bn1wopc3FD_DCHbM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2717150699</pqid></control><display><type>article</type><title>K-mer applied in Mycobacterium tuberculosis genome cluster analysis</title><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Ferreira, Leila Maria ; Sáfadi, Thelma ; Ferreira, Juliano Lino</creator><creatorcontrib>Ferreira, Leila Maria ; Sáfadi, Thelma ; Ferreira, Juliano Lino</creatorcontrib><description>Abstract According to studies carried out, approximately 10 million people developed tuberculosis in 2018. Of this total, 1.5 million people died from the disease. To study the behavior of the genome sequences of Mycobacterium tuberculosis (MTB), the bacterium responsible for the development of tuberculosis (TB), an analysis was performed using k-mers (DNA word frequency). The k values ranged from 1 to 10, because the analysis was performed on the full length of the sequences, where each sequence is composed of approximately 4 million base pairs, k values above 10, the analysis is interrupted, as consequence of the program's capacity. The aim of this work was to verify the formation of the phylogenetic tree in each k-mer analyzed. The results showed the formation of distinct groups in some k-mers analyzed, taking into account the threshold line. However, in all groups, the multidrug-resistant (MDR) and extensively drug-resistant (XDR) strains remained together and separated from the other strains. RESUMO De acordo com estudos realizados, cerca de 10 milhões de pessoas desenvolveram tuberculose em 2018. Desse total, 1,5 milhão de pessoas morreram devido à doença. Procurando estudar o comportamento das sequências do genoma da Mycobacteruim tuberculosis (MTB), bactéria responsável por desenvolver a Tuberculose (TB), foi realizada uma análise aplicando o k-mer (frequência de palavras do DNA). Os valores de k variaram de 1 a 10, pois devido a análise ter sido feita no comprimento total das sequencias, onde cada sequencia é composta por aproximadamente 4 milhões de pares de bases, valores de k acima de 10, a análise é interrompida, como consequência da capacidade do programa. O intuito do trabalho foi de verificar a formação da árvore filogenética em cada k-mer analisado. Os resultados obtidos evidenciaram a formação de grupos distintos em alguns k-mers analisados, levando-se em consideração a linha de corte. Entretanto, em todos os grupos formados as cepas multidroga resistente (MDR) e extensivamente resistente à droga (XDR) permaneceram juntas e separadas das demais cepas.</description><identifier>ISSN: 1519-6984</identifier><identifier>ISSN: 1678-4375</identifier><identifier>EISSN: 1678-4375</identifier><identifier>DOI: 10.1590/1519-6984.258258</identifier><language>eng</language><publisher>São Carlos: Instituto Internacional de Ecologia</publisher><subject>BIOLOGY ; Cluster analysis ; Coronaviruses ; Deoxyribonucleic acid ; DNA ; Drug resistance ; Fatalities ; Gene sequencing ; Genomes ; HIV ; Human immunodeficiency virus ; Multidrug resistance ; Mycobacterium tuberculosis ; Phylogeny ; Tuberculosis</subject><ispartof>Brazilian journal of biology, 2024, Vol.84, p.1-8</ispartof><rights>2024. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>This work is licensed under a Creative Commons Attribution 4.0 International License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c2538-cc91b74125794459165d4a42a2eac1dbf9d6c0060a3af7c6dacc8218676eec223</cites><orcidid>0000-0003-1723-8253 ; 0000-0002-4918-300X ; 0000-0002-8502-4444</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,860,881,4010,27900,27901,27902</link.rule.ids></links><search><creatorcontrib>Ferreira, Leila Maria</creatorcontrib><creatorcontrib>Sáfadi, Thelma</creatorcontrib><creatorcontrib>Ferreira, Juliano Lino</creatorcontrib><title>K-mer applied in Mycobacterium tuberculosis genome cluster analysis</title><title>Brazilian journal of biology</title><addtitle>Braz. J. Biol</addtitle><description>Abstract According to studies carried out, approximately 10 million people developed tuberculosis in 2018. Of this total, 1.5 million people died from the disease. To study the behavior of the genome sequences of Mycobacterium tuberculosis (MTB), the bacterium responsible for the development of tuberculosis (TB), an analysis was performed using k-mers (DNA word frequency). The k values ranged from 1 to 10, because the analysis was performed on the full length of the sequences, where each sequence is composed of approximately 4 million base pairs, k values above 10, the analysis is interrupted, as consequence of the program's capacity. The aim of this work was to verify the formation of the phylogenetic tree in each k-mer analyzed. The results showed the formation of distinct groups in some k-mers analyzed, taking into account the threshold line. However, in all groups, the multidrug-resistant (MDR) and extensively drug-resistant (XDR) strains remained together and separated from the other strains. RESUMO De acordo com estudos realizados, cerca de 10 milhões de pessoas desenvolveram tuberculose em 2018. Desse total, 1,5 milhão de pessoas morreram devido à doença. Procurando estudar o comportamento das sequências do genoma da Mycobacteruim tuberculosis (MTB), bactéria responsável por desenvolver a Tuberculose (TB), foi realizada uma análise aplicando o k-mer (frequência de palavras do DNA). Os valores de k variaram de 1 a 10, pois devido a análise ter sido feita no comprimento total das sequencias, onde cada sequencia é composta por aproximadamente 4 milhões de pares de bases, valores de k acima de 10, a análise é interrompida, como consequência da capacidade do programa. O intuito do trabalho foi de verificar a formação da árvore filogenética em cada k-mer analisado. Os resultados obtidos evidenciaram a formação de grupos distintos em alguns k-mers analisados, levando-se em consideração a linha de corte. Entretanto, em todos os grupos formados as cepas multidroga resistente (MDR) e extensivamente resistente à droga (XDR) permaneceram juntas e separadas das demais cepas.</description><subject>BIOLOGY</subject><subject>Cluster analysis</subject><subject>Coronaviruses</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>Drug resistance</subject><subject>Fatalities</subject><subject>Gene sequencing</subject><subject>Genomes</subject><subject>HIV</subject><subject>Human immunodeficiency virus</subject><subject>Multidrug resistance</subject><subject>Mycobacterium tuberculosis</subject><subject>Phylogeny</subject><subject>Tuberculosis</subject><issn>1519-6984</issn><issn>1678-4375</issn><issn>1678-4375</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNpdUU1LxDAQDaLg-nH3WPDipeskzedRFr9wxYN6DmmaSpe0qUl72H9vS3UPwsAMM-893mMQusKwxkzBLWZY5VxJuiZMTnWEVpgLmdNCsONp_juforOUdgCEQSFXaPOSty5mpu9946qs6bLXvQ2lsYOLzdhmw1i6aEcfUpOyL9eF1mXWj2mYSZ3x-2l_gU5q45O7_O3n6PPh_mPzlG_fHp83d9vcElbI3FqFS0ExYUJRyhTmrKKGEkOcsbgqa1VxC8DBFKYWllfGWkmw5II7ZwkpztF60U22cT7oXRjjZCHp9zmdntMRIBQAMEBB6US4WQh9DN-jS4Num2Sd96ZzYUyacEkmN1iqCXr9D3pQJwILzICrGQULysaQUnS17mPTmrjXGPT8Bn1wopc3FD_DCHbM</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Ferreira, Leila Maria</creator><creator>Sáfadi, Thelma</creator><creator>Ferreira, Juliano Lino</creator><general>Instituto Internacional de Ecologia</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QG</scope><scope>7QL</scope><scope>7SN</scope><scope>7SS</scope><scope>7T7</scope><scope>7U9</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>CLZPN</scope><scope>COVID</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>M7N</scope><scope>M7P</scope><scope>MBDVC</scope><scope>P64</scope><scope>PATMY</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PYCSY</scope><scope>Q9U</scope><scope>7X8</scope><scope>GPN</scope><orcidid>https://orcid.org/0000-0003-1723-8253</orcidid><orcidid>https://orcid.org/0000-0002-4918-300X</orcidid><orcidid>https://orcid.org/0000-0002-8502-4444</orcidid></search><sort><creationdate>2024</creationdate><title>K-mer applied in Mycobacterium tuberculosis genome cluster analysis</title><author>Ferreira, Leila Maria ; Sáfadi, Thelma ; Ferreira, Juliano Lino</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2538-cc91b74125794459165d4a42a2eac1dbf9d6c0060a3af7c6dacc8218676eec223</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>BIOLOGY</topic><topic>Cluster analysis</topic><topic>Coronaviruses</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>Drug resistance</topic><topic>Fatalities</topic><topic>Gene sequencing</topic><topic>Genomes</topic><topic>HIV</topic><topic>Human immunodeficiency virus</topic><topic>Multidrug resistance</topic><topic>Mycobacterium tuberculosis</topic><topic>Phylogeny</topic><topic>Tuberculosis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ferreira, Leila Maria</creatorcontrib><creatorcontrib>Sáfadi, Thelma</creatorcontrib><creatorcontrib>Ferreira, Juliano Lino</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Virology and AIDS Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Agricultural &amp; Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>Latin America &amp; Iberia Database</collection><collection>Coronavirus Research Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Research Library</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Research Library (Corporate)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Environmental Science Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Environmental Science Collection</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>SciELO</collection><jtitle>Brazilian journal of biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ferreira, Leila Maria</au><au>Sáfadi, Thelma</au><au>Ferreira, Juliano Lino</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>K-mer applied in Mycobacterium tuberculosis genome cluster analysis</atitle><jtitle>Brazilian journal of biology</jtitle><addtitle>Braz. J. Biol</addtitle><date>2024</date><risdate>2024</risdate><volume>84</volume><spage>1</spage><epage>8</epage><pages>1-8</pages><issn>1519-6984</issn><issn>1678-4375</issn><eissn>1678-4375</eissn><abstract>Abstract According to studies carried out, approximately 10 million people developed tuberculosis in 2018. Of this total, 1.5 million people died from the disease. To study the behavior of the genome sequences of Mycobacterium tuberculosis (MTB), the bacterium responsible for the development of tuberculosis (TB), an analysis was performed using k-mers (DNA word frequency). The k values ranged from 1 to 10, because the analysis was performed on the full length of the sequences, where each sequence is composed of approximately 4 million base pairs, k values above 10, the analysis is interrupted, as consequence of the program's capacity. The aim of this work was to verify the formation of the phylogenetic tree in each k-mer analyzed. The results showed the formation of distinct groups in some k-mers analyzed, taking into account the threshold line. However, in all groups, the multidrug-resistant (MDR) and extensively drug-resistant (XDR) strains remained together and separated from the other strains. RESUMO De acordo com estudos realizados, cerca de 10 milhões de pessoas desenvolveram tuberculose em 2018. Desse total, 1,5 milhão de pessoas morreram devido à doença. Procurando estudar o comportamento das sequências do genoma da Mycobacteruim tuberculosis (MTB), bactéria responsável por desenvolver a Tuberculose (TB), foi realizada uma análise aplicando o k-mer (frequência de palavras do DNA). Os valores de k variaram de 1 a 10, pois devido a análise ter sido feita no comprimento total das sequencias, onde cada sequencia é composta por aproximadamente 4 milhões de pares de bases, valores de k acima de 10, a análise é interrompida, como consequência da capacidade do programa. O intuito do trabalho foi de verificar a formação da árvore filogenética em cada k-mer analisado. Os resultados obtidos evidenciaram a formação de grupos distintos em alguns k-mers analisados, levando-se em consideração a linha de corte. Entretanto, em todos os grupos formados as cepas multidroga resistente (MDR) e extensivamente resistente à droga (XDR) permaneceram juntas e separadas das demais cepas.</abstract><cop>São Carlos</cop><pub>Instituto Internacional de Ecologia</pub><doi>10.1590/1519-6984.258258</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0003-1723-8253</orcidid><orcidid>https://orcid.org/0000-0002-4918-300X</orcidid><orcidid>https://orcid.org/0000-0002-8502-4444</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1519-6984
ispartof Brazilian journal of biology, 2024, Vol.84, p.1-8
issn 1519-6984
1678-4375
1678-4375
language eng
recordid cdi_scielo_journals_S1519_69842024000100344
source DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects BIOLOGY
Cluster analysis
Coronaviruses
Deoxyribonucleic acid
DNA
Drug resistance
Fatalities
Gene sequencing
Genomes
HIV
Human immunodeficiency virus
Multidrug resistance
Mycobacterium tuberculosis
Phylogeny
Tuberculosis
title K-mer applied in Mycobacterium tuberculosis genome cluster analysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T00%3A35%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_sciel&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=K-mer%20applied%20in%20Mycobacterium%20tuberculosis%20genome%20cluster%20analysis&rft.jtitle=Brazilian%20journal%20of%20biology&rft.au=Ferreira,%20Leila%20Maria&rft.date=2024&rft.volume=84&rft.spage=1&rft.epage=8&rft.pages=1-8&rft.issn=1519-6984&rft.eissn=1678-4375&rft_id=info:doi/10.1590/1519-6984.258258&rft_dat=%3Cproquest_sciel%3E2682257189%3C/proquest_sciel%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2717150699&rft_id=info:pmid/&rft_scielo_id=S1519_69842024000100344&rfr_iscdi=true