K-mer applied in Mycobacterium tuberculosis genome cluster analysis
Abstract According to studies carried out, approximately 10 million people developed tuberculosis in 2018. Of this total, 1.5 million people died from the disease. To study the behavior of the genome sequences of Mycobacterium tuberculosis (MTB), the bacterium responsible for the development of tube...
Gespeichert in:
Veröffentlicht in: | Brazilian journal of biology 2024, Vol.84, p.1-8 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 8 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | Brazilian journal of biology |
container_volume | 84 |
creator | Ferreira, Leila Maria Sáfadi, Thelma Ferreira, Juliano Lino |
description | Abstract According to studies carried out, approximately 10 million people developed tuberculosis in 2018. Of this total, 1.5 million people died from the disease. To study the behavior of the genome sequences of Mycobacterium tuberculosis (MTB), the bacterium responsible for the development of tuberculosis (TB), an analysis was performed using k-mers (DNA word frequency). The k values ranged from 1 to 10, because the analysis was performed on the full length of the sequences, where each sequence is composed of approximately 4 million base pairs, k values above 10, the analysis is interrupted, as consequence of the program's capacity. The aim of this work was to verify the formation of the phylogenetic tree in each k-mer analyzed. The results showed the formation of distinct groups in some k-mers analyzed, taking into account the threshold line. However, in all groups, the multidrug-resistant (MDR) and extensively drug-resistant (XDR) strains remained together and separated from the other strains.
RESUMO De acordo com estudos realizados, cerca de 10 milhões de pessoas desenvolveram tuberculose em 2018. Desse total, 1,5 milhão de pessoas morreram devido à doença. Procurando estudar o comportamento das sequências do genoma da Mycobacteruim tuberculosis (MTB), bactéria responsável por desenvolver a Tuberculose (TB), foi realizada uma análise aplicando o k-mer (frequência de palavras do DNA). Os valores de k variaram de 1 a 10, pois devido a análise ter sido feita no comprimento total das sequencias, onde cada sequencia é composta por aproximadamente 4 milhões de pares de bases, valores de k acima de 10, a análise é interrompida, como consequência da capacidade do programa. O intuito do trabalho foi de verificar a formação da árvore filogenética em cada k-mer analisado. Os resultados obtidos evidenciaram a formação de grupos distintos em alguns k-mers analisados, levando-se em consideração a linha de corte. Entretanto, em todos os grupos formados as cepas multidroga resistente (MDR) e extensivamente resistente à droga (XDR) permaneceram juntas e separadas das demais cepas. |
doi_str_mv | 10.1590/1519-6984.258258 |
format | Article |
fullrecord | <record><control><sourceid>proquest_sciel</sourceid><recordid>TN_cdi_scielo_journals_S1519_69842024000100344</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><scielo_id>S1519_69842024000100344</scielo_id><sourcerecordid>2682257189</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2538-cc91b74125794459165d4a42a2eac1dbf9d6c0060a3af7c6dacc8218676eec223</originalsourceid><addsrcrecordid>eNpdUU1LxDAQDaLg-nH3WPDipeskzedRFr9wxYN6DmmaSpe0qUl72H9vS3UPwsAMM-893mMQusKwxkzBLWZY5VxJuiZMTnWEVpgLmdNCsONp_juforOUdgCEQSFXaPOSty5mpu9946qs6bLXvQ2lsYOLzdhmw1i6aEcfUpOyL9eF1mXWj2mYSZ3x-2l_gU5q45O7_O3n6PPh_mPzlG_fHp83d9vcElbI3FqFS0ExYUJRyhTmrKKGEkOcsbgqa1VxC8DBFKYWllfGWkmw5II7ZwkpztF60U22cT7oXRjjZCHp9zmdntMRIBQAMEBB6US4WQh9DN-jS4Num2Sd96ZzYUyacEkmN1iqCXr9D3pQJwILzICrGQULysaQUnS17mPTmrjXGPT8Bn1wopc3FD_DCHbM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2717150699</pqid></control><display><type>article</type><title>K-mer applied in Mycobacterium tuberculosis genome cluster analysis</title><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Ferreira, Leila Maria ; Sáfadi, Thelma ; Ferreira, Juliano Lino</creator><creatorcontrib>Ferreira, Leila Maria ; Sáfadi, Thelma ; Ferreira, Juliano Lino</creatorcontrib><description>Abstract According to studies carried out, approximately 10 million people developed tuberculosis in 2018. Of this total, 1.5 million people died from the disease. To study the behavior of the genome sequences of Mycobacterium tuberculosis (MTB), the bacterium responsible for the development of tuberculosis (TB), an analysis was performed using k-mers (DNA word frequency). The k values ranged from 1 to 10, because the analysis was performed on the full length of the sequences, where each sequence is composed of approximately 4 million base pairs, k values above 10, the analysis is interrupted, as consequence of the program's capacity. The aim of this work was to verify the formation of the phylogenetic tree in each k-mer analyzed. The results showed the formation of distinct groups in some k-mers analyzed, taking into account the threshold line. However, in all groups, the multidrug-resistant (MDR) and extensively drug-resistant (XDR) strains remained together and separated from the other strains.
RESUMO De acordo com estudos realizados, cerca de 10 milhões de pessoas desenvolveram tuberculose em 2018. Desse total, 1,5 milhão de pessoas morreram devido à doença. Procurando estudar o comportamento das sequências do genoma da Mycobacteruim tuberculosis (MTB), bactéria responsável por desenvolver a Tuberculose (TB), foi realizada uma análise aplicando o k-mer (frequência de palavras do DNA). Os valores de k variaram de 1 a 10, pois devido a análise ter sido feita no comprimento total das sequencias, onde cada sequencia é composta por aproximadamente 4 milhões de pares de bases, valores de k acima de 10, a análise é interrompida, como consequência da capacidade do programa. O intuito do trabalho foi de verificar a formação da árvore filogenética em cada k-mer analisado. Os resultados obtidos evidenciaram a formação de grupos distintos em alguns k-mers analisados, levando-se em consideração a linha de corte. Entretanto, em todos os grupos formados as cepas multidroga resistente (MDR) e extensivamente resistente à droga (XDR) permaneceram juntas e separadas das demais cepas.</description><identifier>ISSN: 1519-6984</identifier><identifier>ISSN: 1678-4375</identifier><identifier>EISSN: 1678-4375</identifier><identifier>DOI: 10.1590/1519-6984.258258</identifier><language>eng</language><publisher>São Carlos: Instituto Internacional de Ecologia</publisher><subject>BIOLOGY ; Cluster analysis ; Coronaviruses ; Deoxyribonucleic acid ; DNA ; Drug resistance ; Fatalities ; Gene sequencing ; Genomes ; HIV ; Human immunodeficiency virus ; Multidrug resistance ; Mycobacterium tuberculosis ; Phylogeny ; Tuberculosis</subject><ispartof>Brazilian journal of biology, 2024, Vol.84, p.1-8</ispartof><rights>2024. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>This work is licensed under a Creative Commons Attribution 4.0 International License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c2538-cc91b74125794459165d4a42a2eac1dbf9d6c0060a3af7c6dacc8218676eec223</cites><orcidid>0000-0003-1723-8253 ; 0000-0002-4918-300X ; 0000-0002-8502-4444</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,860,881,4010,27900,27901,27902</link.rule.ids></links><search><creatorcontrib>Ferreira, Leila Maria</creatorcontrib><creatorcontrib>Sáfadi, Thelma</creatorcontrib><creatorcontrib>Ferreira, Juliano Lino</creatorcontrib><title>K-mer applied in Mycobacterium tuberculosis genome cluster analysis</title><title>Brazilian journal of biology</title><addtitle>Braz. J. Biol</addtitle><description>Abstract According to studies carried out, approximately 10 million people developed tuberculosis in 2018. Of this total, 1.5 million people died from the disease. To study the behavior of the genome sequences of Mycobacterium tuberculosis (MTB), the bacterium responsible for the development of tuberculosis (TB), an analysis was performed using k-mers (DNA word frequency). The k values ranged from 1 to 10, because the analysis was performed on the full length of the sequences, where each sequence is composed of approximately 4 million base pairs, k values above 10, the analysis is interrupted, as consequence of the program's capacity. The aim of this work was to verify the formation of the phylogenetic tree in each k-mer analyzed. The results showed the formation of distinct groups in some k-mers analyzed, taking into account the threshold line. However, in all groups, the multidrug-resistant (MDR) and extensively drug-resistant (XDR) strains remained together and separated from the other strains.
RESUMO De acordo com estudos realizados, cerca de 10 milhões de pessoas desenvolveram tuberculose em 2018. Desse total, 1,5 milhão de pessoas morreram devido à doença. Procurando estudar o comportamento das sequências do genoma da Mycobacteruim tuberculosis (MTB), bactéria responsável por desenvolver a Tuberculose (TB), foi realizada uma análise aplicando o k-mer (frequência de palavras do DNA). Os valores de k variaram de 1 a 10, pois devido a análise ter sido feita no comprimento total das sequencias, onde cada sequencia é composta por aproximadamente 4 milhões de pares de bases, valores de k acima de 10, a análise é interrompida, como consequência da capacidade do programa. O intuito do trabalho foi de verificar a formação da árvore filogenética em cada k-mer analisado. Os resultados obtidos evidenciaram a formação de grupos distintos em alguns k-mers analisados, levando-se em consideração a linha de corte. Entretanto, em todos os grupos formados as cepas multidroga resistente (MDR) e extensivamente resistente à droga (XDR) permaneceram juntas e separadas das demais cepas.</description><subject>BIOLOGY</subject><subject>Cluster analysis</subject><subject>Coronaviruses</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>Drug resistance</subject><subject>Fatalities</subject><subject>Gene sequencing</subject><subject>Genomes</subject><subject>HIV</subject><subject>Human immunodeficiency virus</subject><subject>Multidrug resistance</subject><subject>Mycobacterium tuberculosis</subject><subject>Phylogeny</subject><subject>Tuberculosis</subject><issn>1519-6984</issn><issn>1678-4375</issn><issn>1678-4375</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNpdUU1LxDAQDaLg-nH3WPDipeskzedRFr9wxYN6DmmaSpe0qUl72H9vS3UPwsAMM-893mMQusKwxkzBLWZY5VxJuiZMTnWEVpgLmdNCsONp_juforOUdgCEQSFXaPOSty5mpu9946qs6bLXvQ2lsYOLzdhmw1i6aEcfUpOyL9eF1mXWj2mYSZ3x-2l_gU5q45O7_O3n6PPh_mPzlG_fHp83d9vcElbI3FqFS0ExYUJRyhTmrKKGEkOcsbgqa1VxC8DBFKYWllfGWkmw5II7ZwkpztF60U22cT7oXRjjZCHp9zmdntMRIBQAMEBB6US4WQh9DN-jS4Num2Sd96ZzYUyacEkmN1iqCXr9D3pQJwILzICrGQULysaQUnS17mPTmrjXGPT8Bn1wopc3FD_DCHbM</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Ferreira, Leila Maria</creator><creator>Sáfadi, Thelma</creator><creator>Ferreira, Juliano Lino</creator><general>Instituto Internacional de Ecologia</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QG</scope><scope>7QL</scope><scope>7SN</scope><scope>7SS</scope><scope>7T7</scope><scope>7U9</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>CLZPN</scope><scope>COVID</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>M7N</scope><scope>M7P</scope><scope>MBDVC</scope><scope>P64</scope><scope>PATMY</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PYCSY</scope><scope>Q9U</scope><scope>7X8</scope><scope>GPN</scope><orcidid>https://orcid.org/0000-0003-1723-8253</orcidid><orcidid>https://orcid.org/0000-0002-4918-300X</orcidid><orcidid>https://orcid.org/0000-0002-8502-4444</orcidid></search><sort><creationdate>2024</creationdate><title>K-mer applied in Mycobacterium tuberculosis genome cluster analysis</title><author>Ferreira, Leila Maria ; Sáfadi, Thelma ; Ferreira, Juliano Lino</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2538-cc91b74125794459165d4a42a2eac1dbf9d6c0060a3af7c6dacc8218676eec223</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>BIOLOGY</topic><topic>Cluster analysis</topic><topic>Coronaviruses</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>Drug resistance</topic><topic>Fatalities</topic><topic>Gene sequencing</topic><topic>Genomes</topic><topic>HIV</topic><topic>Human immunodeficiency virus</topic><topic>Multidrug resistance</topic><topic>Mycobacterium tuberculosis</topic><topic>Phylogeny</topic><topic>Tuberculosis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ferreira, Leila Maria</creatorcontrib><creatorcontrib>Sáfadi, Thelma</creatorcontrib><creatorcontrib>Ferreira, Juliano Lino</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Virology and AIDS Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>Latin America & Iberia Database</collection><collection>Coronavirus Research Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Research Library</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Research Library (Corporate)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Environmental Science Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Environmental Science Collection</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>SciELO</collection><jtitle>Brazilian journal of biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ferreira, Leila Maria</au><au>Sáfadi, Thelma</au><au>Ferreira, Juliano Lino</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>K-mer applied in Mycobacterium tuberculosis genome cluster analysis</atitle><jtitle>Brazilian journal of biology</jtitle><addtitle>Braz. J. Biol</addtitle><date>2024</date><risdate>2024</risdate><volume>84</volume><spage>1</spage><epage>8</epage><pages>1-8</pages><issn>1519-6984</issn><issn>1678-4375</issn><eissn>1678-4375</eissn><abstract>Abstract According to studies carried out, approximately 10 million people developed tuberculosis in 2018. Of this total, 1.5 million people died from the disease. To study the behavior of the genome sequences of Mycobacterium tuberculosis (MTB), the bacterium responsible for the development of tuberculosis (TB), an analysis was performed using k-mers (DNA word frequency). The k values ranged from 1 to 10, because the analysis was performed on the full length of the sequences, where each sequence is composed of approximately 4 million base pairs, k values above 10, the analysis is interrupted, as consequence of the program's capacity. The aim of this work was to verify the formation of the phylogenetic tree in each k-mer analyzed. The results showed the formation of distinct groups in some k-mers analyzed, taking into account the threshold line. However, in all groups, the multidrug-resistant (MDR) and extensively drug-resistant (XDR) strains remained together and separated from the other strains.
RESUMO De acordo com estudos realizados, cerca de 10 milhões de pessoas desenvolveram tuberculose em 2018. Desse total, 1,5 milhão de pessoas morreram devido à doença. Procurando estudar o comportamento das sequências do genoma da Mycobacteruim tuberculosis (MTB), bactéria responsável por desenvolver a Tuberculose (TB), foi realizada uma análise aplicando o k-mer (frequência de palavras do DNA). Os valores de k variaram de 1 a 10, pois devido a análise ter sido feita no comprimento total das sequencias, onde cada sequencia é composta por aproximadamente 4 milhões de pares de bases, valores de k acima de 10, a análise é interrompida, como consequência da capacidade do programa. O intuito do trabalho foi de verificar a formação da árvore filogenética em cada k-mer analisado. Os resultados obtidos evidenciaram a formação de grupos distintos em alguns k-mers analisados, levando-se em consideração a linha de corte. Entretanto, em todos os grupos formados as cepas multidroga resistente (MDR) e extensivamente resistente à droga (XDR) permaneceram juntas e separadas das demais cepas.</abstract><cop>São Carlos</cop><pub>Instituto Internacional de Ecologia</pub><doi>10.1590/1519-6984.258258</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0003-1723-8253</orcidid><orcidid>https://orcid.org/0000-0002-4918-300X</orcidid><orcidid>https://orcid.org/0000-0002-8502-4444</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1519-6984 |
ispartof | Brazilian journal of biology, 2024, Vol.84, p.1-8 |
issn | 1519-6984 1678-4375 1678-4375 |
language | eng |
recordid | cdi_scielo_journals_S1519_69842024000100344 |
source | DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | BIOLOGY Cluster analysis Coronaviruses Deoxyribonucleic acid DNA Drug resistance Fatalities Gene sequencing Genomes HIV Human immunodeficiency virus Multidrug resistance Mycobacterium tuberculosis Phylogeny Tuberculosis |
title | K-mer applied in Mycobacterium tuberculosis genome cluster analysis |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T00%3A35%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_sciel&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=K-mer%20applied%20in%20Mycobacterium%20tuberculosis%20genome%20cluster%20analysis&rft.jtitle=Brazilian%20journal%20of%20biology&rft.au=Ferreira,%20Leila%20Maria&rft.date=2024&rft.volume=84&rft.spage=1&rft.epage=8&rft.pages=1-8&rft.issn=1519-6984&rft.eissn=1678-4375&rft_id=info:doi/10.1590/1519-6984.258258&rft_dat=%3Cproquest_sciel%3E2682257189%3C/proquest_sciel%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2717150699&rft_id=info:pmid/&rft_scielo_id=S1519_69842024000100344&rfr_iscdi=true |