Genome sequence comparison under a new form of tri-nucleotide representation based on bio-chemical properties of nucleotides
•A new tri-nucleotide representation is proposed for genome sequence comparison.•Representation is non-degenerate and it is based on bio-chemical properties of the nucleotides.•Simple Euclidian distance measure is applied for sequence comparison.•Method is not dependent on the alignment of the seque...
Gespeichert in:
Veröffentlicht in: | Gene 2020-03, Vol.730, p.144257-144257, Article 144257 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 144257 |
---|---|
container_issue | |
container_start_page | 144257 |
container_title | Gene |
container_volume | 730 |
creator | Das, Subhram Das, Arijit Mondal, Bingshati Dey, Nilanjan Bhattacharya, D.K. Tibarewala, D.N. |
description | •A new tri-nucleotide representation is proposed for genome sequence comparison.•Representation is non-degenerate and it is based on bio-chemical properties of the nucleotides.•Simple Euclidian distance measure is applied for sequence comparison.•Method is not dependent on the alignment of the sequences.•Results of proposed method are verified for all possible genome sequences.
Genetic sequence analysis, classification of genome sequence and evolutionary relationship between species using their biological sequences, are the emerging research domain in Bioinformatics. Several methods have already been applied to DNA sequence comparison under tri-nucleotide representation. In this paper, a new form of tri-nucleotide representation is proposed for sequence comparison. The comparison does not depend on the alignment of the sequences. In this representation, the bio-chemical properties of the nucleotides are considered. The novelty of this method is that the sequences of unequal lengths are represented by vectors of the same length and each of the tri-nucleotide formed out of the given sequence has its unique representation. To validate the proposed method, it is verified on several data sets related to mammalians, viruses and bacteria. The results of this method are further compared with those obtained by methods such as probabilistic method, natural vector method, Fourier power spectrum method, multiple encoding vector method, and feature frequency profiles method. Moreover, this method produces accurate phylogeny in all the cases. It is also proved that the time complexity of the present method is less. |
doi_str_mv | 10.1016/j.gene.2019.144257 |
format | Article |
fullrecord | <record><control><sourceid>proquest_webof</sourceid><recordid>TN_cdi_proquest_miscellaneous_2317963879</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0378111919309163</els_id><sourcerecordid>2317963879</sourcerecordid><originalsourceid>FETCH-LOGICAL-c356t-a6b7055c729790ef21785e1d265c9156d8aef9fa1587092c77ef3f7443d7c7063</originalsourceid><addsrcrecordid>eNqNkU2LFDEQhoMo7rj6BzxIjoL0mI9JpwNeZNBVWPCi55BJKpqhO2mTtIvgjzdNj-tNzCV1eN6i6imEnlOyp4T2r8_7rxBhzwhVe3o4MCEfoB0dpOoI4cNDtCNcDh2lVF2hJ6WcSXtCsMfoilMplBr4Dv26gZgmwAW-LxAtYJum2eRQUsRLdJCxwRHusE95wsnjmkMXFztCqsEBzjBnKBCrqaElTqaAw2sRUme_wRSsGfGc0wy5Bihrh7_p8hQ98mYs8OzyX6Mv7999Pn7obj_dfDy-ve0sF33tTH-SbXArmZKKgGdUDgKoY72wioreDQa88oaKQRLFrJTguZeHA3fSStLza_Ry69smaWuWqqdQLIyjiZCWolnzoXrezDWUbajNqZQMXs85TCb_1JTo1bo-69W6Xq3rzXoLvbj0X04TuPvIH80NeLUBd3BKvtiwqr7H1rNQIjiTraKs0cP_08ewuT-mJdYWfbNFoen8ESDrS9yFDLZql8K_FvkNocq1kA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2317963879</pqid></control><display><type>article</type><title>Genome sequence comparison under a new form of tri-nucleotide representation based on bio-chemical properties of nucleotides</title><source>MEDLINE</source><source>Web of Science - Science Citation Index Expanded - 2020<img src="https://exlibris-pub.s3.amazonaws.com/fromwos-v2.jpg" /></source><source>Access via ScienceDirect (Elsevier)</source><creator>Das, Subhram ; Das, Arijit ; Mondal, Bingshati ; Dey, Nilanjan ; Bhattacharya, D.K. ; Tibarewala, D.N.</creator><creatorcontrib>Das, Subhram ; Das, Arijit ; Mondal, Bingshati ; Dey, Nilanjan ; Bhattacharya, D.K. ; Tibarewala, D.N.</creatorcontrib><description>•A new tri-nucleotide representation is proposed for genome sequence comparison.•Representation is non-degenerate and it is based on bio-chemical properties of the nucleotides.•Simple Euclidian distance measure is applied for sequence comparison.•Method is not dependent on the alignment of the sequences.•Results of proposed method are verified for all possible genome sequences.
Genetic sequence analysis, classification of genome sequence and evolutionary relationship between species using their biological sequences, are the emerging research domain in Bioinformatics. Several methods have already been applied to DNA sequence comparison under tri-nucleotide representation. In this paper, a new form of tri-nucleotide representation is proposed for sequence comparison. The comparison does not depend on the alignment of the sequences. In this representation, the bio-chemical properties of the nucleotides are considered. The novelty of this method is that the sequences of unequal lengths are represented by vectors of the same length and each of the tri-nucleotide formed out of the given sequence has its unique representation. To validate the proposed method, it is verified on several data sets related to mammalians, viruses and bacteria. The results of this method are further compared with those obtained by methods such as probabilistic method, natural vector method, Fourier power spectrum method, multiple encoding vector method, and feature frequency profiles method. Moreover, this method produces accurate phylogeny in all the cases. It is also proved that the time complexity of the present method is less.</description><identifier>ISSN: 0378-1119</identifier><identifier>EISSN: 1879-0038</identifier><identifier>DOI: 10.1016/j.gene.2019.144257</identifier><identifier>PMID: 31759983</identifier><language>eng</language><publisher>AMSTERDAM: Elsevier B.V</publisher><subject>Algorithms ; Alignment-free method ; Animals ; Bacteria - genetics ; Base Sequence ; Chromosome Mapping - methods ; Cluster Analysis ; Computational Biology - methods ; Evolutionary relationship ; Genetics & Heredity ; Genomics - methods ; Humans ; Life Sciences & Biomedicine ; Mammals - genetics ; Nucleotides - chemistry ; Nucleotides - genetics ; Phylogeny ; Science & Technology ; Sequence Alignment ; Sequence Analysis, DNA - methods ; Sequence comparison ; Tri-nucleotide ; Trinucleotide Repeats - genetics ; Viruses - genetics</subject><ispartof>Gene, 2020-03, Vol.730, p.144257-144257, Article 144257</ispartof><rights>2019 Elsevier B.V.</rights><rights>Copyright © 2019 Elsevier B.V. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>true</woscitedreferencessubscribed><woscitedreferencescount>7</woscitedreferencescount><woscitedreferencesoriginalsourcerecordid>wos000510532700012</woscitedreferencesoriginalsourcerecordid><citedby>FETCH-LOGICAL-c356t-a6b7055c729790ef21785e1d265c9156d8aef9fa1587092c77ef3f7443d7c7063</citedby><cites>FETCH-LOGICAL-c356t-a6b7055c729790ef21785e1d265c9156d8aef9fa1587092c77ef3f7443d7c7063</cites><orcidid>0000-0003-2899-4433</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.gene.2019.144257$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>315,781,785,3551,27929,27930,28253,46000</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31759983$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Das, Subhram</creatorcontrib><creatorcontrib>Das, Arijit</creatorcontrib><creatorcontrib>Mondal, Bingshati</creatorcontrib><creatorcontrib>Dey, Nilanjan</creatorcontrib><creatorcontrib>Bhattacharya, D.K.</creatorcontrib><creatorcontrib>Tibarewala, D.N.</creatorcontrib><title>Genome sequence comparison under a new form of tri-nucleotide representation based on bio-chemical properties of nucleotides</title><title>Gene</title><addtitle>GENE</addtitle><addtitle>Gene</addtitle><description>•A new tri-nucleotide representation is proposed for genome sequence comparison.•Representation is non-degenerate and it is based on bio-chemical properties of the nucleotides.•Simple Euclidian distance measure is applied for sequence comparison.•Method is not dependent on the alignment of the sequences.•Results of proposed method are verified for all possible genome sequences.
Genetic sequence analysis, classification of genome sequence and evolutionary relationship between species using their biological sequences, are the emerging research domain in Bioinformatics. Several methods have already been applied to DNA sequence comparison under tri-nucleotide representation. In this paper, a new form of tri-nucleotide representation is proposed for sequence comparison. The comparison does not depend on the alignment of the sequences. In this representation, the bio-chemical properties of the nucleotides are considered. The novelty of this method is that the sequences of unequal lengths are represented by vectors of the same length and each of the tri-nucleotide formed out of the given sequence has its unique representation. To validate the proposed method, it is verified on several data sets related to mammalians, viruses and bacteria. The results of this method are further compared with those obtained by methods such as probabilistic method, natural vector method, Fourier power spectrum method, multiple encoding vector method, and feature frequency profiles method. Moreover, this method produces accurate phylogeny in all the cases. It is also proved that the time complexity of the present method is less.</description><subject>Algorithms</subject><subject>Alignment-free method</subject><subject>Animals</subject><subject>Bacteria - genetics</subject><subject>Base Sequence</subject><subject>Chromosome Mapping - methods</subject><subject>Cluster Analysis</subject><subject>Computational Biology - methods</subject><subject>Evolutionary relationship</subject><subject>Genetics & Heredity</subject><subject>Genomics - methods</subject><subject>Humans</subject><subject>Life Sciences & Biomedicine</subject><subject>Mammals - genetics</subject><subject>Nucleotides - chemistry</subject><subject>Nucleotides - genetics</subject><subject>Phylogeny</subject><subject>Science & Technology</subject><subject>Sequence Alignment</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Sequence comparison</subject><subject>Tri-nucleotide</subject><subject>Trinucleotide Repeats - genetics</subject><subject>Viruses - genetics</subject><issn>0378-1119</issn><issn>1879-0038</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>AOWDO</sourceid><sourceid>EIF</sourceid><recordid>eNqNkU2LFDEQhoMo7rj6BzxIjoL0mI9JpwNeZNBVWPCi55BJKpqhO2mTtIvgjzdNj-tNzCV1eN6i6imEnlOyp4T2r8_7rxBhzwhVe3o4MCEfoB0dpOoI4cNDtCNcDh2lVF2hJ6WcSXtCsMfoilMplBr4Dv26gZgmwAW-LxAtYJum2eRQUsRLdJCxwRHusE95wsnjmkMXFztCqsEBzjBnKBCrqaElTqaAw2sRUme_wRSsGfGc0wy5Bihrh7_p8hQ98mYs8OzyX6Mv7999Pn7obj_dfDy-ve0sF33tTH-SbXArmZKKgGdUDgKoY72wioreDQa88oaKQRLFrJTguZeHA3fSStLza_Ry69smaWuWqqdQLIyjiZCWolnzoXrezDWUbajNqZQMXs85TCb_1JTo1bo-69W6Xq3rzXoLvbj0X04TuPvIH80NeLUBd3BKvtiwqr7H1rNQIjiTraKs0cP_08ewuT-mJdYWfbNFoen8ESDrS9yFDLZql8K_FvkNocq1kA</recordid><startdate>20200310</startdate><enddate>20200310</enddate><creator>Das, Subhram</creator><creator>Das, Arijit</creator><creator>Mondal, Bingshati</creator><creator>Dey, Nilanjan</creator><creator>Bhattacharya, D.K.</creator><creator>Tibarewala, D.N.</creator><general>Elsevier B.V</general><general>Elsevier</general><scope>AOWDO</scope><scope>BLEPL</scope><scope>DTL</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-2899-4433</orcidid></search><sort><creationdate>20200310</creationdate><title>Genome sequence comparison under a new form of tri-nucleotide representation based on bio-chemical properties of nucleotides</title><author>Das, Subhram ; Das, Arijit ; Mondal, Bingshati ; Dey, Nilanjan ; Bhattacharya, D.K. ; Tibarewala, D.N.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c356t-a6b7055c729790ef21785e1d265c9156d8aef9fa1587092c77ef3f7443d7c7063</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Alignment-free method</topic><topic>Animals</topic><topic>Bacteria - genetics</topic><topic>Base Sequence</topic><topic>Chromosome Mapping - methods</topic><topic>Cluster Analysis</topic><topic>Computational Biology - methods</topic><topic>Evolutionary relationship</topic><topic>Genetics & Heredity</topic><topic>Genomics - methods</topic><topic>Humans</topic><topic>Life Sciences & Biomedicine</topic><topic>Mammals - genetics</topic><topic>Nucleotides - chemistry</topic><topic>Nucleotides - genetics</topic><topic>Phylogeny</topic><topic>Science & Technology</topic><topic>Sequence Alignment</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Sequence comparison</topic><topic>Tri-nucleotide</topic><topic>Trinucleotide Repeats - genetics</topic><topic>Viruses - genetics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Das, Subhram</creatorcontrib><creatorcontrib>Das, Arijit</creatorcontrib><creatorcontrib>Mondal, Bingshati</creatorcontrib><creatorcontrib>Dey, Nilanjan</creatorcontrib><creatorcontrib>Bhattacharya, D.K.</creatorcontrib><creatorcontrib>Tibarewala, D.N.</creatorcontrib><collection>Web of Science - Science Citation Index Expanded - 2020</collection><collection>Web of Science Core Collection</collection><collection>Science Citation Index Expanded</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Gene</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Das, Subhram</au><au>Das, Arijit</au><au>Mondal, Bingshati</au><au>Dey, Nilanjan</au><au>Bhattacharya, D.K.</au><au>Tibarewala, D.N.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Genome sequence comparison under a new form of tri-nucleotide representation based on bio-chemical properties of nucleotides</atitle><jtitle>Gene</jtitle><stitle>GENE</stitle><addtitle>Gene</addtitle><date>2020-03-10</date><risdate>2020</risdate><volume>730</volume><spage>144257</spage><epage>144257</epage><pages>144257-144257</pages><artnum>144257</artnum><issn>0378-1119</issn><eissn>1879-0038</eissn><abstract>•A new tri-nucleotide representation is proposed for genome sequence comparison.•Representation is non-degenerate and it is based on bio-chemical properties of the nucleotides.•Simple Euclidian distance measure is applied for sequence comparison.•Method is not dependent on the alignment of the sequences.•Results of proposed method are verified for all possible genome sequences.
Genetic sequence analysis, classification of genome sequence and evolutionary relationship between species using their biological sequences, are the emerging research domain in Bioinformatics. Several methods have already been applied to DNA sequence comparison under tri-nucleotide representation. In this paper, a new form of tri-nucleotide representation is proposed for sequence comparison. The comparison does not depend on the alignment of the sequences. In this representation, the bio-chemical properties of the nucleotides are considered. The novelty of this method is that the sequences of unequal lengths are represented by vectors of the same length and each of the tri-nucleotide formed out of the given sequence has its unique representation. To validate the proposed method, it is verified on several data sets related to mammalians, viruses and bacteria. The results of this method are further compared with those obtained by methods such as probabilistic method, natural vector method, Fourier power spectrum method, multiple encoding vector method, and feature frequency profiles method. Moreover, this method produces accurate phylogeny in all the cases. It is also proved that the time complexity of the present method is less.</abstract><cop>AMSTERDAM</cop><pub>Elsevier B.V</pub><pmid>31759983</pmid><doi>10.1016/j.gene.2019.144257</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0003-2899-4433</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0378-1119 |
ispartof | Gene, 2020-03, Vol.730, p.144257-144257, Article 144257 |
issn | 0378-1119 1879-0038 |
language | eng |
recordid | cdi_proquest_miscellaneous_2317963879 |
source | MEDLINE; Web of Science - Science Citation Index Expanded - 2020<img src="https://exlibris-pub.s3.amazonaws.com/fromwos-v2.jpg" />; Access via ScienceDirect (Elsevier) |
subjects | Algorithms Alignment-free method Animals Bacteria - genetics Base Sequence Chromosome Mapping - methods Cluster Analysis Computational Biology - methods Evolutionary relationship Genetics & Heredity Genomics - methods Humans Life Sciences & Biomedicine Mammals - genetics Nucleotides - chemistry Nucleotides - genetics Phylogeny Science & Technology Sequence Alignment Sequence Analysis, DNA - methods Sequence comparison Tri-nucleotide Trinucleotide Repeats - genetics Viruses - genetics |
title | Genome sequence comparison under a new form of tri-nucleotide representation based on bio-chemical properties of nucleotides |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-14T20%3A39%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_webof&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Genome%20sequence%20comparison%20under%20a%20new%20form%20of%20tri-nucleotide%20representation%20based%20on%20bio-chemical%20properties%20of%20nucleotides&rft.jtitle=Gene&rft.au=Das,%20Subhram&rft.date=2020-03-10&rft.volume=730&rft.spage=144257&rft.epage=144257&rft.pages=144257-144257&rft.artnum=144257&rft.issn=0378-1119&rft.eissn=1879-0038&rft_id=info:doi/10.1016/j.gene.2019.144257&rft_dat=%3Cproquest_webof%3E2317963879%3C/proquest_webof%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2317963879&rft_id=info:pmid/31759983&rft_els_id=S0378111919309163&rfr_iscdi=true |