The breakdown of the word symmetry in the human genome

Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word le...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of theoretical biology 2013-10, Vol.335, p.153-159
Hauptverfasser: Afreixo, Vera, Bastos, Carlos A.C., Garcia, Sara P., Rodrigues, João M.O.S., Pinho, Armando J., Ferreira, Paulo J.S.G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 159
container_issue
container_start_page 153
container_title Journal of theoretical biology
container_volume 335
creator Afreixo, Vera
Bastos, Carlos A.C.
Garcia, Sara P.
Rodrigues, João M.O.S.
Pinho, Armando J.
Ferreira, Paulo J.S.G.
description Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word length above which significance is lost? Can deviations in symmetry due to the finite size of the data be identified? This work addresses these questions by studying word symmetries in the human genome, chromosomes and transcriptome. To rule out finite-length effects, the results are compared with those obtained from random control sequences built to satisfy Chargaff's second parity rule. We use several techniques to evaluate the phenomenon of symmetry, including Pearson's correlation coefficient, total variational distance, a novel word symmetry distance, as well as traditional and equivalence statistical tests. We conclude that word symmetries are statistical significant in the human genome for word lengths up to 6nucleotides. For longer words, we present evidence that the phenomenon may not be as prevalent as previously thought.
doi_str_mv 10.1016/j.jtbi.2013.06.032
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1676362327</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0022519313003044</els_id><sourcerecordid>1676362327</sourcerecordid><originalsourceid>FETCH-LOGICAL-c413t-670ce8fd2d27cb41ac7fee202daf1a9e28e691e36a5b019925e8547fc8cd51473</originalsourceid><addsrcrecordid>eNqNkE1P3DAURS3UCoaPP8CizbKbhPfsxE4kNtWohUpILApry7FfwMMkpnamaP59PQztEnX1pKtzr54OY-cIFQLKi1W1mntfcUBRgaxA8AO2QOiasm1q_MAWAJyXDXbiiB2ntAKArhbykB1x0QrkChdM3j1S0UcyTy68TEUYijkHLyG6Im3Hkea4Lfz0Gj5uRjMVDzSFkU7Zx8GsE5293RN2__3b3fK6vLm9-rH8elPaGsVcSgWW2sFxx5XtazRWDUQcuDMDmo54S7JDEtI0PWDX8Yby62qwrXUN1kqcsC_73ecYfm0ozXr0ydJ6bSYKm6RRKikkF_w_0Jp3ba1QQUb5HrUxpBRp0M_RjyZuNYLeqdUrvVOrd2o1SJ3V5tKnt_1NP5L7V_nrMgOf98BggjYP0Sd9_zMvNNm7kKJuMnG5Jygr--0p6mQ9TZacj2Rn7YJ_74M_a7qRqQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1429847170</pqid></control><display><type>article</type><title>The breakdown of the word symmetry in the human genome</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><creator>Afreixo, Vera ; Bastos, Carlos A.C. ; Garcia, Sara P. ; Rodrigues, João M.O.S. ; Pinho, Armando J. ; Ferreira, Paulo J.S.G.</creator><creatorcontrib>Afreixo, Vera ; Bastos, Carlos A.C. ; Garcia, Sara P. ; Rodrigues, João M.O.S. ; Pinho, Armando J. ; Ferreira, Paulo J.S.G.</creatorcontrib><description>Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word length above which significance is lost? Can deviations in symmetry due to the finite size of the data be identified? This work addresses these questions by studying word symmetries in the human genome, chromosomes and transcriptome. To rule out finite-length effects, the results are compared with those obtained from random control sequences built to satisfy Chargaff's second parity rule. We use several techniques to evaluate the phenomenon of symmetry, including Pearson's correlation coefficient, total variational distance, a novel word symmetry distance, as well as traditional and equivalence statistical tests. We conclude that word symmetries are statistical significant in the human genome for word lengths up to 6nucleotides. For longer words, we present evidence that the phenomenon may not be as prevalent as previously thought.</description><identifier>ISSN: 0022-5193</identifier><identifier>EISSN: 1095-8541</identifier><identifier>DOI: 10.1016/j.jtbi.2013.06.032</identifier><identifier>PMID: 23831271</identifier><language>eng</language><publisher>England: Elsevier Ltd</publisher><subject>chromosomes ; Chromosomes, Human - genetics ; Chromosomes, Human - metabolism ; correlation ; Equivalence testing ; genome ; Genome, Human - physiology ; Humans ; Models, Genetic ; Oligonucleotide composition ; Single strand symmetry ; transcriptome ; Transcriptome - physiology ; Word symmetry distance</subject><ispartof>Journal of theoretical biology, 2013-10, Vol.335, p.153-159</ispartof><rights>2013 Elsevier Ltd</rights><rights>2013 Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c413t-670ce8fd2d27cb41ac7fee202daf1a9e28e691e36a5b019925e8547fc8cd51473</citedby><cites>FETCH-LOGICAL-c413t-670ce8fd2d27cb41ac7fee202daf1a9e28e691e36a5b019925e8547fc8cd51473</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0022519313003044$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23831271$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Afreixo, Vera</creatorcontrib><creatorcontrib>Bastos, Carlos A.C.</creatorcontrib><creatorcontrib>Garcia, Sara P.</creatorcontrib><creatorcontrib>Rodrigues, João M.O.S.</creatorcontrib><creatorcontrib>Pinho, Armando J.</creatorcontrib><creatorcontrib>Ferreira, Paulo J.S.G.</creatorcontrib><title>The breakdown of the word symmetry in the human genome</title><title>Journal of theoretical biology</title><addtitle>J Theor Biol</addtitle><description>Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word length above which significance is lost? Can deviations in symmetry due to the finite size of the data be identified? This work addresses these questions by studying word symmetries in the human genome, chromosomes and transcriptome. To rule out finite-length effects, the results are compared with those obtained from random control sequences built to satisfy Chargaff's second parity rule. We use several techniques to evaluate the phenomenon of symmetry, including Pearson's correlation coefficient, total variational distance, a novel word symmetry distance, as well as traditional and equivalence statistical tests. We conclude that word symmetries are statistical significant in the human genome for word lengths up to 6nucleotides. For longer words, we present evidence that the phenomenon may not be as prevalent as previously thought.</description><subject>chromosomes</subject><subject>Chromosomes, Human - genetics</subject><subject>Chromosomes, Human - metabolism</subject><subject>correlation</subject><subject>Equivalence testing</subject><subject>genome</subject><subject>Genome, Human - physiology</subject><subject>Humans</subject><subject>Models, Genetic</subject><subject>Oligonucleotide composition</subject><subject>Single strand symmetry</subject><subject>transcriptome</subject><subject>Transcriptome - physiology</subject><subject>Word symmetry distance</subject><issn>0022-5193</issn><issn>1095-8541</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkE1P3DAURS3UCoaPP8CizbKbhPfsxE4kNtWohUpILApry7FfwMMkpnamaP59PQztEnX1pKtzr54OY-cIFQLKi1W1mntfcUBRgaxA8AO2QOiasm1q_MAWAJyXDXbiiB2ntAKArhbykB1x0QrkChdM3j1S0UcyTy68TEUYijkHLyG6Im3Hkea4Lfz0Gj5uRjMVDzSFkU7Zx8GsE5293RN2__3b3fK6vLm9-rH8elPaGsVcSgWW2sFxx5XtazRWDUQcuDMDmo54S7JDEtI0PWDX8Yby62qwrXUN1kqcsC_73ecYfm0ozXr0ydJ6bSYKm6RRKikkF_w_0Jp3ba1QQUb5HrUxpBRp0M_RjyZuNYLeqdUrvVOrd2o1SJ3V5tKnt_1NP5L7V_nrMgOf98BggjYP0Sd9_zMvNNm7kKJuMnG5Jygr--0p6mQ9TZacj2Rn7YJ_74M_a7qRqQ</recordid><startdate>20131021</startdate><enddate>20131021</enddate><creator>Afreixo, Vera</creator><creator>Bastos, Carlos A.C.</creator><creator>Garcia, Sara P.</creator><creator>Rodrigues, João M.O.S.</creator><creator>Pinho, Armando J.</creator><creator>Ferreira, Paulo J.S.G.</creator><general>Elsevier Ltd</general><scope>FBQ</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope></search><sort><creationdate>20131021</creationdate><title>The breakdown of the word symmetry in the human genome</title><author>Afreixo, Vera ; Bastos, Carlos A.C. ; Garcia, Sara P. ; Rodrigues, João M.O.S. ; Pinho, Armando J. ; Ferreira, Paulo J.S.G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c413t-670ce8fd2d27cb41ac7fee202daf1a9e28e691e36a5b019925e8547fc8cd51473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>chromosomes</topic><topic>Chromosomes, Human - genetics</topic><topic>Chromosomes, Human - metabolism</topic><topic>correlation</topic><topic>Equivalence testing</topic><topic>genome</topic><topic>Genome, Human - physiology</topic><topic>Humans</topic><topic>Models, Genetic</topic><topic>Oligonucleotide composition</topic><topic>Single strand symmetry</topic><topic>transcriptome</topic><topic>Transcriptome - physiology</topic><topic>Word symmetry distance</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Afreixo, Vera</creatorcontrib><creatorcontrib>Bastos, Carlos A.C.</creatorcontrib><creatorcontrib>Garcia, Sara P.</creatorcontrib><creatorcontrib>Rodrigues, João M.O.S.</creatorcontrib><creatorcontrib>Pinho, Armando J.</creatorcontrib><creatorcontrib>Ferreira, Paulo J.S.G.</creatorcontrib><collection>AGRIS</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><jtitle>Journal of theoretical biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Afreixo, Vera</au><au>Bastos, Carlos A.C.</au><au>Garcia, Sara P.</au><au>Rodrigues, João M.O.S.</au><au>Pinho, Armando J.</au><au>Ferreira, Paulo J.S.G.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The breakdown of the word symmetry in the human genome</atitle><jtitle>Journal of theoretical biology</jtitle><addtitle>J Theor Biol</addtitle><date>2013-10-21</date><risdate>2013</risdate><volume>335</volume><spage>153</spage><epage>159</epage><pages>153-159</pages><issn>0022-5193</issn><eissn>1095-8541</eissn><abstract>Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word length above which significance is lost? Can deviations in symmetry due to the finite size of the data be identified? This work addresses these questions by studying word symmetries in the human genome, chromosomes and transcriptome. To rule out finite-length effects, the results are compared with those obtained from random control sequences built to satisfy Chargaff's second parity rule. We use several techniques to evaluate the phenomenon of symmetry, including Pearson's correlation coefficient, total variational distance, a novel word symmetry distance, as well as traditional and equivalence statistical tests. We conclude that word symmetries are statistical significant in the human genome for word lengths up to 6nucleotides. For longer words, we present evidence that the phenomenon may not be as prevalent as previously thought.</abstract><cop>England</cop><pub>Elsevier Ltd</pub><pmid>23831271</pmid><doi>10.1016/j.jtbi.2013.06.032</doi><tpages>7</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0022-5193
ispartof Journal of theoretical biology, 2013-10, Vol.335, p.153-159
issn 0022-5193
1095-8541
language eng
recordid cdi_proquest_miscellaneous_1676362327
source MEDLINE; Elsevier ScienceDirect Journals
subjects chromosomes
Chromosomes, Human - genetics
Chromosomes, Human - metabolism
correlation
Equivalence testing
genome
Genome, Human - physiology
Humans
Models, Genetic
Oligonucleotide composition
Single strand symmetry
transcriptome
Transcriptome - physiology
Word symmetry distance
title The breakdown of the word symmetry in the human genome
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T06%3A21%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20breakdown%20of%20the%20word%20symmetry%20in%20the%20human%20genome&rft.jtitle=Journal%20of%20theoretical%20biology&rft.au=Afreixo,%20Vera&rft.date=2013-10-21&rft.volume=335&rft.spage=153&rft.epage=159&rft.pages=153-159&rft.issn=0022-5193&rft.eissn=1095-8541&rft_id=info:doi/10.1016/j.jtbi.2013.06.032&rft_dat=%3Cproquest_cross%3E1676362327%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1429847170&rft_id=info:pmid/23831271&rft_els_id=S0022519313003044&rfr_iscdi=true