The breakdown of the word symmetry in the human genome
Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word le...
Gespeichert in:
Veröffentlicht in: | Journal of theoretical biology 2013-10, Vol.335, p.153-159 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 159 |
---|---|
container_issue | |
container_start_page | 153 |
container_title | Journal of theoretical biology |
container_volume | 335 |
creator | Afreixo, Vera Bastos, Carlos A.C. Garcia, Sara P. Rodrigues, João M.O.S. Pinho, Armando J. Ferreira, Paulo J.S.G. |
description | Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word length above which significance is lost? Can deviations in symmetry due to the finite size of the data be identified? This work addresses these questions by studying word symmetries in the human genome, chromosomes and transcriptome. To rule out finite-length effects, the results are compared with those obtained from random control sequences built to satisfy Chargaff's second parity rule. We use several techniques to evaluate the phenomenon of symmetry, including Pearson's correlation coefficient, total variational distance, a novel word symmetry distance, as well as traditional and equivalence statistical tests. We conclude that word symmetries are statistical significant in the human genome for word lengths up to 6nucleotides. For longer words, we present evidence that the phenomenon may not be as prevalent as previously thought. |
doi_str_mv | 10.1016/j.jtbi.2013.06.032 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1676362327</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0022519313003044</els_id><sourcerecordid>1676362327</sourcerecordid><originalsourceid>FETCH-LOGICAL-c413t-670ce8fd2d27cb41ac7fee202daf1a9e28e691e36a5b019925e8547fc8cd51473</originalsourceid><addsrcrecordid>eNqNkE1P3DAURS3UCoaPP8CizbKbhPfsxE4kNtWohUpILApry7FfwMMkpnamaP59PQztEnX1pKtzr54OY-cIFQLKi1W1mntfcUBRgaxA8AO2QOiasm1q_MAWAJyXDXbiiB2ntAKArhbykB1x0QrkChdM3j1S0UcyTy68TEUYijkHLyG6Im3Hkea4Lfz0Gj5uRjMVDzSFkU7Zx8GsE5293RN2__3b3fK6vLm9-rH8elPaGsVcSgWW2sFxx5XtazRWDUQcuDMDmo54S7JDEtI0PWDX8Yby62qwrXUN1kqcsC_73ecYfm0ozXr0ydJ6bSYKm6RRKikkF_w_0Jp3ba1QQUb5HrUxpBRp0M_RjyZuNYLeqdUrvVOrd2o1SJ3V5tKnt_1NP5L7V_nrMgOf98BggjYP0Sd9_zMvNNm7kKJuMnG5Jygr--0p6mQ9TZacj2Rn7YJ_74M_a7qRqQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1429847170</pqid></control><display><type>article</type><title>The breakdown of the word symmetry in the human genome</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><creator>Afreixo, Vera ; Bastos, Carlos A.C. ; Garcia, Sara P. ; Rodrigues, João M.O.S. ; Pinho, Armando J. ; Ferreira, Paulo J.S.G.</creator><creatorcontrib>Afreixo, Vera ; Bastos, Carlos A.C. ; Garcia, Sara P. ; Rodrigues, João M.O.S. ; Pinho, Armando J. ; Ferreira, Paulo J.S.G.</creatorcontrib><description>Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word length above which significance is lost? Can deviations in symmetry due to the finite size of the data be identified? This work addresses these questions by studying word symmetries in the human genome, chromosomes and transcriptome. To rule out finite-length effects, the results are compared with those obtained from random control sequences built to satisfy Chargaff's second parity rule. We use several techniques to evaluate the phenomenon of symmetry, including Pearson's correlation coefficient, total variational distance, a novel word symmetry distance, as well as traditional and equivalence statistical tests. We conclude that word symmetries are statistical significant in the human genome for word lengths up to 6nucleotides. For longer words, we present evidence that the phenomenon may not be as prevalent as previously thought.</description><identifier>ISSN: 0022-5193</identifier><identifier>EISSN: 1095-8541</identifier><identifier>DOI: 10.1016/j.jtbi.2013.06.032</identifier><identifier>PMID: 23831271</identifier><language>eng</language><publisher>England: Elsevier Ltd</publisher><subject>chromosomes ; Chromosomes, Human - genetics ; Chromosomes, Human - metabolism ; correlation ; Equivalence testing ; genome ; Genome, Human - physiology ; Humans ; Models, Genetic ; Oligonucleotide composition ; Single strand symmetry ; transcriptome ; Transcriptome - physiology ; Word symmetry distance</subject><ispartof>Journal of theoretical biology, 2013-10, Vol.335, p.153-159</ispartof><rights>2013 Elsevier Ltd</rights><rights>2013 Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c413t-670ce8fd2d27cb41ac7fee202daf1a9e28e691e36a5b019925e8547fc8cd51473</citedby><cites>FETCH-LOGICAL-c413t-670ce8fd2d27cb41ac7fee202daf1a9e28e691e36a5b019925e8547fc8cd51473</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0022519313003044$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23831271$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Afreixo, Vera</creatorcontrib><creatorcontrib>Bastos, Carlos A.C.</creatorcontrib><creatorcontrib>Garcia, Sara P.</creatorcontrib><creatorcontrib>Rodrigues, João M.O.S.</creatorcontrib><creatorcontrib>Pinho, Armando J.</creatorcontrib><creatorcontrib>Ferreira, Paulo J.S.G.</creatorcontrib><title>The breakdown of the word symmetry in the human genome</title><title>Journal of theoretical biology</title><addtitle>J Theor Biol</addtitle><description>Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word length above which significance is lost? Can deviations in symmetry due to the finite size of the data be identified? This work addresses these questions by studying word symmetries in the human genome, chromosomes and transcriptome. To rule out finite-length effects, the results are compared with those obtained from random control sequences built to satisfy Chargaff's second parity rule. We use several techniques to evaluate the phenomenon of symmetry, including Pearson's correlation coefficient, total variational distance, a novel word symmetry distance, as well as traditional and equivalence statistical tests. We conclude that word symmetries are statistical significant in the human genome for word lengths up to 6nucleotides. For longer words, we present evidence that the phenomenon may not be as prevalent as previously thought.</description><subject>chromosomes</subject><subject>Chromosomes, Human - genetics</subject><subject>Chromosomes, Human - metabolism</subject><subject>correlation</subject><subject>Equivalence testing</subject><subject>genome</subject><subject>Genome, Human - physiology</subject><subject>Humans</subject><subject>Models, Genetic</subject><subject>Oligonucleotide composition</subject><subject>Single strand symmetry</subject><subject>transcriptome</subject><subject>Transcriptome - physiology</subject><subject>Word symmetry distance</subject><issn>0022-5193</issn><issn>1095-8541</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkE1P3DAURS3UCoaPP8CizbKbhPfsxE4kNtWohUpILApry7FfwMMkpnamaP59PQztEnX1pKtzr54OY-cIFQLKi1W1mntfcUBRgaxA8AO2QOiasm1q_MAWAJyXDXbiiB2ntAKArhbykB1x0QrkChdM3j1S0UcyTy68TEUYijkHLyG6Im3Hkea4Lfz0Gj5uRjMVDzSFkU7Zx8GsE5293RN2__3b3fK6vLm9-rH8elPaGsVcSgWW2sFxx5XtazRWDUQcuDMDmo54S7JDEtI0PWDX8Yby62qwrXUN1kqcsC_73ecYfm0ozXr0ydJ6bSYKm6RRKikkF_w_0Jp3ba1QQUb5HrUxpBRp0M_RjyZuNYLeqdUrvVOrd2o1SJ3V5tKnt_1NP5L7V_nrMgOf98BggjYP0Sd9_zMvNNm7kKJuMnG5Jygr--0p6mQ9TZacj2Rn7YJ_74M_a7qRqQ</recordid><startdate>20131021</startdate><enddate>20131021</enddate><creator>Afreixo, Vera</creator><creator>Bastos, Carlos A.C.</creator><creator>Garcia, Sara P.</creator><creator>Rodrigues, João M.O.S.</creator><creator>Pinho, Armando J.</creator><creator>Ferreira, Paulo J.S.G.</creator><general>Elsevier Ltd</general><scope>FBQ</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope></search><sort><creationdate>20131021</creationdate><title>The breakdown of the word symmetry in the human genome</title><author>Afreixo, Vera ; Bastos, Carlos A.C. ; Garcia, Sara P. ; Rodrigues, João M.O.S. ; Pinho, Armando J. ; Ferreira, Paulo J.S.G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c413t-670ce8fd2d27cb41ac7fee202daf1a9e28e691e36a5b019925e8547fc8cd51473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>chromosomes</topic><topic>Chromosomes, Human - genetics</topic><topic>Chromosomes, Human - metabolism</topic><topic>correlation</topic><topic>Equivalence testing</topic><topic>genome</topic><topic>Genome, Human - physiology</topic><topic>Humans</topic><topic>Models, Genetic</topic><topic>Oligonucleotide composition</topic><topic>Single strand symmetry</topic><topic>transcriptome</topic><topic>Transcriptome - physiology</topic><topic>Word symmetry distance</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Afreixo, Vera</creatorcontrib><creatorcontrib>Bastos, Carlos A.C.</creatorcontrib><creatorcontrib>Garcia, Sara P.</creatorcontrib><creatorcontrib>Rodrigues, João M.O.S.</creatorcontrib><creatorcontrib>Pinho, Armando J.</creatorcontrib><creatorcontrib>Ferreira, Paulo J.S.G.</creatorcontrib><collection>AGRIS</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><jtitle>Journal of theoretical biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Afreixo, Vera</au><au>Bastos, Carlos A.C.</au><au>Garcia, Sara P.</au><au>Rodrigues, João M.O.S.</au><au>Pinho, Armando J.</au><au>Ferreira, Paulo J.S.G.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The breakdown of the word symmetry in the human genome</atitle><jtitle>Journal of theoretical biology</jtitle><addtitle>J Theor Biol</addtitle><date>2013-10-21</date><risdate>2013</risdate><volume>335</volume><spage>153</spage><epage>159</epage><pages>153-159</pages><issn>0022-5193</issn><eissn>1095-8541</eissn><abstract>Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word length above which significance is lost? Can deviations in symmetry due to the finite size of the data be identified? This work addresses these questions by studying word symmetries in the human genome, chromosomes and transcriptome. To rule out finite-length effects, the results are compared with those obtained from random control sequences built to satisfy Chargaff's second parity rule. We use several techniques to evaluate the phenomenon of symmetry, including Pearson's correlation coefficient, total variational distance, a novel word symmetry distance, as well as traditional and equivalence statistical tests. We conclude that word symmetries are statistical significant in the human genome for word lengths up to 6nucleotides. For longer words, we present evidence that the phenomenon may not be as prevalent as previously thought.</abstract><cop>England</cop><pub>Elsevier Ltd</pub><pmid>23831271</pmid><doi>10.1016/j.jtbi.2013.06.032</doi><tpages>7</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0022-5193 |
ispartof | Journal of theoretical biology, 2013-10, Vol.335, p.153-159 |
issn | 0022-5193 1095-8541 |
language | eng |
recordid | cdi_proquest_miscellaneous_1676362327 |
source | MEDLINE; Elsevier ScienceDirect Journals |
subjects | chromosomes Chromosomes, Human - genetics Chromosomes, Human - metabolism correlation Equivalence testing genome Genome, Human - physiology Humans Models, Genetic Oligonucleotide composition Single strand symmetry transcriptome Transcriptome - physiology Word symmetry distance |
title | The breakdown of the word symmetry in the human genome |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T06%3A21%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20breakdown%20of%20the%20word%20symmetry%20in%20the%20human%20genome&rft.jtitle=Journal%20of%20theoretical%20biology&rft.au=Afreixo,%20Vera&rft.date=2013-10-21&rft.volume=335&rft.spage=153&rft.epage=159&rft.pages=153-159&rft.issn=0022-5193&rft.eissn=1095-8541&rft_id=info:doi/10.1016/j.jtbi.2013.06.032&rft_dat=%3Cproquest_cross%3E1676362327%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1429847170&rft_id=info:pmid/23831271&rft_els_id=S0022519313003044&rfr_iscdi=true |