Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human
DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of...
Gespeichert in:
Veröffentlicht in: | International journal of molecular sciences 2017-02, Vol.18 (2), p.420-420 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 420 |
---|---|
container_issue | 2 |
container_start_page | 420 |
container_title | International journal of molecular sciences |
container_volume | 18 |
creator | Wu, Chengchao Yao, Shixin Li, Xinghao Chen, Chujia Hu, Xuehai |
description | DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of CpG sites in the human genome, it is urgent to develop reliable computational models for predicting genome-wide DNA methylation. Here, we proposed a novel algorithm that accurately extracted sequence complexity features (seven features) and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features) by utilizing the methylation profiles of embryonic stem cells in human. The prediction results from 22 human chromosomes with size-varied windows showed that the 600-bp window achieved the best average accuracy of 94.7%. Moreover, comparisons with two existing methods further showed the superiority of our model, and cross-species predictions on mouse data also demonstrated that our model has certain generalization ability. Finally, a statistical test of the experimental data and the predicted data on functional regions annotated by ChromHMM found that six out of 10 regions were consistent, which implies reliable prediction of unassayed CpG sites. Accordingly, we believe that our novel model will be useful and reliable in predicting DNA methylation. |
doi_str_mv | 10.3390/ijms18020420 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5343954</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1891861563</sourcerecordid><originalsourceid>FETCH-LOGICAL-c445t-2e29f2cf8fb355d5b3c3be494d1352c9b817d0d99e376cdd48864c75c382dc7a3</originalsourceid><addsrcrecordid>eNqNkc1P3DAQxS1UBHThxhlF4tJDA_5M7AsS2rZQiS8JUI-uY0_Aq8Re4qTq_vcNu4AWTj159Oan55l5CO0TfMSYwsd-1iYiMcWc4g20QzilOcZF-Wmt3kafU5phTBkVagttU0kJZYTuoN9nEGIL-S_vILvpwHnb-xiyWGffrk6zS-gfF41ZSvfJh4elOo3tPCa_VE1w2S08DRAsLBsN_PX9IvMhOx9aE3bRZm2aBHsv7wTd__h-Nz3PL67Pfk5PL3LLuehzClTV1NayrpgQTlTMsgq44o4wQa2qJCkddkoBKwvrHJey4LYUlknqbGnYBJ2sfOdD1YKzEPrONHre-dZ0Cx2N1-87wT_qh_hHC8aZEnw0-PJi0MVxndTr1icLTWMCxCFpIhWRBREF-w-0UKoQ4_lH9PADOotDF8ZLjFQpx8yK8vnvryvKdjGlDuq3uQnWzynr9ZRH_GB91zf4NVb2D_DyorQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1878420674</pqid></control><display><type>article</type><title>Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human</title><source>MDPI - Multidisciplinary Digital Publishing Institute</source><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><creator>Wu, Chengchao ; Yao, Shixin ; Li, Xinghao ; Chen, Chujia ; Hu, Xuehai</creator><creatorcontrib>Wu, Chengchao ; Yao, Shixin ; Li, Xinghao ; Chen, Chujia ; Hu, Xuehai</creatorcontrib><description>DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of CpG sites in the human genome, it is urgent to develop reliable computational models for predicting genome-wide DNA methylation. Here, we proposed a novel algorithm that accurately extracted sequence complexity features (seven features) and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features) by utilizing the methylation profiles of embryonic stem cells in human. The prediction results from 22 human chromosomes with size-varied windows showed that the 600-bp window achieved the best average accuracy of 94.7%. Moreover, comparisons with two existing methods further showed the superiority of our model, and cross-species predictions on mouse data also demonstrated that our model has certain generalization ability. Finally, a statistical test of the experimental data and the predicted data on functional regions annotated by ChromHMM found that six out of 10 regions were consistent, which implies reliable prediction of unassayed CpG sites. Accordingly, we believe that our novel model will be useful and reliable in predicting DNA methylation.</description><identifier>ISSN: 1422-0067</identifier><identifier>ISSN: 1661-6596</identifier><identifier>EISSN: 1422-0067</identifier><identifier>DOI: 10.3390/ijms18020420</identifier><identifier>PMID: 28212312</identifier><language>eng</language><publisher>Switzerland: MDPI AG</publisher><subject>Animal models ; Animals ; Assaying ; Base Composition ; Chromosomes ; Computational Biology - methods ; Computer applications ; CpG Islands ; Datasets as Topic ; Deoxyribonucleic acid ; DNA ; DNA Methylation ; Embryo cells ; Epigenomics - methods ; Gene expression ; Gene Expression Profiling ; Gene regulation ; Genes ; Genome, Human ; Genome-Wide Association Study ; Genomes ; Humans ; Mathematical models ; Models, Genetic ; Nucleotide sequence ; Prediction models ; Reproducibility of Results ; ROC Curve ; Species Specificity ; Stem cell transplantation ; Stem cells ; Transcription</subject><ispartof>International journal of molecular sciences, 2017-02, Vol.18 (2), p.420-420</ispartof><rights>Copyright MDPI AG 2017</rights><rights>2017 by the authors. 2017</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c445t-2e29f2cf8fb355d5b3c3be494d1352c9b817d0d99e376cdd48864c75c382dc7a3</citedby><cites>FETCH-LOGICAL-c445t-2e29f2cf8fb355d5b3c3be494d1352c9b817d0d99e376cdd48864c75c382dc7a3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5343954/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5343954/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/28212312$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wu, Chengchao</creatorcontrib><creatorcontrib>Yao, Shixin</creatorcontrib><creatorcontrib>Li, Xinghao</creatorcontrib><creatorcontrib>Chen, Chujia</creatorcontrib><creatorcontrib>Hu, Xuehai</creatorcontrib><title>Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human</title><title>International journal of molecular sciences</title><addtitle>Int J Mol Sci</addtitle><description>DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of CpG sites in the human genome, it is urgent to develop reliable computational models for predicting genome-wide DNA methylation. Here, we proposed a novel algorithm that accurately extracted sequence complexity features (seven features) and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features) by utilizing the methylation profiles of embryonic stem cells in human. The prediction results from 22 human chromosomes with size-varied windows showed that the 600-bp window achieved the best average accuracy of 94.7%. Moreover, comparisons with two existing methods further showed the superiority of our model, and cross-species predictions on mouse data also demonstrated that our model has certain generalization ability. Finally, a statistical test of the experimental data and the predicted data on functional regions annotated by ChromHMM found that six out of 10 regions were consistent, which implies reliable prediction of unassayed CpG sites. Accordingly, we believe that our novel model will be useful and reliable in predicting DNA methylation.</description><subject>Animal models</subject><subject>Animals</subject><subject>Assaying</subject><subject>Base Composition</subject><subject>Chromosomes</subject><subject>Computational Biology - methods</subject><subject>Computer applications</subject><subject>CpG Islands</subject><subject>Datasets as Topic</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>DNA Methylation</subject><subject>Embryo cells</subject><subject>Epigenomics - methods</subject><subject>Gene expression</subject><subject>Gene Expression Profiling</subject><subject>Gene regulation</subject><subject>Genes</subject><subject>Genome, Human</subject><subject>Genome-Wide Association Study</subject><subject>Genomes</subject><subject>Humans</subject><subject>Mathematical models</subject><subject>Models, Genetic</subject><subject>Nucleotide sequence</subject><subject>Prediction models</subject><subject>Reproducibility of Results</subject><subject>ROC Curve</subject><subject>Species Specificity</subject><subject>Stem cell transplantation</subject><subject>Stem cells</subject><subject>Transcription</subject><issn>1422-0067</issn><issn>1661-6596</issn><issn>1422-0067</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNqNkc1P3DAQxS1UBHThxhlF4tJDA_5M7AsS2rZQiS8JUI-uY0_Aq8Re4qTq_vcNu4AWTj159Oan55l5CO0TfMSYwsd-1iYiMcWc4g20QzilOcZF-Wmt3kafU5phTBkVagttU0kJZYTuoN9nEGIL-S_vILvpwHnb-xiyWGffrk6zS-gfF41ZSvfJh4elOo3tPCa_VE1w2S08DRAsLBsN_PX9IvMhOx9aE3bRZm2aBHsv7wTd__h-Nz3PL67Pfk5PL3LLuehzClTV1NayrpgQTlTMsgq44o4wQa2qJCkddkoBKwvrHJey4LYUlknqbGnYBJ2sfOdD1YKzEPrONHre-dZ0Cx2N1-87wT_qh_hHC8aZEnw0-PJi0MVxndTr1icLTWMCxCFpIhWRBREF-w-0UKoQ4_lH9PADOotDF8ZLjFQpx8yK8vnvryvKdjGlDuq3uQnWzynr9ZRH_GB91zf4NVb2D_DyorQ</recordid><startdate>20170216</startdate><enddate>20170216</enddate><creator>Wu, Chengchao</creator><creator>Yao, Shixin</creator><creator>Li, Xinghao</creator><creator>Chen, Chujia</creator><creator>Hu, Xuehai</creator><general>MDPI AG</general><general>MDPI</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>K9.</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>MBDVC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>7TK</scope><scope>5PM</scope></search><sort><creationdate>20170216</creationdate><title>Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human</title><author>Wu, Chengchao ; Yao, Shixin ; Li, Xinghao ; Chen, Chujia ; Hu, Xuehai</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c445t-2e29f2cf8fb355d5b3c3be494d1352c9b817d0d99e376cdd48864c75c382dc7a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Animal models</topic><topic>Animals</topic><topic>Assaying</topic><topic>Base Composition</topic><topic>Chromosomes</topic><topic>Computational Biology - methods</topic><topic>Computer applications</topic><topic>CpG Islands</topic><topic>Datasets as Topic</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>DNA Methylation</topic><topic>Embryo cells</topic><topic>Epigenomics - methods</topic><topic>Gene expression</topic><topic>Gene Expression Profiling</topic><topic>Gene regulation</topic><topic>Genes</topic><topic>Genome, Human</topic><topic>Genome-Wide Association Study</topic><topic>Genomes</topic><topic>Humans</topic><topic>Mathematical models</topic><topic>Models, Genetic</topic><topic>Nucleotide sequence</topic><topic>Prediction models</topic><topic>Reproducibility of Results</topic><topic>ROC Curve</topic><topic>Species Specificity</topic><topic>Stem cell transplantation</topic><topic>Stem cells</topic><topic>Transcription</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Chengchao</creatorcontrib><creatorcontrib>Yao, Shixin</creatorcontrib><creatorcontrib>Li, Xinghao</creatorcontrib><creatorcontrib>Chen, Chujia</creatorcontrib><creatorcontrib>Hu, Xuehai</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>Neurosciences Abstracts</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>International journal of molecular sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wu, Chengchao</au><au>Yao, Shixin</au><au>Li, Xinghao</au><au>Chen, Chujia</au><au>Hu, Xuehai</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human</atitle><jtitle>International journal of molecular sciences</jtitle><addtitle>Int J Mol Sci</addtitle><date>2017-02-16</date><risdate>2017</risdate><volume>18</volume><issue>2</issue><spage>420</spage><epage>420</epage><pages>420-420</pages><issn>1422-0067</issn><issn>1661-6596</issn><eissn>1422-0067</eissn><abstract>DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of CpG sites in the human genome, it is urgent to develop reliable computational models for predicting genome-wide DNA methylation. Here, we proposed a novel algorithm that accurately extracted sequence complexity features (seven features) and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features) by utilizing the methylation profiles of embryonic stem cells in human. The prediction results from 22 human chromosomes with size-varied windows showed that the 600-bp window achieved the best average accuracy of 94.7%. Moreover, comparisons with two existing methods further showed the superiority of our model, and cross-species predictions on mouse data also demonstrated that our model has certain generalization ability. Finally, a statistical test of the experimental data and the predicted data on functional regions annotated by ChromHMM found that six out of 10 regions were consistent, which implies reliable prediction of unassayed CpG sites. Accordingly, we believe that our novel model will be useful and reliable in predicting DNA methylation.</abstract><cop>Switzerland</cop><pub>MDPI AG</pub><pmid>28212312</pmid><doi>10.3390/ijms18020420</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1422-0067 |
ispartof | International journal of molecular sciences, 2017-02, Vol.18 (2), p.420-420 |
issn | 1422-0067 1661-6596 1422-0067 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5343954 |
source | MDPI - Multidisciplinary Digital Publishing Institute; MEDLINE; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central |
subjects | Animal models Animals Assaying Base Composition Chromosomes Computational Biology - methods Computer applications CpG Islands Datasets as Topic Deoxyribonucleic acid DNA DNA Methylation Embryo cells Epigenomics - methods Gene expression Gene Expression Profiling Gene regulation Genes Genome, Human Genome-Wide Association Study Genomes Humans Mathematical models Models, Genetic Nucleotide sequence Prediction models Reproducibility of Results ROC Curve Species Specificity Stem cell transplantation Stem cells Transcription |
title | Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-16T11%3A25%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Genome-Wide%20Prediction%20of%20DNA%20Methylation%20Using%20DNA%20Composition%20and%20Sequence%20Complexity%20in%20Human&rft.jtitle=International%20journal%20of%20molecular%20sciences&rft.au=Wu,%20Chengchao&rft.date=2017-02-16&rft.volume=18&rft.issue=2&rft.spage=420&rft.epage=420&rft.pages=420-420&rft.issn=1422-0067&rft.eissn=1422-0067&rft_id=info:doi/10.3390/ijms18020420&rft_dat=%3Cproquest_pubme%3E1891861563%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1878420674&rft_id=info:pmid/28212312&rfr_iscdi=true |