Analyzing genomic data using tensor-based orthogonal polynomials with application to synthetic RNAs

Abstract An important goal in molecular biology is to quantify both the patterns across a genomic sequence and the relationship between phenotype and underlying sequence. We propose a multivariate tensor-based orthogonal polynomial approach to characterize nucleotides or amino acids in a given seque...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:NAR genomics and bioinformatics 2020-12, Vol.2 (4), p.lqaa101-lqaa101
Hauptverfasser: Nafees, Saba, Rice, Sean H, Wakeman, Catherine A
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page lqaa101
container_issue 4
container_start_page lqaa101
container_title NAR genomics and bioinformatics
container_volume 2
creator Nafees, Saba
Rice, Sean H
Wakeman, Catherine A
description Abstract An important goal in molecular biology is to quantify both the patterns across a genomic sequence and the relationship between phenotype and underlying sequence. We propose a multivariate tensor-based orthogonal polynomial approach to characterize nucleotides or amino acids in a given sequence and map corresponding phenotypes onto the sequence space. We have applied this method to a previously published case of small transcription activating RNAs. Covariance patterns along the sequence showcased strong correlations between nucleotides at the ends of the sequence. However, when the phenotype is projected onto the sequence space, this pattern does not emerge. When doing second order analysis and quantifying the functional relationship between the phenotype and pairs of sites along the sequence, we identified sites with high regressions spread across the sequence, indicating potential intramolecular binding. In addition to quantifying interactions between different parts of a sequence, the method quantifies sequence–phenotype interactions at first and higher order levels. We discuss the strengths and constraints of the method and compare it to computational methods such as machine learning approaches. An accompanying command line tool to compute these polynomials is provided. We show proof of concept of this approach and demonstrate its potential application to other biological systems.
doi_str_mv 10.1093/nargab/lqaa101
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7731874</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/nargab/lqaa101</oup_id><sourcerecordid>2489251276</sourcerecordid><originalsourceid>FETCH-LOGICAL-c379t-4a1159ea5de9ed8698f27628bb649305443334a9b7c0eaf0ed67dfda216591143</originalsourceid><addsrcrecordid>eNqFkc9LwzAYhoMoKnNXj5KjHuqSJmmbizDEXyAKoufwtU27SJfUJFXmX2_HpujJ0xe-PHnywovQMSXnlEg2s-BbKGfdGwAldAcdphmjiUyzYvfX-QBNQ3glhKSCC07oPjpgTOQi4-IQVXML3erT2Ba32rqlqXANEfAQ1quobXA-KSHoGjsfF651I497163WMHQBf5i4wND3nakgGmdxdDisbFzoOMqeHubhCO01I6mn2zlBL9dXz5e3yf3jzd3l_D6pWC5jwoFSITWIWktdF5ksmjTP0qIsMy4ZEZwzxjjIMq-IhoboOsvrpoaUZkJSytkEXWy8_VAudV1pGz10qvdmCX6lHBj198aahWrdu8pzRot8LTjdCrx7G3SIamlCpbsOrHZDUCkvZCromGpEzzdo5V0IXjc_31Ci1uWoTTlqW8744OR3uB_8u4oRONsAbuj_k30Bc0idzw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2489251276</pqid></control><display><type>article</type><title>Analyzing genomic data using tensor-based orthogonal polynomials with application to synthetic RNAs</title><source>DOAJ Directory of Open Access Journals</source><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><creator>Nafees, Saba ; Rice, Sean H ; Wakeman, Catherine A</creator><creatorcontrib>Nafees, Saba ; Rice, Sean H ; Wakeman, Catherine A</creatorcontrib><description>Abstract An important goal in molecular biology is to quantify both the patterns across a genomic sequence and the relationship between phenotype and underlying sequence. We propose a multivariate tensor-based orthogonal polynomial approach to characterize nucleotides or amino acids in a given sequence and map corresponding phenotypes onto the sequence space. We have applied this method to a previously published case of small transcription activating RNAs. Covariance patterns along the sequence showcased strong correlations between nucleotides at the ends of the sequence. However, when the phenotype is projected onto the sequence space, this pattern does not emerge. When doing second order analysis and quantifying the functional relationship between the phenotype and pairs of sites along the sequence, we identified sites with high regressions spread across the sequence, indicating potential intramolecular binding. In addition to quantifying interactions between different parts of a sequence, the method quantifies sequence–phenotype interactions at first and higher order levels. We discuss the strengths and constraints of the method and compare it to computational methods such as machine learning approaches. An accompanying command line tool to compute these polynomials is provided. We show proof of concept of this approach and demonstrate its potential application to other biological systems.</description><identifier>ISSN: 2631-9268</identifier><identifier>EISSN: 2631-9268</identifier><identifier>DOI: 10.1093/nargab/lqaa101</identifier><identifier>PMID: 33575645</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Methods</subject><ispartof>NAR genomics and bioinformatics, 2020-12, Vol.2 (4), p.lqaa101-lqaa101</ispartof><rights>The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. 2020</rights><rights>The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c379t-4a1159ea5de9ed8698f27628bb649305443334a9b7c0eaf0ed67dfda216591143</cites><orcidid>0000-0002-3292-7703 ; 0000-0003-0311-6669</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7731874/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7731874/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,1604,27923,27924,53790,53792</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33575645$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Nafees, Saba</creatorcontrib><creatorcontrib>Rice, Sean H</creatorcontrib><creatorcontrib>Wakeman, Catherine A</creatorcontrib><title>Analyzing genomic data using tensor-based orthogonal polynomials with application to synthetic RNAs</title><title>NAR genomics and bioinformatics</title><addtitle>NAR Genom Bioinform</addtitle><description>Abstract An important goal in molecular biology is to quantify both the patterns across a genomic sequence and the relationship between phenotype and underlying sequence. We propose a multivariate tensor-based orthogonal polynomial approach to characterize nucleotides or amino acids in a given sequence and map corresponding phenotypes onto the sequence space. We have applied this method to a previously published case of small transcription activating RNAs. Covariance patterns along the sequence showcased strong correlations between nucleotides at the ends of the sequence. However, when the phenotype is projected onto the sequence space, this pattern does not emerge. When doing second order analysis and quantifying the functional relationship between the phenotype and pairs of sites along the sequence, we identified sites with high regressions spread across the sequence, indicating potential intramolecular binding. In addition to quantifying interactions between different parts of a sequence, the method quantifies sequence–phenotype interactions at first and higher order levels. We discuss the strengths and constraints of the method and compare it to computational methods such as machine learning approaches. An accompanying command line tool to compute these polynomials is provided. We show proof of concept of this approach and demonstrate its potential application to other biological systems.</description><subject>Methods</subject><issn>2631-9268</issn><issn>2631-9268</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqFkc9LwzAYhoMoKnNXj5KjHuqSJmmbizDEXyAKoufwtU27SJfUJFXmX2_HpujJ0xe-PHnywovQMSXnlEg2s-BbKGfdGwAldAcdphmjiUyzYvfX-QBNQ3glhKSCC07oPjpgTOQi4-IQVXML3erT2Ba32rqlqXANEfAQ1quobXA-KSHoGjsfF651I497163WMHQBf5i4wND3nakgGmdxdDisbFzoOMqeHubhCO01I6mn2zlBL9dXz5e3yf3jzd3l_D6pWC5jwoFSITWIWktdF5ksmjTP0qIsMy4ZEZwzxjjIMq-IhoboOsvrpoaUZkJSytkEXWy8_VAudV1pGz10qvdmCX6lHBj198aahWrdu8pzRot8LTjdCrx7G3SIamlCpbsOrHZDUCkvZCromGpEzzdo5V0IXjc_31Ci1uWoTTlqW8744OR3uB_8u4oRONsAbuj_k30Bc0idzw</recordid><startdate>20201201</startdate><enddate>20201201</enddate><creator>Nafees, Saba</creator><creator>Rice, Sean H</creator><creator>Wakeman, Catherine A</creator><general>Oxford University Press</general><scope>TOX</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-3292-7703</orcidid><orcidid>https://orcid.org/0000-0003-0311-6669</orcidid></search><sort><creationdate>20201201</creationdate><title>Analyzing genomic data using tensor-based orthogonal polynomials with application to synthetic RNAs</title><author>Nafees, Saba ; Rice, Sean H ; Wakeman, Catherine A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c379t-4a1159ea5de9ed8698f27628bb649305443334a9b7c0eaf0ed67dfda216591143</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nafees, Saba</creatorcontrib><creatorcontrib>Rice, Sean H</creatorcontrib><creatorcontrib>Wakeman, Catherine A</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>NAR genomics and bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nafees, Saba</au><au>Rice, Sean H</au><au>Wakeman, Catherine A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Analyzing genomic data using tensor-based orthogonal polynomials with application to synthetic RNAs</atitle><jtitle>NAR genomics and bioinformatics</jtitle><addtitle>NAR Genom Bioinform</addtitle><date>2020-12-01</date><risdate>2020</risdate><volume>2</volume><issue>4</issue><spage>lqaa101</spage><epage>lqaa101</epage><pages>lqaa101-lqaa101</pages><issn>2631-9268</issn><eissn>2631-9268</eissn><abstract>Abstract An important goal in molecular biology is to quantify both the patterns across a genomic sequence and the relationship between phenotype and underlying sequence. We propose a multivariate tensor-based orthogonal polynomial approach to characterize nucleotides or amino acids in a given sequence and map corresponding phenotypes onto the sequence space. We have applied this method to a previously published case of small transcription activating RNAs. Covariance patterns along the sequence showcased strong correlations between nucleotides at the ends of the sequence. However, when the phenotype is projected onto the sequence space, this pattern does not emerge. When doing second order analysis and quantifying the functional relationship between the phenotype and pairs of sites along the sequence, we identified sites with high regressions spread across the sequence, indicating potential intramolecular binding. In addition to quantifying interactions between different parts of a sequence, the method quantifies sequence–phenotype interactions at first and higher order levels. We discuss the strengths and constraints of the method and compare it to computational methods such as machine learning approaches. An accompanying command line tool to compute these polynomials is provided. We show proof of concept of this approach and demonstrate its potential application to other biological systems.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>33575645</pmid><doi>10.1093/nargab/lqaa101</doi><orcidid>https://orcid.org/0000-0002-3292-7703</orcidid><orcidid>https://orcid.org/0000-0003-0311-6669</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2631-9268
ispartof NAR genomics and bioinformatics, 2020-12, Vol.2 (4), p.lqaa101-lqaa101
issn 2631-9268
2631-9268
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7731874
source DOAJ Directory of Open Access Journals; Oxford Journals Open Access Collection; PubMed Central
subjects Methods
title Analyzing genomic data using tensor-based orthogonal polynomials with application to synthetic RNAs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T23%3A11%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Analyzing%20genomic%20data%20using%20tensor-based%20orthogonal%20polynomials%20with%20application%20to%20synthetic%20RNAs&rft.jtitle=NAR%20genomics%20and%20bioinformatics&rft.au=Nafees,%20Saba&rft.date=2020-12-01&rft.volume=2&rft.issue=4&rft.spage=lqaa101&rft.epage=lqaa101&rft.pages=lqaa101-lqaa101&rft.issn=2631-9268&rft.eissn=2631-9268&rft_id=info:doi/10.1093/nargab/lqaa101&rft_dat=%3Cproquest_pubme%3E2489251276%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2489251276&rft_id=info:pmid/33575645&rft_oup_id=10.1093/nargab/lqaa101&rfr_iscdi=true