t-Distributed Stochastic Neighbor Embedding (t-SNE): A tool for eco-physiological transcriptomic analysis

High-throughput RNA sequencing (RNA-Seq) has transformed the ecophysiological assessment of individual plankton species and communities. However, the technology generates complex data consisting of millions of short-read sequences that can be difficult to analyze and interpret. New bioinformatics wo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Marine genomics 2020-06, Vol.51, p.100723-100723, Article 100723
Hauptverfasser: Cieslak, Matthew C., Castelfranco, Ann M., Roncalli, Vittoria, Lenz, Petra H., Hartline, Daniel K.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 100723
container_issue
container_start_page 100723
container_title Marine genomics
container_volume 51
creator Cieslak, Matthew C.
Castelfranco, Ann M.
Roncalli, Vittoria
Lenz, Petra H.
Hartline, Daniel K.
description High-throughput RNA sequencing (RNA-Seq) has transformed the ecophysiological assessment of individual plankton species and communities. However, the technology generates complex data consisting of millions of short-read sequences that can be difficult to analyze and interpret. New bioinformatics workflows are needed to guide experimentation, environmental sampling, and to develop and test hypotheses. One complexity-reducing tool that has been used successfully in other fields is “t-distributed Stochastic Neighbor Embedding” (t-SNE). Its application to transcriptomic data from marine pelagic and benthic systems has yet to be explored. The present study demonstrates an application for evaluating RNA-Seq data using previously published, conventionally analyzed studies on the copepods Calanus finmarchicus and Neocalanus flemingeri. In one application, gene expression profiles were compared among different developmental stages. In another, they were compared among experimental conditions. In a third, they were compared among environmental samples from different locations. The profile categories identified by t-SNE were validated by reference to published results using differential gene expression and Gene Ontology (GO) analyses. The analyses demonstrate how individual samples can be evaluated for differences in global gene expression, as well as differences in expression related to specific biological processes, such as lipid metabolism and responses to stress. As RNA-Seq data from plankton species and communities become more common, t-SNE analysis should provide a powerful tool for determining trends and classifying samples into groups with similar transcriptional physiology, independent of collection site or time.
doi_str_mv 10.1016/j.margen.2019.100723
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2439416199</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1874778719301746</els_id><sourcerecordid>2439416199</sourcerecordid><originalsourceid>FETCH-LOGICAL-c441t-63e32be561305fbe85be81c1b8d80fa9c25bf8fb455eeb83d8d404792aef126b3</originalsourceid><addsrcrecordid>eNqFkUtP3DAQgK0KVCjtP6iqHOkhi1-JHQ5IK7q0SIgeaM-WH5Ndr5J4sb1I_PuaBjjCwRpr5psZaT6EvhK8IJi0Z9vFqOMapgXFpCspLCj7gI6JFG0tuJAH__-8FkKKI_QppS3GLRUSf0RHjAjJWcOOkc_1D59y9GafwVV3OdiNTtnb6hb8emNCrFajAef8tK5Oc313u_p-Xi2rHMJQ9aUKNtS7zWPyYQhrb_VQ5ainZKPf5TCWOXrSQymnz-iw10OCL8_xBP29Wv25_FXf_P55fbm8qS3nJNctA0YNNC1huOkNyKY8YomRTuJed5Y2ppe94U0DYCRz0nHMRUc19IS2hp2g03nuLob7PaSsRp8sDIOeIOyTopx1nLSk695HGcVMYi6bgvIZtTGkFKFXu-iLgEdFsHryobZq9qGefKjZR2n79rxhb0Zwr00vAgpwMQNQTvLgIapkPUwWnI9gs3LBv73hH4oOnhk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2320380485</pqid></control><display><type>article</type><title>t-Distributed Stochastic Neighbor Embedding (t-SNE): A tool for eco-physiological transcriptomic analysis</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Cieslak, Matthew C. ; Castelfranco, Ann M. ; Roncalli, Vittoria ; Lenz, Petra H. ; Hartline, Daniel K.</creator><creatorcontrib>Cieslak, Matthew C. ; Castelfranco, Ann M. ; Roncalli, Vittoria ; Lenz, Petra H. ; Hartline, Daniel K.</creatorcontrib><description>High-throughput RNA sequencing (RNA-Seq) has transformed the ecophysiological assessment of individual plankton species and communities. However, the technology generates complex data consisting of millions of short-read sequences that can be difficult to analyze and interpret. New bioinformatics workflows are needed to guide experimentation, environmental sampling, and to develop and test hypotheses. One complexity-reducing tool that has been used successfully in other fields is “t-distributed Stochastic Neighbor Embedding” (t-SNE). Its application to transcriptomic data from marine pelagic and benthic systems has yet to be explored. The present study demonstrates an application for evaluating RNA-Seq data using previously published, conventionally analyzed studies on the copepods Calanus finmarchicus and Neocalanus flemingeri. In one application, gene expression profiles were compared among different developmental stages. In another, they were compared among experimental conditions. In a third, they were compared among environmental samples from different locations. The profile categories identified by t-SNE were validated by reference to published results using differential gene expression and Gene Ontology (GO) analyses. The analyses demonstrate how individual samples can be evaluated for differences in global gene expression, as well as differences in expression related to specific biological processes, such as lipid metabolism and responses to stress. As RNA-Seq data from plankton species and communities become more common, t-SNE analysis should provide a powerful tool for determining trends and classifying samples into groups with similar transcriptional physiology, independent of collection site or time.</description><identifier>ISSN: 1874-7787</identifier><identifier>EISSN: 1876-7478</identifier><identifier>DOI: 10.1016/j.margen.2019.100723</identifier><identifier>PMID: 31784353</identifier><language>eng</language><publisher>Netherlands: Elsevier B.V</publisher><subject>Bioinformatics ; Calanus finmarchicus ; Copepod ; ecophysiology ; gene expression ; gene expression regulation ; gene ontology ; genomics ; lipid metabolism ; Omics ; plankton ; RNA ; RNA-Seq ; sequence analysis ; species ; transcription (genetics) ; transcriptomics ; Zooplankton</subject><ispartof>Marine genomics, 2020-06, Vol.51, p.100723-100723, Article 100723</ispartof><rights>2019 The Author(s)</rights><rights>Copyright © 2019 The Author(s). Published by Elsevier B.V. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c441t-63e32be561305fbe85be81c1b8d80fa9c25bf8fb455eeb83d8d404792aef126b3</citedby><cites>FETCH-LOGICAL-c441t-63e32be561305fbe85be81c1b8d80fa9c25bf8fb455eeb83d8d404792aef126b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1874778719301746$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65534</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31784353$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Cieslak, Matthew C.</creatorcontrib><creatorcontrib>Castelfranco, Ann M.</creatorcontrib><creatorcontrib>Roncalli, Vittoria</creatorcontrib><creatorcontrib>Lenz, Petra H.</creatorcontrib><creatorcontrib>Hartline, Daniel K.</creatorcontrib><title>t-Distributed Stochastic Neighbor Embedding (t-SNE): A tool for eco-physiological transcriptomic analysis</title><title>Marine genomics</title><addtitle>Mar Genomics</addtitle><description>High-throughput RNA sequencing (RNA-Seq) has transformed the ecophysiological assessment of individual plankton species and communities. However, the technology generates complex data consisting of millions of short-read sequences that can be difficult to analyze and interpret. New bioinformatics workflows are needed to guide experimentation, environmental sampling, and to develop and test hypotheses. One complexity-reducing tool that has been used successfully in other fields is “t-distributed Stochastic Neighbor Embedding” (t-SNE). Its application to transcriptomic data from marine pelagic and benthic systems has yet to be explored. The present study demonstrates an application for evaluating RNA-Seq data using previously published, conventionally analyzed studies on the copepods Calanus finmarchicus and Neocalanus flemingeri. In one application, gene expression profiles were compared among different developmental stages. In another, they were compared among experimental conditions. In a third, they were compared among environmental samples from different locations. The profile categories identified by t-SNE were validated by reference to published results using differential gene expression and Gene Ontology (GO) analyses. The analyses demonstrate how individual samples can be evaluated for differences in global gene expression, as well as differences in expression related to specific biological processes, such as lipid metabolism and responses to stress. As RNA-Seq data from plankton species and communities become more common, t-SNE analysis should provide a powerful tool for determining trends and classifying samples into groups with similar transcriptional physiology, independent of collection site or time.</description><subject>Bioinformatics</subject><subject>Calanus finmarchicus</subject><subject>Copepod</subject><subject>ecophysiology</subject><subject>gene expression</subject><subject>gene expression regulation</subject><subject>gene ontology</subject><subject>genomics</subject><subject>lipid metabolism</subject><subject>Omics</subject><subject>plankton</subject><subject>RNA</subject><subject>RNA-Seq</subject><subject>sequence analysis</subject><subject>species</subject><subject>transcription (genetics)</subject><subject>transcriptomics</subject><subject>Zooplankton</subject><issn>1874-7787</issn><issn>1876-7478</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNqFkUtP3DAQgK0KVCjtP6iqHOkhi1-JHQ5IK7q0SIgeaM-WH5Ndr5J4sb1I_PuaBjjCwRpr5psZaT6EvhK8IJi0Z9vFqOMapgXFpCspLCj7gI6JFG0tuJAH__-8FkKKI_QppS3GLRUSf0RHjAjJWcOOkc_1D59y9GafwVV3OdiNTtnb6hb8emNCrFajAef8tK5Oc313u_p-Xi2rHMJQ9aUKNtS7zWPyYQhrb_VQ5ainZKPf5TCWOXrSQymnz-iw10OCL8_xBP29Wv25_FXf_P55fbm8qS3nJNctA0YNNC1huOkNyKY8YomRTuJed5Y2ppe94U0DYCRz0nHMRUc19IS2hp2g03nuLob7PaSsRp8sDIOeIOyTopx1nLSk695HGcVMYi6bgvIZtTGkFKFXu-iLgEdFsHryobZq9qGefKjZR2n79rxhb0Zwr00vAgpwMQNQTvLgIapkPUwWnI9gs3LBv73hH4oOnhk</recordid><startdate>202006</startdate><enddate>202006</enddate><creator>Cieslak, Matthew C.</creator><creator>Castelfranco, Ann M.</creator><creator>Roncalli, Vittoria</creator><creator>Lenz, Petra H.</creator><creator>Hartline, Daniel K.</creator><general>Elsevier B.V</general><scope>6I.</scope><scope>AAFTH</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7S9</scope><scope>L.6</scope></search><sort><creationdate>202006</creationdate><title>t-Distributed Stochastic Neighbor Embedding (t-SNE): A tool for eco-physiological transcriptomic analysis</title><author>Cieslak, Matthew C. ; Castelfranco, Ann M. ; Roncalli, Vittoria ; Lenz, Petra H. ; Hartline, Daniel K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c441t-63e32be561305fbe85be81c1b8d80fa9c25bf8fb455eeb83d8d404792aef126b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Bioinformatics</topic><topic>Calanus finmarchicus</topic><topic>Copepod</topic><topic>ecophysiology</topic><topic>gene expression</topic><topic>gene expression regulation</topic><topic>gene ontology</topic><topic>genomics</topic><topic>lipid metabolism</topic><topic>Omics</topic><topic>plankton</topic><topic>RNA</topic><topic>RNA-Seq</topic><topic>sequence analysis</topic><topic>species</topic><topic>transcription (genetics)</topic><topic>transcriptomics</topic><topic>Zooplankton</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cieslak, Matthew C.</creatorcontrib><creatorcontrib>Castelfranco, Ann M.</creatorcontrib><creatorcontrib>Roncalli, Vittoria</creatorcontrib><creatorcontrib>Lenz, Petra H.</creatorcontrib><creatorcontrib>Hartline, Daniel K.</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><jtitle>Marine genomics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cieslak, Matthew C.</au><au>Castelfranco, Ann M.</au><au>Roncalli, Vittoria</au><au>Lenz, Petra H.</au><au>Hartline, Daniel K.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>t-Distributed Stochastic Neighbor Embedding (t-SNE): A tool for eco-physiological transcriptomic analysis</atitle><jtitle>Marine genomics</jtitle><addtitle>Mar Genomics</addtitle><date>2020-06</date><risdate>2020</risdate><volume>51</volume><spage>100723</spage><epage>100723</epage><pages>100723-100723</pages><artnum>100723</artnum><issn>1874-7787</issn><eissn>1876-7478</eissn><abstract>High-throughput RNA sequencing (RNA-Seq) has transformed the ecophysiological assessment of individual plankton species and communities. However, the technology generates complex data consisting of millions of short-read sequences that can be difficult to analyze and interpret. New bioinformatics workflows are needed to guide experimentation, environmental sampling, and to develop and test hypotheses. One complexity-reducing tool that has been used successfully in other fields is “t-distributed Stochastic Neighbor Embedding” (t-SNE). Its application to transcriptomic data from marine pelagic and benthic systems has yet to be explored. The present study demonstrates an application for evaluating RNA-Seq data using previously published, conventionally analyzed studies on the copepods Calanus finmarchicus and Neocalanus flemingeri. In one application, gene expression profiles were compared among different developmental stages. In another, they were compared among experimental conditions. In a third, they were compared among environmental samples from different locations. The profile categories identified by t-SNE were validated by reference to published results using differential gene expression and Gene Ontology (GO) analyses. The analyses demonstrate how individual samples can be evaluated for differences in global gene expression, as well as differences in expression related to specific biological processes, such as lipid metabolism and responses to stress. As RNA-Seq data from plankton species and communities become more common, t-SNE analysis should provide a powerful tool for determining trends and classifying samples into groups with similar transcriptional physiology, independent of collection site or time.</abstract><cop>Netherlands</cop><pub>Elsevier B.V</pub><pmid>31784353</pmid><doi>10.1016/j.margen.2019.100723</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1874-7787
ispartof Marine genomics, 2020-06, Vol.51, p.100723-100723, Article 100723
issn 1874-7787
1876-7478
language eng
recordid cdi_proquest_miscellaneous_2439416199
source Elsevier ScienceDirect Journals Complete
subjects Bioinformatics
Calanus finmarchicus
Copepod
ecophysiology
gene expression
gene expression regulation
gene ontology
genomics
lipid metabolism
Omics
plankton
RNA
RNA-Seq
sequence analysis
species
transcription (genetics)
transcriptomics
Zooplankton
title t-Distributed Stochastic Neighbor Embedding (t-SNE): A tool for eco-physiological transcriptomic analysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T15%3A36%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=t-Distributed%20Stochastic%20Neighbor%20Embedding%20(t-SNE):%20A%20tool%20for%20eco-physiological%20transcriptomic%20analysis&rft.jtitle=Marine%20genomics&rft.au=Cieslak,%20Matthew%20C.&rft.date=2020-06&rft.volume=51&rft.spage=100723&rft.epage=100723&rft.pages=100723-100723&rft.artnum=100723&rft.issn=1874-7787&rft.eissn=1876-7478&rft_id=info:doi/10.1016/j.margen.2019.100723&rft_dat=%3Cproquest_cross%3E2439416199%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2320380485&rft_id=info:pmid/31784353&rft_els_id=S1874778719301746&rfr_iscdi=true