Defining diversity, specialization, and gene specificity in transcriptomes through information theory

The transcriptome is a set of genes transcribed in a given tissue under specific conditions and can be characterized by a list of genes with their corresponding frequencies of transcription. Transcriptome changes can be measured by counting gene tags from mRNA libraries or by measuring light signals...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the National Academy of Sciences - PNAS 2008-07, Vol.105 (28), p.9709-9714
Hauptverfasser: Martinez, Octavio, Reyes-Valdes, MHumberto
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 9714
container_issue 28
container_start_page 9709
container_title Proceedings of the National Academy of Sciences - PNAS
container_volume 105
creator Martinez, Octavio
Reyes-Valdes, MHumberto
description The transcriptome is a set of genes transcribed in a given tissue under specific conditions and can be characterized by a list of genes with their corresponding frequencies of transcription. Transcriptome changes can be measured by counting gene tags from mRNA libraries or by measuring light signals in DNA microarrays. In any case, it is difficult to completely comprehend the global changes that occur in the transcriptome, given that thousands of gene expression measurements are involved. We propose an approach to define and estimate the diversity and specialization of transcriptomes and gene specificity. We define transcriptome diversity as the Shannon entropy of its frequency distribution. Gene specificity is defined as the mutual information between the tissues and the corresponding transcript, allowing detection of either housekeeping or highly specific genes and clarifying the meaning of these concepts in the literature. Tissue specialization is measured by average gene specificity. We introduce the formulae using a simple example and show their application in two datasets of gene expression in human tissues. Visualization of the positions of transcriptomes in a system of diversity and specialization coordinates makes it possible to understand at a glance their interrelations, summarizing in a powerful way which transcriptomes are richer in diversity of expressed genes, or which are relatively more specialized. The framework presented enlightens the relation among transcriptomes, allowing a better understanding of their changes through the development of the organism or in response to environmental stimuli.
doi_str_mv 10.1073/pnas.0803479105
format Article
fullrecord <record><control><sourceid>jstor_pnas_</sourceid><recordid>TN_cdi_pnas_primary_105_28_9709</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>25463038</jstor_id><sourcerecordid>25463038</sourcerecordid><originalsourceid>FETCH-LOGICAL-c618t-c1eb5bfca41d0be46718e5a006418840ca7bbe9f782a0dcce557995c5251be9a3</originalsourceid><addsrcrecordid>eNp9ks1v1DAQxSMEotvCmRMQ9YA4NO04thP7goTKp1SJA_RsOc4k61ViBzupWP56vOyqCxyQLFma95vnGT1n2TMClwRqejU5HS9BAGW1JMAfZCsCkhQVk_AwWwGUdSFYyU6y0xg3ACC5gMfZCREVVFLIVYbvsLPOuj5v7R2GaOftRR4nNFYP9qeerXcXuXZt3qPDvdBZk6jcunwO2kUT7DT7EWM-r4Nf-nVSOh_G372phj5sn2SPOj1EfHq4z7LbD--_XX8qbr58_Hz99qYwFRFzYQg2vOmMZqSFBllVE4FcA1SMCMHA6LppUHa1KDW0xiDntZTc8JKTVNf0LHuz952WZsTWoEsjDmoKdtRhq7y26m_F2bXq_Z0qGaOCyGTw6mAQ_PcF46xGGw0Og3bol6hKEJICJwk8_wfc-CW4tFxiCEuHQYKu9pAJPsaA3f0kBNQuP7XLTx3zSx0v_lzgyB8CS8DLA7DrPNpxVQola9gRr_9PqG4Zhhl_zAl9vkc3cfbhni05qyhQcXys017pPtiobr-m9Wj6SKRilNJfsNvE3A</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>201401440</pqid></control><display><type>article</type><title>Defining diversity, specialization, and gene specificity in transcriptomes through information theory</title><source>MEDLINE</source><source>Jstor Complete Legacy</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Martinez, Octavio ; Reyes-Valdes, MHumberto</creator><creatorcontrib>Martinez, Octavio ; Reyes-Valdes, MHumberto</creatorcontrib><description>The transcriptome is a set of genes transcribed in a given tissue under specific conditions and can be characterized by a list of genes with their corresponding frequencies of transcription. Transcriptome changes can be measured by counting gene tags from mRNA libraries or by measuring light signals in DNA microarrays. In any case, it is difficult to completely comprehend the global changes that occur in the transcriptome, given that thousands of gene expression measurements are involved. We propose an approach to define and estimate the diversity and specialization of transcriptomes and gene specificity. We define transcriptome diversity as the Shannon entropy of its frequency distribution. Gene specificity is defined as the mutual information between the tissues and the corresponding transcript, allowing detection of either housekeeping or highly specific genes and clarifying the meaning of these concepts in the literature. Tissue specialization is measured by average gene specificity. We introduce the formulae using a simple example and show their application in two datasets of gene expression in human tissues. Visualization of the positions of transcriptomes in a system of diversity and specialization coordinates makes it possible to understand at a glance their interrelations, summarizing in a powerful way which transcriptomes are richer in diversity of expressed genes, or which are relatively more specialized. The framework presented enlightens the relation among transcriptomes, allowing a better understanding of their changes through the development of the organism or in response to environmental stimuli.</description><identifier>ISSN: 0027-8424</identifier><identifier>EISSN: 1091-6490</identifier><identifier>DOI: 10.1073/pnas.0803479105</identifier><identifier>PMID: 18606989</identifier><language>eng</language><publisher>United States: National Academy of Sciences</publisher><subject>Biological Sciences ; Datasets ; Deoxyribonucleic acid ; DNA ; Entropy ; Estimates ; Gene expression ; Gene Expression Profiling - methods ; Gene Frequency ; Genes ; Genes - physiology ; Genetic diversity ; Genetic Variation ; Housekeeping ; Humans ; Information Theory ; Pancreas ; Ribonucleic acid ; RNA ; Salivary glands ; Scatter plots ; Terminology as Topic ; Tissue Distribution ; Tissues ; Transcriptomes</subject><ispartof>Proceedings of the National Academy of Sciences - PNAS, 2008-07, Vol.105 (28), p.9709-9714</ispartof><rights>Copyright 2008 The National Academy of Sciences of the United States of America</rights><rights>Copyright National Academy of Sciences Jul 15, 2008</rights><rights>2008 by The National Academy of Sciences of the USA</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c618t-c1eb5bfca41d0be46718e5a006418840ca7bbe9f782a0dcce557995c5251be9a3</citedby><cites>FETCH-LOGICAL-c618t-c1eb5bfca41d0be46718e5a006418840ca7bbe9f782a0dcce557995c5251be9a3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://www.pnas.org/content/105/28.cover.gif</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/25463038$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/25463038$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,727,780,784,803,885,27915,27916,53782,53784,58008,58241</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/18606989$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Martinez, Octavio</creatorcontrib><creatorcontrib>Reyes-Valdes, MHumberto</creatorcontrib><title>Defining diversity, specialization, and gene specificity in transcriptomes through information theory</title><title>Proceedings of the National Academy of Sciences - PNAS</title><addtitle>Proc Natl Acad Sci U S A</addtitle><description>The transcriptome is a set of genes transcribed in a given tissue under specific conditions and can be characterized by a list of genes with their corresponding frequencies of transcription. Transcriptome changes can be measured by counting gene tags from mRNA libraries or by measuring light signals in DNA microarrays. In any case, it is difficult to completely comprehend the global changes that occur in the transcriptome, given that thousands of gene expression measurements are involved. We propose an approach to define and estimate the diversity and specialization of transcriptomes and gene specificity. We define transcriptome diversity as the Shannon entropy of its frequency distribution. Gene specificity is defined as the mutual information between the tissues and the corresponding transcript, allowing detection of either housekeeping or highly specific genes and clarifying the meaning of these concepts in the literature. Tissue specialization is measured by average gene specificity. We introduce the formulae using a simple example and show their application in two datasets of gene expression in human tissues. Visualization of the positions of transcriptomes in a system of diversity and specialization coordinates makes it possible to understand at a glance their interrelations, summarizing in a powerful way which transcriptomes are richer in diversity of expressed genes, or which are relatively more specialized. The framework presented enlightens the relation among transcriptomes, allowing a better understanding of their changes through the development of the organism or in response to environmental stimuli.</description><subject>Biological Sciences</subject><subject>Datasets</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>Entropy</subject><subject>Estimates</subject><subject>Gene expression</subject><subject>Gene Expression Profiling - methods</subject><subject>Gene Frequency</subject><subject>Genes</subject><subject>Genes - physiology</subject><subject>Genetic diversity</subject><subject>Genetic Variation</subject><subject>Housekeeping</subject><subject>Humans</subject><subject>Information Theory</subject><subject>Pancreas</subject><subject>Ribonucleic acid</subject><subject>RNA</subject><subject>Salivary glands</subject><subject>Scatter plots</subject><subject>Terminology as Topic</subject><subject>Tissue Distribution</subject><subject>Tissues</subject><subject>Transcriptomes</subject><issn>0027-8424</issn><issn>1091-6490</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2008</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9ks1v1DAQxSMEotvCmRMQ9YA4NO04thP7goTKp1SJA_RsOc4k61ViBzupWP56vOyqCxyQLFma95vnGT1n2TMClwRqejU5HS9BAGW1JMAfZCsCkhQVk_AwWwGUdSFYyU6y0xg3ACC5gMfZCREVVFLIVYbvsLPOuj5v7R2GaOftRR4nNFYP9qeerXcXuXZt3qPDvdBZk6jcunwO2kUT7DT7EWM-r4Nf-nVSOh_G372phj5sn2SPOj1EfHq4z7LbD--_XX8qbr58_Hz99qYwFRFzYQg2vOmMZqSFBllVE4FcA1SMCMHA6LppUHa1KDW0xiDntZTc8JKTVNf0LHuz952WZsTWoEsjDmoKdtRhq7y26m_F2bXq_Z0qGaOCyGTw6mAQ_PcF46xGGw0Og3bol6hKEJICJwk8_wfc-CW4tFxiCEuHQYKu9pAJPsaA3f0kBNQuP7XLTx3zSx0v_lzgyB8CS8DLA7DrPNpxVQola9gRr_9PqG4Zhhl_zAl9vkc3cfbhni05qyhQcXys017pPtiobr-m9Wj6SKRilNJfsNvE3A</recordid><startdate>20080715</startdate><enddate>20080715</enddate><creator>Martinez, Octavio</creator><creator>Reyes-Valdes, MHumberto</creator><general>National Academy of Sciences</general><general>National Acad Sciences</general><scope>FBQ</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QG</scope><scope>7QL</scope><scope>7QP</scope><scope>7QR</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TK</scope><scope>7TM</scope><scope>7TO</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>5PM</scope></search><sort><creationdate>20080715</creationdate><title>Defining diversity, specialization, and gene specificity in transcriptomes through information theory</title><author>Martinez, Octavio ; Reyes-Valdes, MHumberto</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c618t-c1eb5bfca41d0be46718e5a006418840ca7bbe9f782a0dcce557995c5251be9a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Biological Sciences</topic><topic>Datasets</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>Entropy</topic><topic>Estimates</topic><topic>Gene expression</topic><topic>Gene Expression Profiling - methods</topic><topic>Gene Frequency</topic><topic>Genes</topic><topic>Genes - physiology</topic><topic>Genetic diversity</topic><topic>Genetic Variation</topic><topic>Housekeeping</topic><topic>Humans</topic><topic>Information Theory</topic><topic>Pancreas</topic><topic>Ribonucleic acid</topic><topic>RNA</topic><topic>Salivary glands</topic><topic>Scatter plots</topic><topic>Terminology as Topic</topic><topic>Tissue Distribution</topic><topic>Tissues</topic><topic>Transcriptomes</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Martinez, Octavio</creatorcontrib><creatorcontrib>Reyes-Valdes, MHumberto</creatorcontrib><collection>AGRIS</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Martinez, Octavio</au><au>Reyes-Valdes, MHumberto</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Defining diversity, specialization, and gene specificity in transcriptomes through information theory</atitle><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle><addtitle>Proc Natl Acad Sci U S A</addtitle><date>2008-07-15</date><risdate>2008</risdate><volume>105</volume><issue>28</issue><spage>9709</spage><epage>9714</epage><pages>9709-9714</pages><issn>0027-8424</issn><eissn>1091-6490</eissn><abstract>The transcriptome is a set of genes transcribed in a given tissue under specific conditions and can be characterized by a list of genes with their corresponding frequencies of transcription. Transcriptome changes can be measured by counting gene tags from mRNA libraries or by measuring light signals in DNA microarrays. In any case, it is difficult to completely comprehend the global changes that occur in the transcriptome, given that thousands of gene expression measurements are involved. We propose an approach to define and estimate the diversity and specialization of transcriptomes and gene specificity. We define transcriptome diversity as the Shannon entropy of its frequency distribution. Gene specificity is defined as the mutual information between the tissues and the corresponding transcript, allowing detection of either housekeeping or highly specific genes and clarifying the meaning of these concepts in the literature. Tissue specialization is measured by average gene specificity. We introduce the formulae using a simple example and show their application in two datasets of gene expression in human tissues. Visualization of the positions of transcriptomes in a system of diversity and specialization coordinates makes it possible to understand at a glance their interrelations, summarizing in a powerful way which transcriptomes are richer in diversity of expressed genes, or which are relatively more specialized. The framework presented enlightens the relation among transcriptomes, allowing a better understanding of their changes through the development of the organism or in response to environmental stimuli.</abstract><cop>United States</cop><pub>National Academy of Sciences</pub><pmid>18606989</pmid><doi>10.1073/pnas.0803479105</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0027-8424
ispartof Proceedings of the National Academy of Sciences - PNAS, 2008-07, Vol.105 (28), p.9709-9714
issn 0027-8424
1091-6490
language eng
recordid cdi_pnas_primary_105_28_9709
source MEDLINE; Jstor Complete Legacy; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects Biological Sciences
Datasets
Deoxyribonucleic acid
DNA
Entropy
Estimates
Gene expression
Gene Expression Profiling - methods
Gene Frequency
Genes
Genes - physiology
Genetic diversity
Genetic Variation
Housekeeping
Humans
Information Theory
Pancreas
Ribonucleic acid
RNA
Salivary glands
Scatter plots
Terminology as Topic
Tissue Distribution
Tissues
Transcriptomes
title Defining diversity, specialization, and gene specificity in transcriptomes through information theory
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T05%3A19%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pnas_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Defining%20diversity,%20specialization,%20and%20gene%20specificity%20in%20transcriptomes%20through%20information%20theory&rft.jtitle=Proceedings%20of%20the%20National%20Academy%20of%20Sciences%20-%20PNAS&rft.au=Martinez,%20Octavio&rft.date=2008-07-15&rft.volume=105&rft.issue=28&rft.spage=9709&rft.epage=9714&rft.pages=9709-9714&rft.issn=0027-8424&rft.eissn=1091-6490&rft_id=info:doi/10.1073/pnas.0803479105&rft_dat=%3Cjstor_pnas_%3E25463038%3C/jstor_pnas_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=201401440&rft_id=info:pmid/18606989&rft_jstor_id=25463038&rfr_iscdi=true