Fuzzy c-means clustering with prior biological knowledge
We propose a novel semi-supervised clustering method called GO Fuzzy c-means, which enables the simultaneous use of biological knowledge and gene expression data in a probabilistic clustering algorithm. Our method is based on the fuzzy c-means clustering algorithm and utilizes the Gene Ontology anno...
Gespeichert in:
Veröffentlicht in: | Journal of biomedical informatics 2009-02, Vol.42 (1), p.74-81 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 81 |
---|---|
container_issue | 1 |
container_start_page | 74 |
container_title | Journal of biomedical informatics |
container_volume | 42 |
creator | Tari, Luis Baral, Chitta Kim, Seungchan |
description | We propose a novel semi-supervised clustering method called GO Fuzzy c-means, which enables the simultaneous use of biological knowledge and gene expression data in a probabilistic clustering algorithm. Our method is based on the fuzzy c-means clustering algorithm and utilizes the Gene Ontology annotations as prior knowledge to guide the process of grouping functionally related genes. Unlike traditional clustering methods, our method is capable of assigning genes to multiple clusters, which is a more appropriate representation of the behavior of genes. Two datasets of yeast (
Saccharomyces cerevisiae) expression profiles were applied to compare our method with other state-of-the-art clustering methods. Our experiments show that our method can produce far better biologically meaningful clusters even with the use of a small percentage of Gene Ontology annotations. In addition, our experiments further indicate that the utilization of prior knowledge in our method can predict gene functions effectively. The source code is freely available at
http://sysbio.fulton.asu.edu/gofuzzy/. |
doi_str_mv | 10.1016/j.jbi.2008.05.009 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2673503</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1532046408000798</els_id><sourcerecordid>66950562</sourcerecordid><originalsourceid>FETCH-LOGICAL-c480t-507d20314781151e7aba8bc255bb0855788a2dbf60b0506d3b878851c1efe3473</originalsourceid><addsrcrecordid>eNqFkU1LxDAQhoMofqz-AC_Sk7fWSbuTZhEEEb9A8KLnkKSza9Zuo0mr6K83sosfFz0lJO-88848jO1zKDhwcTQv5sYVJYAsAAuAyRrb5liVOYwlrH_dxXiL7cQ4B-AcUWyyLS5xgnU92WbyYnh_f8tsviDdxcy2Q-wpuG6Wvbr-IXsKzofMON_6mbO6zR47_9pSM6NdtjHVbaS91Tli9xfnd2dX-c3t5fXZ6U1uU4Y-R6ibEio-rmVqzqnWRktjS0RjQCLWUuqyMVMBBhBEUxmZnpBbTlOqxnU1YidL36fBLKix1PVBtyoFW-jwprx26vdP5x7UzL-oUtQVQpUMDlcGwT8PFHu1cNFS2-qO_BCVEBMEFOW_wjQGVpB2OmJ8KbTBxxho-pWGg_oEo-YqgVGfYBSgSmBSzcHPMb4rViSS4HgpoLTMF0dBReuos9S4QLZXjXd_2H8AKO2ePw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>20353015</pqid></control><display><type>article</type><title>Fuzzy c-means clustering with prior biological knowledge</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Tari, Luis ; Baral, Chitta ; Kim, Seungchan</creator><creatorcontrib>Tari, Luis ; Baral, Chitta ; Kim, Seungchan</creatorcontrib><description>We propose a novel semi-supervised clustering method called GO Fuzzy c-means, which enables the simultaneous use of biological knowledge and gene expression data in a probabilistic clustering algorithm. Our method is based on the fuzzy c-means clustering algorithm and utilizes the Gene Ontology annotations as prior knowledge to guide the process of grouping functionally related genes. Unlike traditional clustering methods, our method is capable of assigning genes to multiple clusters, which is a more appropriate representation of the behavior of genes. Two datasets of yeast (
Saccharomyces cerevisiae) expression profiles were applied to compare our method with other state-of-the-art clustering methods. Our experiments show that our method can produce far better biologically meaningful clusters even with the use of a small percentage of Gene Ontology annotations. In addition, our experiments further indicate that the utilization of prior knowledge in our method can predict gene functions effectively. The source code is freely available at
http://sysbio.fulton.asu.edu/gofuzzy/.</description><identifier>ISSN: 1532-0464</identifier><identifier>EISSN: 1532-0480</identifier><identifier>DOI: 10.1016/j.jbi.2008.05.009</identifier><identifier>PMID: 18595779</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Algorithms ; Cluster Analysis ; Computational Biology ; Databases, Genetic ; Fuzzy c-means clustering ; Fuzzy Logic ; Gene expression data ; Gene Expression Profiling - methods ; Gene function prediction ; Gene Ontology ; Genes - physiology ; Genes, Fungal - physiology ; Internet ; Normal Distribution ; Oligonucleotide Array Sequence Analysis ; Reproducibility of Results ; Saccharomyces cerevisiae ; Saccharomyces cerevisiae - genetics ; Saccharomyces cerevisiae yeast ; Semi-supervised clustering ; Software</subject><ispartof>Journal of biomedical informatics, 2009-02, Vol.42 (1), p.74-81</ispartof><rights>2008 Elsevier Inc.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c480t-507d20314781151e7aba8bc255bb0855788a2dbf60b0506d3b878851c1efe3473</citedby><cites>FETCH-LOGICAL-c480t-507d20314781151e7aba8bc255bb0855788a2dbf60b0506d3b878851c1efe3473</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1532046408000798$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>230,314,776,780,881,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/18595779$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Tari, Luis</creatorcontrib><creatorcontrib>Baral, Chitta</creatorcontrib><creatorcontrib>Kim, Seungchan</creatorcontrib><title>Fuzzy c-means clustering with prior biological knowledge</title><title>Journal of biomedical informatics</title><addtitle>J Biomed Inform</addtitle><description>We propose a novel semi-supervised clustering method called GO Fuzzy c-means, which enables the simultaneous use of biological knowledge and gene expression data in a probabilistic clustering algorithm. Our method is based on the fuzzy c-means clustering algorithm and utilizes the Gene Ontology annotations as prior knowledge to guide the process of grouping functionally related genes. Unlike traditional clustering methods, our method is capable of assigning genes to multiple clusters, which is a more appropriate representation of the behavior of genes. Two datasets of yeast (
Saccharomyces cerevisiae) expression profiles were applied to compare our method with other state-of-the-art clustering methods. Our experiments show that our method can produce far better biologically meaningful clusters even with the use of a small percentage of Gene Ontology annotations. In addition, our experiments further indicate that the utilization of prior knowledge in our method can predict gene functions effectively. The source code is freely available at
http://sysbio.fulton.asu.edu/gofuzzy/.</description><subject>Algorithms</subject><subject>Cluster Analysis</subject><subject>Computational Biology</subject><subject>Databases, Genetic</subject><subject>Fuzzy c-means clustering</subject><subject>Fuzzy Logic</subject><subject>Gene expression data</subject><subject>Gene Expression Profiling - methods</subject><subject>Gene function prediction</subject><subject>Gene Ontology</subject><subject>Genes - physiology</subject><subject>Genes, Fungal - physiology</subject><subject>Internet</subject><subject>Normal Distribution</subject><subject>Oligonucleotide Array Sequence Analysis</subject><subject>Reproducibility of Results</subject><subject>Saccharomyces cerevisiae</subject><subject>Saccharomyces cerevisiae - genetics</subject><subject>Saccharomyces cerevisiae yeast</subject><subject>Semi-supervised clustering</subject><subject>Software</subject><issn>1532-0464</issn><issn>1532-0480</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkU1LxDAQhoMofqz-AC_Sk7fWSbuTZhEEEb9A8KLnkKSza9Zuo0mr6K83sosfFz0lJO-88848jO1zKDhwcTQv5sYVJYAsAAuAyRrb5liVOYwlrH_dxXiL7cQ4B-AcUWyyLS5xgnU92WbyYnh_f8tsviDdxcy2Q-wpuG6Wvbr-IXsKzofMON_6mbO6zR47_9pSM6NdtjHVbaS91Tli9xfnd2dX-c3t5fXZ6U1uU4Y-R6ibEio-rmVqzqnWRktjS0RjQCLWUuqyMVMBBhBEUxmZnpBbTlOqxnU1YidL36fBLKix1PVBtyoFW-jwprx26vdP5x7UzL-oUtQVQpUMDlcGwT8PFHu1cNFS2-qO_BCVEBMEFOW_wjQGVpB2OmJ8KbTBxxho-pWGg_oEo-YqgVGfYBSgSmBSzcHPMb4rViSS4HgpoLTMF0dBReuos9S4QLZXjXd_2H8AKO2ePw</recordid><startdate>20090201</startdate><enddate>20090201</enddate><creator>Tari, Luis</creator><creator>Baral, Chitta</creator><creator>Kim, Seungchan</creator><general>Elsevier Inc</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>M7N</scope><scope>P64</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20090201</creationdate><title>Fuzzy c-means clustering with prior biological knowledge</title><author>Tari, Luis ; Baral, Chitta ; Kim, Seungchan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c480t-507d20314781151e7aba8bc255bb0855788a2dbf60b0506d3b878851c1efe3473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Algorithms</topic><topic>Cluster Analysis</topic><topic>Computational Biology</topic><topic>Databases, Genetic</topic><topic>Fuzzy c-means clustering</topic><topic>Fuzzy Logic</topic><topic>Gene expression data</topic><topic>Gene Expression Profiling - methods</topic><topic>Gene function prediction</topic><topic>Gene Ontology</topic><topic>Genes - physiology</topic><topic>Genes, Fungal - physiology</topic><topic>Internet</topic><topic>Normal Distribution</topic><topic>Oligonucleotide Array Sequence Analysis</topic><topic>Reproducibility of Results</topic><topic>Saccharomyces cerevisiae</topic><topic>Saccharomyces cerevisiae - genetics</topic><topic>Saccharomyces cerevisiae yeast</topic><topic>Semi-supervised clustering</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tari, Luis</creatorcontrib><creatorcontrib>Baral, Chitta</creatorcontrib><creatorcontrib>Kim, Seungchan</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of biomedical informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tari, Luis</au><au>Baral, Chitta</au><au>Kim, Seungchan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Fuzzy c-means clustering with prior biological knowledge</atitle><jtitle>Journal of biomedical informatics</jtitle><addtitle>J Biomed Inform</addtitle><date>2009-02-01</date><risdate>2009</risdate><volume>42</volume><issue>1</issue><spage>74</spage><epage>81</epage><pages>74-81</pages><issn>1532-0464</issn><eissn>1532-0480</eissn><abstract>We propose a novel semi-supervised clustering method called GO Fuzzy c-means, which enables the simultaneous use of biological knowledge and gene expression data in a probabilistic clustering algorithm. Our method is based on the fuzzy c-means clustering algorithm and utilizes the Gene Ontology annotations as prior knowledge to guide the process of grouping functionally related genes. Unlike traditional clustering methods, our method is capable of assigning genes to multiple clusters, which is a more appropriate representation of the behavior of genes. Two datasets of yeast (
Saccharomyces cerevisiae) expression profiles were applied to compare our method with other state-of-the-art clustering methods. Our experiments show that our method can produce far better biologically meaningful clusters even with the use of a small percentage of Gene Ontology annotations. In addition, our experiments further indicate that the utilization of prior knowledge in our method can predict gene functions effectively. The source code is freely available at
http://sysbio.fulton.asu.edu/gofuzzy/.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>18595779</pmid><doi>10.1016/j.jbi.2008.05.009</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1532-0464 |
ispartof | Journal of biomedical informatics, 2009-02, Vol.42 (1), p.74-81 |
issn | 1532-0464 1532-0480 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2673503 |
source | MEDLINE; Elsevier ScienceDirect Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Algorithms Cluster Analysis Computational Biology Databases, Genetic Fuzzy c-means clustering Fuzzy Logic Gene expression data Gene Expression Profiling - methods Gene function prediction Gene Ontology Genes - physiology Genes, Fungal - physiology Internet Normal Distribution Oligonucleotide Array Sequence Analysis Reproducibility of Results Saccharomyces cerevisiae Saccharomyces cerevisiae - genetics Saccharomyces cerevisiae yeast Semi-supervised clustering Software |
title | Fuzzy c-means clustering with prior biological knowledge |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T00%3A11%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Fuzzy%20c-means%20clustering%20with%20prior%20biological%20knowledge&rft.jtitle=Journal%20of%20biomedical%20informatics&rft.au=Tari,%20Luis&rft.date=2009-02-01&rft.volume=42&rft.issue=1&rft.spage=74&rft.epage=81&rft.pages=74-81&rft.issn=1532-0464&rft.eissn=1532-0480&rft_id=info:doi/10.1016/j.jbi.2008.05.009&rft_dat=%3Cproquest_pubme%3E66950562%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=20353015&rft_id=info:pmid/18595779&rft_els_id=S1532046408000798&rfr_iscdi=true |