Mining Bucket Order-Preserving SubMatrices in Gene Expression Data

The Order-Preserving SubMatrices (OPSMs) are employed to discover significant biological associations between genes and experiment conditions. Herein, we propose a new relaxed OPSM model by considering the linearity relaxation, which is called the Bucket OPSM (BOPSM) model. An efficient method calle...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on knowledge and data engineering 2012-12, Vol.24 (12), p.2218-2231
Hauptverfasser:	Qiong Fang, Ng, Wilfred, Jianlin Feng, Yuliang Li
Format:	Artikel
Sprache:	eng
Schlagworte:	biclustering Biological Biological system modeling bucket order Buckets Data mining Data models Gene expression Genes Itemsets linearity relaxation Mines Mining OPSM Order-preserving submatrix Similarity similarity relaxation State of the art Strategy Studies
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	2231
container_issue	12
container_start_page	2218
container_title	IEEE transactions on knowledge and data engineering
container_volume	24
creator	Qiong Fang Ng, Wilfred Jianlin Feng Yuliang Li
description	The Order-Preserving SubMatrices (OPSMs) are employed to discover significant biological associations between genes and experiment conditions. Herein, we propose a new relaxed OPSM model by considering the linearity relaxation, which is called the Bucket OPSM (BOPSM) model. An efficient method called ApriBopsm is developed to exhaustively mine such BOPSM patterns. We further generalize the BOPSM model by incorporating the similarity relaxation strategy. We develop a generalized BOPSM model called GeBOPSM and adopt a pattern growing method called SeedGrowth to mine GeBOPSM patterns. Informally, the SeedGrowth algorithm adopts two different growing strategies on rows and columns in order to expand a seed BOPSM into a maximal GeBOPSM pattern. We conduct a series of experiments using both synthetic and biological datasets to study the effectiveness of our proposed relaxed models and the efficiency of the relevant mining methods. The BOPSM model is shown to be able to capture the characteristics of noisy OPSM patterns, and is superior to the strict counterparts. ApriBopsm is also significantly more efficient than OPC-Tree, which is the state-of-the-art OPSM mining method. Compared to all the current relaxed OPSM models, the GeBOPSM model achieves the best performance in terms of the number of mined quality patterns.
doi_str_mv	10.1109/TKDE.2011.180
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TKDE_2011_180</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5989809</ieee_id><sourcerecordid>2797716301</sourcerecordid><originalsourceid>FETCH-LOGICAL-c318t-26e96b7c4a70b510955f36871900b177ae06f03c4dddd13efe13bda3fdea779a3</originalsourceid><addsrcrecordid>eNpd0D1PwzAQBmALgUQpjEwskVhYUnxxHNsj_aAgWhWJMltOckEubVLsBMG_x1ERA7fc6fTodHoJuQQ6AqDqdv00nY0SCjACSY_IADiXcQIKjsNMU4hTlopTcub9hlIqhYQBGS9tbeu3aNwV79hGK1eii58denSf_f6ly5emdbZAH9k6mmON0exrH4C3TR1NTWvOyUllth4vfvuQvN7P1pOHeLGaP07uFnHBQLZxkqHKclGkRtCch385r1gmBShKcxDCIM0qyoq0DAUMKwSWl4ZVJRohlGFDcnO4u3fNR4e-1TvrC9xuTY1N5zUkCciMc5oEev2PbprO1eE7DQCpSgUDEVR8UIVrvHdY6b2zO-O-NVDdJ6r7RHWfqA6JBn918BYR_yxXUkmq2A9nInAZ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1114947317</pqid></control><display><type>article</type><title>Mining Bucket Order-Preserving SubMatrices in Gene Expression Data</title><source>IEEE Electronic Library (IEL)</source><creator>Qiong Fang ; Ng, Wilfred ; Jianlin Feng ; Yuliang Li</creator><creatorcontrib>Qiong Fang ; Ng, Wilfred ; Jianlin Feng ; Yuliang Li</creatorcontrib><description>The Order-Preserving SubMatrices (OPSMs) are employed to discover significant biological associations between genes and experiment conditions. Herein, we propose a new relaxed OPSM model by considering the linearity relaxation, which is called the Bucket OPSM (BOPSM) model. An efficient method called ApriBopsm is developed to exhaustively mine such BOPSM patterns. We further generalize the BOPSM model by incorporating the similarity relaxation strategy. We develop a generalized BOPSM model called GeBOPSM and adopt a pattern growing method called SeedGrowth to mine GeBOPSM patterns. Informally, the SeedGrowth algorithm adopts two different growing strategies on rows and columns in order to expand a seed BOPSM into a maximal GeBOPSM pattern. We conduct a series of experiments using both synthetic and biological datasets to study the effectiveness of our proposed relaxed models and the efficiency of the relevant mining methods. The BOPSM model is shown to be able to capture the characteristics of noisy OPSM patterns, and is superior to the strict counterparts. ApriBopsm is also significantly more efficient than OPC-Tree, which is the state-of-the-art OPSM mining method. Compared to all the current relaxed OPSM models, the GeBOPSM model achieves the best performance in terms of the number of mined quality patterns.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2011.180</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>biclustering ; Biological ; Biological system modeling ; bucket order ; Buckets ; Data mining ; Data models ; Gene expression ; Genes ; Itemsets ; linearity relaxation ; Mines ; Mining ; OPSM ; Order-preserving submatrix ; Similarity ; similarity relaxation ; State of the art ; Strategy ; Studies</subject><ispartof>IEEE transactions on knowledge and data engineering, 2012-12, Vol.24 (12), p.2218-2231</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Dec 2012</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c318t-26e96b7c4a70b510955f36871900b177ae06f03c4dddd13efe13bda3fdea779a3</citedby><cites>FETCH-LOGICAL-c318t-26e96b7c4a70b510955f36871900b177ae06f03c4dddd13efe13bda3fdea779a3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5989809$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5989809$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Qiong Fang</creatorcontrib><creatorcontrib>Ng, Wilfred</creatorcontrib><creatorcontrib>Jianlin Feng</creatorcontrib><creatorcontrib>Yuliang Li</creatorcontrib><title>Mining Bucket Order-Preserving SubMatrices in Gene Expression Data</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>The Order-Preserving SubMatrices (OPSMs) are employed to discover significant biological associations between genes and experiment conditions. Herein, we propose a new relaxed OPSM model by considering the linearity relaxation, which is called the Bucket OPSM (BOPSM) model. An efficient method called ApriBopsm is developed to exhaustively mine such BOPSM patterns. We further generalize the BOPSM model by incorporating the similarity relaxation strategy. We develop a generalized BOPSM model called GeBOPSM and adopt a pattern growing method called SeedGrowth to mine GeBOPSM patterns. Informally, the SeedGrowth algorithm adopts two different growing strategies on rows and columns in order to expand a seed BOPSM into a maximal GeBOPSM pattern. We conduct a series of experiments using both synthetic and biological datasets to study the effectiveness of our proposed relaxed models and the efficiency of the relevant mining methods. The BOPSM model is shown to be able to capture the characteristics of noisy OPSM patterns, and is superior to the strict counterparts. ApriBopsm is also significantly more efficient than OPC-Tree, which is the state-of-the-art OPSM mining method. Compared to all the current relaxed OPSM models, the GeBOPSM model achieves the best performance in terms of the number of mined quality patterns.</description><subject>biclustering</subject><subject>Biological</subject><subject>Biological system modeling</subject><subject>bucket order</subject><subject>Buckets</subject><subject>Data mining</subject><subject>Data models</subject><subject>Gene expression</subject><subject>Genes</subject><subject>Itemsets</subject><subject>linearity relaxation</subject><subject>Mines</subject><subject>Mining</subject><subject>OPSM</subject><subject>Order-preserving submatrix</subject><subject>Similarity</subject><subject>similarity relaxation</subject><subject>State of the art</subject><subject>Strategy</subject><subject>Studies</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpd0D1PwzAQBmALgUQpjEwskVhYUnxxHNsj_aAgWhWJMltOckEubVLsBMG_x1ERA7fc6fTodHoJuQQ6AqDqdv00nY0SCjACSY_IADiXcQIKjsNMU4hTlopTcub9hlIqhYQBGS9tbeu3aNwV79hGK1eii58denSf_f6ly5emdbZAH9k6mmON0exrH4C3TR1NTWvOyUllth4vfvuQvN7P1pOHeLGaP07uFnHBQLZxkqHKclGkRtCch385r1gmBShKcxDCIM0qyoq0DAUMKwSWl4ZVJRohlGFDcnO4u3fNR4e-1TvrC9xuTY1N5zUkCciMc5oEev2PbprO1eE7DQCpSgUDEVR8UIVrvHdY6b2zO-O-NVDdJ6r7RHWfqA6JBn918BYR_yxXUkmq2A9nInAZ</recordid><startdate>20121201</startdate><enddate>20121201</enddate><creator>Qiong Fang</creator><creator>Ng, Wilfred</creator><creator>Jianlin Feng</creator><creator>Yuliang Li</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20121201</creationdate><title>Mining Bucket Order-Preserving SubMatrices in Gene Expression Data</title><author>Qiong Fang ; Ng, Wilfred ; Jianlin Feng ; Yuliang Li</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c318t-26e96b7c4a70b510955f36871900b177ae06f03c4dddd13efe13bda3fdea779a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>biclustering</topic><topic>Biological</topic><topic>Biological system modeling</topic><topic>bucket order</topic><topic>Buckets</topic><topic>Data mining</topic><topic>Data models</topic><topic>Gene expression</topic><topic>Genes</topic><topic>Itemsets</topic><topic>linearity relaxation</topic><topic>Mines</topic><topic>Mining</topic><topic>OPSM</topic><topic>Order-preserving submatrix</topic><topic>Similarity</topic><topic>similarity relaxation</topic><topic>State of the art</topic><topic>Strategy</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Qiong Fang</creatorcontrib><creatorcontrib>Ng, Wilfred</creatorcontrib><creatorcontrib>Jianlin Feng</creatorcontrib><creatorcontrib>Yuliang Li</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Qiong Fang</au><au>Ng, Wilfred</au><au>Jianlin Feng</au><au>Yuliang Li</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mining Bucket Order-Preserving SubMatrices in Gene Expression Data</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2012-12-01</date><risdate>2012</risdate><volume>24</volume><issue>12</issue><spage>2218</spage><epage>2231</epage><pages>2218-2231</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>The Order-Preserving SubMatrices (OPSMs) are employed to discover significant biological associations between genes and experiment conditions. Herein, we propose a new relaxed OPSM model by considering the linearity relaxation, which is called the Bucket OPSM (BOPSM) model. An efficient method called ApriBopsm is developed to exhaustively mine such BOPSM patterns. We further generalize the BOPSM model by incorporating the similarity relaxation strategy. We develop a generalized BOPSM model called GeBOPSM and adopt a pattern growing method called SeedGrowth to mine GeBOPSM patterns. Informally, the SeedGrowth algorithm adopts two different growing strategies on rows and columns in order to expand a seed BOPSM into a maximal GeBOPSM pattern. We conduct a series of experiments using both synthetic and biological datasets to study the effectiveness of our proposed relaxed models and the efficiency of the relevant mining methods. The BOPSM model is shown to be able to capture the characteristics of noisy OPSM patterns, and is superior to the strict counterparts. ApriBopsm is also significantly more efficient than OPC-Tree, which is the state-of-the-art OPSM mining method. Compared to all the current relaxed OPSM models, the GeBOPSM model achieves the best performance in terms of the number of mined quality patterns.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TKDE.2011.180</doi><tpages>14</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1041-4347
ispartof	IEEE transactions on knowledge and data engineering, 2012-12, Vol.24 (12), p.2218-2231
issn	1041-4347 1558-2191
language	eng
recordid	cdi_crossref_primary_10_1109_TKDE_2011_180
source	IEEE Electronic Library (IEL)
subjects	biclustering Biological Biological system modeling bucket order Buckets Data mining Data models Gene expression Genes Itemsets linearity relaxation Mines Mining OPSM Order-preserving submatrix Similarity similarity relaxation State of the art Strategy Studies
title	Mining Bucket Order-Preserving SubMatrices in Gene Expression Data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T19%3A47%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mining%20Bucket%20Order-Preserving%20SubMatrices%20in%20Gene%20Expression%20Data&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Qiong%20Fang&rft.date=2012-12-01&rft.volume=24&rft.issue=12&rft.spage=2218&rft.epage=2231&rft.pages=2218-2231&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2011.180&rft_dat=%3Cproquest_RIE%3E2797716301%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1114947317&rft_id=info:pmid/&rft_ieee_id=5989809&rfr_iscdi=true