Accelerating Expectation-Maximization Algorithms with Frequent Updates

Expectation Maximization is a popular approach for parameter estimation in many applications such as image understanding, document classification, or genome data analysis. Despite the popularity of EM algorithms, it is challenging to efficiently implement these algorithms in a distributed environmen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Jiangtao Yin, Yanfeng Zhang, Lixin Gao
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 283
container_issue
container_start_page 275
container_title
container_volume
creator Jiangtao Yin
Yanfeng Zhang
Lixin Gao
description Expectation Maximization is a popular approach for parameter estimation in many applications such as image understanding, document classification, or genome data analysis. Despite the popularity of EM algorithms, it is challenging to efficiently implement these algorithms in a distributed environment. In particular, many EM algorithms that frequently update the parameters have been shown to be much more efficient than their concurrent counterparts. Accordingly, we propose two approaches to parallelize such EM algorithms in a distributed environment so as to scale to massive data sets. We prove that both approaches maintain the convergence properties of the EM algorithms. Based on the approaches, we design and implement a distributed framework, FreEM, to support the implementation of frequent updates for the EM algorithms. We show its efficiency through three well-known EM applications: k-means clustering, fuzzy c-means clustering and parameter estimation for the Gaussian Mixture model. We evaluate our framework on both a local cluster of machines and the Amazon EC2 cloud. Our evaluation shows that the EM algorithms with frequent updates implemented on FreEM can run much faster than those implementations with traditional concurrent updates.
doi_str_mv 10.1109/CLUSTER.2012.81
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6337789</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6337789</ieee_id><sourcerecordid>6337789</sourcerecordid><originalsourceid>FETCH-LOGICAL-i216t-2b37a985526e002f37e14d88b8fdc7e8425bfbe7065daf660873aecd43e7c62c3</originalsourceid><addsrcrecordid>eNotjstOwzAURM1LopSuWbDJD6TY16_rZRW1gBSEBO26cpybYtS0IQlq4euJgNmcmc3oMHYj-FQI7u6yfPW6nL9MgQuYojhhV9wapxVyq0_ZCITB1IGWZ2ziLAplrAQF4M7ZSGgNqQalLtmk6975EBTInRmxxSwE2lLr-7jbJPNjQ6Ef-n6XPvljrOP370hm282-jf1b3SWHAcmipY9P2vXJqil9T901u6j8tqPJP8dstZgvs4c0f75_zGZ5Gge_PoVCWu9w0DHEOVTSklAlYoFVGSyhAl1UBVludOkrYzha6SmUSpINBoIcs9u_30hE66aNtW-_1kZKa9HJH5xdUUw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Accelerating Expectation-Maximization Algorithms with Frequent Updates</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Jiangtao Yin ; Yanfeng Zhang ; Lixin Gao</creator><creatorcontrib>Jiangtao Yin ; Yanfeng Zhang ; Lixin Gao</creatorcontrib><description>Expectation Maximization is a popular approach for parameter estimation in many applications such as image understanding, document classification, or genome data analysis. Despite the popularity of EM algorithms, it is challenging to efficiently implement these algorithms in a distributed environment. In particular, many EM algorithms that frequently update the parameters have been shown to be much more efficient than their concurrent counterparts. Accordingly, we propose two approaches to parallelize such EM algorithms in a distributed environment so as to scale to massive data sets. We prove that both approaches maintain the convergence properties of the EM algorithms. Based on the approaches, we design and implement a distributed framework, FreEM, to support the implementation of frequent updates for the EM algorithms. We show its efficiency through three well-known EM applications: k-means clustering, fuzzy c-means clustering and parameter estimation for the Gaussian Mixture model. We evaluate our framework on both a local cluster of machines and the Amazon EC2 cloud. Our evaluation shows that the EM algorithms with frequent updates implemented on FreEM can run much faster than those implementations with traditional concurrent updates.</description><identifier>ISSN: 1552-5244</identifier><identifier>ISBN: 9781467324229</identifier><identifier>ISBN: 1467324221</identifier><identifier>EISSN: 2168-9253</identifier><identifier>EISBN: 0769548075</identifier><identifier>EISBN: 9780769548074</identifier><identifier>DOI: 10.1109/CLUSTER.2012.81</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acceleration ; Algorithm design and analysis ; Clustering algorithms ; Convergence ; Frequency control ; Linear programming ; Synchronization</subject><ispartof>2012 IEEE International Conference on Cluster Computing, 2012, p.275-283</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6337789$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6337789$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Jiangtao Yin</creatorcontrib><creatorcontrib>Yanfeng Zhang</creatorcontrib><creatorcontrib>Lixin Gao</creatorcontrib><title>Accelerating Expectation-Maximization Algorithms with Frequent Updates</title><title>2012 IEEE International Conference on Cluster Computing</title><addtitle>CLUSTR</addtitle><description>Expectation Maximization is a popular approach for parameter estimation in many applications such as image understanding, document classification, or genome data analysis. Despite the popularity of EM algorithms, it is challenging to efficiently implement these algorithms in a distributed environment. In particular, many EM algorithms that frequently update the parameters have been shown to be much more efficient than their concurrent counterparts. Accordingly, we propose two approaches to parallelize such EM algorithms in a distributed environment so as to scale to massive data sets. We prove that both approaches maintain the convergence properties of the EM algorithms. Based on the approaches, we design and implement a distributed framework, FreEM, to support the implementation of frequent updates for the EM algorithms. We show its efficiency through three well-known EM applications: k-means clustering, fuzzy c-means clustering and parameter estimation for the Gaussian Mixture model. We evaluate our framework on both a local cluster of machines and the Amazon EC2 cloud. Our evaluation shows that the EM algorithms with frequent updates implemented on FreEM can run much faster than those implementations with traditional concurrent updates.</description><subject>Acceleration</subject><subject>Algorithm design and analysis</subject><subject>Clustering algorithms</subject><subject>Convergence</subject><subject>Frequency control</subject><subject>Linear programming</subject><subject>Synchronization</subject><issn>1552-5244</issn><issn>2168-9253</issn><isbn>9781467324229</isbn><isbn>1467324221</isbn><isbn>0769548075</isbn><isbn>9780769548074</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotjstOwzAURM1LopSuWbDJD6TY16_rZRW1gBSEBO26cpybYtS0IQlq4euJgNmcmc3oMHYj-FQI7u6yfPW6nL9MgQuYojhhV9wapxVyq0_ZCITB1IGWZ2ziLAplrAQF4M7ZSGgNqQalLtmk6975EBTInRmxxSwE2lLr-7jbJPNjQ6Ef-n6XPvljrOP370hm282-jf1b3SWHAcmipY9P2vXJqil9T901u6j8tqPJP8dstZgvs4c0f75_zGZ5Gge_PoVCWu9w0DHEOVTSklAlYoFVGSyhAl1UBVludOkrYzha6SmUSpINBoIcs9u_30hE66aNtW-_1kZKa9HJH5xdUUw</recordid><startdate>20120101</startdate><enddate>20120101</enddate><creator>Jiangtao Yin</creator><creator>Yanfeng Zhang</creator><creator>Lixin Gao</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20120101</creationdate><title>Accelerating Expectation-Maximization Algorithms with Frequent Updates</title><author>Jiangtao Yin ; Yanfeng Zhang ; Lixin Gao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i216t-2b37a985526e002f37e14d88b8fdc7e8425bfbe7065daf660873aecd43e7c62c3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Acceleration</topic><topic>Algorithm design and analysis</topic><topic>Clustering algorithms</topic><topic>Convergence</topic><topic>Frequency control</topic><topic>Linear programming</topic><topic>Synchronization</topic><toplevel>online_resources</toplevel><creatorcontrib>Jiangtao Yin</creatorcontrib><creatorcontrib>Yanfeng Zhang</creatorcontrib><creatorcontrib>Lixin Gao</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jiangtao Yin</au><au>Yanfeng Zhang</au><au>Lixin Gao</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Accelerating Expectation-Maximization Algorithms with Frequent Updates</atitle><btitle>2012 IEEE International Conference on Cluster Computing</btitle><stitle>CLUSTR</stitle><date>2012-01-01</date><risdate>2012</risdate><spage>275</spage><epage>283</epage><pages>275-283</pages><issn>1552-5244</issn><eissn>2168-9253</eissn><isbn>9781467324229</isbn><isbn>1467324221</isbn><eisbn>0769548075</eisbn><eisbn>9780769548074</eisbn><coden>IEEPAD</coden><abstract>Expectation Maximization is a popular approach for parameter estimation in many applications such as image understanding, document classification, or genome data analysis. Despite the popularity of EM algorithms, it is challenging to efficiently implement these algorithms in a distributed environment. In particular, many EM algorithms that frequently update the parameters have been shown to be much more efficient than their concurrent counterparts. Accordingly, we propose two approaches to parallelize such EM algorithms in a distributed environment so as to scale to massive data sets. We prove that both approaches maintain the convergence properties of the EM algorithms. Based on the approaches, we design and implement a distributed framework, FreEM, to support the implementation of frequent updates for the EM algorithms. We show its efficiency through three well-known EM applications: k-means clustering, fuzzy c-means clustering and parameter estimation for the Gaussian Mixture model. We evaluate our framework on both a local cluster of machines and the Amazon EC2 cloud. Our evaluation shows that the EM algorithms with frequent updates implemented on FreEM can run much faster than those implementations with traditional concurrent updates.</abstract><pub>IEEE</pub><doi>10.1109/CLUSTER.2012.81</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1552-5244
ispartof 2012 IEEE International Conference on Cluster Computing, 2012, p.275-283
issn 1552-5244
2168-9253
language eng
recordid cdi_ieee_primary_6337789
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Acceleration
Algorithm design and analysis
Clustering algorithms
Convergence
Frequency control
Linear programming
Synchronization
title Accelerating Expectation-Maximization Algorithms with Frequent Updates
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T14%3A17%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Accelerating%20Expectation-Maximization%20Algorithms%20with%20Frequent%20Updates&rft.btitle=2012%20IEEE%20International%20Conference%20on%20Cluster%20Computing&rft.au=Jiangtao%20Yin&rft.date=2012-01-01&rft.spage=275&rft.epage=283&rft.pages=275-283&rft.issn=1552-5244&rft.eissn=2168-9253&rft.isbn=9781467324229&rft.isbn_list=1467324221&rft.coden=IEEPAD&rft_id=info:doi/10.1109/CLUSTER.2012.81&rft_dat=%3Cieee_6IE%3E6337789%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=0769548075&rft.eisbn_list=9780769548074&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6337789&rfr_iscdi=true