Audio-Visual Active Speaker Tracking in Cluttered Indoors Environments

We propose a system for detecting the active speaker in cluttered and reverberant environments where more than one person speaks and moves. Rather than using only audio information, the system utilizes audiovisual information from multiple acoustic and video sensors that feed separate audio and vide...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on cybernetics 2009-02, Vol.39 (1), p.7-15
Hauptverfasser:	Talantzis, Fotios, Pnevmatikakis, Aristodemos, Constantinides, Anthony G.
Format:	Artikel
Sprache:	eng
Schlagworte:	Acoustic sensors Acoustic signal detection Feeds Indoor environments Information theory Kalman filters Loudspeakers Particle filters particle filters (PFs) person tracking Position measurement Sensor systems Spatiotemporal phenomena
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	15
container_issue	1
container_start_page	7
container_title	IEEE transactions on cybernetics
container_volume	39
creator	Talantzis, Fotios Pnevmatikakis, Aristodemos Constantinides, Anthony G.
description	We propose a system for detecting the active speaker in cluttered and reverberant environments where more than one person speaks and moves. Rather than using only audio information, the system utilizes audiovisual information from multiple acoustic and video sensors that feed separate audio and video tracking modules. The audio module operates using a particle filter (PF) and an information-theoretic framework to provide accurate acoustic source location under reverberant conditions. The video subsystem combines in 3-D a number of 2-D trackers based on a variation of Stauffer's adaptive background algorithm with spatiotemporal adaptation of the learning parameters and a Kalman tracker in a feedback configuration. Extensive experiments show that gains are to be expected when fusion of the separate modalities is performed to detect the active speaker.
doi_str_mv	10.1109/TSMCB.2008.2009558
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_66831126</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4694076</ieee_id><sourcerecordid>2295557831</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-b1eb051cb5783e3dec3cfca51a729d0be5bff602656a611d7800f25354f5efa13</originalsourceid><addsrcrecordid>eNpdkE1LxDAQhoMofv8BBSkevFVnmiZNjuviFygeXL2GNJ1KtNuuSSv47-26i4KXmYF53mF4GDtCOEcEfTF7ephenmcAalm0EGqD7aLOMYVcZ5vjDIqneY56h-3F-AYjBLrYZjuoUUAhil12PRkq36UvPg62SSau95-UPC3IvlNIZsG6d9--Jr5Nps3Q9xSoSu7aqutCTK7aTx-6dk5tHw_YVm2bSIfrvs-er69m09v0_vHmbjq5Tx3PVZ-WSCUIdKUoFCdekeOudlagLTJdQUmirGsJmRTSSsSqUAB1JrjIa0G1Rb7PzlZ3F6H7GCj2Zu6jo6axLXVDNFIqjpjJETz9B751Q2jH34wShQCOio9QtoJc6GIMVJtF8HMbvgyCWSo2P4rNUrFZKx5DJ-vLQzmn6i-ydjoCxyvAE9HvOpc6h0Lyb7aff3U</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>857503183</pqid></control><display><type>article</type><title>Audio-Visual Active Speaker Tracking in Cluttered Indoors Environments</title><source>IEEE Electronic Library (IEL)</source><creator>Talantzis, Fotios ; Pnevmatikakis, Aristodemos ; Constantinides, Anthony G.</creator><creatorcontrib>Talantzis, Fotios ; Pnevmatikakis, Aristodemos ; Constantinides, Anthony G.</creatorcontrib><description>We propose a system for detecting the active speaker in cluttered and reverberant environments where more than one person speaks and moves. Rather than using only audio information, the system utilizes audiovisual information from multiple acoustic and video sensors that feed separate audio and video tracking modules. The audio module operates using a particle filter (PF) and an information-theoretic framework to provide accurate acoustic source location under reverberant conditions. The video subsystem combines in 3-D a number of 2-D trackers based on a variation of Stauffer's adaptive background algorithm with spatiotemporal adaptation of the learning parameters and a Kalman tracker in a feedback configuration. Extensive experiments show that gains are to be expected when fusion of the separate modalities is performed to detect the active speaker.</description><identifier>ISSN: 1083-4419</identifier><identifier>ISSN: 2168-2267</identifier><identifier>EISSN: 1941-0492</identifier><identifier>EISSN: 2168-2275</identifier><identifier>DOI: 10.1109/TSMCB.2008.2009558</identifier><identifier>PMID: 19150757</identifier><identifier>CODEN: ITSCFI</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Acoustic sensors ; Acoustic signal detection ; Feeds ; Indoor environments ; Information theory ; Kalman filters ; Loudspeakers ; Particle filters ; particle filters (PFs) ; person tracking ; Position measurement ; Sensor systems ; Spatiotemporal phenomena</subject><ispartof>IEEE transactions on cybernetics, 2009-02, Vol.39 (1), p.7-15</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c348t-b1eb051cb5783e3dec3cfca51a729d0be5bff602656a611d7800f25354f5efa13</citedby><cites>FETCH-LOGICAL-c348t-b1eb051cb5783e3dec3cfca51a729d0be5bff602656a611d7800f25354f5efa13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4694076$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4694076$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19150757$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Talantzis, Fotios</creatorcontrib><creatorcontrib>Pnevmatikakis, Aristodemos</creatorcontrib><creatorcontrib>Constantinides, Anthony G.</creatorcontrib><title>Audio-Visual Active Speaker Tracking in Cluttered Indoors Environments</title><title>IEEE transactions on cybernetics</title><addtitle>TSMCB</addtitle><addtitle>IEEE Trans Syst Man Cybern B Cybern</addtitle><description>We propose a system for detecting the active speaker in cluttered and reverberant environments where more than one person speaks and moves. Rather than using only audio information, the system utilizes audiovisual information from multiple acoustic and video sensors that feed separate audio and video tracking modules. The audio module operates using a particle filter (PF) and an information-theoretic framework to provide accurate acoustic source location under reverberant conditions. The video subsystem combines in 3-D a number of 2-D trackers based on a variation of Stauffer's adaptive background algorithm with spatiotemporal adaptation of the learning parameters and a Kalman tracker in a feedback configuration. Extensive experiments show that gains are to be expected when fusion of the separate modalities is performed to detect the active speaker.</description><subject>Acoustic sensors</subject><subject>Acoustic signal detection</subject><subject>Feeds</subject><subject>Indoor environments</subject><subject>Information theory</subject><subject>Kalman filters</subject><subject>Loudspeakers</subject><subject>Particle filters</subject><subject>particle filters (PFs)</subject><subject>person tracking</subject><subject>Position measurement</subject><subject>Sensor systems</subject><subject>Spatiotemporal phenomena</subject><issn>1083-4419</issn><issn>2168-2267</issn><issn>1941-0492</issn><issn>2168-2275</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1LxDAQhoMofv8BBSkevFVnmiZNjuviFygeXL2GNJ1KtNuuSSv47-26i4KXmYF53mF4GDtCOEcEfTF7ephenmcAalm0EGqD7aLOMYVcZ5vjDIqneY56h-3F-AYjBLrYZjuoUUAhil12PRkq36UvPg62SSau95-UPC3IvlNIZsG6d9--Jr5Nps3Q9xSoSu7aqutCTK7aTx-6dk5tHw_YVm2bSIfrvs-er69m09v0_vHmbjq5Tx3PVZ-WSCUIdKUoFCdekeOudlagLTJdQUmirGsJmRTSSsSqUAB1JrjIa0G1Rb7PzlZ3F6H7GCj2Zu6jo6axLXVDNFIqjpjJETz9B751Q2jH34wShQCOio9QtoJc6GIMVJtF8HMbvgyCWSo2P4rNUrFZKx5DJ-vLQzmn6i-ydjoCxyvAE9HvOpc6h0Lyb7aff3U</recordid><startdate>20090201</startdate><enddate>20090201</enddate><creator>Talantzis, Fotios</creator><creator>Pnevmatikakis, Aristodemos</creator><creator>Constantinides, Anthony G.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7TB</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20090201</creationdate><title>Audio-Visual Active Speaker Tracking in Cluttered Indoors Environments</title><author>Talantzis, Fotios ; Pnevmatikakis, Aristodemos ; Constantinides, Anthony G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-b1eb051cb5783e3dec3cfca51a729d0be5bff602656a611d7800f25354f5efa13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Acoustic sensors</topic><topic>Acoustic signal detection</topic><topic>Feeds</topic><topic>Indoor environments</topic><topic>Information theory</topic><topic>Kalman filters</topic><topic>Loudspeakers</topic><topic>Particle filters</topic><topic>particle filters (PFs)</topic><topic>person tracking</topic><topic>Position measurement</topic><topic>Sensor systems</topic><topic>Spatiotemporal phenomena</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Talantzis, Fotios</creatorcontrib><creatorcontrib>Pnevmatikakis, Aristodemos</creatorcontrib><creatorcontrib>Constantinides, Anthony G.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on cybernetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Talantzis, Fotios</au><au>Pnevmatikakis, Aristodemos</au><au>Constantinides, Anthony G.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Audio-Visual Active Speaker Tracking in Cluttered Indoors Environments</atitle><jtitle>IEEE transactions on cybernetics</jtitle><stitle>TSMCB</stitle><addtitle>IEEE Trans Syst Man Cybern B Cybern</addtitle><date>2009-02-01</date><risdate>2009</risdate><volume>39</volume><issue>1</issue><spage>7</spage><epage>15</epage><pages>7-15</pages><issn>1083-4419</issn><issn>2168-2267</issn><eissn>1941-0492</eissn><eissn>2168-2275</eissn><coden>ITSCFI</coden><abstract>We propose a system for detecting the active speaker in cluttered and reverberant environments where more than one person speaks and moves. Rather than using only audio information, the system utilizes audiovisual information from multiple acoustic and video sensors that feed separate audio and video tracking modules. The audio module operates using a particle filter (PF) and an information-theoretic framework to provide accurate acoustic source location under reverberant conditions. The video subsystem combines in 3-D a number of 2-D trackers based on a variation of Stauffer's adaptive background algorithm with spatiotemporal adaptation of the learning parameters and a Kalman tracker in a feedback configuration. Extensive experiments show that gains are to be expected when fusion of the separate modalities is performed to detect the active speaker.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>19150757</pmid><doi>10.1109/TSMCB.2008.2009558</doi><tpages>9</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1083-4419
ispartof	IEEE transactions on cybernetics, 2009-02, Vol.39 (1), p.7-15
issn	1083-4419 2168-2267 1941-0492 2168-2275
language	eng
recordid	cdi_proquest_miscellaneous_66831126
source	IEEE Electronic Library (IEL)
subjects	Acoustic sensors Acoustic signal detection Feeds Indoor environments Information theory Kalman filters Loudspeakers Particle filters particle filters (PFs) person tracking Position measurement Sensor systems Spatiotemporal phenomena
title	Audio-Visual Active Speaker Tracking in Cluttered Indoors Environments
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T02%3A37%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Audio-Visual%20Active%20Speaker%20Tracking%20in%20Cluttered%20Indoors%20Environments&rft.jtitle=IEEE%20transactions%20on%20cybernetics&rft.au=Talantzis,%20Fotios&rft.date=2009-02-01&rft.volume=39&rft.issue=1&rft.spage=7&rft.epage=15&rft.pages=7-15&rft.issn=1083-4419&rft.eissn=1941-0492&rft.coden=ITSCFI&rft_id=info:doi/10.1109/TSMCB.2008.2009558&rft_dat=%3Cproquest_RIE%3E2295557831%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=857503183&rft_id=info:pmid/19150757&rft_ieee_id=4694076&rfr_iscdi=true