Security monitoring using microphone arrays and audio classification

In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on instrumentation and measurement 2006-08, Vol.55 (4), p.1025-1032
Hauptverfasser:	Abu-El-Quran, A.R., Goubran, R.A., Chan, A.D.C.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Audio classification beamforming Classification Classification algorithms feature extraction Frames Libraries Microphone arrays Monitoring Monitoring systems Neural networks Performance evaluation Reverberation Robustness Security security monitoring Segments Speech speech processing
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1032
container_issue	4
container_start_page	1025
container_title	IEEE transactions on instrumentation and measurement
container_volume	55
creator	Abu-El-Quran, A.R. Goubran, R.A. Chan, A.D.C.
description	In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library
doi_str_mv	10.1109/TIM.2006.876394
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_1658350</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1658350</ieee_id><sourcerecordid>2340320961</sourcerecordid><originalsourceid>FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</originalsourceid><addsrcrecordid>eNpdkD1PwzAQhi0EEqUwM7BELExpz5-xR8RnpSIGymwZxwFXSVzsZOi_x1WQkFjulud9dfcgdIlhgTGo5Wb1siAAYiErQRU7QjPMeVUqIcgxmgFgWSrGxSk6S2kLAJVg1Qzdvzk7Rj_siy70fgjR95_FmA6z8zaG3VfoXWFiNPtUmL4uzFj7UNjWpOQbb83gQ3-OThrTJnfxu-fo_fFhc_dcrl-fVne369JSIYaywbUUpFYOu9pRw5mpjJGNrJkkklFHseUKlKUEiG0wbT6UxEoQyoSypmJ0jm6m3l0M36NLg-58sq5tTe_CmLSUikogmGTy-h-5DWPs83FaCp7FgKQZWk5Q_jOl6Bq9i74zca8x6INTnZ3qg1M9Oc2JqynhnXN_tOCScqA_ue5yLw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>865155083</pqid></control><display><type>article</type><title>Security monitoring using microphone arrays and audio classification</title><source>IEEE Electronic Library (IEL)</source><creator>Abu-El-Quran, A.R. ; Goubran, R.A. ; Chan, A.D.C.</creator><creatorcontrib>Abu-El-Quran, A.R. ; Goubran, R.A. ; Chan, A.D.C.</creatorcontrib><description>In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library</description><identifier>ISSN: 0018-9456</identifier><identifier>EISSN: 1557-9662</identifier><identifier>DOI: 10.1109/TIM.2006.876394</identifier><identifier>CODEN: IEIMAO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Audio classification ; beamforming ; Classification ; Classification algorithms ; feature extraction ; Frames ; Libraries ; Microphone arrays ; Monitoring ; Monitoring systems ; Neural networks ; Performance evaluation ; Reverberation ; Robustness ; Security ; security monitoring ; Segments ; Speech ; speech processing</subject><ispartof>IEEE transactions on instrumentation and measurement, 2006-08, Vol.55 (4), p.1025-1032</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</citedby><cites>FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1658350$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1658350$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Abu-El-Quran, A.R.</creatorcontrib><creatorcontrib>Goubran, R.A.</creatorcontrib><creatorcontrib>Chan, A.D.C.</creatorcontrib><title>Security monitoring using microphone arrays and audio classification</title><title>IEEE transactions on instrumentation and measurement</title><addtitle>TIM</addtitle><description>In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library</description><subject>Algorithms</subject><subject>Audio classification</subject><subject>beamforming</subject><subject>Classification</subject><subject>Classification algorithms</subject><subject>feature extraction</subject><subject>Frames</subject><subject>Libraries</subject><subject>Microphone arrays</subject><subject>Monitoring</subject><subject>Monitoring systems</subject><subject>Neural networks</subject><subject>Performance evaluation</subject><subject>Reverberation</subject><subject>Robustness</subject><subject>Security</subject><subject>security monitoring</subject><subject>Segments</subject><subject>Speech</subject><subject>speech processing</subject><issn>0018-9456</issn><issn>1557-9662</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkD1PwzAQhi0EEqUwM7BELExpz5-xR8RnpSIGymwZxwFXSVzsZOi_x1WQkFjulud9dfcgdIlhgTGo5Wb1siAAYiErQRU7QjPMeVUqIcgxmgFgWSrGxSk6S2kLAJVg1Qzdvzk7Rj_siy70fgjR95_FmA6z8zaG3VfoXWFiNPtUmL4uzFj7UNjWpOQbb83gQ3-OThrTJnfxu-fo_fFhc_dcrl-fVne369JSIYaywbUUpFYOu9pRw5mpjJGNrJkkklFHseUKlKUEiG0wbT6UxEoQyoSypmJ0jm6m3l0M36NLg-58sq5tTe_CmLSUikogmGTy-h-5DWPs83FaCp7FgKQZWk5Q_jOl6Bq9i74zca8x6INTnZ3qg1M9Oc2JqynhnXN_tOCScqA_ue5yLw</recordid><startdate>20060801</startdate><enddate>20060801</enddate><creator>Abu-El-Quran, A.R.</creator><creator>Goubran, R.A.</creator><creator>Chan, A.D.C.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>7U5</scope><scope>8FD</scope><scope>L7M</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20060801</creationdate><title>Security monitoring using microphone arrays and audio classification</title><author>Abu-El-Quran, A.R. ; Goubran, R.A. ; Chan, A.D.C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Audio classification</topic><topic>beamforming</topic><topic>Classification</topic><topic>Classification algorithms</topic><topic>feature extraction</topic><topic>Frames</topic><topic>Libraries</topic><topic>Microphone arrays</topic><topic>Monitoring</topic><topic>Monitoring systems</topic><topic>Neural networks</topic><topic>Performance evaluation</topic><topic>Reverberation</topic><topic>Robustness</topic><topic>Security</topic><topic>security monitoring</topic><topic>Segments</topic><topic>Speech</topic><topic>speech processing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Abu-El-Quran, A.R.</creatorcontrib><creatorcontrib>Goubran, R.A.</creatorcontrib><creatorcontrib>Chan, A.D.C.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on instrumentation and measurement</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Abu-El-Quran, A.R.</au><au>Goubran, R.A.</au><au>Chan, A.D.C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Security monitoring using microphone arrays and audio classification</atitle><jtitle>IEEE transactions on instrumentation and measurement</jtitle><stitle>TIM</stitle><date>2006-08-01</date><risdate>2006</risdate><volume>55</volume><issue>4</issue><spage>1025</spage><epage>1032</epage><pages>1025-1032</pages><issn>0018-9456</issn><eissn>1557-9662</eissn><coden>IEIMAO</coden><abstract>In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIM.2006.876394</doi><tpages>8</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0018-9456
ispartof	IEEE transactions on instrumentation and measurement, 2006-08, Vol.55 (4), p.1025-1032
issn	0018-9456 1557-9662
language	eng
recordid	cdi_ieee_primary_1658350
source	IEEE Electronic Library (IEL)
subjects	Algorithms Audio classification beamforming Classification Classification algorithms feature extraction Frames Libraries Microphone arrays Monitoring Monitoring systems Neural networks Performance evaluation Reverberation Robustness Security security monitoring Segments Speech speech processing
title	Security monitoring using microphone arrays and audio classification
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T05%3A50%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Security%20monitoring%20using%20microphone%20arrays%20and%20audio%20classification&rft.jtitle=IEEE%20transactions%20on%20instrumentation%20and%20measurement&rft.au=Abu-El-Quran,%20A.R.&rft.date=2006-08-01&rft.volume=55&rft.issue=4&rft.spage=1025&rft.epage=1032&rft.pages=1025-1032&rft.issn=0018-9456&rft.eissn=1557-9662&rft.coden=IEIMAO&rft_id=info:doi/10.1109/TIM.2006.876394&rft_dat=%3Cproquest_RIE%3E2340320961%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=865155083&rft_id=info:pmid/&rft_ieee_id=1658350&rfr_iscdi=true