Security monitoring using microphone arrays and audio classification

In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on instrumentation and measurement 2006-08, Vol.55 (4), p.1025-1032
Hauptverfasser: Abu-El-Quran, A.R., Goubran, R.A., Chan, A.D.C.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1032
container_issue 4
container_start_page 1025
container_title IEEE transactions on instrumentation and measurement
container_volume 55
creator Abu-El-Quran, A.R.
Goubran, R.A.
Chan, A.D.C.
description In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library
doi_str_mv 10.1109/TIM.2006.876394
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_1658350</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1658350</ieee_id><sourcerecordid>2340320961</sourcerecordid><originalsourceid>FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</originalsourceid><addsrcrecordid>eNpdkD1PwzAQhi0EEqUwM7BELExpz5-xR8RnpSIGymwZxwFXSVzsZOi_x1WQkFjulud9dfcgdIlhgTGo5Wb1siAAYiErQRU7QjPMeVUqIcgxmgFgWSrGxSk6S2kLAJVg1Qzdvzk7Rj_siy70fgjR95_FmA6z8zaG3VfoXWFiNPtUmL4uzFj7UNjWpOQbb83gQ3-OThrTJnfxu-fo_fFhc_dcrl-fVne369JSIYaywbUUpFYOu9pRw5mpjJGNrJkkklFHseUKlKUEiG0wbT6UxEoQyoSypmJ0jm6m3l0M36NLg-58sq5tTe_CmLSUikogmGTy-h-5DWPs83FaCp7FgKQZWk5Q_jOl6Bq9i74zca8x6INTnZ3qg1M9Oc2JqynhnXN_tOCScqA_ue5yLw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>865155083</pqid></control><display><type>article</type><title>Security monitoring using microphone arrays and audio classification</title><source>IEEE Electronic Library (IEL)</source><creator>Abu-El-Quran, A.R. ; Goubran, R.A. ; Chan, A.D.C.</creator><creatorcontrib>Abu-El-Quran, A.R. ; Goubran, R.A. ; Chan, A.D.C.</creatorcontrib><description>In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library</description><identifier>ISSN: 0018-9456</identifier><identifier>EISSN: 1557-9662</identifier><identifier>DOI: 10.1109/TIM.2006.876394</identifier><identifier>CODEN: IEIMAO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Audio classification ; beamforming ; Classification ; Classification algorithms ; feature extraction ; Frames ; Libraries ; Microphone arrays ; Monitoring ; Monitoring systems ; Neural networks ; Performance evaluation ; Reverberation ; Robustness ; Security ; security monitoring ; Segments ; Speech ; speech processing</subject><ispartof>IEEE transactions on instrumentation and measurement, 2006-08, Vol.55 (4), p.1025-1032</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</citedby><cites>FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1658350$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1658350$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Abu-El-Quran, A.R.</creatorcontrib><creatorcontrib>Goubran, R.A.</creatorcontrib><creatorcontrib>Chan, A.D.C.</creatorcontrib><title>Security monitoring using microphone arrays and audio classification</title><title>IEEE transactions on instrumentation and measurement</title><addtitle>TIM</addtitle><description>In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library</description><subject>Algorithms</subject><subject>Audio classification</subject><subject>beamforming</subject><subject>Classification</subject><subject>Classification algorithms</subject><subject>feature extraction</subject><subject>Frames</subject><subject>Libraries</subject><subject>Microphone arrays</subject><subject>Monitoring</subject><subject>Monitoring systems</subject><subject>Neural networks</subject><subject>Performance evaluation</subject><subject>Reverberation</subject><subject>Robustness</subject><subject>Security</subject><subject>security monitoring</subject><subject>Segments</subject><subject>Speech</subject><subject>speech processing</subject><issn>0018-9456</issn><issn>1557-9662</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkD1PwzAQhi0EEqUwM7BELExpz5-xR8RnpSIGymwZxwFXSVzsZOi_x1WQkFjulud9dfcgdIlhgTGo5Wb1siAAYiErQRU7QjPMeVUqIcgxmgFgWSrGxSk6S2kLAJVg1Qzdvzk7Rj_siy70fgjR95_FmA6z8zaG3VfoXWFiNPtUmL4uzFj7UNjWpOQbb83gQ3-OThrTJnfxu-fo_fFhc_dcrl-fVne369JSIYaywbUUpFYOu9pRw5mpjJGNrJkkklFHseUKlKUEiG0wbT6UxEoQyoSypmJ0jm6m3l0M36NLg-58sq5tTe_CmLSUikogmGTy-h-5DWPs83FaCp7FgKQZWk5Q_jOl6Bq9i74zca8x6INTnZ3qg1M9Oc2JqynhnXN_tOCScqA_ue5yLw</recordid><startdate>20060801</startdate><enddate>20060801</enddate><creator>Abu-El-Quran, A.R.</creator><creator>Goubran, R.A.</creator><creator>Chan, A.D.C.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>7U5</scope><scope>8FD</scope><scope>L7M</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20060801</creationdate><title>Security monitoring using microphone arrays and audio classification</title><author>Abu-El-Quran, A.R. ; Goubran, R.A. ; Chan, A.D.C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Audio classification</topic><topic>beamforming</topic><topic>Classification</topic><topic>Classification algorithms</topic><topic>feature extraction</topic><topic>Frames</topic><topic>Libraries</topic><topic>Microphone arrays</topic><topic>Monitoring</topic><topic>Monitoring systems</topic><topic>Neural networks</topic><topic>Performance evaluation</topic><topic>Reverberation</topic><topic>Robustness</topic><topic>Security</topic><topic>security monitoring</topic><topic>Segments</topic><topic>Speech</topic><topic>speech processing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Abu-El-Quran, A.R.</creatorcontrib><creatorcontrib>Goubran, R.A.</creatorcontrib><creatorcontrib>Chan, A.D.C.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on instrumentation and measurement</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Abu-El-Quran, A.R.</au><au>Goubran, R.A.</au><au>Chan, A.D.C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Security monitoring using microphone arrays and audio classification</atitle><jtitle>IEEE transactions on instrumentation and measurement</jtitle><stitle>TIM</stitle><date>2006-08-01</date><risdate>2006</risdate><volume>55</volume><issue>4</issue><spage>1025</spage><epage>1032</epage><pages>1025-1032</pages><issn>0018-9456</issn><eissn>1557-9662</eissn><coden>IEIMAO</coden><abstract>In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIM.2006.876394</doi><tpages>8</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0018-9456
ispartof IEEE transactions on instrumentation and measurement, 2006-08, Vol.55 (4), p.1025-1032
issn 0018-9456
1557-9662
language eng
recordid cdi_ieee_primary_1658350
source IEEE Electronic Library (IEL)
subjects Algorithms
Audio classification
beamforming
Classification
Classification algorithms
feature extraction
Frames
Libraries
Microphone arrays
Monitoring
Monitoring systems
Neural networks
Performance evaluation
Reverberation
Robustness
Security
security monitoring
Segments
Speech
speech processing
title Security monitoring using microphone arrays and audio classification
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T05%3A50%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Security%20monitoring%20using%20microphone%20arrays%20and%20audio%20classification&rft.jtitle=IEEE%20transactions%20on%20instrumentation%20and%20measurement&rft.au=Abu-El-Quran,%20A.R.&rft.date=2006-08-01&rft.volume=55&rft.issue=4&rft.spage=1025&rft.epage=1032&rft.pages=1025-1032&rft.issn=0018-9456&rft.eissn=1557-9662&rft.coden=IEIMAO&rft_id=info:doi/10.1109/TIM.2006.876394&rft_dat=%3Cproquest_RIE%3E2340320961%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=865155083&rft_id=info:pmid/&rft_ieee_id=1658350&rfr_iscdi=true