Security monitoring using microphone arrays and audio classification
In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on instrumentation and measurement 2006-08, Vol.55 (4), p.1025-1032 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1032 |
---|---|
container_issue | 4 |
container_start_page | 1025 |
container_title | IEEE transactions on instrumentation and measurement |
container_volume | 55 |
creator | Abu-El-Quran, A.R. Goubran, R.A. Chan, A.D.C. |
description | In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library |
doi_str_mv | 10.1109/TIM.2006.876394 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_1658350</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1658350</ieee_id><sourcerecordid>2340320961</sourcerecordid><originalsourceid>FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</originalsourceid><addsrcrecordid>eNpdkD1PwzAQhi0EEqUwM7BELExpz5-xR8RnpSIGymwZxwFXSVzsZOi_x1WQkFjulud9dfcgdIlhgTGo5Wb1siAAYiErQRU7QjPMeVUqIcgxmgFgWSrGxSk6S2kLAJVg1Qzdvzk7Rj_siy70fgjR95_FmA6z8zaG3VfoXWFiNPtUmL4uzFj7UNjWpOQbb83gQ3-OThrTJnfxu-fo_fFhc_dcrl-fVne369JSIYaywbUUpFYOu9pRw5mpjJGNrJkkklFHseUKlKUEiG0wbT6UxEoQyoSypmJ0jm6m3l0M36NLg-58sq5tTe_CmLSUikogmGTy-h-5DWPs83FaCp7FgKQZWk5Q_jOl6Bq9i74zca8x6INTnZ3qg1M9Oc2JqynhnXN_tOCScqA_ue5yLw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>865155083</pqid></control><display><type>article</type><title>Security monitoring using microphone arrays and audio classification</title><source>IEEE Electronic Library (IEL)</source><creator>Abu-El-Quran, A.R. ; Goubran, R.A. ; Chan, A.D.C.</creator><creatorcontrib>Abu-El-Quran, A.R. ; Goubran, R.A. ; Chan, A.D.C.</creatorcontrib><description>In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library</description><identifier>ISSN: 0018-9456</identifier><identifier>EISSN: 1557-9662</identifier><identifier>DOI: 10.1109/TIM.2006.876394</identifier><identifier>CODEN: IEIMAO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Algorithms ; Audio classification ; beamforming ; Classification ; Classification algorithms ; feature extraction ; Frames ; Libraries ; Microphone arrays ; Monitoring ; Monitoring systems ; Neural networks ; Performance evaluation ; Reverberation ; Robustness ; Security ; security monitoring ; Segments ; Speech ; speech processing</subject><ispartof>IEEE transactions on instrumentation and measurement, 2006-08, Vol.55 (4), p.1025-1032</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2006</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</citedby><cites>FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1658350$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1658350$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Abu-El-Quran, A.R.</creatorcontrib><creatorcontrib>Goubran, R.A.</creatorcontrib><creatorcontrib>Chan, A.D.C.</creatorcontrib><title>Security monitoring using microphone arrays and audio classification</title><title>IEEE transactions on instrumentation and measurement</title><addtitle>TIM</addtitle><description>In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library</description><subject>Algorithms</subject><subject>Audio classification</subject><subject>beamforming</subject><subject>Classification</subject><subject>Classification algorithms</subject><subject>feature extraction</subject><subject>Frames</subject><subject>Libraries</subject><subject>Microphone arrays</subject><subject>Monitoring</subject><subject>Monitoring systems</subject><subject>Neural networks</subject><subject>Performance evaluation</subject><subject>Reverberation</subject><subject>Robustness</subject><subject>Security</subject><subject>security monitoring</subject><subject>Segments</subject><subject>Speech</subject><subject>speech processing</subject><issn>0018-9456</issn><issn>1557-9662</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2006</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkD1PwzAQhi0EEqUwM7BELExpz5-xR8RnpSIGymwZxwFXSVzsZOi_x1WQkFjulud9dfcgdIlhgTGo5Wb1siAAYiErQRU7QjPMeVUqIcgxmgFgWSrGxSk6S2kLAJVg1Qzdvzk7Rj_siy70fgjR95_FmA6z8zaG3VfoXWFiNPtUmL4uzFj7UNjWpOQbb83gQ3-OThrTJnfxu-fo_fFhc_dcrl-fVne369JSIYaywbUUpFYOu9pRw5mpjJGNrJkkklFHseUKlKUEiG0wbT6UxEoQyoSypmJ0jm6m3l0M36NLg-58sq5tTe_CmLSUikogmGTy-h-5DWPs83FaCp7FgKQZWk5Q_jOl6Bq9i74zca8x6INTnZ3qg1M9Oc2JqynhnXN_tOCScqA_ue5yLw</recordid><startdate>20060801</startdate><enddate>20060801</enddate><creator>Abu-El-Quran, A.R.</creator><creator>Goubran, R.A.</creator><creator>Chan, A.D.C.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>7U5</scope><scope>8FD</scope><scope>L7M</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20060801</creationdate><title>Security monitoring using microphone arrays and audio classification</title><author>Abu-El-Quran, A.R. ; Goubran, R.A. ; Chan, A.D.C.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c366t-f1d862d9e1ede3a54a7aa8f8d482843e31c5909c3202cf13fb9819623469ca743</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Algorithms</topic><topic>Audio classification</topic><topic>beamforming</topic><topic>Classification</topic><topic>Classification algorithms</topic><topic>feature extraction</topic><topic>Frames</topic><topic>Libraries</topic><topic>Microphone arrays</topic><topic>Monitoring</topic><topic>Monitoring systems</topic><topic>Neural networks</topic><topic>Performance evaluation</topic><topic>Reverberation</topic><topic>Robustness</topic><topic>Security</topic><topic>security monitoring</topic><topic>Segments</topic><topic>Speech</topic><topic>speech processing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Abu-El-Quran, A.R.</creatorcontrib><creatorcontrib>Goubran, R.A.</creatorcontrib><creatorcontrib>Chan, A.D.C.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on instrumentation and measurement</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Abu-El-Quran, A.R.</au><au>Goubran, R.A.</au><au>Chan, A.D.C.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Security monitoring using microphone arrays and audio classification</atitle><jtitle>IEEE transactions on instrumentation and measurement</jtitle><stitle>TIM</stitle><date>2006-08-01</date><risdate>2006</risdate><volume>55</volume><issue>4</issue><spage>1025</spage><epage>1032</epage><pages>1025-1032</pages><issn>0018-9456</issn><eissn>1557-9662</eissn><coden>IEIMAO</coden><abstract>In the paper, the authors propose a security monitoring system that can detect and classify the location and nature of different sounds within a room. This system is reliable and robust even in the presence of reverberation and in low signal-to-noise (SNR) environments. We describe a novel algorithm for audio classification, which, first, classifies an audio segment as speech or nonspeech and, second, classifies nonspeech audio segments into a particular audio type. To classify an audio segment as speech or nonspeech, this algorithm divides the audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio (PR) parameter; it is this PR parameter that is used to discriminate speech audio segments from nonspeech audio segments. The discerning threshold for the PR parameter is adaptive to accommodate different environments. A time-delayed neural network is employed to further classify nonspeech audio segments into an audio type. The performance of this novel audio classification algorithm is evaluated using a library of audio segments. This library includes both speech segments and nonspeech segments, such as windows breaking and footsteps. Evaluation is performed under different SNR environments, both with and without reverberation. Using 0.4-s audio segments, the proposed algorithm can achieve an average classification accuracy of 94.5% for the reverberant library and 95.1% for the nonreverberant library</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TIM.2006.876394</doi><tpages>8</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0018-9456 |
ispartof | IEEE transactions on instrumentation and measurement, 2006-08, Vol.55 (4), p.1025-1032 |
issn | 0018-9456 1557-9662 |
language | eng |
recordid | cdi_ieee_primary_1658350 |
source | IEEE Electronic Library (IEL) |
subjects | Algorithms Audio classification beamforming Classification Classification algorithms feature extraction Frames Libraries Microphone arrays Monitoring Monitoring systems Neural networks Performance evaluation Reverberation Robustness Security security monitoring Segments Speech speech processing |
title | Security monitoring using microphone arrays and audio classification |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T05%3A50%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Security%20monitoring%20using%20microphone%20arrays%20and%20audio%20classification&rft.jtitle=IEEE%20transactions%20on%20instrumentation%20and%20measurement&rft.au=Abu-El-Quran,%20A.R.&rft.date=2006-08-01&rft.volume=55&rft.issue=4&rft.spage=1025&rft.epage=1032&rft.pages=1025-1032&rft.issn=0018-9456&rft.eissn=1557-9662&rft.coden=IEIMAO&rft_id=info:doi/10.1109/TIM.2006.876394&rft_dat=%3Cproquest_RIE%3E2340320961%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=865155083&rft_id=info:pmid/&rft_ieee_id=1658350&rfr_iscdi=true |