Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP

In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEICE Transactions on Information and Systems 2013/12/01, Vol.E96.D(12), pp.2888-2891
Hauptverfasser: SONG, Ji-Hyun, LEE, Sangmin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2891
container_issue 12
container_start_page 2888
container_title IEICE Transactions on Information and Systems
container_volume E96.D
creator SONG, Ji-Hyun
LEE, Sangmin
description In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL distribution; the variance of the speech and noise of the GNL distribution are estimated using higher-order moments. After in-depth analysis of estimated variances, a feature that is useful for discrimination between speech and noise at low SNRs is derived and compared to a threshold to detect speech activity. To consider the inter-frame correlation of speech activity, the result from the previous frame is employed in the decision rule of the proposed VAD algorithm. The performance of our proposed VAD algorithm is evaluated in terms of receiver operating characteristics (ROC) and detection accuracy. Results show that the proposed method yields better results than conventional VAD algorithms.
doi_str_mv 10.1587/transinf.E96.D.2888
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1551037850</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1551037850</sourcerecordid><originalsourceid>FETCH-LOGICAL-c591t-5933c4b8615c5c4c81a356f71095e86c569bbbaebdedaedcfb43dd208710235c3</originalsourceid><addsrcrecordid>eNpdkE9PGzEQxa0KpAboJ-hlL5W4bPCfeOM9hgQoVQo9tL1as97Z1Mixg-0gwaevQ2iEepo3o9-bJz1CPjM6ZlJNL3IEn6wfxldtM16MuVLqAxmx6UTWTDTsiIxoy5paScE_kpOUHihlijM5IqvfwRqsZibbJ5ufqwVmLDr46hIS9lURN-gxgrMvZb0LcQ2uXsLGQbEtbMrRdttXw603IW5ChGz9qpoH39vdHVz1ffbjjBwP4BJ-epun5Nf11c_513p5f3M7ny1rI1uWa9kKYSadapg00kyMYiBkM0wZbSWqxsim7boOsOuxB-zN0E1E33OqCsGFNOKUnO__bmJ43GLKem2TQefAY9gmzaRkVEyVpAUVe9TEkFLEQW-iXUN81ozqXa36X6261KoXeldrcX15C4BkwA0FMTYdrFxRwRVvCvdtzz2kDCs8ABCzNQ7__834u5ADZP5A1OjFXzJnl5Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1551037850</pqid></control><display><type>article</type><title>Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP</title><source>J-STAGE日本語サイト (Free Access)</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>SONG, Ji-Hyun ; LEE, Sangmin</creator><creatorcontrib>SONG, Ji-Hyun ; LEE, Sangmin</creatorcontrib><description>In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL distribution; the variance of the speech and noise of the GNL distribution are estimated using higher-order moments. After in-depth analysis of estimated variances, a feature that is useful for discrimination between speech and noise at low SNRs is derived and compared to a threshold to detect speech activity. To consider the inter-frame correlation of speech activity, the result from the previous frame is employed in the decision rule of the proposed VAD algorithm. The performance of our proposed VAD algorithm is evaluated in terms of receiver operating characteristics (ROC) and detection accuracy. Results show that the proposed method yields better results than conventional VAD algorithms.</description><identifier>ISSN: 0916-8532</identifier><identifier>EISSN: 1745-1361</identifier><identifier>DOI: 10.1587/transinf.E96.D.2888</identifier><language>eng</language><publisher>Oxford: The Institute of Electronics, Information and Communication Engineers</publisher><subject>Algorithms ; Applied sciences ; conditional maximum a posteriori ; Correlation ; Detection, estimation, filtering, equalization, prediction ; Discrimination ; Exact sciences and technology ; generalized normal-Laplace distribution ; higher order moments ; Information, signal and communications theory ; Noise ; Probability density functions ; Radiocommunications ; Receivers ; Signal and communications theory ; Signal processing ; Signal, noise ; Speech ; Speech processing ; Telecommunications ; Telecommunications and information theory ; Transmitters. Receivers ; Voice ; voice activity detection</subject><ispartof>IEICE Transactions on Information and Systems, 2013/12/01, Vol.E96.D(12), pp.2888-2891</ispartof><rights>2013 The Institute of Electronics, Information and Communication Engineers</rights><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c591t-5933c4b8615c5c4c81a356f71095e86c569bbbaebdedaedcfb43dd208710235c3</citedby><cites>FETCH-LOGICAL-c591t-5933c4b8615c5c4c81a356f71095e86c569bbbaebdedaedcfb43dd208710235c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,1877,4010,27900,27901,27902</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=28032826$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>SONG, Ji-Hyun</creatorcontrib><creatorcontrib>LEE, Sangmin</creatorcontrib><title>Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP</title><title>IEICE Transactions on Information and Systems</title><addtitle>IEICE Trans. Inf. &amp; Syst.</addtitle><description>In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL distribution; the variance of the speech and noise of the GNL distribution are estimated using higher-order moments. After in-depth analysis of estimated variances, a feature that is useful for discrimination between speech and noise at low SNRs is derived and compared to a threshold to detect speech activity. To consider the inter-frame correlation of speech activity, the result from the previous frame is employed in the decision rule of the proposed VAD algorithm. The performance of our proposed VAD algorithm is evaluated in terms of receiver operating characteristics (ROC) and detection accuracy. Results show that the proposed method yields better results than conventional VAD algorithms.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>conditional maximum a posteriori</subject><subject>Correlation</subject><subject>Detection, estimation, filtering, equalization, prediction</subject><subject>Discrimination</subject><subject>Exact sciences and technology</subject><subject>generalized normal-Laplace distribution</subject><subject>higher order moments</subject><subject>Information, signal and communications theory</subject><subject>Noise</subject><subject>Probability density functions</subject><subject>Radiocommunications</subject><subject>Receivers</subject><subject>Signal and communications theory</subject><subject>Signal processing</subject><subject>Signal, noise</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Telecommunications</subject><subject>Telecommunications and information theory</subject><subject>Transmitters. Receivers</subject><subject>Voice</subject><subject>voice activity detection</subject><issn>0916-8532</issn><issn>1745-1361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><recordid>eNpdkE9PGzEQxa0KpAboJ-hlL5W4bPCfeOM9hgQoVQo9tL1as97Z1Mixg-0gwaevQ2iEepo3o9-bJz1CPjM6ZlJNL3IEn6wfxldtM16MuVLqAxmx6UTWTDTsiIxoy5paScE_kpOUHihlijM5IqvfwRqsZibbJ5ufqwVmLDr46hIS9lURN-gxgrMvZb0LcQ2uXsLGQbEtbMrRdttXw603IW5ChGz9qpoH39vdHVz1ffbjjBwP4BJ-epun5Nf11c_513p5f3M7ny1rI1uWa9kKYSadapg00kyMYiBkM0wZbSWqxsim7boOsOuxB-zN0E1E33OqCsGFNOKUnO__bmJ43GLKem2TQefAY9gmzaRkVEyVpAUVe9TEkFLEQW-iXUN81ozqXa36X6261KoXeldrcX15C4BkwA0FMTYdrFxRwRVvCvdtzz2kDCs8ABCzNQ7__834u5ADZP5A1OjFXzJnl5Q</recordid><startdate>2013</startdate><enddate>2013</enddate><creator>SONG, Ji-Hyun</creator><creator>LEE, Sangmin</creator><general>The Institute of Electronics, Information and Communication Engineers</general><general>Oxford University Press</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>2013</creationdate><title>Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP</title><author>SONG, Ji-Hyun ; LEE, Sangmin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c591t-5933c4b8615c5c4c81a356f71095e86c569bbbaebdedaedcfb43dd208710235c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>conditional maximum a posteriori</topic><topic>Correlation</topic><topic>Detection, estimation, filtering, equalization, prediction</topic><topic>Discrimination</topic><topic>Exact sciences and technology</topic><topic>generalized normal-Laplace distribution</topic><topic>higher order moments</topic><topic>Information, signal and communications theory</topic><topic>Noise</topic><topic>Probability density functions</topic><topic>Radiocommunications</topic><topic>Receivers</topic><topic>Signal and communications theory</topic><topic>Signal processing</topic><topic>Signal, noise</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Telecommunications</topic><topic>Telecommunications and information theory</topic><topic>Transmitters. Receivers</topic><topic>Voice</topic><topic>voice activity detection</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>SONG, Ji-Hyun</creatorcontrib><creatorcontrib>LEE, Sangmin</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEICE Transactions on Information and Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>SONG, Ji-Hyun</au><au>LEE, Sangmin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP</atitle><jtitle>IEICE Transactions on Information and Systems</jtitle><addtitle>IEICE Trans. Inf. &amp; Syst.</addtitle><date>2013</date><risdate>2013</risdate><volume>E96.D</volume><issue>12</issue><spage>2888</spage><epage>2891</epage><pages>2888-2891</pages><issn>0916-8532</issn><eissn>1745-1361</eissn><abstract>In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL distribution; the variance of the speech and noise of the GNL distribution are estimated using higher-order moments. After in-depth analysis of estimated variances, a feature that is useful for discrimination between speech and noise at low SNRs is derived and compared to a threshold to detect speech activity. To consider the inter-frame correlation of speech activity, the result from the previous frame is employed in the decision rule of the proposed VAD algorithm. The performance of our proposed VAD algorithm is evaluated in terms of receiver operating characteristics (ROC) and detection accuracy. Results show that the proposed method yields better results than conventional VAD algorithms.</abstract><cop>Oxford</cop><pub>The Institute of Electronics, Information and Communication Engineers</pub><doi>10.1587/transinf.E96.D.2888</doi><tpages>4</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0916-8532
ispartof IEICE Transactions on Information and Systems, 2013/12/01, Vol.E96.D(12), pp.2888-2891
issn 0916-8532
1745-1361
language eng
recordid cdi_proquest_miscellaneous_1551037850
source J-STAGE日本語サイト (Free Access); EZB-FREE-00999 freely available EZB journals
subjects Algorithms
Applied sciences
conditional maximum a posteriori
Correlation
Detection, estimation, filtering, equalization, prediction
Discrimination
Exact sciences and technology
generalized normal-Laplace distribution
higher order moments
Information, signal and communications theory
Noise
Probability density functions
Radiocommunications
Receivers
Signal and communications theory
Signal processing
Signal, noise
Speech
Speech processing
Telecommunications
Telecommunications and information theory
Transmitters. Receivers
Voice
voice activity detection
title Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T08%3A26%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Voice%20Activity%20Detection%20Based%20on%20Generalized%20Normal-Laplace%20Distribution%20Incorporating%20Conditional%20MAP&rft.jtitle=IEICE%20Transactions%20on%20Information%20and%20Systems&rft.au=SONG,%20Ji-Hyun&rft.date=2013&rft.volume=E96.D&rft.issue=12&rft.spage=2888&rft.epage=2891&rft.pages=2888-2891&rft.issn=0916-8532&rft.eissn=1745-1361&rft_id=info:doi/10.1587/transinf.E96.D.2888&rft_dat=%3Cproquest_cross%3E1551037850%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1551037850&rft_id=info:pmid/&rfr_iscdi=true