Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP
In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL...
Gespeichert in:
Veröffentlicht in: | IEICE Transactions on Information and Systems 2013/12/01, Vol.E96.D(12), pp.2888-2891 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 2891 |
---|---|
container_issue | 12 |
container_start_page | 2888 |
container_title | IEICE Transactions on Information and Systems |
container_volume | E96.D |
creator | SONG, Ji-Hyun LEE, Sangmin |
description | In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL distribution; the variance of the speech and noise of the GNL distribution are estimated using higher-order moments. After in-depth analysis of estimated variances, a feature that is useful for discrimination between speech and noise at low SNRs is derived and compared to a threshold to detect speech activity. To consider the inter-frame correlation of speech activity, the result from the previous frame is employed in the decision rule of the proposed VAD algorithm. The performance of our proposed VAD algorithm is evaluated in terms of receiver operating characteristics (ROC) and detection accuracy. Results show that the proposed method yields better results than conventional VAD algorithms. |
doi_str_mv | 10.1587/transinf.E96.D.2888 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1551037850</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1551037850</sourcerecordid><originalsourceid>FETCH-LOGICAL-c591t-5933c4b8615c5c4c81a356f71095e86c569bbbaebdedaedcfb43dd208710235c3</originalsourceid><addsrcrecordid>eNpdkE9PGzEQxa0KpAboJ-hlL5W4bPCfeOM9hgQoVQo9tL1as97Z1Mixg-0gwaevQ2iEepo3o9-bJz1CPjM6ZlJNL3IEn6wfxldtM16MuVLqAxmx6UTWTDTsiIxoy5paScE_kpOUHihlijM5IqvfwRqsZibbJ5ufqwVmLDr46hIS9lURN-gxgrMvZb0LcQ2uXsLGQbEtbMrRdttXw603IW5ChGz9qpoH39vdHVz1ffbjjBwP4BJ-epun5Nf11c_513p5f3M7ny1rI1uWa9kKYSadapg00kyMYiBkM0wZbSWqxsim7boOsOuxB-zN0E1E33OqCsGFNOKUnO__bmJ43GLKem2TQefAY9gmzaRkVEyVpAUVe9TEkFLEQW-iXUN81ozqXa36X6261KoXeldrcX15C4BkwA0FMTYdrFxRwRVvCvdtzz2kDCs8ABCzNQ7__834u5ADZP5A1OjFXzJnl5Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1551037850</pqid></control><display><type>article</type><title>Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP</title><source>J-STAGE日本語サイト (Free Access)</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>SONG, Ji-Hyun ; LEE, Sangmin</creator><creatorcontrib>SONG, Ji-Hyun ; LEE, Sangmin</creatorcontrib><description>In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL distribution; the variance of the speech and noise of the GNL distribution are estimated using higher-order moments. After in-depth analysis of estimated variances, a feature that is useful for discrimination between speech and noise at low SNRs is derived and compared to a threshold to detect speech activity. To consider the inter-frame correlation of speech activity, the result from the previous frame is employed in the decision rule of the proposed VAD algorithm. The performance of our proposed VAD algorithm is evaluated in terms of receiver operating characteristics (ROC) and detection accuracy. Results show that the proposed method yields better results than conventional VAD algorithms.</description><identifier>ISSN: 0916-8532</identifier><identifier>EISSN: 1745-1361</identifier><identifier>DOI: 10.1587/transinf.E96.D.2888</identifier><language>eng</language><publisher>Oxford: The Institute of Electronics, Information and Communication Engineers</publisher><subject>Algorithms ; Applied sciences ; conditional maximum a posteriori ; Correlation ; Detection, estimation, filtering, equalization, prediction ; Discrimination ; Exact sciences and technology ; generalized normal-Laplace distribution ; higher order moments ; Information, signal and communications theory ; Noise ; Probability density functions ; Radiocommunications ; Receivers ; Signal and communications theory ; Signal processing ; Signal, noise ; Speech ; Speech processing ; Telecommunications ; Telecommunications and information theory ; Transmitters. Receivers ; Voice ; voice activity detection</subject><ispartof>IEICE Transactions on Information and Systems, 2013/12/01, Vol.E96.D(12), pp.2888-2891</ispartof><rights>2013 The Institute of Electronics, Information and Communication Engineers</rights><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c591t-5933c4b8615c5c4c81a356f71095e86c569bbbaebdedaedcfb43dd208710235c3</citedby><cites>FETCH-LOGICAL-c591t-5933c4b8615c5c4c81a356f71095e86c569bbbaebdedaedcfb43dd208710235c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,1877,4010,27900,27901,27902</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=28032826$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>SONG, Ji-Hyun</creatorcontrib><creatorcontrib>LEE, Sangmin</creatorcontrib><title>Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP</title><title>IEICE Transactions on Information and Systems</title><addtitle>IEICE Trans. Inf. & Syst.</addtitle><description>In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL distribution; the variance of the speech and noise of the GNL distribution are estimated using higher-order moments. After in-depth analysis of estimated variances, a feature that is useful for discrimination between speech and noise at low SNRs is derived and compared to a threshold to detect speech activity. To consider the inter-frame correlation of speech activity, the result from the previous frame is employed in the decision rule of the proposed VAD algorithm. The performance of our proposed VAD algorithm is evaluated in terms of receiver operating characteristics (ROC) and detection accuracy. Results show that the proposed method yields better results than conventional VAD algorithms.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>conditional maximum a posteriori</subject><subject>Correlation</subject><subject>Detection, estimation, filtering, equalization, prediction</subject><subject>Discrimination</subject><subject>Exact sciences and technology</subject><subject>generalized normal-Laplace distribution</subject><subject>higher order moments</subject><subject>Information, signal and communications theory</subject><subject>Noise</subject><subject>Probability density functions</subject><subject>Radiocommunications</subject><subject>Receivers</subject><subject>Signal and communications theory</subject><subject>Signal processing</subject><subject>Signal, noise</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Telecommunications</subject><subject>Telecommunications and information theory</subject><subject>Transmitters. Receivers</subject><subject>Voice</subject><subject>voice activity detection</subject><issn>0916-8532</issn><issn>1745-1361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><recordid>eNpdkE9PGzEQxa0KpAboJ-hlL5W4bPCfeOM9hgQoVQo9tL1as97Z1Mixg-0gwaevQ2iEepo3o9-bJz1CPjM6ZlJNL3IEn6wfxldtM16MuVLqAxmx6UTWTDTsiIxoy5paScE_kpOUHihlijM5IqvfwRqsZibbJ5ufqwVmLDr46hIS9lURN-gxgrMvZb0LcQ2uXsLGQbEtbMrRdttXw603IW5ChGz9qpoH39vdHVz1ffbjjBwP4BJ-epun5Nf11c_513p5f3M7ny1rI1uWa9kKYSadapg00kyMYiBkM0wZbSWqxsim7boOsOuxB-zN0E1E33OqCsGFNOKUnO__bmJ43GLKem2TQefAY9gmzaRkVEyVpAUVe9TEkFLEQW-iXUN81ozqXa36X6261KoXeldrcX15C4BkwA0FMTYdrFxRwRVvCvdtzz2kDCs8ABCzNQ7__834u5ADZP5A1OjFXzJnl5Q</recordid><startdate>2013</startdate><enddate>2013</enddate><creator>SONG, Ji-Hyun</creator><creator>LEE, Sangmin</creator><general>The Institute of Electronics, Information and Communication Engineers</general><general>Oxford University Press</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>2013</creationdate><title>Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP</title><author>SONG, Ji-Hyun ; LEE, Sangmin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c591t-5933c4b8615c5c4c81a356f71095e86c569bbbaebdedaedcfb43dd208710235c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>conditional maximum a posteriori</topic><topic>Correlation</topic><topic>Detection, estimation, filtering, equalization, prediction</topic><topic>Discrimination</topic><topic>Exact sciences and technology</topic><topic>generalized normal-Laplace distribution</topic><topic>higher order moments</topic><topic>Information, signal and communications theory</topic><topic>Noise</topic><topic>Probability density functions</topic><topic>Radiocommunications</topic><topic>Receivers</topic><topic>Signal and communications theory</topic><topic>Signal processing</topic><topic>Signal, noise</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Telecommunications</topic><topic>Telecommunications and information theory</topic><topic>Transmitters. Receivers</topic><topic>Voice</topic><topic>voice activity detection</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>SONG, Ji-Hyun</creatorcontrib><creatorcontrib>LEE, Sangmin</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEICE Transactions on Information and Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>SONG, Ji-Hyun</au><au>LEE, Sangmin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP</atitle><jtitle>IEICE Transactions on Information and Systems</jtitle><addtitle>IEICE Trans. Inf. & Syst.</addtitle><date>2013</date><risdate>2013</risdate><volume>E96.D</volume><issue>12</issue><spage>2888</spage><epage>2891</epage><pages>2888-2891</pages><issn>0916-8532</issn><eissn>1745-1361</eissn><abstract>In this paper, we propose a novel voice activity detection (VAD) algorithm based on the generalized normal-Laplace (GNL) distribution to provide enhanced performance in adverse noise environments. Specifically, the probability density function (PDF) of a noisy speech signal is represented by the GNL distribution; the variance of the speech and noise of the GNL distribution are estimated using higher-order moments. After in-depth analysis of estimated variances, a feature that is useful for discrimination between speech and noise at low SNRs is derived and compared to a threshold to detect speech activity. To consider the inter-frame correlation of speech activity, the result from the previous frame is employed in the decision rule of the proposed VAD algorithm. The performance of our proposed VAD algorithm is evaluated in terms of receiver operating characteristics (ROC) and detection accuracy. Results show that the proposed method yields better results than conventional VAD algorithms.</abstract><cop>Oxford</cop><pub>The Institute of Electronics, Information and Communication Engineers</pub><doi>10.1587/transinf.E96.D.2888</doi><tpages>4</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0916-8532 |
ispartof | IEICE Transactions on Information and Systems, 2013/12/01, Vol.E96.D(12), pp.2888-2891 |
issn | 0916-8532 1745-1361 |
language | eng |
recordid | cdi_proquest_miscellaneous_1551037850 |
source | J-STAGE日本語サイト (Free Access); EZB-FREE-00999 freely available EZB journals |
subjects | Algorithms Applied sciences conditional maximum a posteriori Correlation Detection, estimation, filtering, equalization, prediction Discrimination Exact sciences and technology generalized normal-Laplace distribution higher order moments Information, signal and communications theory Noise Probability density functions Radiocommunications Receivers Signal and communications theory Signal processing Signal, noise Speech Speech processing Telecommunications Telecommunications and information theory Transmitters. Receivers Voice voice activity detection |
title | Voice Activity Detection Based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T08%3A26%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Voice%20Activity%20Detection%20Based%20on%20Generalized%20Normal-Laplace%20Distribution%20Incorporating%20Conditional%20MAP&rft.jtitle=IEICE%20Transactions%20on%20Information%20and%20Systems&rft.au=SONG,%20Ji-Hyun&rft.date=2013&rft.volume=E96.D&rft.issue=12&rft.spage=2888&rft.epage=2891&rft.pages=2888-2891&rft.issn=0916-8532&rft.eissn=1745-1361&rft_id=info:doi/10.1587/transinf.E96.D.2888&rft_dat=%3Cproquest_cross%3E1551037850%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1551037850&rft_id=info:pmid/&rfr_iscdi=true |