Two-level Noise Robust and Block Featured PNN Model for Speaker Recognition in Real Environment
Speaker recognition is gaining popularity in a device and application-specific verification and validation to avoid complex textual passwords and keep remembering them. Various devices and applications have adapted speaker-based verification to ensure online and offline access. However, speaker reco...
Gespeichert in:
Veröffentlicht in: | Wireless personal communications 2022, Vol.125 (4), p.3741-3771 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 3771 |
---|---|
container_issue | 4 |
container_start_page | 3741 |
container_title | Wireless personal communications |
container_volume | 125 |
creator | Juneja, Kapil |
description | Speaker recognition is gaining popularity in a device and application-specific verification and validation to avoid complex textual passwords and keep remembering them. Various devices and applications have adapted speaker-based verification to ensure online and offline access. However, speaker recognition is also affected by multiple devices and environment-specific disturbances. In this paper, the Two-level noise-robust PNN model (2LNR-PNN) is presented for the significant recognition of the speaker. The noise is handled during the pre-processing level and the featureset generation stage. The high-level noise and situational turbulence were addressed in this work using spectral subtraction and the GMM method. This rectified noise is processed under frequency and window-based computation to extract the MFCC, LPC, and statistical features. This composite featureset is processed under Probabilistic Neural Network (PNN) for identifying the speaker. The proposed model has experimented on THUYG-20 SRE Corpus and self-collected real-time dataset. The separate experiments are conducted in different noise conditions with car, fan, white, cafeteria and babble noises. The experiments are validated against various feature processors, machine learning and deep learning models. The analytical observations are collected using accuracy, EER and FRR measures. The proposed model claims an average accuracy of over 80% and a maximum FRR of 0.2 in varied noises with 1db, 5db and 9db SNR conditions. The proposed model outperformed the experimented machine learning and deep learning models with a significant performance gain. |
doi_str_mv | 10.1007/s11277-022-09734-7 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2700751583</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2700751583</sourcerecordid><originalsourceid>FETCH-LOGICAL-c249t-931e9dab7bd3f23dea3da68302ac150929b2ebc41e87da8a5a83ba78599c7a8b3</originalsourceid><addsrcrecordid>eNp9kF1LwzAUhoMoOKd_wKuA19F8tEtzqWNTYU6ZE7wLaXs6unXJTNqJ_97MCt55dTjwPu85PAhdMnrNKJU3gTEuJaGcE6qkSIg8QgOWSk4ykbwfowFVXJERZ_wUnYWwpjRiig-QXn460sAeGjx3dQC8cHkXWmxsie8aV2zwFEzbeSjxy3yOn1wZk5Xz-HUHZgMeL6BwK1u3tbO4tnE1DZ7Yfe2d3YJtz9FJZZoAF79ziN6mk-X4gcye7x_HtzNS8ES1RAkGqjS5zEtRcVGCEaUZZYJyU7D08HzOIS8SBpksTWZSk4ncyCxVqpAmy8UQXfW9O-8-OgitXrvO23hScxkNpSzNREzxPlV4F4KHSu98vTX-SzOqDyJ1L1JHkfpHpJYREj0UYtiuwP9V_0N9A2midjk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2700751583</pqid></control><display><type>article</type><title>Two-level Noise Robust and Block Featured PNN Model for Speaker Recognition in Real Environment</title><source>Springer Nature - Complete Springer Journals</source><creator>Juneja, Kapil</creator><creatorcontrib>Juneja, Kapil</creatorcontrib><description>Speaker recognition is gaining popularity in a device and application-specific verification and validation to avoid complex textual passwords and keep remembering them. Various devices and applications have adapted speaker-based verification to ensure online and offline access. However, speaker recognition is also affected by multiple devices and environment-specific disturbances. In this paper, the Two-level noise-robust PNN model (2LNR-PNN) is presented for the significant recognition of the speaker. The noise is handled during the pre-processing level and the featureset generation stage. The high-level noise and situational turbulence were addressed in this work using spectral subtraction and the GMM method. This rectified noise is processed under frequency and window-based computation to extract the MFCC, LPC, and statistical features. This composite featureset is processed under Probabilistic Neural Network (PNN) for identifying the speaker. The proposed model has experimented on THUYG-20 SRE Corpus and self-collected real-time dataset. The separate experiments are conducted in different noise conditions with car, fan, white, cafeteria and babble noises. The experiments are validated against various feature processors, machine learning and deep learning models. The analytical observations are collected using accuracy, EER and FRR measures. The proposed model claims an average accuracy of over 80% and a maximum FRR of 0.2 in varied noises with 1db, 5db and 9db SNR conditions. The proposed model outperformed the experimented machine learning and deep learning models with a significant performance gain.</description><identifier>ISSN: 0929-6212</identifier><identifier>EISSN: 1572-834X</identifier><identifier>DOI: 10.1007/s11277-022-09734-7</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accuracy ; Communications Engineering ; Computer Communication Networks ; Deep learning ; Engineering ; Feature extraction ; Machine learning ; Networks ; Neural networks ; Noise ; Robustness ; Signal,Image and Speech Processing ; Speech recognition ; Statistical analysis ; Subtraction ; Verification</subject><ispartof>Wireless personal communications, 2022, Vol.125 (4), p.3741-3771</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c249t-931e9dab7bd3f23dea3da68302ac150929b2ebc41e87da8a5a83ba78599c7a8b3</citedby><cites>FETCH-LOGICAL-c249t-931e9dab7bd3f23dea3da68302ac150929b2ebc41e87da8a5a83ba78599c7a8b3</cites><orcidid>0000-0002-6351-3351</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11277-022-09734-7$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11277-022-09734-7$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51297</link.rule.ids></links><search><creatorcontrib>Juneja, Kapil</creatorcontrib><title>Two-level Noise Robust and Block Featured PNN Model for Speaker Recognition in Real Environment</title><title>Wireless personal communications</title><addtitle>Wireless Pers Commun</addtitle><description>Speaker recognition is gaining popularity in a device and application-specific verification and validation to avoid complex textual passwords and keep remembering them. Various devices and applications have adapted speaker-based verification to ensure online and offline access. However, speaker recognition is also affected by multiple devices and environment-specific disturbances. In this paper, the Two-level noise-robust PNN model (2LNR-PNN) is presented for the significant recognition of the speaker. The noise is handled during the pre-processing level and the featureset generation stage. The high-level noise and situational turbulence were addressed in this work using spectral subtraction and the GMM method. This rectified noise is processed under frequency and window-based computation to extract the MFCC, LPC, and statistical features. This composite featureset is processed under Probabilistic Neural Network (PNN) for identifying the speaker. The proposed model has experimented on THUYG-20 SRE Corpus and self-collected real-time dataset. The separate experiments are conducted in different noise conditions with car, fan, white, cafeteria and babble noises. The experiments are validated against various feature processors, machine learning and deep learning models. The analytical observations are collected using accuracy, EER and FRR measures. The proposed model claims an average accuracy of over 80% and a maximum FRR of 0.2 in varied noises with 1db, 5db and 9db SNR conditions. The proposed model outperformed the experimented machine learning and deep learning models with a significant performance gain.</description><subject>Accuracy</subject><subject>Communications Engineering</subject><subject>Computer Communication Networks</subject><subject>Deep learning</subject><subject>Engineering</subject><subject>Feature extraction</subject><subject>Machine learning</subject><subject>Networks</subject><subject>Neural networks</subject><subject>Noise</subject><subject>Robustness</subject><subject>Signal,Image and Speech Processing</subject><subject>Speech recognition</subject><subject>Statistical analysis</subject><subject>Subtraction</subject><subject>Verification</subject><issn>0929-6212</issn><issn>1572-834X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kF1LwzAUhoMoOKd_wKuA19F8tEtzqWNTYU6ZE7wLaXs6unXJTNqJ_97MCt55dTjwPu85PAhdMnrNKJU3gTEuJaGcE6qkSIg8QgOWSk4ykbwfowFVXJERZ_wUnYWwpjRiig-QXn460sAeGjx3dQC8cHkXWmxsie8aV2zwFEzbeSjxy3yOn1wZk5Xz-HUHZgMeL6BwK1u3tbO4tnE1DZ7Yfe2d3YJtz9FJZZoAF79ziN6mk-X4gcye7x_HtzNS8ES1RAkGqjS5zEtRcVGCEaUZZYJyU7D08HzOIS8SBpksTWZSk4ncyCxVqpAmy8UQXfW9O-8-OgitXrvO23hScxkNpSzNREzxPlV4F4KHSu98vTX-SzOqDyJ1L1JHkfpHpJYREj0UYtiuwP9V_0N9A2midjk</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Juneja, Kapil</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-6351-3351</orcidid></search><sort><creationdate>2022</creationdate><title>Two-level Noise Robust and Block Featured PNN Model for Speaker Recognition in Real Environment</title><author>Juneja, Kapil</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c249t-931e9dab7bd3f23dea3da68302ac150929b2ebc41e87da8a5a83ba78599c7a8b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Communications Engineering</topic><topic>Computer Communication Networks</topic><topic>Deep learning</topic><topic>Engineering</topic><topic>Feature extraction</topic><topic>Machine learning</topic><topic>Networks</topic><topic>Neural networks</topic><topic>Noise</topic><topic>Robustness</topic><topic>Signal,Image and Speech Processing</topic><topic>Speech recognition</topic><topic>Statistical analysis</topic><topic>Subtraction</topic><topic>Verification</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Juneja, Kapil</creatorcontrib><collection>CrossRef</collection><jtitle>Wireless personal communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Juneja, Kapil</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Two-level Noise Robust and Block Featured PNN Model for Speaker Recognition in Real Environment</atitle><jtitle>Wireless personal communications</jtitle><stitle>Wireless Pers Commun</stitle><date>2022</date><risdate>2022</risdate><volume>125</volume><issue>4</issue><spage>3741</spage><epage>3771</epage><pages>3741-3771</pages><issn>0929-6212</issn><eissn>1572-834X</eissn><abstract>Speaker recognition is gaining popularity in a device and application-specific verification and validation to avoid complex textual passwords and keep remembering them. Various devices and applications have adapted speaker-based verification to ensure online and offline access. However, speaker recognition is also affected by multiple devices and environment-specific disturbances. In this paper, the Two-level noise-robust PNN model (2LNR-PNN) is presented for the significant recognition of the speaker. The noise is handled during the pre-processing level and the featureset generation stage. The high-level noise and situational turbulence were addressed in this work using spectral subtraction and the GMM method. This rectified noise is processed under frequency and window-based computation to extract the MFCC, LPC, and statistical features. This composite featureset is processed under Probabilistic Neural Network (PNN) for identifying the speaker. The proposed model has experimented on THUYG-20 SRE Corpus and self-collected real-time dataset. The separate experiments are conducted in different noise conditions with car, fan, white, cafeteria and babble noises. The experiments are validated against various feature processors, machine learning and deep learning models. The analytical observations are collected using accuracy, EER and FRR measures. The proposed model claims an average accuracy of over 80% and a maximum FRR of 0.2 in varied noises with 1db, 5db and 9db SNR conditions. The proposed model outperformed the experimented machine learning and deep learning models with a significant performance gain.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11277-022-09734-7</doi><tpages>31</tpages><orcidid>https://orcid.org/0000-0002-6351-3351</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0929-6212 |
ispartof | Wireless personal communications, 2022, Vol.125 (4), p.3741-3771 |
issn | 0929-6212 1572-834X |
language | eng |
recordid | cdi_proquest_journals_2700751583 |
source | Springer Nature - Complete Springer Journals |
subjects | Accuracy Communications Engineering Computer Communication Networks Deep learning Engineering Feature extraction Machine learning Networks Neural networks Noise Robustness Signal,Image and Speech Processing Speech recognition Statistical analysis Subtraction Verification |
title | Two-level Noise Robust and Block Featured PNN Model for Speaker Recognition in Real Environment |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T06%3A08%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Two-level%20Noise%20Robust%20and%20Block%20Featured%20PNN%20Model%20for%20Speaker%20Recognition%20in%20Real%20Environment&rft.jtitle=Wireless%20personal%20communications&rft.au=Juneja,%20Kapil&rft.date=2022&rft.volume=125&rft.issue=4&rft.spage=3741&rft.epage=3771&rft.pages=3741-3771&rft.issn=0929-6212&rft.eissn=1572-834X&rft_id=info:doi/10.1007/s11277-022-09734-7&rft_dat=%3Cproquest_cross%3E2700751583%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2700751583&rft_id=info:pmid/&rfr_iscdi=true |