Data Driven Neural Speech Enhancement for Smart Healthcare in Consumer Electronics Applications
This paper presents the practical response and performance-aware development of online speech enhancement from a consumer electronic perspective. To improve the efficiency of human-machine interaction, speech can play a vital role as a transmission medium on the Internet of Medical Things (IoM). How...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on consumer electronics 2024-05, Vol.70 (2), p.4828-4838 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 4838 |
---|---|
container_issue | 2 |
container_start_page | 4828 |
container_title | IEEE transactions on consumer electronics |
container_volume | 70 |
creator | Paikrao, Pavan D. Mukherjee, Amrit Ghosh, Uttam Goswami, Pratik Novak, Milan Kumar Jain, Deepak Al-Numay, Mohammed S. Narwade, Pradeep |
description | This paper presents the practical response and performance-aware development of online speech enhancement from a consumer electronic perspective. To improve the efficiency of human-machine interaction, speech can play a vital role as a transmission medium on the Internet of Medical Things (IoM). However, some intelligent speech recognition systems cannot preserve the confidentiality of speech data. Additionally, the preservation of privacy is onerous, especially for model training and speech recognition in real-time. The recent development of big data-oriented wireless technologies associated with edge computing, interconnected devices of the Internet of Medical Things (IoMT), and big data analytics has great demand for connected human-machine interaction for many applications like automated cars, health monitoring, and consumer personal health care monitoring systems. Although big data-oriented wireless technologies serve these applications, the challenge remains of ignoring emotional care. This paper starts by explaining how to make a neural network-based architecture that can improve the speech of multichannel first-order Ambisonics mixtures and lower the need for human intervention through ambient intelligence (AmI). This will make the system work better overall in medical situations. Second, we demonstrate the effectiveness of different noise estimation techniques on proposed modulation domain processing (MDP) applications in smart hospitals, including electronic medical documentation, disease diagnosis, and evaluation. The proposed approach outperforms the enhancement of the conventional modulation domain in the cortex with several objective evaluation parameters such as Log Likelihood Ratio (LLR), Weighted Spectral Slope (WSS), Perceptual Evaluation of Speech Quality (PESQ), Csig and segmental (SNR seg.) Different noise estimators are used to figure out what effect the system has on different spectral modification parameters, like the over-subtraction factor and the modification domain. The experimental results show that the MDP system achieves better performance in terms of SNRseg. scores (49%) for the state-of-the-art consumer electronics perspective in a health care system. The proposed framework would greatly contribute to personalized communication health monitoring by consumers in a noisy environment. |
doi_str_mv | 10.1109/TCE.2024.3387740 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10496958</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10496958</ieee_id><sourcerecordid>3098886861</sourcerecordid><originalsourceid>FETCH-LOGICAL-c175t-22e6ca34fbedf51064c0801bcafff6d7acf00c5a246211133ed34843885712c63</originalsourceid><addsrcrecordid>eNpNkD1PwzAQhi0EEqWwMzBYYk7xd5yxSgNFqmBomS3XPaupUic4DhL_nlTtwHTDPe-d3gehR0pmlJLiZVNWM0aYmHGu81yQKzShUupMUJZfowkhhc44UfwW3fX9gRAqJNMTZBY2WbyI9Q8E_AFDtA1edwBuj6uwt8HBEULCvo14fbQx4SXYJu2djYDrgMs29MMRIq4acCm2oXY9nnddUzub6nF5j268bXp4uMwp-nqtNuUyW32-vZfzVeZoLlPGGChnufBb2HlJiRKOaEK3znrv1S63zhPipGVCMUop57DjQguutcwpc4pP0fP5bhfb7wH6ZA7tEMP40vCxutZKKzpS5Ey52PZ9BG-6WI-1fg0l5qTRjBrNSaO5aBwjT-dIDQD_cFGoQmr-B9vVbk0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3098886861</pqid></control><display><type>article</type><title>Data Driven Neural Speech Enhancement for Smart Healthcare in Consumer Electronics Applications</title><source>IEEE/IET Electronic Library (IEL)</source><creator>Paikrao, Pavan D. ; Mukherjee, Amrit ; Ghosh, Uttam ; Goswami, Pratik ; Novak, Milan ; Kumar Jain, Deepak ; Al-Numay, Mohammed S. ; Narwade, Pradeep</creator><creatorcontrib>Paikrao, Pavan D. ; Mukherjee, Amrit ; Ghosh, Uttam ; Goswami, Pratik ; Novak, Milan ; Kumar Jain, Deepak ; Al-Numay, Mohammed S. ; Narwade, Pradeep</creatorcontrib><description>This paper presents the practical response and performance-aware development of online speech enhancement from a consumer electronic perspective. To improve the efficiency of human-machine interaction, speech can play a vital role as a transmission medium on the Internet of Medical Things (IoM). However, some intelligent speech recognition systems cannot preserve the confidentiality of speech data. Additionally, the preservation of privacy is onerous, especially for model training and speech recognition in real-time. The recent development of big data-oriented wireless technologies associated with edge computing, interconnected devices of the Internet of Medical Things (IoMT), and big data analytics has great demand for connected human-machine interaction for many applications like automated cars, health monitoring, and consumer personal health care monitoring systems. Although big data-oriented wireless technologies serve these applications, the challenge remains of ignoring emotional care. This paper starts by explaining how to make a neural network-based architecture that can improve the speech of multichannel first-order Ambisonics mixtures and lower the need for human intervention through ambient intelligence (AmI). This will make the system work better overall in medical situations. Second, we demonstrate the effectiveness of different noise estimation techniques on proposed modulation domain processing (MDP) applications in smart hospitals, including electronic medical documentation, disease diagnosis, and evaluation. The proposed approach outperforms the enhancement of the conventional modulation domain in the cortex with several objective evaluation parameters such as Log Likelihood Ratio (LLR), Weighted Spectral Slope (WSS), Perceptual Evaluation of Speech Quality (PESQ), Csig and segmental (SNR seg.) Different noise estimators are used to figure out what effect the system has on different spectral modification parameters, like the over-subtraction factor and the modification domain. The experimental results show that the MDP system achieves better performance in terms of SNRseg. scores (49%) for the state-of-the-art consumer electronics perspective in a health care system. The proposed framework would greatly contribute to personalized communication health monitoring by consumers in a noisy environment.</description><identifier>ISSN: 0098-3063</identifier><identifier>EISSN: 1558-4127</identifier><identifier>DOI: 10.1109/TCE.2024.3387740</identifier><identifier>CODEN: ITCEDA</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Ambient intelligence ; Big Data ; big data computing ; Consumer electronics ; Consumer healthcare system ; Edge computing ; Electronics ; Health care ; Human-computer interaction ; Internet of medical things ; Likelihood ratio ; Medical electronics ; Medical services ; Modulation ; Monitoring ; Neural networks ; Noise measurement ; Parameter estimation ; Parameter modification ; Personal health ; Real time ; Real-time systems ; Speech ; Speech enhancement ; Speech processing ; Speech recognition ; Telemedicine ; Voice recognition</subject><ispartof>IEEE transactions on consumer electronics, 2024-05, Vol.70 (2), p.4828-4838</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c175t-22e6ca34fbedf51064c0801bcafff6d7acf00c5a246211133ed34843885712c63</cites><orcidid>0000-0003-1698-8888 ; 0000-0002-4226-4536 ; 0000-0001-8576-2060 ; 0000-0003-3636-8002 ; 0000-0002-6714-5568 ; 0000-0003-2419-7189</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10496958$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,27923,27924,54757</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10496958$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Paikrao, Pavan D.</creatorcontrib><creatorcontrib>Mukherjee, Amrit</creatorcontrib><creatorcontrib>Ghosh, Uttam</creatorcontrib><creatorcontrib>Goswami, Pratik</creatorcontrib><creatorcontrib>Novak, Milan</creatorcontrib><creatorcontrib>Kumar Jain, Deepak</creatorcontrib><creatorcontrib>Al-Numay, Mohammed S.</creatorcontrib><creatorcontrib>Narwade, Pradeep</creatorcontrib><title>Data Driven Neural Speech Enhancement for Smart Healthcare in Consumer Electronics Applications</title><title>IEEE transactions on consumer electronics</title><addtitle>T-CE</addtitle><description>This paper presents the practical response and performance-aware development of online speech enhancement from a consumer electronic perspective. To improve the efficiency of human-machine interaction, speech can play a vital role as a transmission medium on the Internet of Medical Things (IoM). However, some intelligent speech recognition systems cannot preserve the confidentiality of speech data. Additionally, the preservation of privacy is onerous, especially for model training and speech recognition in real-time. The recent development of big data-oriented wireless technologies associated with edge computing, interconnected devices of the Internet of Medical Things (IoMT), and big data analytics has great demand for connected human-machine interaction for many applications like automated cars, health monitoring, and consumer personal health care monitoring systems. Although big data-oriented wireless technologies serve these applications, the challenge remains of ignoring emotional care. This paper starts by explaining how to make a neural network-based architecture that can improve the speech of multichannel first-order Ambisonics mixtures and lower the need for human intervention through ambient intelligence (AmI). This will make the system work better overall in medical situations. Second, we demonstrate the effectiveness of different noise estimation techniques on proposed modulation domain processing (MDP) applications in smart hospitals, including electronic medical documentation, disease diagnosis, and evaluation. The proposed approach outperforms the enhancement of the conventional modulation domain in the cortex with several objective evaluation parameters such as Log Likelihood Ratio (LLR), Weighted Spectral Slope (WSS), Perceptual Evaluation of Speech Quality (PESQ), Csig and segmental (SNR seg.) Different noise estimators are used to figure out what effect the system has on different spectral modification parameters, like the over-subtraction factor and the modification domain. The experimental results show that the MDP system achieves better performance in terms of SNRseg. scores (49%) for the state-of-the-art consumer electronics perspective in a health care system. The proposed framework would greatly contribute to personalized communication health monitoring by consumers in a noisy environment.</description><subject>Ambient intelligence</subject><subject>Big Data</subject><subject>big data computing</subject><subject>Consumer electronics</subject><subject>Consumer healthcare system</subject><subject>Edge computing</subject><subject>Electronics</subject><subject>Health care</subject><subject>Human-computer interaction</subject><subject>Internet of medical things</subject><subject>Likelihood ratio</subject><subject>Medical electronics</subject><subject>Medical services</subject><subject>Modulation</subject><subject>Monitoring</subject><subject>Neural networks</subject><subject>Noise measurement</subject><subject>Parameter estimation</subject><subject>Parameter modification</subject><subject>Personal health</subject><subject>Real time</subject><subject>Real-time systems</subject><subject>Speech</subject><subject>Speech enhancement</subject><subject>Speech processing</subject><subject>Speech recognition</subject><subject>Telemedicine</subject><subject>Voice recognition</subject><issn>0098-3063</issn><issn>1558-4127</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkD1PwzAQhi0EEqWwMzBYYk7xd5yxSgNFqmBomS3XPaupUic4DhL_nlTtwHTDPe-d3gehR0pmlJLiZVNWM0aYmHGu81yQKzShUupMUJZfowkhhc44UfwW3fX9gRAqJNMTZBY2WbyI9Q8E_AFDtA1edwBuj6uwt8HBEULCvo14fbQx4SXYJu2djYDrgMs29MMRIq4acCm2oXY9nnddUzub6nF5j268bXp4uMwp-nqtNuUyW32-vZfzVeZoLlPGGChnufBb2HlJiRKOaEK3znrv1S63zhPipGVCMUop57DjQguutcwpc4pP0fP5bhfb7wH6ZA7tEMP40vCxutZKKzpS5Ey52PZ9BG-6WI-1fg0l5qTRjBrNSaO5aBwjT-dIDQD_cFGoQmr-B9vVbk0</recordid><startdate>20240501</startdate><enddate>20240501</enddate><creator>Paikrao, Pavan D.</creator><creator>Mukherjee, Amrit</creator><creator>Ghosh, Uttam</creator><creator>Goswami, Pratik</creator><creator>Novak, Milan</creator><creator>Kumar Jain, Deepak</creator><creator>Al-Numay, Mohammed S.</creator><creator>Narwade, Pradeep</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0003-1698-8888</orcidid><orcidid>https://orcid.org/0000-0002-4226-4536</orcidid><orcidid>https://orcid.org/0000-0001-8576-2060</orcidid><orcidid>https://orcid.org/0000-0003-3636-8002</orcidid><orcidid>https://orcid.org/0000-0002-6714-5568</orcidid><orcidid>https://orcid.org/0000-0003-2419-7189</orcidid></search><sort><creationdate>20240501</creationdate><title>Data Driven Neural Speech Enhancement for Smart Healthcare in Consumer Electronics Applications</title><author>Paikrao, Pavan D. ; Mukherjee, Amrit ; Ghosh, Uttam ; Goswami, Pratik ; Novak, Milan ; Kumar Jain, Deepak ; Al-Numay, Mohammed S. ; Narwade, Pradeep</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c175t-22e6ca34fbedf51064c0801bcafff6d7acf00c5a246211133ed34843885712c63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Ambient intelligence</topic><topic>Big Data</topic><topic>big data computing</topic><topic>Consumer electronics</topic><topic>Consumer healthcare system</topic><topic>Edge computing</topic><topic>Electronics</topic><topic>Health care</topic><topic>Human-computer interaction</topic><topic>Internet of medical things</topic><topic>Likelihood ratio</topic><topic>Medical electronics</topic><topic>Medical services</topic><topic>Modulation</topic><topic>Monitoring</topic><topic>Neural networks</topic><topic>Noise measurement</topic><topic>Parameter estimation</topic><topic>Parameter modification</topic><topic>Personal health</topic><topic>Real time</topic><topic>Real-time systems</topic><topic>Speech</topic><topic>Speech enhancement</topic><topic>Speech processing</topic><topic>Speech recognition</topic><topic>Telemedicine</topic><topic>Voice recognition</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Paikrao, Pavan D.</creatorcontrib><creatorcontrib>Mukherjee, Amrit</creatorcontrib><creatorcontrib>Ghosh, Uttam</creatorcontrib><creatorcontrib>Goswami, Pratik</creatorcontrib><creatorcontrib>Novak, Milan</creatorcontrib><creatorcontrib>Kumar Jain, Deepak</creatorcontrib><creatorcontrib>Al-Numay, Mohammed S.</creatorcontrib><creatorcontrib>Narwade, Pradeep</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE/IET Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on consumer electronics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Paikrao, Pavan D.</au><au>Mukherjee, Amrit</au><au>Ghosh, Uttam</au><au>Goswami, Pratik</au><au>Novak, Milan</au><au>Kumar Jain, Deepak</au><au>Al-Numay, Mohammed S.</au><au>Narwade, Pradeep</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Data Driven Neural Speech Enhancement for Smart Healthcare in Consumer Electronics Applications</atitle><jtitle>IEEE transactions on consumer electronics</jtitle><stitle>T-CE</stitle><date>2024-05-01</date><risdate>2024</risdate><volume>70</volume><issue>2</issue><spage>4828</spage><epage>4838</epage><pages>4828-4838</pages><issn>0098-3063</issn><eissn>1558-4127</eissn><coden>ITCEDA</coden><abstract>This paper presents the practical response and performance-aware development of online speech enhancement from a consumer electronic perspective. To improve the efficiency of human-machine interaction, speech can play a vital role as a transmission medium on the Internet of Medical Things (IoM). However, some intelligent speech recognition systems cannot preserve the confidentiality of speech data. Additionally, the preservation of privacy is onerous, especially for model training and speech recognition in real-time. The recent development of big data-oriented wireless technologies associated with edge computing, interconnected devices of the Internet of Medical Things (IoMT), and big data analytics has great demand for connected human-machine interaction for many applications like automated cars, health monitoring, and consumer personal health care monitoring systems. Although big data-oriented wireless technologies serve these applications, the challenge remains of ignoring emotional care. This paper starts by explaining how to make a neural network-based architecture that can improve the speech of multichannel first-order Ambisonics mixtures and lower the need for human intervention through ambient intelligence (AmI). This will make the system work better overall in medical situations. Second, we demonstrate the effectiveness of different noise estimation techniques on proposed modulation domain processing (MDP) applications in smart hospitals, including electronic medical documentation, disease diagnosis, and evaluation. The proposed approach outperforms the enhancement of the conventional modulation domain in the cortex with several objective evaluation parameters such as Log Likelihood Ratio (LLR), Weighted Spectral Slope (WSS), Perceptual Evaluation of Speech Quality (PESQ), Csig and segmental (SNR seg.) Different noise estimators are used to figure out what effect the system has on different spectral modification parameters, like the over-subtraction factor and the modification domain. The experimental results show that the MDP system achieves better performance in terms of SNRseg. scores (49%) for the state-of-the-art consumer electronics perspective in a health care system. The proposed framework would greatly contribute to personalized communication health monitoring by consumers in a noisy environment.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCE.2024.3387740</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0003-1698-8888</orcidid><orcidid>https://orcid.org/0000-0002-4226-4536</orcidid><orcidid>https://orcid.org/0000-0001-8576-2060</orcidid><orcidid>https://orcid.org/0000-0003-3636-8002</orcidid><orcidid>https://orcid.org/0000-0002-6714-5568</orcidid><orcidid>https://orcid.org/0000-0003-2419-7189</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0098-3063 |
ispartof | IEEE transactions on consumer electronics, 2024-05, Vol.70 (2), p.4828-4838 |
issn | 0098-3063 1558-4127 |
language | eng |
recordid | cdi_ieee_primary_10496958 |
source | IEEE/IET Electronic Library (IEL) |
subjects | Ambient intelligence Big Data big data computing Consumer electronics Consumer healthcare system Edge computing Electronics Health care Human-computer interaction Internet of medical things Likelihood ratio Medical electronics Medical services Modulation Monitoring Neural networks Noise measurement Parameter estimation Parameter modification Personal health Real time Real-time systems Speech Speech enhancement Speech processing Speech recognition Telemedicine Voice recognition |
title | Data Driven Neural Speech Enhancement for Smart Healthcare in Consumer Electronics Applications |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T17%3A32%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Data%20Driven%20Neural%20Speech%20Enhancement%20for%20Smart%20Healthcare%20in%20Consumer%20Electronics%20Applications&rft.jtitle=IEEE%20transactions%20on%20consumer%20electronics&rft.au=Paikrao,%20Pavan%20D.&rft.date=2024-05-01&rft.volume=70&rft.issue=2&rft.spage=4828&rft.epage=4838&rft.pages=4828-4838&rft.issn=0098-3063&rft.eissn=1558-4127&rft.coden=ITCEDA&rft_id=info:doi/10.1109/TCE.2024.3387740&rft_dat=%3Cproquest_RIE%3E3098886861%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3098886861&rft_id=info:pmid/&rft_ieee_id=10496958&rfr_iscdi=true |