Smartphone Recordings are Comparable to “Gold Standard” Recordings for Acoustic Measurements of Voice

The purpose of this study was to assess the relationship and comparability of cepstral and spectral measures of voice obtained from a high-cost “flat” microphone and precision sound level meter (SLM) vs. high-end and entry level models of commonly and currently used smartphones (iPhone i12 and iSE;...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of voice 2023-04
Hauptverfasser: Awan, Shaheen N., Shaikh, Mohsin Ahmed, Awan, Jordan A., Abdalla, Ibrahim, Lim, Kelvin O., Misono, Stephanie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title Journal of voice
container_volume
creator Awan, Shaheen N.
Shaikh, Mohsin Ahmed
Awan, Jordan A.
Abdalla, Ibrahim
Lim, Kelvin O.
Misono, Stephanie
description The purpose of this study was to assess the relationship and comparability of cepstral and spectral measures of voice obtained from a high-cost “flat” microphone and precision sound level meter (SLM) vs. high-end and entry level models of commonly and currently used smartphones (iPhone i12 and iSE; Samsung s21 and s9 smartphones). Device comparisons were also conducted in different settings (sound-treated booth vs. typical “quiet” office room) and at different mouth-to-microphone distances (15 and 30 cm). The SLM and smartphone devices were used to record a series of speech and vowel samples from a prerecorded diverse set of 24 speakers representing a wide range of sex, age, fundamental frequency (F0), and voice quality types. Recordings were analyzed for the following measures: smoothed cepstral peak prominence (CPP in dB); the low vs high spectral ratio (L/H Ratio in dB); and the Cepstral Spectral Index of Dysphonia (CSID). A strong device effect was observed for L/H Ratio (dB) in both vowel and sentence contexts and for CSID in the sentence context. In contrast, device had a weak effect on CPP (dB), regardless of context. Recording distance was observed to have a small-to-moderate effect on measures of CPP and CSID but had a negligible effect on L/H Ratio. With the exception of L/H Ratio in the vowel context, setting was observed to have a strong effect on all three measures. While these aforementioned effects resulted in significant differences between measures obtained with SLM vs. smartphone devices, the intercorrelations of the measurements were extremely strong (r's > 0.90), indicating that all devices were able to capture the range of voice characteristics represented in the voice sample corpus. Regression modeling showed that acoustic measurements obtained from smartphone recordings could be successfully converted to comparable measurements obtained by a "gold standard" (precision SLM recordings conducted in a sound-treated booth at 15 cm) with small degrees of error. These findings indicate that a variety of commonly available modern smartphones can be used to collect high quality voice recordings usable for informative acoustic analysis. While device, setting, and distance can have significant effects on acoustic measurements, these effects are predictable and can be accounted for using regression modeling.
doi_str_mv 10.1016/j.jvoice.2023.01.031
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10545813</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0892199723000310</els_id><sourcerecordid>2797147957</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-c20181274abff9b9c7135f35df23a704c0ab1b49a85f30df4c0b08d3cb78c4833</originalsourceid><addsrcrecordid>eNp9UcFu1DAQtRAV3Rb-ACEfuSSM4wTbF1C1goLUqhIFrpZjj1uvknixsytx64fQn-uX1KstVblwGmnmvTdv5hHymkHNgL1_t6pX2xgs1g00vAZWA2fPyIJJwau2k_I5WYBUTcWUEofkKOcVADRl-oIccgFMSWgXJFyOJs3r6zgh_YY2Jhemq0xNQrqM49ok0w9I50jvbv6cxsHRy9lMziR3d3P7lOBjoic2bvIcLD1HkzcJR5zmTKOnP3c-X5IDb4aMrx7qMfnx-dP35Zfq7OL06_LkrLKc87myDTDJGtGa3nvVKysY7zzvnG-4EdBaMD3rW2Vk6YLzpdGDdNz2QtpWcn5MPu5115t-RGeLiWQGvU6hXPpbRxP0v5MpXOuruNUMuvI3tlN4-6CQ4q8N5lmPIVscBjNhuVA3QgnWCtWJAm33UJtizgn94x4GeheTXul9THoXkwamS0yF9uapx0fS31wK4MMegOVT24BJZxtwsuhCQjtrF8P_N9wDWXWpxg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2797147957</pqid></control><display><type>article</type><title>Smartphone Recordings are Comparable to “Gold Standard” Recordings for Acoustic Measurements of Voice</title><source>Elsevier ScienceDirect Journals</source><creator>Awan, Shaheen N. ; Shaikh, Mohsin Ahmed ; Awan, Jordan A. ; Abdalla, Ibrahim ; Lim, Kelvin O. ; Misono, Stephanie</creator><creatorcontrib>Awan, Shaheen N. ; Shaikh, Mohsin Ahmed ; Awan, Jordan A. ; Abdalla, Ibrahim ; Lim, Kelvin O. ; Misono, Stephanie</creatorcontrib><description>The purpose of this study was to assess the relationship and comparability of cepstral and spectral measures of voice obtained from a high-cost “flat” microphone and precision sound level meter (SLM) vs. high-end and entry level models of commonly and currently used smartphones (iPhone i12 and iSE; Samsung s21 and s9 smartphones). Device comparisons were also conducted in different settings (sound-treated booth vs. typical “quiet” office room) and at different mouth-to-microphone distances (15 and 30 cm). The SLM and smartphone devices were used to record a series of speech and vowel samples from a prerecorded diverse set of 24 speakers representing a wide range of sex, age, fundamental frequency (F0), and voice quality types. Recordings were analyzed for the following measures: smoothed cepstral peak prominence (CPP in dB); the low vs high spectral ratio (L/H Ratio in dB); and the Cepstral Spectral Index of Dysphonia (CSID). A strong device effect was observed for L/H Ratio (dB) in both vowel and sentence contexts and for CSID in the sentence context. In contrast, device had a weak effect on CPP (dB), regardless of context. Recording distance was observed to have a small-to-moderate effect on measures of CPP and CSID but had a negligible effect on L/H Ratio. With the exception of L/H Ratio in the vowel context, setting was observed to have a strong effect on all three measures. While these aforementioned effects resulted in significant differences between measures obtained with SLM vs. smartphone devices, the intercorrelations of the measurements were extremely strong (r's &gt; 0.90), indicating that all devices were able to capture the range of voice characteristics represented in the voice sample corpus. Regression modeling showed that acoustic measurements obtained from smartphone recordings could be successfully converted to comparable measurements obtained by a "gold standard" (precision SLM recordings conducted in a sound-treated booth at 15 cm) with small degrees of error. These findings indicate that a variety of commonly available modern smartphones can be used to collect high quality voice recordings usable for informative acoustic analysis. While device, setting, and distance can have significant effects on acoustic measurements, these effects are predictable and can be accounted for using regression modeling.</description><identifier>ISSN: 0892-1997</identifier><identifier>ISSN: 1873-4588</identifier><identifier>EISSN: 1873-4588</identifier><identifier>DOI: 10.1016/j.jvoice.2023.01.031</identifier><identifier>PMID: 37019804</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>cepstral analysis ; frequency response ; smartphones ; spectral analysis ; voice evaluation</subject><ispartof>Journal of voice, 2023-04</ispartof><rights>2023 The Voice Foundation</rights><rights>Copyright © 2023 The Voice Foundation. Published by Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-c20181274abff9b9c7135f35df23a704c0ab1b49a85f30df4c0b08d3cb78c4833</citedby><cites>FETCH-LOGICAL-c333t-c20181274abff9b9c7135f35df23a704c0ab1b49a85f30df4c0b08d3cb78c4833</cites><orcidid>0000-0002-5052-1547 ; 0000-0002-0379-506X ; 0000-0003-3331-4156 ; 0000-0001-9404-7499</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0892199723000310$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>230,314,776,780,881,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37019804$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Awan, Shaheen N.</creatorcontrib><creatorcontrib>Shaikh, Mohsin Ahmed</creatorcontrib><creatorcontrib>Awan, Jordan A.</creatorcontrib><creatorcontrib>Abdalla, Ibrahim</creatorcontrib><creatorcontrib>Lim, Kelvin O.</creatorcontrib><creatorcontrib>Misono, Stephanie</creatorcontrib><title>Smartphone Recordings are Comparable to “Gold Standard” Recordings for Acoustic Measurements of Voice</title><title>Journal of voice</title><addtitle>J Voice</addtitle><description>The purpose of this study was to assess the relationship and comparability of cepstral and spectral measures of voice obtained from a high-cost “flat” microphone and precision sound level meter (SLM) vs. high-end and entry level models of commonly and currently used smartphones (iPhone i12 and iSE; Samsung s21 and s9 smartphones). Device comparisons were also conducted in different settings (sound-treated booth vs. typical “quiet” office room) and at different mouth-to-microphone distances (15 and 30 cm). The SLM and smartphone devices were used to record a series of speech and vowel samples from a prerecorded diverse set of 24 speakers representing a wide range of sex, age, fundamental frequency (F0), and voice quality types. Recordings were analyzed for the following measures: smoothed cepstral peak prominence (CPP in dB); the low vs high spectral ratio (L/H Ratio in dB); and the Cepstral Spectral Index of Dysphonia (CSID). A strong device effect was observed for L/H Ratio (dB) in both vowel and sentence contexts and for CSID in the sentence context. In contrast, device had a weak effect on CPP (dB), regardless of context. Recording distance was observed to have a small-to-moderate effect on measures of CPP and CSID but had a negligible effect on L/H Ratio. With the exception of L/H Ratio in the vowel context, setting was observed to have a strong effect on all three measures. While these aforementioned effects resulted in significant differences between measures obtained with SLM vs. smartphone devices, the intercorrelations of the measurements were extremely strong (r's &gt; 0.90), indicating that all devices were able to capture the range of voice characteristics represented in the voice sample corpus. Regression modeling showed that acoustic measurements obtained from smartphone recordings could be successfully converted to comparable measurements obtained by a "gold standard" (precision SLM recordings conducted in a sound-treated booth at 15 cm) with small degrees of error. These findings indicate that a variety of commonly available modern smartphones can be used to collect high quality voice recordings usable for informative acoustic analysis. While device, setting, and distance can have significant effects on acoustic measurements, these effects are predictable and can be accounted for using regression modeling.</description><subject>cepstral analysis</subject><subject>frequency response</subject><subject>smartphones</subject><subject>spectral analysis</subject><subject>voice evaluation</subject><issn>0892-1997</issn><issn>1873-4588</issn><issn>1873-4588</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9UcFu1DAQtRAV3Rb-ACEfuSSM4wTbF1C1goLUqhIFrpZjj1uvknixsytx64fQn-uX1KstVblwGmnmvTdv5hHymkHNgL1_t6pX2xgs1g00vAZWA2fPyIJJwau2k_I5WYBUTcWUEofkKOcVADRl-oIccgFMSWgXJFyOJs3r6zgh_YY2Jhemq0xNQrqM49ok0w9I50jvbv6cxsHRy9lMziR3d3P7lOBjoic2bvIcLD1HkzcJR5zmTKOnP3c-X5IDb4aMrx7qMfnx-dP35Zfq7OL06_LkrLKc87myDTDJGtGa3nvVKysY7zzvnG-4EdBaMD3rW2Vk6YLzpdGDdNz2QtpWcn5MPu5115t-RGeLiWQGvU6hXPpbRxP0v5MpXOuruNUMuvI3tlN4-6CQ4q8N5lmPIVscBjNhuVA3QgnWCtWJAm33UJtizgn94x4GeheTXul9THoXkwamS0yF9uapx0fS31wK4MMegOVT24BJZxtwsuhCQjtrF8P_N9wDWXWpxg</recordid><startdate>20230403</startdate><enddate>20230403</enddate><creator>Awan, Shaheen N.</creator><creator>Shaikh, Mohsin Ahmed</creator><creator>Awan, Jordan A.</creator><creator>Abdalla, Ibrahim</creator><creator>Lim, Kelvin O.</creator><creator>Misono, Stephanie</creator><general>Elsevier Inc</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-5052-1547</orcidid><orcidid>https://orcid.org/0000-0002-0379-506X</orcidid><orcidid>https://orcid.org/0000-0003-3331-4156</orcidid><orcidid>https://orcid.org/0000-0001-9404-7499</orcidid></search><sort><creationdate>20230403</creationdate><title>Smartphone Recordings are Comparable to “Gold Standard” Recordings for Acoustic Measurements of Voice</title><author>Awan, Shaheen N. ; Shaikh, Mohsin Ahmed ; Awan, Jordan A. ; Abdalla, Ibrahim ; Lim, Kelvin O. ; Misono, Stephanie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-c20181274abff9b9c7135f35df23a704c0ab1b49a85f30df4c0b08d3cb78c4833</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>cepstral analysis</topic><topic>frequency response</topic><topic>smartphones</topic><topic>spectral analysis</topic><topic>voice evaluation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Awan, Shaheen N.</creatorcontrib><creatorcontrib>Shaikh, Mohsin Ahmed</creatorcontrib><creatorcontrib>Awan, Jordan A.</creatorcontrib><creatorcontrib>Abdalla, Ibrahim</creatorcontrib><creatorcontrib>Lim, Kelvin O.</creatorcontrib><creatorcontrib>Misono, Stephanie</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of voice</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Awan, Shaheen N.</au><au>Shaikh, Mohsin Ahmed</au><au>Awan, Jordan A.</au><au>Abdalla, Ibrahim</au><au>Lim, Kelvin O.</au><au>Misono, Stephanie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Smartphone Recordings are Comparable to “Gold Standard” Recordings for Acoustic Measurements of Voice</atitle><jtitle>Journal of voice</jtitle><addtitle>J Voice</addtitle><date>2023-04-03</date><risdate>2023</risdate><issn>0892-1997</issn><issn>1873-4588</issn><eissn>1873-4588</eissn><abstract>The purpose of this study was to assess the relationship and comparability of cepstral and spectral measures of voice obtained from a high-cost “flat” microphone and precision sound level meter (SLM) vs. high-end and entry level models of commonly and currently used smartphones (iPhone i12 and iSE; Samsung s21 and s9 smartphones). Device comparisons were also conducted in different settings (sound-treated booth vs. typical “quiet” office room) and at different mouth-to-microphone distances (15 and 30 cm). The SLM and smartphone devices were used to record a series of speech and vowel samples from a prerecorded diverse set of 24 speakers representing a wide range of sex, age, fundamental frequency (F0), and voice quality types. Recordings were analyzed for the following measures: smoothed cepstral peak prominence (CPP in dB); the low vs high spectral ratio (L/H Ratio in dB); and the Cepstral Spectral Index of Dysphonia (CSID). A strong device effect was observed for L/H Ratio (dB) in both vowel and sentence contexts and for CSID in the sentence context. In contrast, device had a weak effect on CPP (dB), regardless of context. Recording distance was observed to have a small-to-moderate effect on measures of CPP and CSID but had a negligible effect on L/H Ratio. With the exception of L/H Ratio in the vowel context, setting was observed to have a strong effect on all three measures. While these aforementioned effects resulted in significant differences between measures obtained with SLM vs. smartphone devices, the intercorrelations of the measurements were extremely strong (r's &gt; 0.90), indicating that all devices were able to capture the range of voice characteristics represented in the voice sample corpus. Regression modeling showed that acoustic measurements obtained from smartphone recordings could be successfully converted to comparable measurements obtained by a "gold standard" (precision SLM recordings conducted in a sound-treated booth at 15 cm) with small degrees of error. These findings indicate that a variety of commonly available modern smartphones can be used to collect high quality voice recordings usable for informative acoustic analysis. While device, setting, and distance can have significant effects on acoustic measurements, these effects are predictable and can be accounted for using regression modeling.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>37019804</pmid><doi>10.1016/j.jvoice.2023.01.031</doi><orcidid>https://orcid.org/0000-0002-5052-1547</orcidid><orcidid>https://orcid.org/0000-0002-0379-506X</orcidid><orcidid>https://orcid.org/0000-0003-3331-4156</orcidid><orcidid>https://orcid.org/0000-0001-9404-7499</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0892-1997
ispartof Journal of voice, 2023-04
issn 0892-1997
1873-4588
1873-4588
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10545813
source Elsevier ScienceDirect Journals
subjects cepstral analysis
frequency response
smartphones
spectral analysis
voice evaluation
title Smartphone Recordings are Comparable to “Gold Standard” Recordings for Acoustic Measurements of Voice
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T21%3A32%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Smartphone%20Recordings%20are%20Comparable%20to%20%E2%80%9CGold%20Standard%E2%80%9D%20Recordings%20for%20Acoustic%20Measurements%20of%20Voice&rft.jtitle=Journal%20of%20voice&rft.au=Awan,%20Shaheen%20N.&rft.date=2023-04-03&rft.issn=0892-1997&rft.eissn=1873-4588&rft_id=info:doi/10.1016/j.jvoice.2023.01.031&rft_dat=%3Cproquest_pubme%3E2797147957%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2797147957&rft_id=info:pmid/37019804&rft_els_id=S0892199723000310&rfr_iscdi=true