Preech: A System for Privacy-Preserving Speech Transcription

New Advances in machine learning have made Automated Speech Recognition (ASR) systems practical and more scalable. These systems, however, pose serious privacy threats as speech is a rich source of sensitive acoustic and textual information. Although offline and open-source ASR eliminates the privac...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Ahmed, Shimaa, Chowdhury, Amrita Roy, Fawaz, Kassem, Ramanathan, Parmesh
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Cryptography and Security Computer Science - Sound
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Ahmed, Shimaa Chowdhury, Amrita Roy Fawaz, Kassem Ramanathan, Parmesh
description	New Advances in machine learning have made Automated Speech Recognition (ASR) systems practical and more scalable. These systems, however, pose serious privacy threats as speech is a rich source of sensitive acoustic and textual information. Although offline and open-source ASR eliminates the privacy risks, its transcription performance is inferior to that of cloud-based ASR systems, especially for real-world use cases. In this paper, we propose Pr$\epsilon\epsilon$ch, an end-to-end speech transcription system which lies at an intermediate point in the privacy-utility spectrum. It protects the acoustic features of the speakers' voices and protects the privacy of the textual content at an improved performance relative to offline ASR. Additionally, Pr$\epsilon\epsilon$ch provides several control knobs to allow customizable utility-usability-privacy trade-off. It relies on cloud-based services to transcribe a speech file after applying a series of privacy-preserving operations on the user's side. We perform a comprehensive evaluation of Pr$\epsilon\epsilon$ch, using diverse real-world datasets, that demonstrates its effectiveness. Pr$\epsilon\epsilon$ch provides transcriptions at a 2% to 32.25% (mean 17.34%) relative improvement in word error rate over Deep Speech, while fully obfuscating the speakers' voice biometrics and allowing only a differentially private view of the textual content.
doi_str_mv	10.48550/arxiv.1909.04198
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1909_04198</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1909_04198</sourcerecordid><originalsourceid>FETCH-LOGICAL-a678-e8e29248311c0d2d23ff1b8608274808aa0484c22343ef67434bde5623ff247e3</originalsourceid><addsrcrecordid>eNotj81qwkAUhWfTRVEfoKvOCySdn5vkRroR6R8ICmYfxsmdOqAx3Egwb9_GdnUW5-NwPiGetEoBs0y9OL7FIdWlKlMFusRH8bpjIn9cypXcj_2VzjJcWO44Ds6PyW_ZEw-x_Zb7buJkxa7tPcfuGi_tXDwEd-pp8Z8zUb2_VevPZLP9-FqvNonLC0wIyZQG0GrtVWMaY0PQB8wVmgJQoXMKELwxFiyFvAALh4ayfOIMFGRn4vlv9n6_7jieHY_1pFHfNewPY-dBQA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Preech: A System for Privacy-Preserving Speech Transcription</title><source>arXiv.org</source><creator>Ahmed, Shimaa ; Chowdhury, Amrita Roy ; Fawaz, Kassem ; Ramanathan, Parmesh</creator><creatorcontrib>Ahmed, Shimaa ; Chowdhury, Amrita Roy ; Fawaz, Kassem ; Ramanathan, Parmesh</creatorcontrib><description>New Advances in machine learning have made Automated Speech Recognition (ASR) systems practical and more scalable. These systems, however, pose serious privacy threats as speech is a rich source of sensitive acoustic and textual information. Although offline and open-source ASR eliminates the privacy risks, its transcription performance is inferior to that of cloud-based ASR systems, especially for real-world use cases. In this paper, we propose Pr$\epsilon\epsilon$ch, an end-to-end speech transcription system which lies at an intermediate point in the privacy-utility spectrum. It protects the acoustic features of the speakers' voices and protects the privacy of the textual content at an improved performance relative to offline ASR. Additionally, Pr$\epsilon\epsilon$ch provides several control knobs to allow customizable utility-usability-privacy trade-off. It relies on cloud-based services to transcribe a speech file after applying a series of privacy-preserving operations on the user's side. We perform a comprehensive evaluation of Pr$\epsilon\epsilon$ch, using diverse real-world datasets, that demonstrates its effectiveness. Pr$\epsilon\epsilon$ch provides transcriptions at a 2% to 32.25% (mean 17.34%) relative improvement in word error rate over Deep Speech, while fully obfuscating the speakers' voice biometrics and allowing only a differentially private view of the textual content.</description><identifier>DOI: 10.48550/arxiv.1909.04198</identifier><language>eng</language><subject>Computer Science - Cryptography and Security ; Computer Science - Sound</subject><creationdate>2019-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1909.04198$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1909.04198$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ahmed, Shimaa</creatorcontrib><creatorcontrib>Chowdhury, Amrita Roy</creatorcontrib><creatorcontrib>Fawaz, Kassem</creatorcontrib><creatorcontrib>Ramanathan, Parmesh</creatorcontrib><title>Preech: A System for Privacy-Preserving Speech Transcription</title><description>New Advances in machine learning have made Automated Speech Recognition (ASR) systems practical and more scalable. These systems, however, pose serious privacy threats as speech is a rich source of sensitive acoustic and textual information. Although offline and open-source ASR eliminates the privacy risks, its transcription performance is inferior to that of cloud-based ASR systems, especially for real-world use cases. In this paper, we propose Pr$\epsilon\epsilon$ch, an end-to-end speech transcription system which lies at an intermediate point in the privacy-utility spectrum. It protects the acoustic features of the speakers' voices and protects the privacy of the textual content at an improved performance relative to offline ASR. Additionally, Pr$\epsilon\epsilon$ch provides several control knobs to allow customizable utility-usability-privacy trade-off. It relies on cloud-based services to transcribe a speech file after applying a series of privacy-preserving operations on the user's side. We perform a comprehensive evaluation of Pr$\epsilon\epsilon$ch, using diverse real-world datasets, that demonstrates its effectiveness. Pr$\epsilon\epsilon$ch provides transcriptions at a 2% to 32.25% (mean 17.34%) relative improvement in word error rate over Deep Speech, while fully obfuscating the speakers' voice biometrics and allowing only a differentially private view of the textual content.</description><subject>Computer Science - Cryptography and Security</subject><subject>Computer Science - Sound</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81qwkAUhWfTRVEfoKvOCySdn5vkRroR6R8ICmYfxsmdOqAx3Egwb9_GdnUW5-NwPiGetEoBs0y9OL7FIdWlKlMFusRH8bpjIn9cypXcj_2VzjJcWO44Ds6PyW_ZEw-x_Zb7buJkxa7tPcfuGi_tXDwEd-pp8Z8zUb2_VevPZLP9-FqvNonLC0wIyZQG0GrtVWMaY0PQB8wVmgJQoXMKELwxFiyFvAALh4ayfOIMFGRn4vlv9n6_7jieHY_1pFHfNewPY-dBQA</recordid><startdate>20190909</startdate><enddate>20190909</enddate><creator>Ahmed, Shimaa</creator><creator>Chowdhury, Amrita Roy</creator><creator>Fawaz, Kassem</creator><creator>Ramanathan, Parmesh</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20190909</creationdate><title>Preech: A System for Privacy-Preserving Speech Transcription</title><author>Ahmed, Shimaa ; Chowdhury, Amrita Roy ; Fawaz, Kassem ; Ramanathan, Parmesh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a678-e8e29248311c0d2d23ff1b8608274808aa0484c22343ef67434bde5623ff247e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Cryptography and Security</topic><topic>Computer Science - Sound</topic><toplevel>online_resources</toplevel><creatorcontrib>Ahmed, Shimaa</creatorcontrib><creatorcontrib>Chowdhury, Amrita Roy</creatorcontrib><creatorcontrib>Fawaz, Kassem</creatorcontrib><creatorcontrib>Ramanathan, Parmesh</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ahmed, Shimaa</au><au>Chowdhury, Amrita Roy</au><au>Fawaz, Kassem</au><au>Ramanathan, Parmesh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Preech: A System for Privacy-Preserving Speech Transcription</atitle><date>2019-09-09</date><risdate>2019</risdate><abstract>New Advances in machine learning have made Automated Speech Recognition (ASR) systems practical and more scalable. These systems, however, pose serious privacy threats as speech is a rich source of sensitive acoustic and textual information. Although offline and open-source ASR eliminates the privacy risks, its transcription performance is inferior to that of cloud-based ASR systems, especially for real-world use cases. In this paper, we propose Pr$\epsilon\epsilon$ch, an end-to-end speech transcription system which lies at an intermediate point in the privacy-utility spectrum. It protects the acoustic features of the speakers' voices and protects the privacy of the textual content at an improved performance relative to offline ASR. Additionally, Pr$\epsilon\epsilon$ch provides several control knobs to allow customizable utility-usability-privacy trade-off. It relies on cloud-based services to transcribe a speech file after applying a series of privacy-preserving operations on the user's side. We perform a comprehensive evaluation of Pr$\epsilon\epsilon$ch, using diverse real-world datasets, that demonstrates its effectiveness. Pr$\epsilon\epsilon$ch provides transcriptions at a 2% to 32.25% (mean 17.34%) relative improvement in word error rate over Deep Speech, while fully obfuscating the speakers' voice biometrics and allowing only a differentially private view of the textual content.</abstract><doi>10.48550/arxiv.1909.04198</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1909.04198
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1909_04198
source	arXiv.org
subjects	Computer Science - Cryptography and Security Computer Science - Sound
title	Preech: A System for Privacy-Preserving Speech Transcription
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T03%3A39%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Preech:%20A%20System%20for%20Privacy-Preserving%20Speech%20Transcription&rft.au=Ahmed,%20Shimaa&rft.date=2019-09-09&rft_id=info:doi/10.48550/arxiv.1909.04198&rft_dat=%3Carxiv_GOX%3E1909_04198%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true