Open Source Toolkit for Speech to Text Translation

In this paper we introduce an open source toolkit for speech translation. While there already exists a wide variety of open source tools for the essential tasks of a speech translation system, our goal is to provide an easy to use recipe for the complete pipeline of translating speech. We provide a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Prague bulletin of mathematical linguistics 2018-10, Vol.111 (1), p.125-135
Hauptverfasser: Zenkel, Thomas, Sperber, Matthias, Niehues, Jan, Müller, Markus, Pham, Ngoc-Quan, Stüker, Sebastian, Waibel, Alex
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 135
container_issue 1
container_start_page 125
container_title Prague bulletin of mathematical linguistics
container_volume 111
creator Zenkel, Thomas
Sperber, Matthias
Niehues, Jan
Müller, Markus
Pham, Ngoc-Quan
Stüker, Sebastian
Waibel, Alex
description In this paper we introduce an open source toolkit for speech translation. While there already exists a wide variety of open source tools for the essential tasks of a speech translation system, our goal is to provide an easy to use recipe for the complete pipeline of translating speech. We provide a Docker container with a ready to use pipeline of the following components: a neural speech recognition system, a sentence segmentation system and an attention-based translation system. We provide recipes for training and evaluating models for the task of translating English lectures and TED talks to German. Additionally, we provide pre-trained models for this task. With this toolkit we hope to facilitate the development of speech translation systems and to encourage researchers to improve the overall performance of speech translation systems.
doi_str_mv 10.2478/pralin-2018-0011
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2167894001</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2167894001</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2145-b2383094b2e590ccb511e9c1149c26c5dff7857a4b33dbbe7f3bbec8cd64aeb63</originalsourceid><addsrcrecordid>eNp1UD1PwzAQtRBIlMLOaIk54K84zoRQxZdUqUPDbNnOBVJCHOxU0H-PS5Bg4Ya7N7z37u4hdE7JJROFuhqC6do-Y4SqjBBKD9CMKiIyIiQ7_IOP0UmMG0Kk4pLOEFsN0OO13wYHuPK-e21H3PiA1wOAe8GjxxV8jrgKpo-dGVvfn6KjxnQRzn7mHD3d3VaLh2y5un9c3Cwzx6jIM8u44qQUlkFeEudsTimUjlJROiZdXjdNofLCCMt5bS0UDU_dKVdLYcBKPkcXk-8Q_PsW4qg36cw-rdSMykKVIv2ZWGRiueBjDNDoIbRvJuw0JXqfjJ6S0ftk9D6ZJLmeJB-mGyHU8By2uwR-_f-TfhfL-Rc9YGwD</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2167894001</pqid></control><display><type>article</type><title>Open Source Toolkit for Speech to Text Translation</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Zenkel, Thomas ; Sperber, Matthias ; Niehues, Jan ; Müller, Markus ; Pham, Ngoc-Quan ; Stüker, Sebastian ; Waibel, Alex</creator><creatorcontrib>Zenkel, Thomas ; Sperber, Matthias ; Niehues, Jan ; Müller, Markus ; Pham, Ngoc-Quan ; Stüker, Sebastian ; Waibel, Alex</creatorcontrib><description>In this paper we introduce an open source toolkit for speech translation. While there already exists a wide variety of open source tools for the essential tasks of a speech translation system, our goal is to provide an easy to use recipe for the complete pipeline of translating speech. We provide a Docker container with a ready to use pipeline of the following components: a neural speech recognition system, a sentence segmentation system and an attention-based translation system. We provide recipes for training and evaluating models for the task of translating English lectures and TED talks to German. Additionally, we provide pre-trained models for this task. With this toolkit we hope to facilitate the development of speech translation systems and to encourage researchers to improve the overall performance of speech translation systems.</description><identifier>ISSN: 1804-0462</identifier><identifier>ISSN: 0032-6585</identifier><identifier>EISSN: 1804-0462</identifier><identifier>DOI: 10.2478/pralin-2018-0011</identifier><language>eng</language><publisher>Prague: Sciendo</publisher><subject>English language ; German language ; Open source software ; Segmentation ; Speech recognition ; Text-to-speech ; Translation</subject><ispartof>Prague bulletin of mathematical linguistics, 2018-10, Vol.111 (1), p.125-135</ispartof><rights>2018. This work is published under http://creativecommons.org/licenses/by-nc-nd/3.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2145-b2383094b2e590ccb511e9c1149c26c5dff7857a4b33dbbe7f3bbec8cd64aeb63</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Zenkel, Thomas</creatorcontrib><creatorcontrib>Sperber, Matthias</creatorcontrib><creatorcontrib>Niehues, Jan</creatorcontrib><creatorcontrib>Müller, Markus</creatorcontrib><creatorcontrib>Pham, Ngoc-Quan</creatorcontrib><creatorcontrib>Stüker, Sebastian</creatorcontrib><creatorcontrib>Waibel, Alex</creatorcontrib><title>Open Source Toolkit for Speech to Text Translation</title><title>Prague bulletin of mathematical linguistics</title><description>In this paper we introduce an open source toolkit for speech translation. While there already exists a wide variety of open source tools for the essential tasks of a speech translation system, our goal is to provide an easy to use recipe for the complete pipeline of translating speech. We provide a Docker container with a ready to use pipeline of the following components: a neural speech recognition system, a sentence segmentation system and an attention-based translation system. We provide recipes for training and evaluating models for the task of translating English lectures and TED talks to German. Additionally, we provide pre-trained models for this task. With this toolkit we hope to facilitate the development of speech translation systems and to encourage researchers to improve the overall performance of speech translation systems.</description><subject>English language</subject><subject>German language</subject><subject>Open source software</subject><subject>Segmentation</subject><subject>Speech recognition</subject><subject>Text-to-speech</subject><subject>Translation</subject><issn>1804-0462</issn><issn>0032-6585</issn><issn>1804-0462</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AIMQZ</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp1UD1PwzAQtRBIlMLOaIk54K84zoRQxZdUqUPDbNnOBVJCHOxU0H-PS5Bg4Ya7N7z37u4hdE7JJROFuhqC6do-Y4SqjBBKD9CMKiIyIiQ7_IOP0UmMG0Kk4pLOEFsN0OO13wYHuPK-e21H3PiA1wOAe8GjxxV8jrgKpo-dGVvfn6KjxnQRzn7mHD3d3VaLh2y5un9c3Cwzx6jIM8u44qQUlkFeEudsTimUjlJROiZdXjdNofLCCMt5bS0UDU_dKVdLYcBKPkcXk-8Q_PsW4qg36cw-rdSMykKVIv2ZWGRiueBjDNDoIbRvJuw0JXqfjJ6S0ftk9D6ZJLmeJB-mGyHU8By2uwR-_f-TfhfL-Rc9YGwD</recordid><startdate>20181001</startdate><enddate>20181001</enddate><creator>Zenkel, Thomas</creator><creator>Sperber, Matthias</creator><creator>Niehues, Jan</creator><creator>Müller, Markus</creator><creator>Pham, Ngoc-Quan</creator><creator>Stüker, Sebastian</creator><creator>Waibel, Alex</creator><general>Sciendo</general><general>Institute of Formal and Applied Linguistics</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7T9</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AIMQZ</scope><scope>ALSLI</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BYOGL</scope><scope>CCPQU</scope><scope>CPGLG</scope><scope>CRLPW</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>LIQON</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20181001</creationdate><title>Open Source Toolkit for Speech to Text Translation</title><author>Zenkel, Thomas ; Sperber, Matthias ; Niehues, Jan ; Müller, Markus ; Pham, Ngoc-Quan ; Stüker, Sebastian ; Waibel, Alex</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2145-b2383094b2e590ccb511e9c1149c26c5dff7857a4b33dbbe7f3bbec8cd64aeb63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>English language</topic><topic>German language</topic><topic>Open source software</topic><topic>Segmentation</topic><topic>Speech recognition</topic><topic>Text-to-speech</topic><topic>Translation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zenkel, Thomas</creatorcontrib><creatorcontrib>Sperber, Matthias</creatorcontrib><creatorcontrib>Niehues, Jan</creatorcontrib><creatorcontrib>Müller, Markus</creatorcontrib><creatorcontrib>Pham, Ngoc-Quan</creatorcontrib><creatorcontrib>Stüker, Sebastian</creatorcontrib><creatorcontrib>Waibel, Alex</creatorcontrib><collection>CrossRef</collection><collection>Linguistics and Language Behavior Abstracts (LLBA)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest One Literature</collection><collection>Social Science Premium Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>East Europe, Central Europe Database</collection><collection>ProQuest One Community College</collection><collection>Linguistics Collection</collection><collection>Linguistics Database</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest One Literature - U.S. Customers Only</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>Prague bulletin of mathematical linguistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zenkel, Thomas</au><au>Sperber, Matthias</au><au>Niehues, Jan</au><au>Müller, Markus</au><au>Pham, Ngoc-Quan</au><au>Stüker, Sebastian</au><au>Waibel, Alex</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Open Source Toolkit for Speech to Text Translation</atitle><jtitle>Prague bulletin of mathematical linguistics</jtitle><date>2018-10-01</date><risdate>2018</risdate><volume>111</volume><issue>1</issue><spage>125</spage><epage>135</epage><pages>125-135</pages><issn>1804-0462</issn><issn>0032-6585</issn><eissn>1804-0462</eissn><abstract>In this paper we introduce an open source toolkit for speech translation. While there already exists a wide variety of open source tools for the essential tasks of a speech translation system, our goal is to provide an easy to use recipe for the complete pipeline of translating speech. We provide a Docker container with a ready to use pipeline of the following components: a neural speech recognition system, a sentence segmentation system and an attention-based translation system. We provide recipes for training and evaluating models for the task of translating English lectures and TED talks to German. Additionally, we provide pre-trained models for this task. With this toolkit we hope to facilitate the development of speech translation systems and to encourage researchers to improve the overall performance of speech translation systems.</abstract><cop>Prague</cop><pub>Sciendo</pub><doi>10.2478/pralin-2018-0011</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1804-0462
ispartof Prague bulletin of mathematical linguistics, 2018-10, Vol.111 (1), p.125-135
issn 1804-0462
0032-6585
1804-0462
language eng
recordid cdi_proquest_journals_2167894001
source EZB-FREE-00999 freely available EZB journals
subjects English language
German language
Open source software
Segmentation
Speech recognition
Text-to-speech
Translation
title Open Source Toolkit for Speech to Text Translation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T20%3A19%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Open%20Source%20Toolkit%20for%20Speech%20to%20Text%20Translation&rft.jtitle=Prague%20bulletin%20of%20mathematical%20linguistics&rft.au=Zenkel,%20Thomas&rft.date=2018-10-01&rft.volume=111&rft.issue=1&rft.spage=125&rft.epage=135&rft.pages=125-135&rft.issn=1804-0462&rft.eissn=1804-0462&rft_id=info:doi/10.2478/pralin-2018-0011&rft_dat=%3Cproquest_cross%3E2167894001%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2167894001&rft_id=info:pmid/&rfr_iscdi=true