Comparison between handwritten word and speech record in real-time using CNN architectures

This paper presents the development of a system of comparison between words spoken and written by means of deep learning techniques. There are used 10 words acquired by means of an audio function and, these same words, are written by hand and acquired by a webcam, in such a way as to verify if the t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of electrical and computer engineering (Malacca, Malacca) Malacca), 2020-08, Vol.10 (4), p.4313
Hauptverfasser:	Pinzón-Arenas, Javier Orlando, Jiménez-Moreno, Robinson
Format:	Artikel
Sprache:	eng
Schlagworte:	Handwriting Machine learning Neural networks Real time Voice recognition Webcams Words (language)
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	4
container_start_page	4313
container_title	International journal of electrical and computer engineering (Malacca, Malacca)
container_volume	10
creator	Pinzón-Arenas, Javier Orlando Jiménez-Moreno, Robinson
description	This paper presents the development of a system of comparison between words spoken and written by means of deep learning techniques. There are used 10 words acquired by means of an audio function and, these same words, are written by hand and acquired by a webcam, in such a way as to verify if the two data match and show whether or not it is the required word. For this, 2 different CNN architectures were used for each function, where for voice recognition, a suitable CNN was used to identify complete words by means of their features obtained with mel frequency cepstral coefficients, while for handwriting, a faster R-CNN was used, so that it both locates and identifies the captured word. To implement the system, an easy-to-use graphical interface was developed, which unites the two neural networks for its operation. With this, tests were performed in real-time, obtaining a general accuracy of 95.24%, allowing showing the good performance of the implemented system, adding the response speed factor, being less than 200 ms in making the comparison.
doi_str_mv	10.11591/ijece.v10i4.pp4313-4321
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2397066661</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2397066661</sourcerecordid><originalsourceid>FETCH-LOGICAL-c147t-25801ddff5087c883a5468432c6caa5b63859140add35501e67fdce73a9181c63</originalsourceid><addsrcrecordid>eNotkMtOAzEMRSMEElXpP0RiPSWZPLtEI15SVTawYROliYemameGJEPF35M-vPG1Zdm-ByFMyZxSsaAPYQsO5r-UBD4fBs4oqzir6RWa1Kquq1oofV000brSiuhbNEsprAnnihMlxQR9Nf1-sDGkvsNryAeADm9s5w8x5Fz0oY8elxqnAcBtcAR37ISuKLurctgDHlPovnGzWmEb3SZkcHmMkO7QTWt3CWaXPEWfz08fzWu1fH95ax6XlaNc5fKjJtT7thVEK6c1s4JLXVw46awVa8l0ccqJ9Z4JQShI1XoHitkF1dRJNkX3571D7H9GSNls-zF25aSp2UIRWYKWKX2ecrFPKUJrhhj2Nv4ZSswJpjnBNCeY5gzTHGGyf_KWawk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2397066661</pqid></control><display><type>article</type><title>Comparison between handwritten word and speech record in real-time using CNN architectures</title><source>EZB-FREE-00999 freely available EZB journals</source><creator>Pinzón-Arenas, Javier Orlando ; Jiménez-Moreno, Robinson</creator><creatorcontrib>Pinzón-Arenas, Javier Orlando ; Jiménez-Moreno, Robinson</creatorcontrib><description>This paper presents the development of a system of comparison between words spoken and written by means of deep learning techniques. There are used 10 words acquired by means of an audio function and, these same words, are written by hand and acquired by a webcam, in such a way as to verify if the two data match and show whether or not it is the required word. For this, 2 different CNN architectures were used for each function, where for voice recognition, a suitable CNN was used to identify complete words by means of their features obtained with mel frequency cepstral coefficients, while for handwriting, a faster R-CNN was used, so that it both locates and identifies the captured word. To implement the system, an easy-to-use graphical interface was developed, which unites the two neural networks for its operation. With this, tests were performed in real-time, obtaining a general accuracy of 95.24%, allowing showing the good performance of the implemented system, adding the response speed factor, being less than 200 ms in making the comparison.</description><identifier>ISSN: 2088-8708</identifier><identifier>EISSN: 2722-2578</identifier><identifier>EISSN: 2088-8708</identifier><identifier>DOI: 10.11591/ijece.v10i4.pp4313-4321</identifier><language>eng</language><publisher>Yogyakarta: IAES Institute of Advanced Engineering and Science</publisher><subject>Handwriting ; Machine learning ; Neural networks ; Real time ; Voice recognition ; Webcams ; Words (language)</subject><ispartof>International journal of electrical and computer engineering (Malacca, Malacca), 2020-08, Vol.10 (4), p.4313</ispartof><rights>Copyright IAES Institute of Advanced Engineering and Science Aug 2020</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904</link.rule.ids></links><search><creatorcontrib>Pinzón-Arenas, Javier Orlando</creatorcontrib><creatorcontrib>Jiménez-Moreno, Robinson</creatorcontrib><title>Comparison between handwritten word and speech record in real-time using CNN architectures</title><title>International journal of electrical and computer engineering (Malacca, Malacca)</title><description>This paper presents the development of a system of comparison between words spoken and written by means of deep learning techniques. There are used 10 words acquired by means of an audio function and, these same words, are written by hand and acquired by a webcam, in such a way as to verify if the two data match and show whether or not it is the required word. For this, 2 different CNN architectures were used for each function, where for voice recognition, a suitable CNN was used to identify complete words by means of their features obtained with mel frequency cepstral coefficients, while for handwriting, a faster R-CNN was used, so that it both locates and identifies the captured word. To implement the system, an easy-to-use graphical interface was developed, which unites the two neural networks for its operation. With this, tests were performed in real-time, obtaining a general accuracy of 95.24%, allowing showing the good performance of the implemented system, adding the response speed factor, being less than 200 ms in making the comparison.</description><subject>Handwriting</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Real time</subject><subject>Voice recognition</subject><subject>Webcams</subject><subject>Words (language)</subject><issn>2088-8708</issn><issn>2722-2578</issn><issn>2088-8708</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNotkMtOAzEMRSMEElXpP0RiPSWZPLtEI15SVTawYROliYemameGJEPF35M-vPG1Zdm-ByFMyZxSsaAPYQsO5r-UBD4fBs4oqzir6RWa1Kquq1oofV000brSiuhbNEsprAnnihMlxQR9Nf1-sDGkvsNryAeADm9s5w8x5Fz0oY8elxqnAcBtcAR37ISuKLurctgDHlPovnGzWmEb3SZkcHmMkO7QTWt3CWaXPEWfz08fzWu1fH95ax6XlaNc5fKjJtT7thVEK6c1s4JLXVw46awVa8l0ccqJ9Z4JQShI1XoHitkF1dRJNkX3571D7H9GSNls-zF25aSp2UIRWYKWKX2ecrFPKUJrhhj2Nv4ZSswJpjnBNCeY5gzTHGGyf_KWawk</recordid><startdate>20200801</startdate><enddate>20200801</enddate><creator>Pinzón-Arenas, Javier Orlando</creator><creator>Jiménez-Moreno, Robinson</creator><general>IAES Institute of Advanced Engineering and Science</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BVBZV</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L6V</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20200801</creationdate><title>Comparison between handwritten word and speech record in real-time using CNN architectures</title><author>Pinzón-Arenas, Javier Orlando ; Jiménez-Moreno, Robinson</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c147t-25801ddff5087c883a5468432c6caa5b63859140add35501e67fdce73a9181c63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Handwriting</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Real time</topic><topic>Voice recognition</topic><topic>Webcams</topic><topic>Words (language)</topic><toplevel>online_resources</toplevel><creatorcontrib>Pinzón-Arenas, Javier Orlando</creatorcontrib><creatorcontrib>Jiménez-Moreno, Robinson</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>East & South Asia Database</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>International journal of electrical and computer engineering (Malacca, Malacca)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pinzón-Arenas, Javier Orlando</au><au>Jiménez-Moreno, Robinson</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparison between handwritten word and speech record in real-time using CNN architectures</atitle><jtitle>International journal of electrical and computer engineering (Malacca, Malacca)</jtitle><date>2020-08-01</date><risdate>2020</risdate><volume>10</volume><issue>4</issue><spage>4313</spage><pages>4313-</pages><issn>2088-8708</issn><eissn>2722-2578</eissn><eissn>2088-8708</eissn><abstract>This paper presents the development of a system of comparison between words spoken and written by means of deep learning techniques. There are used 10 words acquired by means of an audio function and, these same words, are written by hand and acquired by a webcam, in such a way as to verify if the two data match and show whether or not it is the required word. For this, 2 different CNN architectures were used for each function, where for voice recognition, a suitable CNN was used to identify complete words by means of their features obtained with mel frequency cepstral coefficients, while for handwriting, a faster R-CNN was used, so that it both locates and identifies the captured word. To implement the system, an easy-to-use graphical interface was developed, which unites the two neural networks for its operation. With this, tests were performed in real-time, obtaining a general accuracy of 95.24%, allowing showing the good performance of the implemented system, adding the response speed factor, being less than 200 ms in making the comparison.</abstract><cop>Yogyakarta</cop><pub>IAES Institute of Advanced Engineering and Science</pub><doi>10.11591/ijece.v10i4.pp4313-4321</doi></addata></record>
fulltext	fulltext
identifier	ISSN: 2088-8708
ispartof	International journal of electrical and computer engineering (Malacca, Malacca), 2020-08, Vol.10 (4), p.4313
issn	2088-8708 2722-2578 2088-8708
language	eng
recordid	cdi_proquest_journals_2397066661
source	EZB-FREE-00999 freely available EZB journals
subjects	Handwriting Machine learning Neural networks Real time Voice recognition Webcams Words (language)
title	Comparison between handwritten word and speech record in real-time using CNN architectures
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T08%3A49%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparison%20between%20handwritten%20word%20and%20speech%20record%20in%20real-time%20using%20CNN%20architectures&rft.jtitle=International%20journal%20of%20electrical%20and%20computer%20engineering%20(Malacca,%20Malacca)&rft.au=Pinz%C3%B3n-Arenas,%20Javier%20Orlando&rft.date=2020-08-01&rft.volume=10&rft.issue=4&rft.spage=4313&rft.pages=4313-&rft.issn=2088-8708&rft.eissn=2722-2578&rft_id=info:doi/10.11591/ijece.v10i4.pp4313-4321&rft_dat=%3Cproquest_cross%3E2397066661%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2397066661&rft_id=info:pmid/&rfr_iscdi=true