A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions

in Proc. of AICS 2022, Cork, Ireland. Communications in Computer and Information Science The Transformer architecture is shown to provide a powerful framework as an end-to-end model for building expression trees from online handwritten gestures corresponding to glyph strokes. In particular, the atte...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ramo, Mirco, Silvestre, Guénolé C. M
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Ramo, Mirco
Silvestre, Guénolé C. M
description in Proc. of AICS 2022, Cork, Ireland. Communications in Computer and Information Science The Transformer architecture is shown to provide a powerful framework as an end-to-end model for building expression trees from online handwritten gestures corresponding to glyph strokes. In particular, the attention mechanism was successfully used to encode, learn and enforce the underlying syntax of expressions creating latent representations that are correctly decoded to the exact mathematical expression tree, providing robustness to ablated inputs and unseen glyphs. For the first time, the encoder is fed with spatio-temporal data tokens potentially forming an infinitely large vocabulary, which finds applications beyond that of online gesture recognition. A new supervised dataset of online handwriting gestures is provided for training models on generic handwriting recognition tasks and a new metric is proposed for the evaluation of the syntactic correctness of the output expression trees. A small Transformer model suitable for edge inference was successfully trained to an average normalised Levenshtein accuracy of 94%, resulting in valid postfix RPN tree representation for 94% of predictions.
doi_str_mv 10.48550/arxiv.2211.02643
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2211_02643</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2211_02643</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-f26303bd636f4dc7165512a276fd5a0089dac47801c98c9f6bd3391dae8db25f3</originalsourceid><addsrcrecordid>eNotj11LwzAYRnPjhWz-AK_MH2jNR5Oml2XMKUwGo16Xt_nYAm06kijz3zurVw-cBw4chB4pKSslBHmGePVfJWOUloTJit-jjxZ3EUJyc5xsxG3UZ5-tzp_R4hvDhzD6YPHOpgUdrZ5PwWc_Bzw7_A75bCfIXsOIt9dLtCndrrRGdw7GZB_-d4W6l223eS32h93bpt0XIGteOCY54YORXLrK6JpKISgDVktnBBCiGgO6qhWhulG6cXIwnDfUgFVmYMLxFXr60y5d_SX6CeJ3_9vXL338Bz24S_E</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions</title><source>arXiv.org</source><creator>Ramo, Mirco ; Silvestre, Guénolé C. M</creator><creatorcontrib>Ramo, Mirco ; Silvestre, Guénolé C. M</creatorcontrib><description>in Proc. of AICS 2022, Cork, Ireland. Communications in Computer and Information Science The Transformer architecture is shown to provide a powerful framework as an end-to-end model for building expression trees from online handwritten gestures corresponding to glyph strokes. In particular, the attention mechanism was successfully used to encode, learn and enforce the underlying syntax of expressions creating latent representations that are correctly decoded to the exact mathematical expression tree, providing robustness to ablated inputs and unseen glyphs. For the first time, the encoder is fed with spatio-temporal data tokens potentially forming an infinitely large vocabulary, which finds applications beyond that of online gesture recognition. A new supervised dataset of online handwriting gestures is provided for training models on generic handwriting recognition tasks and a new metric is proposed for the evaluation of the syntactic correctness of the output expression trees. A small Transformer model suitable for edge inference was successfully trained to an average normalised Levenshtein accuracy of 94%, resulting in valid postfix RPN tree representation for 94% of predictions.</description><identifier>DOI: 10.48550/arxiv.2211.02643</identifier><language>eng</language><subject>Computer Science - Computation and Language ; Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2022-11</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2211.02643$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2211.02643$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ramo, Mirco</creatorcontrib><creatorcontrib>Silvestre, Guénolé C. M</creatorcontrib><title>A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions</title><description>in Proc. of AICS 2022, Cork, Ireland. Communications in Computer and Information Science The Transformer architecture is shown to provide a powerful framework as an end-to-end model for building expression trees from online handwritten gestures corresponding to glyph strokes. In particular, the attention mechanism was successfully used to encode, learn and enforce the underlying syntax of expressions creating latent representations that are correctly decoded to the exact mathematical expression tree, providing robustness to ablated inputs and unseen glyphs. For the first time, the encoder is fed with spatio-temporal data tokens potentially forming an infinitely large vocabulary, which finds applications beyond that of online gesture recognition. A new supervised dataset of online handwriting gestures is provided for training models on generic handwriting recognition tasks and a new metric is proposed for the evaluation of the syntactic correctness of the output expression trees. A small Transformer model suitable for edge inference was successfully trained to an average normalised Levenshtein accuracy of 94%, resulting in valid postfix RPN tree representation for 94% of predictions.</description><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj11LwzAYRnPjhWz-AK_MH2jNR5Oml2XMKUwGo16Xt_nYAm06kijz3zurVw-cBw4chB4pKSslBHmGePVfJWOUloTJit-jjxZ3EUJyc5xsxG3UZ5-tzp_R4hvDhzD6YPHOpgUdrZ5PwWc_Bzw7_A75bCfIXsOIt9dLtCndrrRGdw7GZB_-d4W6l223eS32h93bpt0XIGteOCY54YORXLrK6JpKISgDVktnBBCiGgO6qhWhulG6cXIwnDfUgFVmYMLxFXr60y5d_SX6CeJ3_9vXL338Bz24S_E</recordid><startdate>20221104</startdate><enddate>20221104</enddate><creator>Ramo, Mirco</creator><creator>Silvestre, Guénolé C. M</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20221104</creationdate><title>A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions</title><author>Ramo, Mirco ; Silvestre, Guénolé C. M</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-f26303bd636f4dc7165512a276fd5a0089dac47801c98c9f6bd3391dae8db25f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Ramo, Mirco</creatorcontrib><creatorcontrib>Silvestre, Guénolé C. M</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ramo, Mirco</au><au>Silvestre, Guénolé C. M</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions</atitle><date>2022-11-04</date><risdate>2022</risdate><abstract>in Proc. of AICS 2022, Cork, Ireland. Communications in Computer and Information Science The Transformer architecture is shown to provide a powerful framework as an end-to-end model for building expression trees from online handwritten gestures corresponding to glyph strokes. In particular, the attention mechanism was successfully used to encode, learn and enforce the underlying syntax of expressions creating latent representations that are correctly decoded to the exact mathematical expression tree, providing robustness to ablated inputs and unseen glyphs. For the first time, the encoder is fed with spatio-temporal data tokens potentially forming an infinitely large vocabulary, which finds applications beyond that of online gesture recognition. A new supervised dataset of online handwriting gestures is provided for training models on generic handwriting recognition tasks and a new metric is proposed for the evaluation of the syntactic correctness of the output expression trees. A small Transformer model suitable for edge inference was successfully trained to an average normalised Levenshtein accuracy of 94%, resulting in valid postfix RPN tree representation for 94% of predictions.</abstract><doi>10.48550/arxiv.2211.02643</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2211.02643
ispartof
issn
language eng
recordid cdi_arxiv_primary_2211_02643
source arXiv.org
subjects Computer Science - Computation and Language
Computer Science - Computer Vision and Pattern Recognition
title A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T07%3A12%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Transformer%20Architecture%20for%20Online%20Gesture%20Recognition%20of%20Mathematical%20Expressions&rft.au=Ramo,%20Mirco&rft.date=2022-11-04&rft_id=info:doi/10.48550/arxiv.2211.02643&rft_dat=%3Carxiv_GOX%3E2211_02643%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true