SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data

Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even thoug...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-29, Article 107
Hauptverfasser: Li, Jiyang, Huang, Lin, Shah, Siddharth, Jones, Sean J., Jin, Yincheng, Wang, Dingran, Russell, Adam, Choi, Seokmin, Gao, Yang, Yuan, Junsong, Jin, Zhanpeng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 29
container_issue 3
container_start_page 1
container_title Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies
container_volume 7
creator Li, Jiyang
Huang, Lin
Shah, Siddharth
Jones, Sean J.
Jin, Yincheng
Wang, Dingran
Russell, Adam
Choi, Seokmin
Gao, Yang
Yuan, Junsong
Jin, Zhanpeng
description Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.
doi_str_mv 10.1145/3610881
format Article
fullrecord <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3610881</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3610881</sourcerecordid><originalsourceid>FETCH-LOGICAL-a206t-48fbd57db98bcfb4a3ae3d361a0fb9c5963cd64ba8da7be5298f2655bc34fd1e3</originalsourceid><addsrcrecordid>eNpNjzmLAkEUhBtRUFTM_QNGo6-n71BkPWBAcNd4eH3JiBfTJvvvVTwwqoL6qKIIGVAYU8rFhEkKWtMG6eRc8cwIqZpfvk36Ke0BgBrGNKgOaf9Wu9OmOu16pBXxkEL_pV2ynf_8zZZZsV6sZtMiwxzkNeM6Wi-Ut0ZbFy1HhoH5-yxCtMYJI5nzklvUHpUNIjc65lII6xiPngbWJaNnr6vPKdUhlpe6OmL9X1IoHx_K14c7OXyS6I4f6B3eAIQxPy4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data</title><source>ACM Digital Library Complete</source><creator>Li, Jiyang ; Huang, Lin ; Shah, Siddharth ; Jones, Sean J. ; Jin, Yincheng ; Wang, Dingran ; Russell, Adam ; Choi, Seokmin ; Gao, Yang ; Yuan, Junsong ; Jin, Zhanpeng</creator><creatorcontrib>Li, Jiyang ; Huang, Lin ; Shah, Siddharth ; Jones, Sean J. ; Jin, Yincheng ; Wang, Dingran ; Russell, Adam ; Choi, Seokmin ; Gao, Yang ; Yuan, Junsong ; Jin, Zhanpeng</creatorcontrib><description>Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.</description><identifier>ISSN: 2474-9567</identifier><identifier>EISSN: 2474-9567</identifier><identifier>DOI: 10.1145/3610881</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><subject>Auditory feedback ; Empirical studies in HCI ; Human computer interaction (HCI) ; Human-centered computing ; Interaction techniques ; Ubiquitous and mobile computing ; Ubiquitous and mobile computing theory, concepts and paradigms ; Ubiquitous computing</subject><ispartof>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-29, Article 107</ispartof><rights>ACM</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a206t-48fbd57db98bcfb4a3ae3d361a0fb9c5963cd64ba8da7be5298f2655bc34fd1e3</cites><orcidid>0009-0002-0167-6144 ; 0000-0002-7901-8793 ; 0000-0003-3660-938X ; 0000-0002-1812-4234 ; 0009-0008-1135-2745 ; 0000-0001-6811-0183 ; 0000-0002-0692-8972 ; 0000-0002-6715-8676 ; 0000-0003-4284-6728 ; 0009-0004-2887-3513 ; 0000-0002-3020-3736</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3610881$$EPDF$$P50$$Gacm$$H</linktopdf><link.rule.ids>314,780,784,2280,27923,27924,40195,75999</link.rule.ids></links><search><creatorcontrib>Li, Jiyang</creatorcontrib><creatorcontrib>Huang, Lin</creatorcontrib><creatorcontrib>Shah, Siddharth</creatorcontrib><creatorcontrib>Jones, Sean J.</creatorcontrib><creatorcontrib>Jin, Yincheng</creatorcontrib><creatorcontrib>Wang, Dingran</creatorcontrib><creatorcontrib>Russell, Adam</creatorcontrib><creatorcontrib>Choi, Seokmin</creatorcontrib><creatorcontrib>Gao, Yang</creatorcontrib><creatorcontrib>Yuan, Junsong</creatorcontrib><creatorcontrib>Jin, Zhanpeng</creatorcontrib><title>SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data</title><title>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</title><addtitle>ACM IMWUT</addtitle><description>Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.</description><subject>Auditory feedback</subject><subject>Empirical studies in HCI</subject><subject>Human computer interaction (HCI)</subject><subject>Human-centered computing</subject><subject>Interaction techniques</subject><subject>Ubiquitous and mobile computing</subject><subject>Ubiquitous and mobile computing theory, concepts and paradigms</subject><subject>Ubiquitous computing</subject><issn>2474-9567</issn><issn>2474-9567</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpNjzmLAkEUhBtRUFTM_QNGo6-n71BkPWBAcNd4eH3JiBfTJvvvVTwwqoL6qKIIGVAYU8rFhEkKWtMG6eRc8cwIqZpfvk36Ke0BgBrGNKgOaf9Wu9OmOu16pBXxkEL_pV2ynf_8zZZZsV6sZtMiwxzkNeM6Wi-Ut0ZbFy1HhoH5-yxCtMYJI5nzklvUHpUNIjc65lII6xiPngbWJaNnr6vPKdUhlpe6OmL9X1IoHx_K14c7OXyS6I4f6B3eAIQxPy4</recordid><startdate>20230927</startdate><enddate>20230927</enddate><creator>Li, Jiyang</creator><creator>Huang, Lin</creator><creator>Shah, Siddharth</creator><creator>Jones, Sean J.</creator><creator>Jin, Yincheng</creator><creator>Wang, Dingran</creator><creator>Russell, Adam</creator><creator>Choi, Seokmin</creator><creator>Gao, Yang</creator><creator>Yuan, Junsong</creator><creator>Jin, Zhanpeng</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0009-0002-0167-6144</orcidid><orcidid>https://orcid.org/0000-0002-7901-8793</orcidid><orcidid>https://orcid.org/0000-0003-3660-938X</orcidid><orcidid>https://orcid.org/0000-0002-1812-4234</orcidid><orcidid>https://orcid.org/0009-0008-1135-2745</orcidid><orcidid>https://orcid.org/0000-0001-6811-0183</orcidid><orcidid>https://orcid.org/0000-0002-0692-8972</orcidid><orcidid>https://orcid.org/0000-0002-6715-8676</orcidid><orcidid>https://orcid.org/0000-0003-4284-6728</orcidid><orcidid>https://orcid.org/0009-0004-2887-3513</orcidid><orcidid>https://orcid.org/0000-0002-3020-3736</orcidid></search><sort><creationdate>20230927</creationdate><title>SignRing</title><author>Li, Jiyang ; Huang, Lin ; Shah, Siddharth ; Jones, Sean J. ; Jin, Yincheng ; Wang, Dingran ; Russell, Adam ; Choi, Seokmin ; Gao, Yang ; Yuan, Junsong ; Jin, Zhanpeng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a206t-48fbd57db98bcfb4a3ae3d361a0fb9c5963cd64ba8da7be5298f2655bc34fd1e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Auditory feedback</topic><topic>Empirical studies in HCI</topic><topic>Human computer interaction (HCI)</topic><topic>Human-centered computing</topic><topic>Interaction techniques</topic><topic>Ubiquitous and mobile computing</topic><topic>Ubiquitous and mobile computing theory, concepts and paradigms</topic><topic>Ubiquitous computing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Jiyang</creatorcontrib><creatorcontrib>Huang, Lin</creatorcontrib><creatorcontrib>Shah, Siddharth</creatorcontrib><creatorcontrib>Jones, Sean J.</creatorcontrib><creatorcontrib>Jin, Yincheng</creatorcontrib><creatorcontrib>Wang, Dingran</creatorcontrib><creatorcontrib>Russell, Adam</creatorcontrib><creatorcontrib>Choi, Seokmin</creatorcontrib><creatorcontrib>Gao, Yang</creatorcontrib><creatorcontrib>Yuan, Junsong</creatorcontrib><creatorcontrib>Jin, Zhanpeng</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Jiyang</au><au>Huang, Lin</au><au>Shah, Siddharth</au><au>Jones, Sean J.</au><au>Jin, Yincheng</au><au>Wang, Dingran</au><au>Russell, Adam</au><au>Choi, Seokmin</au><au>Gao, Yang</au><au>Yuan, Junsong</au><au>Jin, Zhanpeng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data</atitle><jtitle>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</jtitle><stitle>ACM IMWUT</stitle><date>2023-09-27</date><risdate>2023</risdate><volume>7</volume><issue>3</issue><spage>1</spage><epage>29</epage><pages>1-29</pages><artnum>107</artnum><issn>2474-9567</issn><eissn>2474-9567</eissn><abstract>Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/3610881</doi><tpages>29</tpages><orcidid>https://orcid.org/0009-0002-0167-6144</orcidid><orcidid>https://orcid.org/0000-0002-7901-8793</orcidid><orcidid>https://orcid.org/0000-0003-3660-938X</orcidid><orcidid>https://orcid.org/0000-0002-1812-4234</orcidid><orcidid>https://orcid.org/0009-0008-1135-2745</orcidid><orcidid>https://orcid.org/0000-0001-6811-0183</orcidid><orcidid>https://orcid.org/0000-0002-0692-8972</orcidid><orcidid>https://orcid.org/0000-0002-6715-8676</orcidid><orcidid>https://orcid.org/0000-0003-4284-6728</orcidid><orcidid>https://orcid.org/0009-0004-2887-3513</orcidid><orcidid>https://orcid.org/0000-0002-3020-3736</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 2474-9567
ispartof Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-29, Article 107
issn 2474-9567
2474-9567
language eng
recordid cdi_crossref_primary_10_1145_3610881
source ACM Digital Library Complete
subjects Auditory feedback
Empirical studies in HCI
Human computer interaction (HCI)
Human-centered computing
Interaction techniques
Ubiquitous and mobile computing
Ubiquitous and mobile computing theory, concepts and paradigms
Ubiquitous computing
title SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T02%3A55%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SignRing:%20Continuous%20American%20Sign%20Language%20Recognition%20Using%20IMU%20Rings%20and%20Virtual%20IMU%20Data&rft.jtitle=Proceedings%20of%20ACM%20on%20interactive,%20mobile,%20wearable%20and%20ubiquitous%20technologies&rft.au=Li,%20Jiyang&rft.date=2023-09-27&rft.volume=7&rft.issue=3&rft.spage=1&rft.epage=29&rft.pages=1-29&rft.artnum=107&rft.issn=2474-9567&rft.eissn=2474-9567&rft_id=info:doi/10.1145/3610881&rft_dat=%3Cacm_cross%3E3610881%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true