SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data

Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even thoug...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-29, Article 107
Hauptverfasser:	Li, Jiyang, Huang, Lin, Shah, Siddharth, Jones, Sean J., Jin, Yincheng, Wang, Dingran, Russell, Adam, Choi, Seokmin, Gao, Yang, Yuan, Junsong, Jin, Zhanpeng
Format:	Artikel
Sprache:	eng
Schlagworte:	Auditory feedback Empirical studies in HCI Human computer interaction (HCI) Human-centered computing Interaction techniques Ubiquitous and mobile computing Ubiquitous and mobile computing theory, concepts and paradigms Ubiquitous computing
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	29
container_issue	3
container_start_page	1
container_title	Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies
container_volume	7
creator	Li, Jiyang Huang, Lin Shah, Siddharth Jones, Sean J. Jin, Yincheng Wang, Dingran Russell, Adam Choi, Seokmin Gao, Yang Yuan, Junsong Jin, Zhanpeng
description	Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.
doi_str_mv	10.1145/3610881
format	Article
fullrecord	<record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3610881</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3610881</sourcerecordid><originalsourceid>FETCH-LOGICAL-a206t-48fbd57db98bcfb4a3ae3d361a0fb9c5963cd64ba8da7be5298f2655bc34fd1e3</originalsourceid><addsrcrecordid>eNpNjzmLAkEUhBtRUFTM_QNGo6-n71BkPWBAcNd4eH3JiBfTJvvvVTwwqoL6qKIIGVAYU8rFhEkKWtMG6eRc8cwIqZpfvk36Ke0BgBrGNKgOaf9Wu9OmOu16pBXxkEL_pV2ynf_8zZZZsV6sZtMiwxzkNeM6Wi-Ut0ZbFy1HhoH5-yxCtMYJI5nzklvUHpUNIjc65lII6xiPngbWJaNnr6vPKdUhlpe6OmL9X1IoHx_K14c7OXyS6I4f6B3eAIQxPy4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data</title><source>ACM Digital Library Complete</source><creator>Li, Jiyang ; Huang, Lin ; Shah, Siddharth ; Jones, Sean J. ; Jin, Yincheng ; Wang, Dingran ; Russell, Adam ; Choi, Seokmin ; Gao, Yang ; Yuan, Junsong ; Jin, Zhanpeng</creator><creatorcontrib>Li, Jiyang ; Huang, Lin ; Shah, Siddharth ; Jones, Sean J. ; Jin, Yincheng ; Wang, Dingran ; Russell, Adam ; Choi, Seokmin ; Gao, Yang ; Yuan, Junsong ; Jin, Zhanpeng</creatorcontrib><description>Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.</description><identifier>ISSN: 2474-9567</identifier><identifier>EISSN: 2474-9567</identifier><identifier>DOI: 10.1145/3610881</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><subject>Auditory feedback ; Empirical studies in HCI ; Human computer interaction (HCI) ; Human-centered computing ; Interaction techniques ; Ubiquitous and mobile computing ; Ubiquitous and mobile computing theory, concepts and paradigms ; Ubiquitous computing</subject><ispartof>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-29, Article 107</ispartof><rights>ACM</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a206t-48fbd57db98bcfb4a3ae3d361a0fb9c5963cd64ba8da7be5298f2655bc34fd1e3</cites><orcidid>0009-0002-0167-6144 ; 0000-0002-7901-8793 ; 0000-0003-3660-938X ; 0000-0002-1812-4234 ; 0009-0008-1135-2745 ; 0000-0001-6811-0183 ; 0000-0002-0692-8972 ; 0000-0002-6715-8676 ; 0000-0003-4284-6728 ; 0009-0004-2887-3513 ; 0000-0002-3020-3736</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3610881$$EPDF$$P50$$Gacm$$H</linktopdf><link.rule.ids>314,780,784,2280,27923,27924,40195,75999</link.rule.ids></links><search><creatorcontrib>Li, Jiyang</creatorcontrib><creatorcontrib>Huang, Lin</creatorcontrib><creatorcontrib>Shah, Siddharth</creatorcontrib><creatorcontrib>Jones, Sean J.</creatorcontrib><creatorcontrib>Jin, Yincheng</creatorcontrib><creatorcontrib>Wang, Dingran</creatorcontrib><creatorcontrib>Russell, Adam</creatorcontrib><creatorcontrib>Choi, Seokmin</creatorcontrib><creatorcontrib>Gao, Yang</creatorcontrib><creatorcontrib>Yuan, Junsong</creatorcontrib><creatorcontrib>Jin, Zhanpeng</creatorcontrib><title>SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data</title><title>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</title><addtitle>ACM IMWUT</addtitle><description>Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.</description><subject>Auditory feedback</subject><subject>Empirical studies in HCI</subject><subject>Human computer interaction (HCI)</subject><subject>Human-centered computing</subject><subject>Interaction techniques</subject><subject>Ubiquitous and mobile computing</subject><subject>Ubiquitous and mobile computing theory, concepts and paradigms</subject><subject>Ubiquitous computing</subject><issn>2474-9567</issn><issn>2474-9567</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpNjzmLAkEUhBtRUFTM_QNGo6-n71BkPWBAcNd4eH3JiBfTJvvvVTwwqoL6qKIIGVAYU8rFhEkKWtMG6eRc8cwIqZpfvk36Ke0BgBrGNKgOaf9Wu9OmOu16pBXxkEL_pV2ynf_8zZZZsV6sZtMiwxzkNeM6Wi-Ut0ZbFy1HhoH5-yxCtMYJI5nzklvUHpUNIjc65lII6xiPngbWJaNnr6vPKdUhlpe6OmL9X1IoHx_K14c7OXyS6I4f6B3eAIQxPy4</recordid><startdate>20230927</startdate><enddate>20230927</enddate><creator>Li, Jiyang</creator><creator>Huang, Lin</creator><creator>Shah, Siddharth</creator><creator>Jones, Sean J.</creator><creator>Jin, Yincheng</creator><creator>Wang, Dingran</creator><creator>Russell, Adam</creator><creator>Choi, Seokmin</creator><creator>Gao, Yang</creator><creator>Yuan, Junsong</creator><creator>Jin, Zhanpeng</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0009-0002-0167-6144</orcidid><orcidid>https://orcid.org/0000-0002-7901-8793</orcidid><orcidid>https://orcid.org/0000-0003-3660-938X</orcidid><orcidid>https://orcid.org/0000-0002-1812-4234</orcidid><orcidid>https://orcid.org/0009-0008-1135-2745</orcidid><orcidid>https://orcid.org/0000-0001-6811-0183</orcidid><orcidid>https://orcid.org/0000-0002-0692-8972</orcidid><orcidid>https://orcid.org/0000-0002-6715-8676</orcidid><orcidid>https://orcid.org/0000-0003-4284-6728</orcidid><orcidid>https://orcid.org/0009-0004-2887-3513</orcidid><orcidid>https://orcid.org/0000-0002-3020-3736</orcidid></search><sort><creationdate>20230927</creationdate><title>SignRing</title><author>Li, Jiyang ; Huang, Lin ; Shah, Siddharth ; Jones, Sean J. ; Jin, Yincheng ; Wang, Dingran ; Russell, Adam ; Choi, Seokmin ; Gao, Yang ; Yuan, Junsong ; Jin, Zhanpeng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a206t-48fbd57db98bcfb4a3ae3d361a0fb9c5963cd64ba8da7be5298f2655bc34fd1e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Auditory feedback</topic><topic>Empirical studies in HCI</topic><topic>Human computer interaction (HCI)</topic><topic>Human-centered computing</topic><topic>Interaction techniques</topic><topic>Ubiquitous and mobile computing</topic><topic>Ubiquitous and mobile computing theory, concepts and paradigms</topic><topic>Ubiquitous computing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Jiyang</creatorcontrib><creatorcontrib>Huang, Lin</creatorcontrib><creatorcontrib>Shah, Siddharth</creatorcontrib><creatorcontrib>Jones, Sean J.</creatorcontrib><creatorcontrib>Jin, Yincheng</creatorcontrib><creatorcontrib>Wang, Dingran</creatorcontrib><creatorcontrib>Russell, Adam</creatorcontrib><creatorcontrib>Choi, Seokmin</creatorcontrib><creatorcontrib>Gao, Yang</creatorcontrib><creatorcontrib>Yuan, Junsong</creatorcontrib><creatorcontrib>Jin, Zhanpeng</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Jiyang</au><au>Huang, Lin</au><au>Shah, Siddharth</au><au>Jones, Sean J.</au><au>Jin, Yincheng</au><au>Wang, Dingran</au><au>Russell, Adam</au><au>Choi, Seokmin</au><au>Gao, Yang</au><au>Yuan, Junsong</au><au>Jin, Zhanpeng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data</atitle><jtitle>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</jtitle><stitle>ACM IMWUT</stitle><date>2023-09-27</date><risdate>2023</risdate><volume>7</volume><issue>3</issue><spage>1</spage><epage>29</epage><pages>1-29</pages><artnum>107</artnum><issn>2474-9567</issn><eissn>2474-9567</eissn><abstract>Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/3610881</doi><tpages>29</tpages><orcidid>https://orcid.org/0009-0002-0167-6144</orcidid><orcidid>https://orcid.org/0000-0002-7901-8793</orcidid><orcidid>https://orcid.org/0000-0003-3660-938X</orcidid><orcidid>https://orcid.org/0000-0002-1812-4234</orcidid><orcidid>https://orcid.org/0009-0008-1135-2745</orcidid><orcidid>https://orcid.org/0000-0001-6811-0183</orcidid><orcidid>https://orcid.org/0000-0002-0692-8972</orcidid><orcidid>https://orcid.org/0000-0002-6715-8676</orcidid><orcidid>https://orcid.org/0000-0003-4284-6728</orcidid><orcidid>https://orcid.org/0009-0004-2887-3513</orcidid><orcidid>https://orcid.org/0000-0002-3020-3736</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 2474-9567
ispartof	Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-29, Article 107
issn	2474-9567 2474-9567
language	eng
recordid	cdi_crossref_primary_10_1145_3610881
source	ACM Digital Library Complete
subjects	Auditory feedback Empirical studies in HCI Human computer interaction (HCI) Human-centered computing Interaction techniques Ubiquitous and mobile computing Ubiquitous and mobile computing theory, concepts and paradigms Ubiquitous computing
title	SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T02%3A55%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SignRing:%20Continuous%20American%20Sign%20Language%20Recognition%20Using%20IMU%20Rings%20and%20Virtual%20IMU%20Data&rft.jtitle=Proceedings%20of%20ACM%20on%20interactive,%20mobile,%20wearable%20and%20ubiquitous%20technologies&rft.au=Li,%20Jiyang&rft.date=2023-09-27&rft.volume=7&rft.issue=3&rft.spage=1&rft.epage=29&rft.pages=1-29&rft.artnum=107&rft.issn=2474-9567&rft.eissn=2474-9567&rft_id=info:doi/10.1145/3610881&rft_dat=%3Cacm_cross%3E3610881%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true