SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data
Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even thoug...
Gespeichert in:
Veröffentlicht in: | Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-29, Article 107 |
---|---|
Hauptverfasser: | , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 29 |
---|---|
container_issue | 3 |
container_start_page | 1 |
container_title | Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies |
container_volume | 7 |
creator | Li, Jiyang Huang, Lin Shah, Siddharth Jones, Sean J. Jin, Yincheng Wang, Dingran Russell, Adam Choi, Seokmin Gao, Yang Yuan, Junsong Jin, Zhanpeng |
description | Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives. |
doi_str_mv | 10.1145/3610881 |
format | Article |
fullrecord | <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3610881</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3610881</sourcerecordid><originalsourceid>FETCH-LOGICAL-a206t-48fbd57db98bcfb4a3ae3d361a0fb9c5963cd64ba8da7be5298f2655bc34fd1e3</originalsourceid><addsrcrecordid>eNpNjzmLAkEUhBtRUFTM_QNGo6-n71BkPWBAcNd4eH3JiBfTJvvvVTwwqoL6qKIIGVAYU8rFhEkKWtMG6eRc8cwIqZpfvk36Ke0BgBrGNKgOaf9Wu9OmOu16pBXxkEL_pV2ynf_8zZZZsV6sZtMiwxzkNeM6Wi-Ut0ZbFy1HhoH5-yxCtMYJI5nzklvUHpUNIjc65lII6xiPngbWJaNnr6vPKdUhlpe6OmL9X1IoHx_K14c7OXyS6I4f6B3eAIQxPy4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data</title><source>ACM Digital Library Complete</source><creator>Li, Jiyang ; Huang, Lin ; Shah, Siddharth ; Jones, Sean J. ; Jin, Yincheng ; Wang, Dingran ; Russell, Adam ; Choi, Seokmin ; Gao, Yang ; Yuan, Junsong ; Jin, Zhanpeng</creator><creatorcontrib>Li, Jiyang ; Huang, Lin ; Shah, Siddharth ; Jones, Sean J. ; Jin, Yincheng ; Wang, Dingran ; Russell, Adam ; Choi, Seokmin ; Gao, Yang ; Yuan, Junsong ; Jin, Zhanpeng</creatorcontrib><description>Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.</description><identifier>ISSN: 2474-9567</identifier><identifier>EISSN: 2474-9567</identifier><identifier>DOI: 10.1145/3610881</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><subject>Auditory feedback ; Empirical studies in HCI ; Human computer interaction (HCI) ; Human-centered computing ; Interaction techniques ; Ubiquitous and mobile computing ; Ubiquitous and mobile computing theory, concepts and paradigms ; Ubiquitous computing</subject><ispartof>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-29, Article 107</ispartof><rights>ACM</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a206t-48fbd57db98bcfb4a3ae3d361a0fb9c5963cd64ba8da7be5298f2655bc34fd1e3</cites><orcidid>0009-0002-0167-6144 ; 0000-0002-7901-8793 ; 0000-0003-3660-938X ; 0000-0002-1812-4234 ; 0009-0008-1135-2745 ; 0000-0001-6811-0183 ; 0000-0002-0692-8972 ; 0000-0002-6715-8676 ; 0000-0003-4284-6728 ; 0009-0004-2887-3513 ; 0000-0002-3020-3736</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3610881$$EPDF$$P50$$Gacm$$H</linktopdf><link.rule.ids>314,780,784,2280,27923,27924,40195,75999</link.rule.ids></links><search><creatorcontrib>Li, Jiyang</creatorcontrib><creatorcontrib>Huang, Lin</creatorcontrib><creatorcontrib>Shah, Siddharth</creatorcontrib><creatorcontrib>Jones, Sean J.</creatorcontrib><creatorcontrib>Jin, Yincheng</creatorcontrib><creatorcontrib>Wang, Dingran</creatorcontrib><creatorcontrib>Russell, Adam</creatorcontrib><creatorcontrib>Choi, Seokmin</creatorcontrib><creatorcontrib>Gao, Yang</creatorcontrib><creatorcontrib>Yuan, Junsong</creatorcontrib><creatorcontrib>Jin, Zhanpeng</creatorcontrib><title>SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data</title><title>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</title><addtitle>ACM IMWUT</addtitle><description>Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.</description><subject>Auditory feedback</subject><subject>Empirical studies in HCI</subject><subject>Human computer interaction (HCI)</subject><subject>Human-centered computing</subject><subject>Interaction techniques</subject><subject>Ubiquitous and mobile computing</subject><subject>Ubiquitous and mobile computing theory, concepts and paradigms</subject><subject>Ubiquitous computing</subject><issn>2474-9567</issn><issn>2474-9567</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpNjzmLAkEUhBtRUFTM_QNGo6-n71BkPWBAcNd4eH3JiBfTJvvvVTwwqoL6qKIIGVAYU8rFhEkKWtMG6eRc8cwIqZpfvk36Ke0BgBrGNKgOaf9Wu9OmOu16pBXxkEL_pV2ynf_8zZZZsV6sZtMiwxzkNeM6Wi-Ut0ZbFy1HhoH5-yxCtMYJI5nzklvUHpUNIjc65lII6xiPngbWJaNnr6vPKdUhlpe6OmL9X1IoHx_K14c7OXyS6I4f6B3eAIQxPy4</recordid><startdate>20230927</startdate><enddate>20230927</enddate><creator>Li, Jiyang</creator><creator>Huang, Lin</creator><creator>Shah, Siddharth</creator><creator>Jones, Sean J.</creator><creator>Jin, Yincheng</creator><creator>Wang, Dingran</creator><creator>Russell, Adam</creator><creator>Choi, Seokmin</creator><creator>Gao, Yang</creator><creator>Yuan, Junsong</creator><creator>Jin, Zhanpeng</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0009-0002-0167-6144</orcidid><orcidid>https://orcid.org/0000-0002-7901-8793</orcidid><orcidid>https://orcid.org/0000-0003-3660-938X</orcidid><orcidid>https://orcid.org/0000-0002-1812-4234</orcidid><orcidid>https://orcid.org/0009-0008-1135-2745</orcidid><orcidid>https://orcid.org/0000-0001-6811-0183</orcidid><orcidid>https://orcid.org/0000-0002-0692-8972</orcidid><orcidid>https://orcid.org/0000-0002-6715-8676</orcidid><orcidid>https://orcid.org/0000-0003-4284-6728</orcidid><orcidid>https://orcid.org/0009-0004-2887-3513</orcidid><orcidid>https://orcid.org/0000-0002-3020-3736</orcidid></search><sort><creationdate>20230927</creationdate><title>SignRing</title><author>Li, Jiyang ; Huang, Lin ; Shah, Siddharth ; Jones, Sean J. ; Jin, Yincheng ; Wang, Dingran ; Russell, Adam ; Choi, Seokmin ; Gao, Yang ; Yuan, Junsong ; Jin, Zhanpeng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a206t-48fbd57db98bcfb4a3ae3d361a0fb9c5963cd64ba8da7be5298f2655bc34fd1e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Auditory feedback</topic><topic>Empirical studies in HCI</topic><topic>Human computer interaction (HCI)</topic><topic>Human-centered computing</topic><topic>Interaction techniques</topic><topic>Ubiquitous and mobile computing</topic><topic>Ubiquitous and mobile computing theory, concepts and paradigms</topic><topic>Ubiquitous computing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Jiyang</creatorcontrib><creatorcontrib>Huang, Lin</creatorcontrib><creatorcontrib>Shah, Siddharth</creatorcontrib><creatorcontrib>Jones, Sean J.</creatorcontrib><creatorcontrib>Jin, Yincheng</creatorcontrib><creatorcontrib>Wang, Dingran</creatorcontrib><creatorcontrib>Russell, Adam</creatorcontrib><creatorcontrib>Choi, Seokmin</creatorcontrib><creatorcontrib>Gao, Yang</creatorcontrib><creatorcontrib>Yuan, Junsong</creatorcontrib><creatorcontrib>Jin, Zhanpeng</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Jiyang</au><au>Huang, Lin</au><au>Shah, Siddharth</au><au>Jones, Sean J.</au><au>Jin, Yincheng</au><au>Wang, Dingran</au><au>Russell, Adam</au><au>Choi, Seokmin</au><au>Gao, Yang</au><au>Yuan, Junsong</au><au>Jin, Zhanpeng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data</atitle><jtitle>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</jtitle><stitle>ACM IMWUT</stitle><date>2023-09-27</date><risdate>2023</risdate><volume>7</volume><issue>3</issue><spage>1</spage><epage>29</epage><pages>1-29</pages><artnum>107</artnum><issn>2474-9567</issn><eissn>2474-9567</eissn><abstract>Sign language is a natural language widely used by Deaf and hard of hearing (DHH) individuals. Advanced wearables are developed to recognize sign language automatically. However, they are limited by the lack of labeled data, which leads to a small vocabulary and unsatisfactory performance even though laborious efforts are put into data collection. Here we propose SignRing, an IMU-based system that breaks through the traditional data augmentation method, makes use of online videos to generate the virtual IMU (v-IMU) data, and pushes the boundary of wearable-based systems by reaching the vocabulary size of 934 with sentences up to 16 glosses. The v-IMU data is generated by reconstructing 3D hand movements from two-view videos and calculating 3-axis acceleration data, by which we are able to achieve a word error rate (WER) of 6.3% with a mix of half v-IMU and half IMU training data (2339 samples for each), and a WER of 14.7% with 100% v-IMU training data (6048 samples), compared with the baseline performance of the 8.3% WER (trained with 2339 samples of IMU data). We have conducted comparisons between v-IMU and IMU data to demonstrate the reliability and generalizability of the v-IMU data. This interdisciplinary work covers various areas such as wearable sensor development, computer vision techniques, deep learning, and linguistics, which can provide valuable insights to researchers with similar research objectives.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/3610881</doi><tpages>29</tpages><orcidid>https://orcid.org/0009-0002-0167-6144</orcidid><orcidid>https://orcid.org/0000-0002-7901-8793</orcidid><orcidid>https://orcid.org/0000-0003-3660-938X</orcidid><orcidid>https://orcid.org/0000-0002-1812-4234</orcidid><orcidid>https://orcid.org/0009-0008-1135-2745</orcidid><orcidid>https://orcid.org/0000-0001-6811-0183</orcidid><orcidid>https://orcid.org/0000-0002-0692-8972</orcidid><orcidid>https://orcid.org/0000-0002-6715-8676</orcidid><orcidid>https://orcid.org/0000-0003-4284-6728</orcidid><orcidid>https://orcid.org/0009-0004-2887-3513</orcidid><orcidid>https://orcid.org/0000-0002-3020-3736</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2474-9567 |
ispartof | Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-29, Article 107 |
issn | 2474-9567 2474-9567 |
language | eng |
recordid | cdi_crossref_primary_10_1145_3610881 |
source | ACM Digital Library Complete |
subjects | Auditory feedback Empirical studies in HCI Human computer interaction (HCI) Human-centered computing Interaction techniques Ubiquitous and mobile computing Ubiquitous and mobile computing theory, concepts and paradigms Ubiquitous computing |
title | SignRing: Continuous American Sign Language Recognition Using IMU Rings and Virtual IMU Data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T02%3A55%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SignRing:%20Continuous%20American%20Sign%20Language%20Recognition%20Using%20IMU%20Rings%20and%20Virtual%20IMU%20Data&rft.jtitle=Proceedings%20of%20ACM%20on%20interactive,%20mobile,%20wearable%20and%20ubiquitous%20technologies&rft.au=Li,%20Jiyang&rft.date=2023-09-27&rft.volume=7&rft.issue=3&rft.spage=1&rft.epage=29&rft.pages=1-29&rft.artnum=107&rft.issn=2474-9567&rft.eissn=2474-9567&rft_id=info:doi/10.1145/3610881&rft_dat=%3Cacm_cross%3E3610881%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |