PATCH: A Plug-in Framework of Non-blocking Inference for Distributed Multimodal System

Recent advancements in deep learning have shown that multimodal inference can be particularly useful in tasks like autonomous driving, human health, and production line monitoring. However, deploying state-of-the-art multimodal models in distributed IoT systems poses unique challenges since the sens...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-24, Article 130
Hauptverfasser:	Wang, Juexing, Wang, Guangjing, Zhang, Xiao, Liu, Li, Zeng, Huacheng, Xiao, Li, Cao, Zhichao, Gu, Lin, Li, Tianxing
Format:	Artikel
Sprache:	eng
Schlagworte:	Architectures Cloud computing Computer systems organization Computing methodologies Distributed architectures Human-centered computing Learning paradigms Machine learning Multi-task learning Ubiquitous and mobile computing Ubiquitous and mobile computing systems and tools
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	24
container_issue	3
container_start_page	1
container_title	Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies
container_volume	7
creator	Wang, Juexing Wang, Guangjing Zhang, Xiao Liu, Li Zeng, Huacheng Xiao, Li Cao, Zhichao Gu, Lin Li, Tianxing
description	Recent advancements in deep learning have shown that multimodal inference can be particularly useful in tasks like autonomous driving, human health, and production line monitoring. However, deploying state-of-the-art multimodal models in distributed IoT systems poses unique challenges since the sensor data from low-cost edge devices can get corrupted, lost, or delayed before reaching the cloud. These problems are magnified in the presence of asymmetric data generation rates from different sensor modalities, wireless network dynamics, or unpredictable sensor behavior, leading to either increased latency or degradation in inference accuracy, which could affect the normal operation of the system with severe consequences like human injury or car accident. In this paper, we propose PATCH, a framework of speculative inference to adapt to these complex scenarios. PATCH serves as a plug-in module in the existing multimodal models, and it enables speculative inference of these off-the-shelf deep learning models. PATCH consists of 1) a Masked-AutoEncoder-based cross-modality imputation module to impute missing data using partially-available sensor data, 2) a lightweight feature pair ranking module that effectively limits the searching space for the optimal imputation configuration with low computation overhead, and 3) a data alignment module that aligns multimodal heterogeneous data streams without using accurate timestamp or external synchronization mechanisms. We implement PATCH in nine popular multimodal models using five public datasets and one self-collected dataset. The experimental results show that PATCH achieves up to 13% mean accuracy improvement over the state-of-art method while only using 10% of training data and reducing the training overhead by 73% compared to the original cost of retraining the model.
doi_str_mv	10.1145/3610885
format	Article
fullrecord	<record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3610885</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3610885</sourcerecordid><originalsourceid>FETCH-LOGICAL-a206t-156165763dfa97604dab555c6be15ae9608c8604f2dae0d8172ed5ce41ae3c013</originalsourceid><addsrcrecordid>eNpNj82KAjEQhMOioLji3RfwNNo9SXcyR5FdFQbcg56HNsmAy8rKxItv74g_eKqi6qOglBohTBENzTQjOEcfqp8ba7KC2HbefE8NU_oFACy0dmD7qvsz3y5Wn6pby1-Kw4cO1O77q82zcrNcL-ZlJjnwOUNiZLKsQy2FZTBB9kTkeR-RJBYMzrs2rvMgEYJDm8dAPhqUqD2gHqjJfdc3_yk1sa5OzeEozaVCqG4HqseBlhzfSfHHF_Qsr4uHPCM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>PATCH: A Plug-in Framework of Non-blocking Inference for Distributed Multimodal System</title><source>Access via ACM Digital Library</source><creator>Wang, Juexing ; Wang, Guangjing ; Zhang, Xiao ; Liu, Li ; Zeng, Huacheng ; Xiao, Li ; Cao, Zhichao ; Gu, Lin ; Li, Tianxing</creator><creatorcontrib>Wang, Juexing ; Wang, Guangjing ; Zhang, Xiao ; Liu, Li ; Zeng, Huacheng ; Xiao, Li ; Cao, Zhichao ; Gu, Lin ; Li, Tianxing</creatorcontrib><description>Recent advancements in deep learning have shown that multimodal inference can be particularly useful in tasks like autonomous driving, human health, and production line monitoring. However, deploying state-of-the-art multimodal models in distributed IoT systems poses unique challenges since the sensor data from low-cost edge devices can get corrupted, lost, or delayed before reaching the cloud. These problems are magnified in the presence of asymmetric data generation rates from different sensor modalities, wireless network dynamics, or unpredictable sensor behavior, leading to either increased latency or degradation in inference accuracy, which could affect the normal operation of the system with severe consequences like human injury or car accident. In this paper, we propose PATCH, a framework of speculative inference to adapt to these complex scenarios. PATCH serves as a plug-in module in the existing multimodal models, and it enables speculative inference of these off-the-shelf deep learning models. PATCH consists of 1) a Masked-AutoEncoder-based cross-modality imputation module to impute missing data using partially-available sensor data, 2) a lightweight feature pair ranking module that effectively limits the searching space for the optimal imputation configuration with low computation overhead, and 3) a data alignment module that aligns multimodal heterogeneous data streams without using accurate timestamp or external synchronization mechanisms. We implement PATCH in nine popular multimodal models using five public datasets and one self-collected dataset. The experimental results show that PATCH achieves up to 13% mean accuracy improvement over the state-of-art method while only using 10% of training data and reducing the training overhead by 73% compared to the original cost of retraining the model.</description><identifier>ISSN: 2474-9567</identifier><identifier>EISSN: 2474-9567</identifier><identifier>DOI: 10.1145/3610885</identifier><language>eng</language><publisher>New York, NY, USA: ACM</publisher><subject>Architectures ; Cloud computing ; Computer systems organization ; Computing methodologies ; Distributed architectures ; Human-centered computing ; Learning paradigms ; Machine learning ; Multi-task learning ; Ubiquitous and mobile computing ; Ubiquitous and mobile computing systems and tools</subject><ispartof>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-24, Article 130</ispartof><rights>ACM</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a206t-156165763dfa97604dab555c6be15ae9608c8604f2dae0d8172ed5ce41ae3c013</cites><orcidid>0009-0006-4330-7098 ; 0000-0002-8159-9072 ; 0000-0002-7392-3477 ; 0000-0003-2861-8438 ; 0000-0003-0808-2285 ; 0000-0002-9353-9042 ; 0000-0002-3272-5239 ; 0000-0002-7419-6240 ; 0009-0003-0418-736X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3610885$$EPDF$$P50$$Gacm$$H</linktopdf><link.rule.ids>315,782,786,2284,27931,27932,40203,76236</link.rule.ids></links><search><creatorcontrib>Wang, Juexing</creatorcontrib><creatorcontrib>Wang, Guangjing</creatorcontrib><creatorcontrib>Zhang, Xiao</creatorcontrib><creatorcontrib>Liu, Li</creatorcontrib><creatorcontrib>Zeng, Huacheng</creatorcontrib><creatorcontrib>Xiao, Li</creatorcontrib><creatorcontrib>Cao, Zhichao</creatorcontrib><creatorcontrib>Gu, Lin</creatorcontrib><creatorcontrib>Li, Tianxing</creatorcontrib><title>PATCH: A Plug-in Framework of Non-blocking Inference for Distributed Multimodal System</title><title>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</title><addtitle>ACM IMWUT</addtitle><description>Recent advancements in deep learning have shown that multimodal inference can be particularly useful in tasks like autonomous driving, human health, and production line monitoring. However, deploying state-of-the-art multimodal models in distributed IoT systems poses unique challenges since the sensor data from low-cost edge devices can get corrupted, lost, or delayed before reaching the cloud. These problems are magnified in the presence of asymmetric data generation rates from different sensor modalities, wireless network dynamics, or unpredictable sensor behavior, leading to either increased latency or degradation in inference accuracy, which could affect the normal operation of the system with severe consequences like human injury or car accident. In this paper, we propose PATCH, a framework of speculative inference to adapt to these complex scenarios. PATCH serves as a plug-in module in the existing multimodal models, and it enables speculative inference of these off-the-shelf deep learning models. PATCH consists of 1) a Masked-AutoEncoder-based cross-modality imputation module to impute missing data using partially-available sensor data, 2) a lightweight feature pair ranking module that effectively limits the searching space for the optimal imputation configuration with low computation overhead, and 3) a data alignment module that aligns multimodal heterogeneous data streams without using accurate timestamp or external synchronization mechanisms. We implement PATCH in nine popular multimodal models using five public datasets and one self-collected dataset. The experimental results show that PATCH achieves up to 13% mean accuracy improvement over the state-of-art method while only using 10% of training data and reducing the training overhead by 73% compared to the original cost of retraining the model.</description><subject>Architectures</subject><subject>Cloud computing</subject><subject>Computer systems organization</subject><subject>Computing methodologies</subject><subject>Distributed architectures</subject><subject>Human-centered computing</subject><subject>Learning paradigms</subject><subject>Machine learning</subject><subject>Multi-task learning</subject><subject>Ubiquitous and mobile computing</subject><subject>Ubiquitous and mobile computing systems and tools</subject><issn>2474-9567</issn><issn>2474-9567</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpNj82KAjEQhMOioLji3RfwNNo9SXcyR5FdFQbcg56HNsmAy8rKxItv74g_eKqi6qOglBohTBENzTQjOEcfqp8ba7KC2HbefE8NU_oFACy0dmD7qvsz3y5Wn6pby1-Kw4cO1O77q82zcrNcL-ZlJjnwOUNiZLKsQy2FZTBB9kTkeR-RJBYMzrs2rvMgEYJDm8dAPhqUqD2gHqjJfdc3_yk1sa5OzeEozaVCqG4HqseBlhzfSfHHF_Qsr4uHPCM</recordid><startdate>20230927</startdate><enddate>20230927</enddate><creator>Wang, Juexing</creator><creator>Wang, Guangjing</creator><creator>Zhang, Xiao</creator><creator>Liu, Li</creator><creator>Zeng, Huacheng</creator><creator>Xiao, Li</creator><creator>Cao, Zhichao</creator><creator>Gu, Lin</creator><creator>Li, Tianxing</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0009-0006-4330-7098</orcidid><orcidid>https://orcid.org/0000-0002-8159-9072</orcidid><orcidid>https://orcid.org/0000-0002-7392-3477</orcidid><orcidid>https://orcid.org/0000-0003-2861-8438</orcidid><orcidid>https://orcid.org/0000-0003-0808-2285</orcidid><orcidid>https://orcid.org/0000-0002-9353-9042</orcidid><orcidid>https://orcid.org/0000-0002-3272-5239</orcidid><orcidid>https://orcid.org/0000-0002-7419-6240</orcidid><orcidid>https://orcid.org/0009-0003-0418-736X</orcidid></search><sort><creationdate>20230927</creationdate><title>PATCH</title><author>Wang, Juexing ; Wang, Guangjing ; Zhang, Xiao ; Liu, Li ; Zeng, Huacheng ; Xiao, Li ; Cao, Zhichao ; Gu, Lin ; Li, Tianxing</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a206t-156165763dfa97604dab555c6be15ae9608c8604f2dae0d8172ed5ce41ae3c013</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Architectures</topic><topic>Cloud computing</topic><topic>Computer systems organization</topic><topic>Computing methodologies</topic><topic>Distributed architectures</topic><topic>Human-centered computing</topic><topic>Learning paradigms</topic><topic>Machine learning</topic><topic>Multi-task learning</topic><topic>Ubiquitous and mobile computing</topic><topic>Ubiquitous and mobile computing systems and tools</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Juexing</creatorcontrib><creatorcontrib>Wang, Guangjing</creatorcontrib><creatorcontrib>Zhang, Xiao</creatorcontrib><creatorcontrib>Liu, Li</creatorcontrib><creatorcontrib>Zeng, Huacheng</creatorcontrib><creatorcontrib>Xiao, Li</creatorcontrib><creatorcontrib>Cao, Zhichao</creatorcontrib><creatorcontrib>Gu, Lin</creatorcontrib><creatorcontrib>Li, Tianxing</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Juexing</au><au>Wang, Guangjing</au><au>Zhang, Xiao</au><au>Liu, Li</au><au>Zeng, Huacheng</au><au>Xiao, Li</au><au>Cao, Zhichao</au><au>Gu, Lin</au><au>Li, Tianxing</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PATCH: A Plug-in Framework of Non-blocking Inference for Distributed Multimodal System</atitle><jtitle>Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies</jtitle><stitle>ACM IMWUT</stitle><date>2023-09-27</date><risdate>2023</risdate><volume>7</volume><issue>3</issue><spage>1</spage><epage>24</epage><pages>1-24</pages><artnum>130</artnum><issn>2474-9567</issn><eissn>2474-9567</eissn><abstract>Recent advancements in deep learning have shown that multimodal inference can be particularly useful in tasks like autonomous driving, human health, and production line monitoring. However, deploying state-of-the-art multimodal models in distributed IoT systems poses unique challenges since the sensor data from low-cost edge devices can get corrupted, lost, or delayed before reaching the cloud. These problems are magnified in the presence of asymmetric data generation rates from different sensor modalities, wireless network dynamics, or unpredictable sensor behavior, leading to either increased latency or degradation in inference accuracy, which could affect the normal operation of the system with severe consequences like human injury or car accident. In this paper, we propose PATCH, a framework of speculative inference to adapt to these complex scenarios. PATCH serves as a plug-in module in the existing multimodal models, and it enables speculative inference of these off-the-shelf deep learning models. PATCH consists of 1) a Masked-AutoEncoder-based cross-modality imputation module to impute missing data using partially-available sensor data, 2) a lightweight feature pair ranking module that effectively limits the searching space for the optimal imputation configuration with low computation overhead, and 3) a data alignment module that aligns multimodal heterogeneous data streams without using accurate timestamp or external synchronization mechanisms. We implement PATCH in nine popular multimodal models using five public datasets and one self-collected dataset. The experimental results show that PATCH achieves up to 13% mean accuracy improvement over the state-of-art method while only using 10% of training data and reducing the training overhead by 73% compared to the original cost of retraining the model.</abstract><cop>New York, NY, USA</cop><pub>ACM</pub><doi>10.1145/3610885</doi><tpages>24</tpages><orcidid>https://orcid.org/0009-0006-4330-7098</orcidid><orcidid>https://orcid.org/0000-0002-8159-9072</orcidid><orcidid>https://orcid.org/0000-0002-7392-3477</orcidid><orcidid>https://orcid.org/0000-0003-2861-8438</orcidid><orcidid>https://orcid.org/0000-0003-0808-2285</orcidid><orcidid>https://orcid.org/0000-0002-9353-9042</orcidid><orcidid>https://orcid.org/0000-0002-3272-5239</orcidid><orcidid>https://orcid.org/0000-0002-7419-6240</orcidid><orcidid>https://orcid.org/0009-0003-0418-736X</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2474-9567
ispartof	Proceedings of ACM on interactive, mobile, wearable and ubiquitous technologies, 2023-09, Vol.7 (3), p.1-24, Article 130
issn	2474-9567 2474-9567
language	eng
recordid	cdi_crossref_primary_10_1145_3610885
source	Access via ACM Digital Library
subjects	Architectures Cloud computing Computer systems organization Computing methodologies Distributed architectures Human-centered computing Learning paradigms Machine learning Multi-task learning Ubiquitous and mobile computing Ubiquitous and mobile computing systems and tools
title	PATCH: A Plug-in Framework of Non-blocking Inference for Distributed Multimodal System
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-04T20%3A09%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PATCH:%20A%20Plug-in%20Framework%20of%20Non-blocking%20Inference%20for%20Distributed%20Multimodal%20System&rft.jtitle=Proceedings%20of%20ACM%20on%20interactive,%20mobile,%20wearable%20and%20ubiquitous%20technologies&rft.au=Wang,%20Juexing&rft.date=2023-09-27&rft.volume=7&rft.issue=3&rft.spage=1&rft.epage=24&rft.pages=1-24&rft.artnum=130&rft.issn=2474-9567&rft.eissn=2474-9567&rft_id=info:doi/10.1145/3610885&rft_dat=%3Cacm_cross%3E3610885%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true