Video Scene Segmentation Using Tensor-Train Faster-RCNN for Multimedia IoT Systems

Video surveillance techniques like scene segmentation are playing an increasingly important role in multimedia Internet-of-Things (IoT) systems. However, existing deep learning-based methods face challenges in both accuracy and memory when deployed on edge computing devices with limited computing re...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE internet of things journal 2021-06, Vol.8 (12), p.9697-9705
Hauptverfasser:	Dai, Cheng, Liu, Xingang, Yang, Laurence T., Ni, Minghao, Ma, Zhenchao, Zhang, Qingchen, Deen, M. Jamal
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Artificial intelligence Artificial neural networks Boxes Computational modeling Computer memory Deep learning Edge computing Feature extraction Frames (data processing) Image segmentation Internet of Things Machine learning Model accuracy Multimedia multimedia Internet-of-Things (IoT) system Performance evaluation Segmentation Tensile stress tensor train Tensors Training video scene segmentation
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	9705
container_issue	12
container_start_page	9697
container_title	IEEE internet of things journal
container_volume	8
creator	Dai, Cheng Liu, Xingang Yang, Laurence T. Ni, Minghao Ma, Zhenchao Zhang, Qingchen Deen, M. Jamal
description	Video surveillance techniques like scene segmentation are playing an increasingly important role in multimedia Internet-of-Things (IoT) systems. However, existing deep learning-based methods face challenges in both accuracy and memory when deployed on edge computing devices with limited computing resources. To address these challenges, a tensor-train video scene segmentation scheme that compares the local background information in regional scene boundary boxes in adjacent frames is proposed. Compared to the existing methods, the proposed scheme can achieve competitive performance in both segmentation accuracy and parameter compression rate. In detail, first, an improved faster region convolutional neural network (faster-RCNN) model is proposed to recognize and generate a large number of region boxes with foreground and background to achieve boundary boxes. Then, the foreground boxes with sparse objects are removed and the rest are considered as optional background boxes used to measure the similarity between two adjacent frames. Second, to accelerate the training efficiency and reduce memory size, a general and efficient training way using tensor-train decomposition to factor the input-to-hidden weight matrix is proposed. Finally, experiments are conducted to evaluate the performance of the proposed scheme in terms of accuracy and model compression. Our results demonstrate that the proposed model can improve the training efficiency and save the memory space for the deep computation model with good accuracy. This work opens the potential for the use of artificial intelligence methods in edge computing devices for multimedia IoT systems.
doi_str_mv	10.1109/JIOT.2020.3022353
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2536870327</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9187799</ieee_id><sourcerecordid>2536870327</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-732c80c77f838e6a7193595e89db035146b468bf61c0c4282b49b901c40906f73</originalsourceid><addsrcrecordid>eNpNkE1rAjEQhkNpoWL9AaWXQM9rJ8luPo5FamuxCrr2GnbjrER01ybrwX_fFaX0NAPzPjPMQ8gjgyFjYF4-J_N8yIHDUADnIhM3pMcFV0kqJb_919-TQYxbAOiwjBnZI4tvv8aGLh3WSJe42WPdFq1varqKvt7QHOvYhCQPha_puIgthmQxms1o1QT6ddy1fo9rX9BJk9PlqRvv4wO5q4pdxMG19slq_JaPPpLp_H0yep0mjhvRJkpwp8EpVWmhURaKGZGZDLVZlyAylsoylbqsJHPgUq55mZrSAHMpGJCVEn3yfNl7CM3PEWNrt80x1N1JyzMhtYLu7S7FLikXmhgDVvYQ_L4IJ8vAnu3Zsz17tmev9jrm6cJ4RPzLG6aVMkb8AkmwaKQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2536870327</pqid></control><display><type>article</type><title>Video Scene Segmentation Using Tensor-Train Faster-RCNN for Multimedia IoT Systems</title><source>IEEE Electronic Library (IEL)</source><creator>Dai, Cheng ; Liu, Xingang ; Yang, Laurence T. ; Ni, Minghao ; Ma, Zhenchao ; Zhang, Qingchen ; Deen, M. Jamal</creator><creatorcontrib>Dai, Cheng ; Liu, Xingang ; Yang, Laurence T. ; Ni, Minghao ; Ma, Zhenchao ; Zhang, Qingchen ; Deen, M. Jamal</creatorcontrib><description>Video surveillance techniques like scene segmentation are playing an increasingly important role in multimedia Internet-of-Things (IoT) systems. However, existing deep learning-based methods face challenges in both accuracy and memory when deployed on edge computing devices with limited computing resources. To address these challenges, a tensor-train video scene segmentation scheme that compares the local background information in regional scene boundary boxes in adjacent frames is proposed. Compared to the existing methods, the proposed scheme can achieve competitive performance in both segmentation accuracy and parameter compression rate. In detail, first, an improved faster region convolutional neural network (faster-RCNN) model is proposed to recognize and generate a large number of region boxes with foreground and background to achieve boundary boxes. Then, the foreground boxes with sparse objects are removed and the rest are considered as optional background boxes used to measure the similarity between two adjacent frames. Second, to accelerate the training efficiency and reduce memory size, a general and efficient training way using tensor-train decomposition to factor the input-to-hidden weight matrix is proposed. Finally, experiments are conducted to evaluate the performance of the proposed scheme in terms of accuracy and model compression. Our results demonstrate that the proposed model can improve the training efficiency and save the memory space for the deep computation model with good accuracy. This work opens the potential for the use of artificial intelligence methods in edge computing devices for multimedia IoT systems.</description><identifier>ISSN: 2327-4662</identifier><identifier>EISSN: 2327-4662</identifier><identifier>DOI: 10.1109/JIOT.2020.3022353</identifier><identifier>CODEN: IITJAU</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Accuracy ; Artificial intelligence ; Artificial neural networks ; Boxes ; Computational modeling ; Computer memory ; Deep learning ; Edge computing ; Feature extraction ; Frames (data processing) ; Image segmentation ; Internet of Things ; Machine learning ; Model accuracy ; Multimedia ; multimedia Internet-of-Things (IoT) system ; Performance evaluation ; Segmentation ; Tensile stress ; tensor train ; Tensors ; Training ; video scene segmentation</subject><ispartof>IEEE internet of things journal, 2021-06, Vol.8 (12), p.9697-9705</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-732c80c77f838e6a7193595e89db035146b468bf61c0c4282b49b901c40906f73</citedby><cites>FETCH-LOGICAL-c293t-732c80c77f838e6a7193595e89db035146b468bf61c0c4282b49b901c40906f73</cites><orcidid>0000-0002-7986-4244 ; 0000-0001-5525-687X ; 0000-0002-5014-2918 ; 0000-0003-2860-4589 ; 0000-0003-4881-0216</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9187799$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9187799$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Dai, Cheng</creatorcontrib><creatorcontrib>Liu, Xingang</creatorcontrib><creatorcontrib>Yang, Laurence T.</creatorcontrib><creatorcontrib>Ni, Minghao</creatorcontrib><creatorcontrib>Ma, Zhenchao</creatorcontrib><creatorcontrib>Zhang, Qingchen</creatorcontrib><creatorcontrib>Deen, M. Jamal</creatorcontrib><title>Video Scene Segmentation Using Tensor-Train Faster-RCNN for Multimedia IoT Systems</title><title>IEEE internet of things journal</title><addtitle>JIoT</addtitle><description>Video surveillance techniques like scene segmentation are playing an increasingly important role in multimedia Internet-of-Things (IoT) systems. However, existing deep learning-based methods face challenges in both accuracy and memory when deployed on edge computing devices with limited computing resources. To address these challenges, a tensor-train video scene segmentation scheme that compares the local background information in regional scene boundary boxes in adjacent frames is proposed. Compared to the existing methods, the proposed scheme can achieve competitive performance in both segmentation accuracy and parameter compression rate. In detail, first, an improved faster region convolutional neural network (faster-RCNN) model is proposed to recognize and generate a large number of region boxes with foreground and background to achieve boundary boxes. Then, the foreground boxes with sparse objects are removed and the rest are considered as optional background boxes used to measure the similarity between two adjacent frames. Second, to accelerate the training efficiency and reduce memory size, a general and efficient training way using tensor-train decomposition to factor the input-to-hidden weight matrix is proposed. Finally, experiments are conducted to evaluate the performance of the proposed scheme in terms of accuracy and model compression. Our results demonstrate that the proposed model can improve the training efficiency and save the memory space for the deep computation model with good accuracy. This work opens the potential for the use of artificial intelligence methods in edge computing devices for multimedia IoT systems.</description><subject>Accuracy</subject><subject>Artificial intelligence</subject><subject>Artificial neural networks</subject><subject>Boxes</subject><subject>Computational modeling</subject><subject>Computer memory</subject><subject>Deep learning</subject><subject>Edge computing</subject><subject>Feature extraction</subject><subject>Frames (data processing)</subject><subject>Image segmentation</subject><subject>Internet of Things</subject><subject>Machine learning</subject><subject>Model accuracy</subject><subject>Multimedia</subject><subject>multimedia Internet-of-Things (IoT) system</subject><subject>Performance evaluation</subject><subject>Segmentation</subject><subject>Tensile stress</subject><subject>tensor train</subject><subject>Tensors</subject><subject>Training</subject><subject>video scene segmentation</subject><issn>2327-4662</issn><issn>2327-4662</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE1rAjEQhkNpoWL9AaWXQM9rJ8luPo5FamuxCrr2GnbjrER01ybrwX_fFaX0NAPzPjPMQ8gjgyFjYF4-J_N8yIHDUADnIhM3pMcFV0kqJb_919-TQYxbAOiwjBnZI4tvv8aGLh3WSJe42WPdFq1varqKvt7QHOvYhCQPha_puIgthmQxms1o1QT6ddy1fo9rX9BJk9PlqRvv4wO5q4pdxMG19slq_JaPPpLp_H0yep0mjhvRJkpwp8EpVWmhURaKGZGZDLVZlyAylsoylbqsJHPgUq55mZrSAHMpGJCVEn3yfNl7CM3PEWNrt80x1N1JyzMhtYLu7S7FLikXmhgDVvYQ_L4IJ8vAnu3Zsz17tmev9jrm6cJ4RPzLG6aVMkb8AkmwaKQ</recordid><startdate>20210615</startdate><enddate>20210615</enddate><creator>Dai, Cheng</creator><creator>Liu, Xingang</creator><creator>Yang, Laurence T.</creator><creator>Ni, Minghao</creator><creator>Ma, Zhenchao</creator><creator>Zhang, Qingchen</creator><creator>Deen, M. Jamal</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-7986-4244</orcidid><orcidid>https://orcid.org/0000-0001-5525-687X</orcidid><orcidid>https://orcid.org/0000-0002-5014-2918</orcidid><orcidid>https://orcid.org/0000-0003-2860-4589</orcidid><orcidid>https://orcid.org/0000-0003-4881-0216</orcidid></search><sort><creationdate>20210615</creationdate><title>Video Scene Segmentation Using Tensor-Train Faster-RCNN for Multimedia IoT Systems</title><author>Dai, Cheng ; Liu, Xingang ; Yang, Laurence T. ; Ni, Minghao ; Ma, Zhenchao ; Zhang, Qingchen ; Deen, M. Jamal</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-732c80c77f838e6a7193595e89db035146b468bf61c0c4282b49b901c40906f73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Accuracy</topic><topic>Artificial intelligence</topic><topic>Artificial neural networks</topic><topic>Boxes</topic><topic>Computational modeling</topic><topic>Computer memory</topic><topic>Deep learning</topic><topic>Edge computing</topic><topic>Feature extraction</topic><topic>Frames (data processing)</topic><topic>Image segmentation</topic><topic>Internet of Things</topic><topic>Machine learning</topic><topic>Model accuracy</topic><topic>Multimedia</topic><topic>multimedia Internet-of-Things (IoT) system</topic><topic>Performance evaluation</topic><topic>Segmentation</topic><topic>Tensile stress</topic><topic>tensor train</topic><topic>Tensors</topic><topic>Training</topic><topic>video scene segmentation</topic><toplevel>online_resources</toplevel><creatorcontrib>Dai, Cheng</creatorcontrib><creatorcontrib>Liu, Xingang</creatorcontrib><creatorcontrib>Yang, Laurence T.</creatorcontrib><creatorcontrib>Ni, Minghao</creatorcontrib><creatorcontrib>Ma, Zhenchao</creatorcontrib><creatorcontrib>Zhang, Qingchen</creatorcontrib><creatorcontrib>Deen, M. Jamal</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE internet of things journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dai, Cheng</au><au>Liu, Xingang</au><au>Yang, Laurence T.</au><au>Ni, Minghao</au><au>Ma, Zhenchao</au><au>Zhang, Qingchen</au><au>Deen, M. Jamal</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Video Scene Segmentation Using Tensor-Train Faster-RCNN for Multimedia IoT Systems</atitle><jtitle>IEEE internet of things journal</jtitle><stitle>JIoT</stitle><date>2021-06-15</date><risdate>2021</risdate><volume>8</volume><issue>12</issue><spage>9697</spage><epage>9705</epage><pages>9697-9705</pages><issn>2327-4662</issn><eissn>2327-4662</eissn><coden>IITJAU</coden><abstract>Video surveillance techniques like scene segmentation are playing an increasingly important role in multimedia Internet-of-Things (IoT) systems. However, existing deep learning-based methods face challenges in both accuracy and memory when deployed on edge computing devices with limited computing resources. To address these challenges, a tensor-train video scene segmentation scheme that compares the local background information in regional scene boundary boxes in adjacent frames is proposed. Compared to the existing methods, the proposed scheme can achieve competitive performance in both segmentation accuracy and parameter compression rate. In detail, first, an improved faster region convolutional neural network (faster-RCNN) model is proposed to recognize and generate a large number of region boxes with foreground and background to achieve boundary boxes. Then, the foreground boxes with sparse objects are removed and the rest are considered as optional background boxes used to measure the similarity between two adjacent frames. Second, to accelerate the training efficiency and reduce memory size, a general and efficient training way using tensor-train decomposition to factor the input-to-hidden weight matrix is proposed. Finally, experiments are conducted to evaluate the performance of the proposed scheme in terms of accuracy and model compression. Our results demonstrate that the proposed model can improve the training efficiency and save the memory space for the deep computation model with good accuracy. This work opens the potential for the use of artificial intelligence methods in edge computing devices for multimedia IoT systems.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/JIOT.2020.3022353</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0002-7986-4244</orcidid><orcidid>https://orcid.org/0000-0001-5525-687X</orcidid><orcidid>https://orcid.org/0000-0002-5014-2918</orcidid><orcidid>https://orcid.org/0000-0003-2860-4589</orcidid><orcidid>https://orcid.org/0000-0003-4881-0216</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2327-4662
ispartof	IEEE internet of things journal, 2021-06, Vol.8 (12), p.9697-9705
issn	2327-4662 2327-4662
language	eng
recordid	cdi_proquest_journals_2536870327
source	IEEE Electronic Library (IEL)
subjects	Accuracy Artificial intelligence Artificial neural networks Boxes Computational modeling Computer memory Deep learning Edge computing Feature extraction Frames (data processing) Image segmentation Internet of Things Machine learning Model accuracy Multimedia multimedia Internet-of-Things (IoT) system Performance evaluation Segmentation Tensile stress tensor train Tensors Training video scene segmentation
title	Video Scene Segmentation Using Tensor-Train Faster-RCNN for Multimedia IoT Systems
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T17%3A10%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Video%20Scene%20Segmentation%20Using%20Tensor-Train%20Faster-RCNN%20for%20Multimedia%20IoT%20Systems&rft.jtitle=IEEE%20internet%20of%20things%20journal&rft.au=Dai,%20Cheng&rft.date=2021-06-15&rft.volume=8&rft.issue=12&rft.spage=9697&rft.epage=9705&rft.pages=9697-9705&rft.issn=2327-4662&rft.eissn=2327-4662&rft.coden=IITJAU&rft_id=info:doi/10.1109/JIOT.2020.3022353&rft_dat=%3Cproquest_RIE%3E2536870327%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2536870327&rft_id=info:pmid/&rft_ieee_id=9187799&rfr_iscdi=true