An Unified Recurrent Video Object Segmentation Framework for Various Surveillance Environments

Moving object segmentation (MOS) in videos received considerable attention because of its broad security-based applications like robotics, outdoor video surveillance, self-driving cars, etc. The current prevailing algorithms highly depend on additional trained modules for other applications or compl...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing 2021, Vol.30, p.7889-7902
Hauptverfasser:	Patil, Prashant W., Dudhane, Akshay, Kulkarni, Ashutosh, Murala, Subrahmanyam, Gonde, Anil Balaji, Gupta, Sunil
Format:	Artikel
Sprache:	eng
Schlagworte:	adversarial learning Agglomeration Algorithms Autonomous cars Coders Decoding Dynamics Feature extraction Machine learning Modules Object segmentation Optical flow (image analysis) Optical imaging recurrent feature sharing Robotics Segmentation Spatio-temporal dependencies Surveillance Task analysis Training various surveillance environments Video Visibility
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	7902
container_issue
container_start_page	7889
container_title	IEEE transactions on image processing
container_volume	30
creator	Patil, Prashant W. Dudhane, Akshay Kulkarni, Ashutosh Murala, Subrahmanyam Gonde, Anil Balaji Gupta, Sunil
description	Moving object segmentation (MOS) in videos received considerable attention because of its broad security-based applications like robotics, outdoor video surveillance, self-driving cars, etc. The current prevailing algorithms highly depend on additional trained modules for other applications or complicated training procedures or neglect the inter-frame spatio-temporal structural dependencies. To address these issues, a simple, robust, and effective unified recurrent edge aggregation approach is proposed for MOS, in which additional trained modules or fine-tuning on a test video frame(s) are not required. Here, a recurrent edge aggregation module (REAM) is proposed to extract effective foreground relevant features capturing spatio-temporal structural dependencies with encoder and respective decoder features connected recurrently from previous frame. These REAM features are then connected to a decoder through skip connections for comprehensive learning named as temporal information propagation . Further, the motion refinement block with multi-scale dense residual is proposed to combine the features from the optical flow encoder stream and the last REAM module for holistic feature learning. Finally, these holistic features and REAM features are given to the decoder block for segmentation. To guide the decoder block, previous frame output with respective scales is utilized. The different configurations of training-testing techniques are examined to evaluate the performance of the proposed method. Specifically, outdoor videos often suffer from constrained visibility due to different environmental conditions and other small particles in the air that scatter the light in the atmosphere. Thus, comprehensive result analysis is conducted on six benchmark video datasets with different surveillance environments. We demonstrate that the proposed method outperforms the state-of-the-art methods for MOS without any pre-trained module, fine-tuning on the test video frame(s) or complicated training.
doi_str_mv	10.1109/TIP.2021.3108405
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2575128682</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9529020</ieee_id><sourcerecordid>2569381168</sourcerecordid><originalsourceid>FETCH-LOGICAL-c324t-84797186c945aa4beaa7820a27f5e90d797b220acafc10996e24f682dc770a213</originalsourceid><addsrcrecordid>eNpdkM1LAzEUxIMoVqt3wUvAi5etSTa7SY6ltFooVOzH0SVN30pqu6nJbsX_3iwtHjwlefnNMG8QuqOkRylRT_Pxa48RRnspJZKT7AxdUcVpQghn5_FOMpEIylUHXYewIYTyjOaXqJNyLmSaiyv03q_worKlhTV-A9N4D1WNl3YNDk9XGzA1nsHHLg51bV2FR17v4Nv5T1w6j5faW9cEPGv8Aex2qysDeFgdrHdVqwk36KLU2wC3p7OLFqPhfPCSTKbP40F_kpiU8TqRXChBZW4Uz7TmK9BaSEY0E2UGiqzj74rFt9GliXurHBgvc8nWRohI0bSLHo--e---Ggh1sbPBQJsIYsCCZblKJaW5jOjDP3TjGl_FdJESGWUyGkeKHCnjXQgeymLv7U77n4KSou2-iN0XbffFqfsouT9KLAD84SpjijCS_gJsrH2L</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2575128682</pqid></control><display><type>article</type><title>An Unified Recurrent Video Object Segmentation Framework for Various Surveillance Environments</title><source>IEEE Electronic Library (IEL)</source><creator>Patil, Prashant W. ; Dudhane, Akshay ; Kulkarni, Ashutosh ; Murala, Subrahmanyam ; Gonde, Anil Balaji ; Gupta, Sunil</creator><creatorcontrib>Patil, Prashant W. ; Dudhane, Akshay ; Kulkarni, Ashutosh ; Murala, Subrahmanyam ; Gonde, Anil Balaji ; Gupta, Sunil</creatorcontrib><description>Moving object segmentation (MOS) in videos received considerable attention because of its broad security-based applications like robotics, outdoor video surveillance, self-driving cars, etc. The current prevailing algorithms highly depend on additional trained modules for other applications or complicated training procedures or neglect the inter-frame spatio-temporal structural dependencies. To address these issues, a simple, robust, and effective unified recurrent edge aggregation approach is proposed for MOS, in which additional trained modules or fine-tuning on a test video frame(s) are not required. Here, a recurrent edge aggregation module (REAM) is proposed to extract effective foreground relevant features capturing spatio-temporal structural dependencies with encoder and respective decoder features connected recurrently from previous frame. These REAM features are then connected to a decoder through skip connections for comprehensive learning named as temporal information propagation . Further, the motion refinement block with multi-scale dense residual is proposed to combine the features from the optical flow encoder stream and the last REAM module for holistic feature learning. Finally, these holistic features and REAM features are given to the decoder block for segmentation. To guide the decoder block, previous frame output with respective scales is utilized. The different configurations of training-testing techniques are examined to evaluate the performance of the proposed method. Specifically, outdoor videos often suffer from constrained visibility due to different environmental conditions and other small particles in the air that scatter the light in the atmosphere. Thus, comprehensive result analysis is conducted on six benchmark video datasets with different surveillance environments. We demonstrate that the proposed method outperforms the state-of-the-art methods for MOS without any pre-trained module, fine-tuning on the test video frame(s) or complicated training.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2021.3108405</identifier><identifier>PMID: 34478367</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>adversarial learning ; Agglomeration ; Algorithms ; Autonomous cars ; Coders ; Decoding ; Dynamics ; Feature extraction ; Machine learning ; Modules ; Object segmentation ; Optical flow (image analysis) ; Optical imaging ; recurrent feature sharing ; Robotics ; Segmentation ; Spatio-temporal dependencies ; Surveillance ; Task analysis ; Training ; various surveillance environments ; Video ; Visibility</subject><ispartof>IEEE transactions on image processing, 2021, Vol.30, p.7889-7902</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c324t-84797186c945aa4beaa7820a27f5e90d797b220acafc10996e24f682dc770a213</citedby><cites>FETCH-LOGICAL-c324t-84797186c945aa4beaa7820a27f5e90d797b220acafc10996e24f682dc770a213</cites><orcidid>0000-0003-3384-4368 ; 0000-0002-0908-1930 ; 0000-0003-2604-6501 ; 0000-0002-7265-3540 ; 0000-0002-4669-9940</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9529020$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,4010,27904,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9529020$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Patil, Prashant W.</creatorcontrib><creatorcontrib>Dudhane, Akshay</creatorcontrib><creatorcontrib>Kulkarni, Ashutosh</creatorcontrib><creatorcontrib>Murala, Subrahmanyam</creatorcontrib><creatorcontrib>Gonde, Anil Balaji</creatorcontrib><creatorcontrib>Gupta, Sunil</creatorcontrib><title>An Unified Recurrent Video Object Segmentation Framework for Various Surveillance Environments</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><description>Moving object segmentation (MOS) in videos received considerable attention because of its broad security-based applications like robotics, outdoor video surveillance, self-driving cars, etc. The current prevailing algorithms highly depend on additional trained modules for other applications or complicated training procedures or neglect the inter-frame spatio-temporal structural dependencies. To address these issues, a simple, robust, and effective unified recurrent edge aggregation approach is proposed for MOS, in which additional trained modules or fine-tuning on a test video frame(s) are not required. Here, a recurrent edge aggregation module (REAM) is proposed to extract effective foreground relevant features capturing spatio-temporal structural dependencies with encoder and respective decoder features connected recurrently from previous frame. These REAM features are then connected to a decoder through skip connections for comprehensive learning named as temporal information propagation . Further, the motion refinement block with multi-scale dense residual is proposed to combine the features from the optical flow encoder stream and the last REAM module for holistic feature learning. Finally, these holistic features and REAM features are given to the decoder block for segmentation. To guide the decoder block, previous frame output with respective scales is utilized. The different configurations of training-testing techniques are examined to evaluate the performance of the proposed method. Specifically, outdoor videos often suffer from constrained visibility due to different environmental conditions and other small particles in the air that scatter the light in the atmosphere. Thus, comprehensive result analysis is conducted on six benchmark video datasets with different surveillance environments. We demonstrate that the proposed method outperforms the state-of-the-art methods for MOS without any pre-trained module, fine-tuning on the test video frame(s) or complicated training.</description><subject>adversarial learning</subject><subject>Agglomeration</subject><subject>Algorithms</subject><subject>Autonomous cars</subject><subject>Coders</subject><subject>Decoding</subject><subject>Dynamics</subject><subject>Feature extraction</subject><subject>Machine learning</subject><subject>Modules</subject><subject>Object segmentation</subject><subject>Optical flow (image analysis)</subject><subject>Optical imaging</subject><subject>recurrent feature sharing</subject><subject>Robotics</subject><subject>Segmentation</subject><subject>Spatio-temporal dependencies</subject><subject>Surveillance</subject><subject>Task analysis</subject><subject>Training</subject><subject>various surveillance environments</subject><subject>Video</subject><subject>Visibility</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkM1LAzEUxIMoVqt3wUvAi5etSTa7SY6ltFooVOzH0SVN30pqu6nJbsX_3iwtHjwlefnNMG8QuqOkRylRT_Pxa48RRnspJZKT7AxdUcVpQghn5_FOMpEIylUHXYewIYTyjOaXqJNyLmSaiyv03q_worKlhTV-A9N4D1WNl3YNDk9XGzA1nsHHLg51bV2FR17v4Nv5T1w6j5faW9cEPGv8Aex2qysDeFgdrHdVqwk36KLU2wC3p7OLFqPhfPCSTKbP40F_kpiU8TqRXChBZW4Uz7TmK9BaSEY0E2UGiqzj74rFt9GliXurHBgvc8nWRohI0bSLHo--e---Ggh1sbPBQJsIYsCCZblKJaW5jOjDP3TjGl_FdJESGWUyGkeKHCnjXQgeymLv7U77n4KSou2-iN0XbffFqfsouT9KLAD84SpjijCS_gJsrH2L</recordid><startdate>2021</startdate><enddate>2021</enddate><creator>Patil, Prashant W.</creator><creator>Dudhane, Akshay</creator><creator>Kulkarni, Ashutosh</creator><creator>Murala, Subrahmanyam</creator><creator>Gonde, Anil Balaji</creator><creator>Gupta, Sunil</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-3384-4368</orcidid><orcidid>https://orcid.org/0000-0002-0908-1930</orcidid><orcidid>https://orcid.org/0000-0003-2604-6501</orcidid><orcidid>https://orcid.org/0000-0002-7265-3540</orcidid><orcidid>https://orcid.org/0000-0002-4669-9940</orcidid></search><sort><creationdate>2021</creationdate><title>An Unified Recurrent Video Object Segmentation Framework for Various Surveillance Environments</title><author>Patil, Prashant W. ; Dudhane, Akshay ; Kulkarni, Ashutosh ; Murala, Subrahmanyam ; Gonde, Anil Balaji ; Gupta, Sunil</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c324t-84797186c945aa4beaa7820a27f5e90d797b220acafc10996e24f682dc770a213</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>adversarial learning</topic><topic>Agglomeration</topic><topic>Algorithms</topic><topic>Autonomous cars</topic><topic>Coders</topic><topic>Decoding</topic><topic>Dynamics</topic><topic>Feature extraction</topic><topic>Machine learning</topic><topic>Modules</topic><topic>Object segmentation</topic><topic>Optical flow (image analysis)</topic><topic>Optical imaging</topic><topic>recurrent feature sharing</topic><topic>Robotics</topic><topic>Segmentation</topic><topic>Spatio-temporal dependencies</topic><topic>Surveillance</topic><topic>Task analysis</topic><topic>Training</topic><topic>various surveillance environments</topic><topic>Video</topic><topic>Visibility</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Patil, Prashant W.</creatorcontrib><creatorcontrib>Dudhane, Akshay</creatorcontrib><creatorcontrib>Kulkarni, Ashutosh</creatorcontrib><creatorcontrib>Murala, Subrahmanyam</creatorcontrib><creatorcontrib>Gonde, Anil Balaji</creatorcontrib><creatorcontrib>Gupta, Sunil</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Patil, Prashant W.</au><au>Dudhane, Akshay</au><au>Kulkarni, Ashutosh</au><au>Murala, Subrahmanyam</au><au>Gonde, Anil Balaji</au><au>Gupta, Sunil</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Unified Recurrent Video Object Segmentation Framework for Various Surveillance Environments</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><date>2021</date><risdate>2021</risdate><volume>30</volume><spage>7889</spage><epage>7902</epage><pages>7889-7902</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>Moving object segmentation (MOS) in videos received considerable attention because of its broad security-based applications like robotics, outdoor video surveillance, self-driving cars, etc. The current prevailing algorithms highly depend on additional trained modules for other applications or complicated training procedures or neglect the inter-frame spatio-temporal structural dependencies. To address these issues, a simple, robust, and effective unified recurrent edge aggregation approach is proposed for MOS, in which additional trained modules or fine-tuning on a test video frame(s) are not required. Here, a recurrent edge aggregation module (REAM) is proposed to extract effective foreground relevant features capturing spatio-temporal structural dependencies with encoder and respective decoder features connected recurrently from previous frame. These REAM features are then connected to a decoder through skip connections for comprehensive learning named as temporal information propagation . Further, the motion refinement block with multi-scale dense residual is proposed to combine the features from the optical flow encoder stream and the last REAM module for holistic feature learning. Finally, these holistic features and REAM features are given to the decoder block for segmentation. To guide the decoder block, previous frame output with respective scales is utilized. The different configurations of training-testing techniques are examined to evaluate the performance of the proposed method. Specifically, outdoor videos often suffer from constrained visibility due to different environmental conditions and other small particles in the air that scatter the light in the atmosphere. Thus, comprehensive result analysis is conducted on six benchmark video datasets with different surveillance environments. We demonstrate that the proposed method outperforms the state-of-the-art methods for MOS without any pre-trained module, fine-tuning on the test video frame(s) or complicated training.</abstract><cop>New York</cop><pub>IEEE</pub><pmid>34478367</pmid><doi>10.1109/TIP.2021.3108405</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0003-3384-4368</orcidid><orcidid>https://orcid.org/0000-0002-0908-1930</orcidid><orcidid>https://orcid.org/0000-0003-2604-6501</orcidid><orcidid>https://orcid.org/0000-0002-7265-3540</orcidid><orcidid>https://orcid.org/0000-0002-4669-9940</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1057-7149
ispartof	IEEE transactions on image processing, 2021, Vol.30, p.7889-7902
issn	1057-7149 1941-0042
language	eng
recordid	cdi_proquest_journals_2575128682
source	IEEE Electronic Library (IEL)
subjects	adversarial learning Agglomeration Algorithms Autonomous cars Coders Decoding Dynamics Feature extraction Machine learning Modules Object segmentation Optical flow (image analysis) Optical imaging recurrent feature sharing Robotics Segmentation Spatio-temporal dependencies Surveillance Task analysis Training various surveillance environments Video Visibility
title	An Unified Recurrent Video Object Segmentation Framework for Various Surveillance Environments
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T02%3A34%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Unified%20Recurrent%20Video%20Object%20Segmentation%20Framework%20for%20Various%20Surveillance%20Environments&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Patil,%20Prashant%20W.&rft.date=2021&rft.volume=30&rft.spage=7889&rft.epage=7902&rft.pages=7889-7902&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2021.3108405&rft_dat=%3Cproquest_RIE%3E2569381168%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2575128682&rft_id=info:pmid/34478367&rft_ieee_id=9529020&rfr_iscdi=true