Future Frame Prediction Network for Human Fall Detection in Surveillance Videos

Video fall detection is one of the most significant challenges in computer vision domain, and it usually involves the recognition of events that do not conform to expected falls. Recently, a majority of unsupervised models are popular to address issues that call for substantial manual labeled traini...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE sensors journal 2023-07, Vol.23 (13), p.1-1
Hauptverfasser:	Li, Suyuan, Song, Xin
Format:	Artikel
Sprache:	eng
Schlagworte:	attention gate Cameras Computer vision Datasets Errors Fall detection Feature extraction Frames (data processing) Machine learning Neural networks Optical flow prediction network Predictions Reconstruction Sensors Supervised learning Training U-Net Videos Wearable sensors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1
container_issue	13
container_start_page	1
container_title	IEEE sensors journal
container_volume	23
creator	Li, Suyuan Song, Xin
description	Video fall detection is one of the most significant challenges in computer vision domain, and it usually involves the recognition of events that do not conform to expected falls. Recently, a majority of unsupervised models are popular to address issues that call for substantial manual labeled training data in supervised learning. However, almost all existing unsupervised methods usually minimize reconstruction errors, which may lead to insufficient reconstruction errors between fall and non-fall video frames because of the powerful representation ability of the neural network. In this paper, we propose a novel efficient fall detection method based on future frame prediction. Specifically, attention U-Net with flexible global aggregation blocks that can achieve better performance is regarded as a frame prediction network, achieving that several video frames predict the next future frame. In the training phase, commonly used appearance constraints on intensity and gradient and motion constraint are combined to further generate higher quality frames. Such constraints promote the performance of the prediction network, which can enlarge the difference between the predicted fall frame and the real fall frame. In the testing phase, the fall score based on the error between the predicted frame and the real frame can be computed to distinguish the fall event. Exhaustive experiments have been conducted on UR fall dataset, multiple cameras fall dataset and high quality fall simulation dataset, and the results verify the effectiveness of the proposed method and outperform other existing state-of-the-art methods.
doi_str_mv	10.1109/JSEN.2023.3276891
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2831508046</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10130115</ieee_id><sourcerecordid>2831508046</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-459e3ba8f187077061ebd34780343134fe2319466049bc7f8ce500ed1982057b3</originalsourceid><addsrcrecordid>eNpNkE1Lw0AURQdRsFZ_gOBiwHXqe5lJZmYptbFKaYWquBvy8QKpaVInieK_NyFduHp3ce59cBi7Rpghgrl73i7WMx98MRO-CrXBEzbBINAeKqlPhyzAk0J9nLOLptkBoFGBmrBN1LWdIx65eE_8xVFWpG1RV3xN7U_tPnleO77s9nHFo7gs-QO1NAJFxbed-6aiLOMqJf5eZFQ3l-wsj8uGro53yt6ixet86a02j0_z-5WX-ka2ngwMiSTWOWoFSkGIlGRCKg1CChQyJ1-gkWEI0iSpynVKAQBlaLQPgUrElN2OuwdXf3XUtHZXd67qX1pfCwxAgwx7CkcqdXXTOMrtwRX72P1aBDt4s4M3O3izR29952bsFET0j0cB2Fv8Ay6KZ8U</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2831508046</pqid></control><display><type>article</type><title>Future Frame Prediction Network for Human Fall Detection in Surveillance Videos</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Suyuan ; Song, Xin</creator><creatorcontrib>Li, Suyuan ; Song, Xin</creatorcontrib><description>Video fall detection is one of the most significant challenges in computer vision domain, and it usually involves the recognition of events that do not conform to expected falls. Recently, a majority of unsupervised models are popular to address issues that call for substantial manual labeled training data in supervised learning. However, almost all existing unsupervised methods usually minimize reconstruction errors, which may lead to insufficient reconstruction errors between fall and non-fall video frames because of the powerful representation ability of the neural network. In this paper, we propose a novel efficient fall detection method based on future frame prediction. Specifically, attention U-Net with flexible global aggregation blocks that can achieve better performance is regarded as a frame prediction network, achieving that several video frames predict the next future frame. In the training phase, commonly used appearance constraints on intensity and gradient and motion constraint are combined to further generate higher quality frames. Such constraints promote the performance of the prediction network, which can enlarge the difference between the predicted fall frame and the real fall frame. In the testing phase, the fall score based on the error between the predicted frame and the real frame can be computed to distinguish the fall event. Exhaustive experiments have been conducted on UR fall dataset, multiple cameras fall dataset and high quality fall simulation dataset, and the results verify the effectiveness of the proposed method and outperform other existing state-of-the-art methods.</description><identifier>ISSN: 1530-437X</identifier><identifier>EISSN: 1558-1748</identifier><identifier>DOI: 10.1109/JSEN.2023.3276891</identifier><identifier>CODEN: ISJEAZ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>attention gate ; Cameras ; Computer vision ; Datasets ; Errors ; Fall detection ; Feature extraction ; Frames (data processing) ; Machine learning ; Neural networks ; Optical flow ; prediction network ; Predictions ; Reconstruction ; Sensors ; Supervised learning ; Training ; U-Net ; Videos ; Wearable sensors</subject><ispartof>IEEE sensors journal, 2023-07, Vol.23 (13), p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-459e3ba8f187077061ebd34780343134fe2319466049bc7f8ce500ed1982057b3</citedby><cites>FETCH-LOGICAL-c294t-459e3ba8f187077061ebd34780343134fe2319466049bc7f8ce500ed1982057b3</cites><orcidid>0000-0001-6700-8670</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10130115$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10130115$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Suyuan</creatorcontrib><creatorcontrib>Song, Xin</creatorcontrib><title>Future Frame Prediction Network for Human Fall Detection in Surveillance Videos</title><title>IEEE sensors journal</title><addtitle>JSEN</addtitle><description>Video fall detection is one of the most significant challenges in computer vision domain, and it usually involves the recognition of events that do not conform to expected falls. Recently, a majority of unsupervised models are popular to address issues that call for substantial manual labeled training data in supervised learning. However, almost all existing unsupervised methods usually minimize reconstruction errors, which may lead to insufficient reconstruction errors between fall and non-fall video frames because of the powerful representation ability of the neural network. In this paper, we propose a novel efficient fall detection method based on future frame prediction. Specifically, attention U-Net with flexible global aggregation blocks that can achieve better performance is regarded as a frame prediction network, achieving that several video frames predict the next future frame. In the training phase, commonly used appearance constraints on intensity and gradient and motion constraint are combined to further generate higher quality frames. Such constraints promote the performance of the prediction network, which can enlarge the difference between the predicted fall frame and the real fall frame. In the testing phase, the fall score based on the error between the predicted frame and the real frame can be computed to distinguish the fall event. Exhaustive experiments have been conducted on UR fall dataset, multiple cameras fall dataset and high quality fall simulation dataset, and the results verify the effectiveness of the proposed method and outperform other existing state-of-the-art methods.</description><subject>attention gate</subject><subject>Cameras</subject><subject>Computer vision</subject><subject>Datasets</subject><subject>Errors</subject><subject>Fall detection</subject><subject>Feature extraction</subject><subject>Frames (data processing)</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Optical flow</subject><subject>prediction network</subject><subject>Predictions</subject><subject>Reconstruction</subject><subject>Sensors</subject><subject>Supervised learning</subject><subject>Training</subject><subject>U-Net</subject><subject>Videos</subject><subject>Wearable sensors</subject><issn>1530-437X</issn><issn>1558-1748</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE1Lw0AURQdRsFZ_gOBiwHXqe5lJZmYptbFKaYWquBvy8QKpaVInieK_NyFduHp3ce59cBi7Rpghgrl73i7WMx98MRO-CrXBEzbBINAeKqlPhyzAk0J9nLOLptkBoFGBmrBN1LWdIx65eE_8xVFWpG1RV3xN7U_tPnleO77s9nHFo7gs-QO1NAJFxbed-6aiLOMqJf5eZFQ3l-wsj8uGro53yt6ixet86a02j0_z-5WX-ka2ngwMiSTWOWoFSkGIlGRCKg1CChQyJ1-gkWEI0iSpynVKAQBlaLQPgUrElN2OuwdXf3XUtHZXd67qX1pfCwxAgwx7CkcqdXXTOMrtwRX72P1aBDt4s4M3O3izR29952bsFET0j0cB2Fv8Ay6KZ8U</recordid><startdate>20230701</startdate><enddate>20230701</enddate><creator>Li, Suyuan</creator><creator>Song, Xin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>7U5</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-6700-8670</orcidid></search><sort><creationdate>20230701</creationdate><title>Future Frame Prediction Network for Human Fall Detection in Surveillance Videos</title><author>Li, Suyuan ; Song, Xin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-459e3ba8f187077061ebd34780343134fe2319466049bc7f8ce500ed1982057b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>attention gate</topic><topic>Cameras</topic><topic>Computer vision</topic><topic>Datasets</topic><topic>Errors</topic><topic>Fall detection</topic><topic>Feature extraction</topic><topic>Frames (data processing)</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Optical flow</topic><topic>prediction network</topic><topic>Predictions</topic><topic>Reconstruction</topic><topic>Sensors</topic><topic>Supervised learning</topic><topic>Training</topic><topic>U-Net</topic><topic>Videos</topic><topic>Wearable sensors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Suyuan</creatorcontrib><creatorcontrib>Song, Xin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE sensors journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Suyuan</au><au>Song, Xin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Future Frame Prediction Network for Human Fall Detection in Surveillance Videos</atitle><jtitle>IEEE sensors journal</jtitle><stitle>JSEN</stitle><date>2023-07-01</date><risdate>2023</risdate><volume>23</volume><issue>13</issue><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>1530-437X</issn><eissn>1558-1748</eissn><coden>ISJEAZ</coden><abstract>Video fall detection is one of the most significant challenges in computer vision domain, and it usually involves the recognition of events that do not conform to expected falls. Recently, a majority of unsupervised models are popular to address issues that call for substantial manual labeled training data in supervised learning. However, almost all existing unsupervised methods usually minimize reconstruction errors, which may lead to insufficient reconstruction errors between fall and non-fall video frames because of the powerful representation ability of the neural network. In this paper, we propose a novel efficient fall detection method based on future frame prediction. Specifically, attention U-Net with flexible global aggregation blocks that can achieve better performance is regarded as a frame prediction network, achieving that several video frames predict the next future frame. In the training phase, commonly used appearance constraints on intensity and gradient and motion constraint are combined to further generate higher quality frames. Such constraints promote the performance of the prediction network, which can enlarge the difference between the predicted fall frame and the real fall frame. In the testing phase, the fall score based on the error between the predicted frame and the real frame can be computed to distinguish the fall event. Exhaustive experiments have been conducted on UR fall dataset, multiple cameras fall dataset and high quality fall simulation dataset, and the results verify the effectiveness of the proposed method and outperform other existing state-of-the-art methods.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/JSEN.2023.3276891</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-6700-8670</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1530-437X
ispartof	IEEE sensors journal, 2023-07, Vol.23 (13), p.1-1
issn	1530-437X 1558-1748
language	eng
recordid	cdi_proquest_journals_2831508046
source	IEEE Electronic Library (IEL)
subjects	attention gate Cameras Computer vision Datasets Errors Fall detection Feature extraction Frames (data processing) Machine learning Neural networks Optical flow prediction network Predictions Reconstruction Sensors Supervised learning Training U-Net Videos Wearable sensors
title	Future Frame Prediction Network for Human Fall Detection in Surveillance Videos
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T14%3A20%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Future%20Frame%20Prediction%20Network%20for%20Human%20Fall%20Detection%20in%20Surveillance%20Videos&rft.jtitle=IEEE%20sensors%20journal&rft.au=Li,%20Suyuan&rft.date=2023-07-01&rft.volume=23&rft.issue=13&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=1530-437X&rft.eissn=1558-1748&rft.coden=ISJEAZ&rft_id=info:doi/10.1109/JSEN.2023.3276891&rft_dat=%3Cproquest_RIE%3E2831508046%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2831508046&rft_id=info:pmid/&rft_ieee_id=10130115&rfr_iscdi=true