Future Frame Prediction Network for Human Fall Detection in Surveillance Videos

Video fall detection is one of the most significant challenges in computer vision domain, and it usually involves the recognition of events that do not conform to expected falls. Recently, a majority of unsupervised models are popular to address issues that call for substantial manual labeled traini...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE sensors journal 2023-07, Vol.23 (13), p.1-1
Hauptverfasser: Li, Suyuan, Song, Xin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1
container_issue 13
container_start_page 1
container_title IEEE sensors journal
container_volume 23
creator Li, Suyuan
Song, Xin
description Video fall detection is one of the most significant challenges in computer vision domain, and it usually involves the recognition of events that do not conform to expected falls. Recently, a majority of unsupervised models are popular to address issues that call for substantial manual labeled training data in supervised learning. However, almost all existing unsupervised methods usually minimize reconstruction errors, which may lead to insufficient reconstruction errors between fall and non-fall video frames because of the powerful representation ability of the neural network. In this paper, we propose a novel efficient fall detection method based on future frame prediction. Specifically, attention U-Net with flexible global aggregation blocks that can achieve better performance is regarded as a frame prediction network, achieving that several video frames predict the next future frame. In the training phase, commonly used appearance constraints on intensity and gradient and motion constraint are combined to further generate higher quality frames. Such constraints promote the performance of the prediction network, which can enlarge the difference between the predicted fall frame and the real fall frame. In the testing phase, the fall score based on the error between the predicted frame and the real frame can be computed to distinguish the fall event. Exhaustive experiments have been conducted on UR fall dataset, multiple cameras fall dataset and high quality fall simulation dataset, and the results verify the effectiveness of the proposed method and outperform other existing state-of-the-art methods.
doi_str_mv 10.1109/JSEN.2023.3276891
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2831508046</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10130115</ieee_id><sourcerecordid>2831508046</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-459e3ba8f187077061ebd34780343134fe2319466049bc7f8ce500ed1982057b3</originalsourceid><addsrcrecordid>eNpNkE1Lw0AURQdRsFZ_gOBiwHXqe5lJZmYptbFKaYWquBvy8QKpaVInieK_NyFduHp3ce59cBi7Rpghgrl73i7WMx98MRO-CrXBEzbBINAeKqlPhyzAk0J9nLOLptkBoFGBmrBN1LWdIx65eE_8xVFWpG1RV3xN7U_tPnleO77s9nHFo7gs-QO1NAJFxbed-6aiLOMqJf5eZFQ3l-wsj8uGro53yt6ixet86a02j0_z-5WX-ka2ngwMiSTWOWoFSkGIlGRCKg1CChQyJ1-gkWEI0iSpynVKAQBlaLQPgUrElN2OuwdXf3XUtHZXd67qX1pfCwxAgwx7CkcqdXXTOMrtwRX72P1aBDt4s4M3O3izR29952bsFET0j0cB2Fv8Ay6KZ8U</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2831508046</pqid></control><display><type>article</type><title>Future Frame Prediction Network for Human Fall Detection in Surveillance Videos</title><source>IEEE Electronic Library (IEL)</source><creator>Li, Suyuan ; Song, Xin</creator><creatorcontrib>Li, Suyuan ; Song, Xin</creatorcontrib><description>Video fall detection is one of the most significant challenges in computer vision domain, and it usually involves the recognition of events that do not conform to expected falls. Recently, a majority of unsupervised models are popular to address issues that call for substantial manual labeled training data in supervised learning. However, almost all existing unsupervised methods usually minimize reconstruction errors, which may lead to insufficient reconstruction errors between fall and non-fall video frames because of the powerful representation ability of the neural network. In this paper, we propose a novel efficient fall detection method based on future frame prediction. Specifically, attention U-Net with flexible global aggregation blocks that can achieve better performance is regarded as a frame prediction network, achieving that several video frames predict the next future frame. In the training phase, commonly used appearance constraints on intensity and gradient and motion constraint are combined to further generate higher quality frames. Such constraints promote the performance of the prediction network, which can enlarge the difference between the predicted fall frame and the real fall frame. In the testing phase, the fall score based on the error between the predicted frame and the real frame can be computed to distinguish the fall event. Exhaustive experiments have been conducted on UR fall dataset, multiple cameras fall dataset and high quality fall simulation dataset, and the results verify the effectiveness of the proposed method and outperform other existing state-of-the-art methods.</description><identifier>ISSN: 1530-437X</identifier><identifier>EISSN: 1558-1748</identifier><identifier>DOI: 10.1109/JSEN.2023.3276891</identifier><identifier>CODEN: ISJEAZ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>attention gate ; Cameras ; Computer vision ; Datasets ; Errors ; Fall detection ; Feature extraction ; Frames (data processing) ; Machine learning ; Neural networks ; Optical flow ; prediction network ; Predictions ; Reconstruction ; Sensors ; Supervised learning ; Training ; U-Net ; Videos ; Wearable sensors</subject><ispartof>IEEE sensors journal, 2023-07, Vol.23 (13), p.1-1</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-459e3ba8f187077061ebd34780343134fe2319466049bc7f8ce500ed1982057b3</citedby><cites>FETCH-LOGICAL-c294t-459e3ba8f187077061ebd34780343134fe2319466049bc7f8ce500ed1982057b3</cites><orcidid>0000-0001-6700-8670</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10130115$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10130115$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Suyuan</creatorcontrib><creatorcontrib>Song, Xin</creatorcontrib><title>Future Frame Prediction Network for Human Fall Detection in Surveillance Videos</title><title>IEEE sensors journal</title><addtitle>JSEN</addtitle><description>Video fall detection is one of the most significant challenges in computer vision domain, and it usually involves the recognition of events that do not conform to expected falls. Recently, a majority of unsupervised models are popular to address issues that call for substantial manual labeled training data in supervised learning. However, almost all existing unsupervised methods usually minimize reconstruction errors, which may lead to insufficient reconstruction errors between fall and non-fall video frames because of the powerful representation ability of the neural network. In this paper, we propose a novel efficient fall detection method based on future frame prediction. Specifically, attention U-Net with flexible global aggregation blocks that can achieve better performance is regarded as a frame prediction network, achieving that several video frames predict the next future frame. In the training phase, commonly used appearance constraints on intensity and gradient and motion constraint are combined to further generate higher quality frames. Such constraints promote the performance of the prediction network, which can enlarge the difference between the predicted fall frame and the real fall frame. In the testing phase, the fall score based on the error between the predicted frame and the real frame can be computed to distinguish the fall event. Exhaustive experiments have been conducted on UR fall dataset, multiple cameras fall dataset and high quality fall simulation dataset, and the results verify the effectiveness of the proposed method and outperform other existing state-of-the-art methods.</description><subject>attention gate</subject><subject>Cameras</subject><subject>Computer vision</subject><subject>Datasets</subject><subject>Errors</subject><subject>Fall detection</subject><subject>Feature extraction</subject><subject>Frames (data processing)</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Optical flow</subject><subject>prediction network</subject><subject>Predictions</subject><subject>Reconstruction</subject><subject>Sensors</subject><subject>Supervised learning</subject><subject>Training</subject><subject>U-Net</subject><subject>Videos</subject><subject>Wearable sensors</subject><issn>1530-437X</issn><issn>1558-1748</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE1Lw0AURQdRsFZ_gOBiwHXqe5lJZmYptbFKaYWquBvy8QKpaVInieK_NyFduHp3ce59cBi7Rpghgrl73i7WMx98MRO-CrXBEzbBINAeKqlPhyzAk0J9nLOLptkBoFGBmrBN1LWdIx65eE_8xVFWpG1RV3xN7U_tPnleO77s9nHFo7gs-QO1NAJFxbed-6aiLOMqJf5eZFQ3l-wsj8uGro53yt6ixet86a02j0_z-5WX-ka2ngwMiSTWOWoFSkGIlGRCKg1CChQyJ1-gkWEI0iSpynVKAQBlaLQPgUrElN2OuwdXf3XUtHZXd67qX1pfCwxAgwx7CkcqdXXTOMrtwRX72P1aBDt4s4M3O3izR29952bsFET0j0cB2Fv8Ay6KZ8U</recordid><startdate>20230701</startdate><enddate>20230701</enddate><creator>Li, Suyuan</creator><creator>Song, Xin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>7U5</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-6700-8670</orcidid></search><sort><creationdate>20230701</creationdate><title>Future Frame Prediction Network for Human Fall Detection in Surveillance Videos</title><author>Li, Suyuan ; Song, Xin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-459e3ba8f187077061ebd34780343134fe2319466049bc7f8ce500ed1982057b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>attention gate</topic><topic>Cameras</topic><topic>Computer vision</topic><topic>Datasets</topic><topic>Errors</topic><topic>Fall detection</topic><topic>Feature extraction</topic><topic>Frames (data processing)</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Optical flow</topic><topic>prediction network</topic><topic>Predictions</topic><topic>Reconstruction</topic><topic>Sensors</topic><topic>Supervised learning</topic><topic>Training</topic><topic>U-Net</topic><topic>Videos</topic><topic>Wearable sensors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Suyuan</creatorcontrib><creatorcontrib>Song, Xin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE sensors journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Suyuan</au><au>Song, Xin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Future Frame Prediction Network for Human Fall Detection in Surveillance Videos</atitle><jtitle>IEEE sensors journal</jtitle><stitle>JSEN</stitle><date>2023-07-01</date><risdate>2023</risdate><volume>23</volume><issue>13</issue><spage>1</spage><epage>1</epage><pages>1-1</pages><issn>1530-437X</issn><eissn>1558-1748</eissn><coden>ISJEAZ</coden><abstract>Video fall detection is one of the most significant challenges in computer vision domain, and it usually involves the recognition of events that do not conform to expected falls. Recently, a majority of unsupervised models are popular to address issues that call for substantial manual labeled training data in supervised learning. However, almost all existing unsupervised methods usually minimize reconstruction errors, which may lead to insufficient reconstruction errors between fall and non-fall video frames because of the powerful representation ability of the neural network. In this paper, we propose a novel efficient fall detection method based on future frame prediction. Specifically, attention U-Net with flexible global aggregation blocks that can achieve better performance is regarded as a frame prediction network, achieving that several video frames predict the next future frame. In the training phase, commonly used appearance constraints on intensity and gradient and motion constraint are combined to further generate higher quality frames. Such constraints promote the performance of the prediction network, which can enlarge the difference between the predicted fall frame and the real fall frame. In the testing phase, the fall score based on the error between the predicted frame and the real frame can be computed to distinguish the fall event. Exhaustive experiments have been conducted on UR fall dataset, multiple cameras fall dataset and high quality fall simulation dataset, and the results verify the effectiveness of the proposed method and outperform other existing state-of-the-art methods.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/JSEN.2023.3276891</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-6700-8670</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1530-437X
ispartof IEEE sensors journal, 2023-07, Vol.23 (13), p.1-1
issn 1530-437X
1558-1748
language eng
recordid cdi_proquest_journals_2831508046
source IEEE Electronic Library (IEL)
subjects attention gate
Cameras
Computer vision
Datasets
Errors
Fall detection
Feature extraction
Frames (data processing)
Machine learning
Neural networks
Optical flow
prediction network
Predictions
Reconstruction
Sensors
Supervised learning
Training
U-Net
Videos
Wearable sensors
title Future Frame Prediction Network for Human Fall Detection in Surveillance Videos
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T14%3A20%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Future%20Frame%20Prediction%20Network%20for%20Human%20Fall%20Detection%20in%20Surveillance%20Videos&rft.jtitle=IEEE%20sensors%20journal&rft.au=Li,%20Suyuan&rft.date=2023-07-01&rft.volume=23&rft.issue=13&rft.spage=1&rft.epage=1&rft.pages=1-1&rft.issn=1530-437X&rft.eissn=1558-1748&rft.coden=ISJEAZ&rft_id=info:doi/10.1109/JSEN.2023.3276891&rft_dat=%3Cproquest_RIE%3E2831508046%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2831508046&rft_id=info:pmid/&rft_ieee_id=10130115&rfr_iscdi=true