Using Direct Acyclic Graphs to Enhance Skeleton-Based Action Recognition with a Linear-Map Convolution Neural Network
Research on the human activity recognition could be utilized for the monitoring of elderly people living alone to reduce the cost of home care. Video sensors can be easily deployed in the different zones of houses to achieve monitoring. The goal of this study is to employ a linear-map convolutional...
Gespeichert in:
Veröffentlicht in: | Sensors (Basel, Switzerland) Switzerland), 2021-04, Vol.21 (9), p.3112 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 9 |
container_start_page | 3112 |
container_title | Sensors (Basel, Switzerland) |
container_volume | 21 |
creator | Tan, Tan-Hsu Hus, Jin-Hao Liu, Shing-Hong Huang, Yung-Fa Gochoo, Munkhjargal |
description | Research on the human activity recognition could be utilized for the monitoring of elderly people living alone to reduce the cost of home care. Video sensors can be easily deployed in the different zones of houses to achieve monitoring. The goal of this study is to employ a linear-map convolutional neural network (CNN) to perform action recognition with RGB videos. To reduce the amount of the training data, the posture information is represented by skeleton data extracted from the 300 frames of one film. The two-stream method was applied to increase the accuracy of recognition by using the spatial and motion features of skeleton sequences. The relations of adjacent skeletal joints were employed to build the direct acyclic graph (DAG) matrices, source matrix, and target matrix. Two features were transferred by DAG matrices and expanded as color texture images. The linear-map CNN had a two-dimensional linear map at the beginning of each layer to adjust the number of channels. A two-dimensional CNN was used to recognize the actions. We applied the RGB videos from the action recognition datasets of the NTU RGB+D database, which was established by the Rapid-Rich Object Search Lab, to execute model training and performance evaluation. The experimental results show that the obtained precision, recall, specificity, F1-score, and accuracy were 86.9%, 86.1%, 99.9%, 86.3%, and 99.5%, respectively, in the cross-subject source, and 94.8%, 94.7%, 99.9%, 94.7%, and 99.9%, respectively, in the cross-view source. An important contribution of this work is that by using the skeleton sequences to produce the spatial and motion features and the DAG matrix to enhance the relation of adjacent skeletal joints, the computation speed was faster than the traditional schemes that utilize single frame image convolution. Therefore, this work exhibits the practical potential of real-life action recognition. |
doi_str_mv | 10.3390/s21093112 |
format | Article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_proquest_miscellaneous_2522398447</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_231ff786f53f417cb540331a290e639c</doaj_id><sourcerecordid>2522398447</sourcerecordid><originalsourceid>FETCH-LOGICAL-c469t-79fff9a5e4f8c0569145d685b0f45c1aedfb339b5b5a95cc38b901d137863323</originalsourceid><addsrcrecordid>eNpdkktvEzEQgFcIRB9w4A8gS1zgsOBnYl-QSiilUgAJytnyeseJ042d2t5W_fd1kxK1nGbk-ebTjDVN84bgj4wp_ClTghUjhD5rDgmnvJWU4ueP8oPmKOcVxpQxJl82B7WLT5SSh834N_uwQF99AlvQib21g7foLJnNMqMS0WlYmmAB_bmEAUoM7ReToa9g8TGg32DjIvhtfuPLEhk09wFMan-YDZrFcB2HcVv9CWMyQw3lJqbLV80LZ4YMrx_icXPx7fRi9r2d_zo7n53MW1vHK-1UOeeUEcCdtFhMFOGin0jRYceFJQZ619VVOtEJo4S1THYKk56wqZwwRtlxc77T9tGs9Cb5tUm3Ohqvtw8xLbRJxdsBNGXEudrmBHOcTG0nOGaMGKowTJiy1fV559qM3Rp6C6HUhZ5In1aCX-pFvNaSUIEpqYL3D4IUr0bIRa99tjAMJkAcs6aCUqYk59OKvvsPXcUxhfpTlWKYCKnEvfDDjrIp5pzA7YchWN_fhd7fRWXfPp5-T_47BHYHiBiySA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2530158951</pqid></control><display><type>article</type><title>Using Direct Acyclic Graphs to Enhance Skeleton-Based Action Recognition with a Linear-Map Convolution Neural Network</title><source>MEDLINE</source><source>Full-Text Journals in Chemistry (Open access)</source><source>Directory of Open Access Journals</source><source>MDPI - Multidisciplinary Digital Publishing Institute</source><source>PubMed Central</source><source>EZB Electronic Journals Library</source><creator>Tan, Tan-Hsu ; Hus, Jin-Hao ; Liu, Shing-Hong ; Huang, Yung-Fa ; Gochoo, Munkhjargal</creator><creatorcontrib>Tan, Tan-Hsu ; Hus, Jin-Hao ; Liu, Shing-Hong ; Huang, Yung-Fa ; Gochoo, Munkhjargal</creatorcontrib><description>Research on the human activity recognition could be utilized for the monitoring of elderly people living alone to reduce the cost of home care. Video sensors can be easily deployed in the different zones of houses to achieve monitoring. The goal of this study is to employ a linear-map convolutional neural network (CNN) to perform action recognition with RGB videos. To reduce the amount of the training data, the posture information is represented by skeleton data extracted from the 300 frames of one film. The two-stream method was applied to increase the accuracy of recognition by using the spatial and motion features of skeleton sequences. The relations of adjacent skeletal joints were employed to build the direct acyclic graph (DAG) matrices, source matrix, and target matrix. Two features were transferred by DAG matrices and expanded as color texture images. The linear-map CNN had a two-dimensional linear map at the beginning of each layer to adjust the number of channels. A two-dimensional CNN was used to recognize the actions. We applied the RGB videos from the action recognition datasets of the NTU RGB+D database, which was established by the Rapid-Rich Object Search Lab, to execute model training and performance evaluation. The experimental results show that the obtained precision, recall, specificity, F1-score, and accuracy were 86.9%, 86.1%, 99.9%, 86.3%, and 99.5%, respectively, in the cross-subject source, and 94.8%, 94.7%, 99.9%, 94.7%, and 99.9%, respectively, in the cross-view source. An important contribution of this work is that by using the skeleton sequences to produce the spatial and motion features and the DAG matrix to enhance the relation of adjacent skeletal joints, the computation speed was faster than the traditional schemes that utilize single frame image convolution. Therefore, this work exhibits the practical potential of real-life action recognition.</description><identifier>ISSN: 1424-8220</identifier><identifier>EISSN: 1424-8220</identifier><identifier>DOI: 10.3390/s21093112</identifier><identifier>PMID: 33946998</identifier><language>eng</language><publisher>Switzerland: MDPI AG</publisher><subject>action recognition ; Aged ; Algorithms ; Artificial intelligence ; Color texture ; Convolution ; Databases, Factual ; Datasets ; direct acyclic graph ; Human Activities ; Human activity recognition ; Humans ; Joints (anatomy) ; linear-map convolutional neural network ; Matrix methods ; Monitoring ; Motion perception ; Moving object recognition ; Neural networks ; Neural Networks, Computer ; Older people ; Population ; Posture ; Skeleton ; spatial feature ; Strain gauges ; temporal feature ; Video</subject><ispartof>Sensors (Basel, Switzerland), 2021-04, Vol.21 (9), p.3112</ispartof><rights>2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2021 by the authors. 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c469t-79fff9a5e4f8c0569145d685b0f45c1aedfb339b5b5a95cc38b901d137863323</citedby><cites>FETCH-LOGICAL-c469t-79fff9a5e4f8c0569145d685b0f45c1aedfb339b5b5a95cc38b901d137863323</cites><orcidid>0000-0002-6613-7435 ; 0000-0003-3337-3551 ; 0000-0002-3923-4387</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8125021/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8125021/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,2102,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33946998$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Tan, Tan-Hsu</creatorcontrib><creatorcontrib>Hus, Jin-Hao</creatorcontrib><creatorcontrib>Liu, Shing-Hong</creatorcontrib><creatorcontrib>Huang, Yung-Fa</creatorcontrib><creatorcontrib>Gochoo, Munkhjargal</creatorcontrib><title>Using Direct Acyclic Graphs to Enhance Skeleton-Based Action Recognition with a Linear-Map Convolution Neural Network</title><title>Sensors (Basel, Switzerland)</title><addtitle>Sensors (Basel)</addtitle><description>Research on the human activity recognition could be utilized for the monitoring of elderly people living alone to reduce the cost of home care. Video sensors can be easily deployed in the different zones of houses to achieve monitoring. The goal of this study is to employ a linear-map convolutional neural network (CNN) to perform action recognition with RGB videos. To reduce the amount of the training data, the posture information is represented by skeleton data extracted from the 300 frames of one film. The two-stream method was applied to increase the accuracy of recognition by using the spatial and motion features of skeleton sequences. The relations of adjacent skeletal joints were employed to build the direct acyclic graph (DAG) matrices, source matrix, and target matrix. Two features were transferred by DAG matrices and expanded as color texture images. The linear-map CNN had a two-dimensional linear map at the beginning of each layer to adjust the number of channels. A two-dimensional CNN was used to recognize the actions. We applied the RGB videos from the action recognition datasets of the NTU RGB+D database, which was established by the Rapid-Rich Object Search Lab, to execute model training and performance evaluation. The experimental results show that the obtained precision, recall, specificity, F1-score, and accuracy were 86.9%, 86.1%, 99.9%, 86.3%, and 99.5%, respectively, in the cross-subject source, and 94.8%, 94.7%, 99.9%, 94.7%, and 99.9%, respectively, in the cross-view source. An important contribution of this work is that by using the skeleton sequences to produce the spatial and motion features and the DAG matrix to enhance the relation of adjacent skeletal joints, the computation speed was faster than the traditional schemes that utilize single frame image convolution. Therefore, this work exhibits the practical potential of real-life action recognition.</description><subject>action recognition</subject><subject>Aged</subject><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Color texture</subject><subject>Convolution</subject><subject>Databases, Factual</subject><subject>Datasets</subject><subject>direct acyclic graph</subject><subject>Human Activities</subject><subject>Human activity recognition</subject><subject>Humans</subject><subject>Joints (anatomy)</subject><subject>linear-map convolutional neural network</subject><subject>Matrix methods</subject><subject>Monitoring</subject><subject>Motion perception</subject><subject>Moving object recognition</subject><subject>Neural networks</subject><subject>Neural Networks, Computer</subject><subject>Older people</subject><subject>Population</subject><subject>Posture</subject><subject>Skeleton</subject><subject>spatial feature</subject><subject>Strain gauges</subject><subject>temporal feature</subject><subject>Video</subject><issn>1424-8220</issn><issn>1424-8220</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>DOA</sourceid><recordid>eNpdkktvEzEQgFcIRB9w4A8gS1zgsOBnYl-QSiilUgAJytnyeseJ042d2t5W_fd1kxK1nGbk-ebTjDVN84bgj4wp_ClTghUjhD5rDgmnvJWU4ueP8oPmKOcVxpQxJl82B7WLT5SSh834N_uwQF99AlvQib21g7foLJnNMqMS0WlYmmAB_bmEAUoM7ReToa9g8TGg32DjIvhtfuPLEhk09wFMan-YDZrFcB2HcVv9CWMyQw3lJqbLV80LZ4YMrx_icXPx7fRi9r2d_zo7n53MW1vHK-1UOeeUEcCdtFhMFOGin0jRYceFJQZ619VVOtEJo4S1THYKk56wqZwwRtlxc77T9tGs9Cb5tUm3Ohqvtw8xLbRJxdsBNGXEudrmBHOcTG0nOGaMGKowTJiy1fV559qM3Rp6C6HUhZ5In1aCX-pFvNaSUIEpqYL3D4IUr0bIRa99tjAMJkAcs6aCUqYk59OKvvsPXcUxhfpTlWKYCKnEvfDDjrIp5pzA7YchWN_fhd7fRWXfPp5-T_47BHYHiBiySA</recordid><startdate>20210429</startdate><enddate>20210429</enddate><creator>Tan, Tan-Hsu</creator><creator>Hus, Jin-Hao</creator><creator>Liu, Shing-Hong</creator><creator>Huang, Yung-Fa</creator><creator>Gochoo, Munkhjargal</creator><general>MDPI AG</general><general>MDPI</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>K9.</scope><scope>M0S</scope><scope>M1P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-6613-7435</orcidid><orcidid>https://orcid.org/0000-0003-3337-3551</orcidid><orcidid>https://orcid.org/0000-0002-3923-4387</orcidid></search><sort><creationdate>20210429</creationdate><title>Using Direct Acyclic Graphs to Enhance Skeleton-Based Action Recognition with a Linear-Map Convolution Neural Network</title><author>Tan, Tan-Hsu ; Hus, Jin-Hao ; Liu, Shing-Hong ; Huang, Yung-Fa ; Gochoo, Munkhjargal</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c469t-79fff9a5e4f8c0569145d685b0f45c1aedfb339b5b5a95cc38b901d137863323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>action recognition</topic><topic>Aged</topic><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Color texture</topic><topic>Convolution</topic><topic>Databases, Factual</topic><topic>Datasets</topic><topic>direct acyclic graph</topic><topic>Human Activities</topic><topic>Human activity recognition</topic><topic>Humans</topic><topic>Joints (anatomy)</topic><topic>linear-map convolutional neural network</topic><topic>Matrix methods</topic><topic>Monitoring</topic><topic>Motion perception</topic><topic>Moving object recognition</topic><topic>Neural networks</topic><topic>Neural Networks, Computer</topic><topic>Older people</topic><topic>Population</topic><topic>Posture</topic><topic>Skeleton</topic><topic>spatial feature</topic><topic>Strain gauges</topic><topic>temporal feature</topic><topic>Video</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tan, Tan-Hsu</creatorcontrib><creatorcontrib>Hus, Jin-Hao</creatorcontrib><creatorcontrib>Liu, Shing-Hong</creatorcontrib><creatorcontrib>Huang, Yung-Fa</creatorcontrib><creatorcontrib>Gochoo, Munkhjargal</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest_Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Academic</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>Sensors (Basel, Switzerland)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Tan, Tan-Hsu</au><au>Hus, Jin-Hao</au><au>Liu, Shing-Hong</au><au>Huang, Yung-Fa</au><au>Gochoo, Munkhjargal</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Using Direct Acyclic Graphs to Enhance Skeleton-Based Action Recognition with a Linear-Map Convolution Neural Network</atitle><jtitle>Sensors (Basel, Switzerland)</jtitle><addtitle>Sensors (Basel)</addtitle><date>2021-04-29</date><risdate>2021</risdate><volume>21</volume><issue>9</issue><spage>3112</spage><pages>3112-</pages><issn>1424-8220</issn><eissn>1424-8220</eissn><abstract>Research on the human activity recognition could be utilized for the monitoring of elderly people living alone to reduce the cost of home care. Video sensors can be easily deployed in the different zones of houses to achieve monitoring. The goal of this study is to employ a linear-map convolutional neural network (CNN) to perform action recognition with RGB videos. To reduce the amount of the training data, the posture information is represented by skeleton data extracted from the 300 frames of one film. The two-stream method was applied to increase the accuracy of recognition by using the spatial and motion features of skeleton sequences. The relations of adjacent skeletal joints were employed to build the direct acyclic graph (DAG) matrices, source matrix, and target matrix. Two features were transferred by DAG matrices and expanded as color texture images. The linear-map CNN had a two-dimensional linear map at the beginning of each layer to adjust the number of channels. A two-dimensional CNN was used to recognize the actions. We applied the RGB videos from the action recognition datasets of the NTU RGB+D database, which was established by the Rapid-Rich Object Search Lab, to execute model training and performance evaluation. The experimental results show that the obtained precision, recall, specificity, F1-score, and accuracy were 86.9%, 86.1%, 99.9%, 86.3%, and 99.5%, respectively, in the cross-subject source, and 94.8%, 94.7%, 99.9%, 94.7%, and 99.9%, respectively, in the cross-view source. An important contribution of this work is that by using the skeleton sequences to produce the spatial and motion features and the DAG matrix to enhance the relation of adjacent skeletal joints, the computation speed was faster than the traditional schemes that utilize single frame image convolution. Therefore, this work exhibits the practical potential of real-life action recognition.</abstract><cop>Switzerland</cop><pub>MDPI AG</pub><pmid>33946998</pmid><doi>10.3390/s21093112</doi><orcidid>https://orcid.org/0000-0002-6613-7435</orcidid><orcidid>https://orcid.org/0000-0003-3337-3551</orcidid><orcidid>https://orcid.org/0000-0002-3923-4387</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1424-8220 |
ispartof | Sensors (Basel, Switzerland), 2021-04, Vol.21 (9), p.3112 |
issn | 1424-8220 1424-8220 |
language | eng |
recordid | cdi_proquest_miscellaneous_2522398447 |
source | MEDLINE; Full-Text Journals in Chemistry (Open access); Directory of Open Access Journals; MDPI - Multidisciplinary Digital Publishing Institute; PubMed Central; EZB Electronic Journals Library |
subjects | action recognition Aged Algorithms Artificial intelligence Color texture Convolution Databases, Factual Datasets direct acyclic graph Human Activities Human activity recognition Humans Joints (anatomy) linear-map convolutional neural network Matrix methods Monitoring Motion perception Moving object recognition Neural networks Neural Networks, Computer Older people Population Posture Skeleton spatial feature Strain gauges temporal feature Video |
title | Using Direct Acyclic Graphs to Enhance Skeleton-Based Action Recognition with a Linear-Map Convolution Neural Network |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T15%3A31%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Using%20Direct%20Acyclic%20Graphs%20to%20Enhance%20Skeleton-Based%20Action%20Recognition%20with%20a%20Linear-Map%20Convolution%20Neural%20Network&rft.jtitle=Sensors%20(Basel,%20Switzerland)&rft.au=Tan,%20Tan-Hsu&rft.date=2021-04-29&rft.volume=21&rft.issue=9&rft.spage=3112&rft.pages=3112-&rft.issn=1424-8220&rft.eissn=1424-8220&rft_id=info:doi/10.3390/s21093112&rft_dat=%3Cproquest_doaj_%3E2522398447%3C/proquest_doaj_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2530158951&rft_id=info:pmid/33946998&rft_doaj_id=oai_doaj_org_article_231ff786f53f417cb540331a290e639c&rfr_iscdi=true |