Temporal Pyramid Network With Spatial-Temporal Attention for Pedestrian Trajectory Prediction

Understanding and predicting human motion behavior with social interactions have become an increasingly crucial problem for a vast number of applications, ranging from visual navigation of autonomous vehicles to activity prediction of intelligent video surveillance. Accurately forecasting crowd moti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on network science and engineering 2022-05, Vol.9 (3), p.1006-1019
Hauptverfasser: Li, Yuanman, Liang, Rongqin, Wei, Wei, Wang, Wei, Zhou, Jiantao, Li, Xia
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1019
container_issue 3
container_start_page 1006
container_title IEEE transactions on network science and engineering
container_volume 9
creator Li, Yuanman
Liang, Rongqin
Wei, Wei
Wang, Wei
Zhou, Jiantao
Li, Xia
description Understanding and predicting human motion behavior with social interactions have become an increasingly crucial problem for a vast number of applications, ranging from visual navigation of autonomous vehicles to activity prediction of intelligent video surveillance. Accurately forecasting crowd motion behavior is challenging due to the multimodal nature of trajectories and complex social interactions between humans. Recent algorithms model and predict the trajectory with a single resolution, making them difficult to exploit the long-range information and the short-range information of the motion behavior simultaneously. In this paper, we propose a temporal pyramid network for pedestrian trajectory prediction through a squeeze modulation and a dilation modulation. The hierarchical design of our framework allows to model the trajectory with multi-resolution, then can better capture the motion behavior at various tempos. By progressively combining the global context with the local one, we finally construct a coarse-to-fine hierarchical pedestrian trajectory prediction framework with multi-supervision. Further, we introduce a unified spatial-temporal attention mechanism to adaptively select important information of persons around in both spatial and temporal domains. We show that our attention strategy is intuitive and effective to encode the influence of social interactions. Experimental results on two benchmarks demonstrate the superiority of our proposed scheme.
doi_str_mv 10.1109/TNSE.2021.3065019
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_2669159214</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9373939</ieee_id><sourcerecordid>2669159214</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-33f2fe5d74eeaaf9d0701bcf014d0ccd9e560316e59f068b930fdff96f5fc2773</originalsourceid><addsrcrecordid>eNo9kE1LAzEQhoMoWLQ_QLwEPG_Nx262OZZSP6DUQlf0IiFNJpjabtZsivTfu0tLmcPM4XlnhgehO0pGlBL5WC1WsxEjjI44EQWh8gINGOd5xpn8vOxnVma5kOU1GrbthhBC2Vhwzgfoq4JdE6Le4uUh6p23eAHpL8Qf_OHTN141Onm9zc7UJCWokw81diHiJVhoU_S6xlXUGzApxANeRrDe9NAtunJ628Lw1G_Q-9Osmr5k87fn1-lknhkmeco4d8xBYcscQGsnLSkJXRtHaG6JMVZCIQinAgrpiBivJSfOOieFK5xhZclv0MNxbxPD7757SW3CPtbdScWEkLSQjOYdRY-UiaFtIzjVRL_T8aAoUb1I1YtUvUh1Etll7o8ZDwBnXvKSy67-AeO1cIE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2669159214</pqid></control><display><type>article</type><title>Temporal Pyramid Network With Spatial-Temporal Attention for Pedestrian Trajectory Prediction</title><source>IEEE Xplore</source><creator>Li, Yuanman ; Liang, Rongqin ; Wei, Wei ; Wang, Wei ; Zhou, Jiantao ; Li, Xia</creator><creatorcontrib>Li, Yuanman ; Liang, Rongqin ; Wei, Wei ; Wang, Wei ; Zhou, Jiantao ; Li, Xia</creatorcontrib><description>Understanding and predicting human motion behavior with social interactions have become an increasingly crucial problem for a vast number of applications, ranging from visual navigation of autonomous vehicles to activity prediction of intelligent video surveillance. Accurately forecasting crowd motion behavior is challenging due to the multimodal nature of trajectories and complex social interactions between humans. Recent algorithms model and predict the trajectory with a single resolution, making them difficult to exploit the long-range information and the short-range information of the motion behavior simultaneously. In this paper, we propose a temporal pyramid network for pedestrian trajectory prediction through a squeeze modulation and a dilation modulation. The hierarchical design of our framework allows to model the trajectory with multi-resolution, then can better capture the motion behavior at various tempos. By progressively combining the global context with the local one, we finally construct a coarse-to-fine hierarchical pedestrian trajectory prediction framework with multi-supervision. Further, we introduce a unified spatial-temporal attention mechanism to adaptively select important information of persons around in both spatial and temporal domains. We show that our attention strategy is intuitive and effective to encode the influence of social interactions. Experimental results on two benchmarks demonstrate the superiority of our proposed scheme.</description><identifier>ISSN: 2327-4697</identifier><identifier>EISSN: 2334-329X</identifier><identifier>DOI: 10.1109/TNSE.2021.3065019</identifier><identifier>CODEN: ITNSD5</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Autonomous navigation ; Computational modeling ; Deep learning ; Feature extraction ; Human motion ; Modulation ; Prediction algorithms ; Predictions ; Predictive models ; social behavior ; social computing ; Social factors ; Social interaction ; social interactions ; spatial-temporal attention ; Task analysis ; temporal pyramid network ; Trajectories ; Trajectory ; trajectory prediction</subject><ispartof>IEEE transactions on network science and engineering, 2022-05, Vol.9 (3), p.1006-1019</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c293t-33f2fe5d74eeaaf9d0701bcf014d0ccd9e560316e59f068b930fdff96f5fc2773</citedby><cites>FETCH-LOGICAL-c293t-33f2fe5d74eeaaf9d0701bcf014d0ccd9e560316e59f068b930fdff96f5fc2773</cites><orcidid>0000-0002-5987-738X ; 0000-0002-6015-2618 ; 0000-0002-7566-2995 ; 0000-0002-7313-6561 ; 0000-0002-8043-9966 ; 0000-0002-9134-4866</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9373939$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9373939$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Li, Yuanman</creatorcontrib><creatorcontrib>Liang, Rongqin</creatorcontrib><creatorcontrib>Wei, Wei</creatorcontrib><creatorcontrib>Wang, Wei</creatorcontrib><creatorcontrib>Zhou, Jiantao</creatorcontrib><creatorcontrib>Li, Xia</creatorcontrib><title>Temporal Pyramid Network With Spatial-Temporal Attention for Pedestrian Trajectory Prediction</title><title>IEEE transactions on network science and engineering</title><addtitle>TNSE</addtitle><description>Understanding and predicting human motion behavior with social interactions have become an increasingly crucial problem for a vast number of applications, ranging from visual navigation of autonomous vehicles to activity prediction of intelligent video surveillance. Accurately forecasting crowd motion behavior is challenging due to the multimodal nature of trajectories and complex social interactions between humans. Recent algorithms model and predict the trajectory with a single resolution, making them difficult to exploit the long-range information and the short-range information of the motion behavior simultaneously. In this paper, we propose a temporal pyramid network for pedestrian trajectory prediction through a squeeze modulation and a dilation modulation. The hierarchical design of our framework allows to model the trajectory with multi-resolution, then can better capture the motion behavior at various tempos. By progressively combining the global context with the local one, we finally construct a coarse-to-fine hierarchical pedestrian trajectory prediction framework with multi-supervision. Further, we introduce a unified spatial-temporal attention mechanism to adaptively select important information of persons around in both spatial and temporal domains. We show that our attention strategy is intuitive and effective to encode the influence of social interactions. Experimental results on two benchmarks demonstrate the superiority of our proposed scheme.</description><subject>Algorithms</subject><subject>Autonomous navigation</subject><subject>Computational modeling</subject><subject>Deep learning</subject><subject>Feature extraction</subject><subject>Human motion</subject><subject>Modulation</subject><subject>Prediction algorithms</subject><subject>Predictions</subject><subject>Predictive models</subject><subject>social behavior</subject><subject>social computing</subject><subject>Social factors</subject><subject>Social interaction</subject><subject>social interactions</subject><subject>spatial-temporal attention</subject><subject>Task analysis</subject><subject>temporal pyramid network</subject><subject>Trajectories</subject><subject>Trajectory</subject><subject>trajectory prediction</subject><issn>2327-4697</issn><issn>2334-329X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kE1LAzEQhoMoWLQ_QLwEPG_Nx262OZZSP6DUQlf0IiFNJpjabtZsivTfu0tLmcPM4XlnhgehO0pGlBL5WC1WsxEjjI44EQWh8gINGOd5xpn8vOxnVma5kOU1GrbthhBC2Vhwzgfoq4JdE6Le4uUh6p23eAHpL8Qf_OHTN141Onm9zc7UJCWokw81diHiJVhoU_S6xlXUGzApxANeRrDe9NAtunJ628Lw1G_Q-9Osmr5k87fn1-lknhkmeco4d8xBYcscQGsnLSkJXRtHaG6JMVZCIQinAgrpiBivJSfOOieFK5xhZclv0MNxbxPD7757SW3CPtbdScWEkLSQjOYdRY-UiaFtIzjVRL_T8aAoUb1I1YtUvUh1Etll7o8ZDwBnXvKSy67-AeO1cIE</recordid><startdate>20220501</startdate><enddate>20220501</enddate><creator>Li, Yuanman</creator><creator>Liang, Rongqin</creator><creator>Wei, Wei</creator><creator>Wang, Wei</creator><creator>Zhou, Jiantao</creator><creator>Li, Xia</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-5987-738X</orcidid><orcidid>https://orcid.org/0000-0002-6015-2618</orcidid><orcidid>https://orcid.org/0000-0002-7566-2995</orcidid><orcidid>https://orcid.org/0000-0002-7313-6561</orcidid><orcidid>https://orcid.org/0000-0002-8043-9966</orcidid><orcidid>https://orcid.org/0000-0002-9134-4866</orcidid></search><sort><creationdate>20220501</creationdate><title>Temporal Pyramid Network With Spatial-Temporal Attention for Pedestrian Trajectory Prediction</title><author>Li, Yuanman ; Liang, Rongqin ; Wei, Wei ; Wang, Wei ; Zhou, Jiantao ; Li, Xia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-33f2fe5d74eeaaf9d0701bcf014d0ccd9e560316e59f068b930fdff96f5fc2773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Autonomous navigation</topic><topic>Computational modeling</topic><topic>Deep learning</topic><topic>Feature extraction</topic><topic>Human motion</topic><topic>Modulation</topic><topic>Prediction algorithms</topic><topic>Predictions</topic><topic>Predictive models</topic><topic>social behavior</topic><topic>social computing</topic><topic>Social factors</topic><topic>Social interaction</topic><topic>social interactions</topic><topic>spatial-temporal attention</topic><topic>Task analysis</topic><topic>temporal pyramid network</topic><topic>Trajectories</topic><topic>Trajectory</topic><topic>trajectory prediction</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Yuanman</creatorcontrib><creatorcontrib>Liang, Rongqin</creatorcontrib><creatorcontrib>Wei, Wei</creatorcontrib><creatorcontrib>Wang, Wei</creatorcontrib><creatorcontrib>Zhou, Jiantao</creatorcontrib><creatorcontrib>Li, Xia</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on network science and engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Yuanman</au><au>Liang, Rongqin</au><au>Wei, Wei</au><au>Wang, Wei</au><au>Zhou, Jiantao</au><au>Li, Xia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Temporal Pyramid Network With Spatial-Temporal Attention for Pedestrian Trajectory Prediction</atitle><jtitle>IEEE transactions on network science and engineering</jtitle><stitle>TNSE</stitle><date>2022-05-01</date><risdate>2022</risdate><volume>9</volume><issue>3</issue><spage>1006</spage><epage>1019</epage><pages>1006-1019</pages><issn>2327-4697</issn><eissn>2334-329X</eissn><coden>ITNSD5</coden><abstract>Understanding and predicting human motion behavior with social interactions have become an increasingly crucial problem for a vast number of applications, ranging from visual navigation of autonomous vehicles to activity prediction of intelligent video surveillance. Accurately forecasting crowd motion behavior is challenging due to the multimodal nature of trajectories and complex social interactions between humans. Recent algorithms model and predict the trajectory with a single resolution, making them difficult to exploit the long-range information and the short-range information of the motion behavior simultaneously. In this paper, we propose a temporal pyramid network for pedestrian trajectory prediction through a squeeze modulation and a dilation modulation. The hierarchical design of our framework allows to model the trajectory with multi-resolution, then can better capture the motion behavior at various tempos. By progressively combining the global context with the local one, we finally construct a coarse-to-fine hierarchical pedestrian trajectory prediction framework with multi-supervision. Further, we introduce a unified spatial-temporal attention mechanism to adaptively select important information of persons around in both spatial and temporal domains. We show that our attention strategy is intuitive and effective to encode the influence of social interactions. Experimental results on two benchmarks demonstrate the superiority of our proposed scheme.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TNSE.2021.3065019</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-5987-738X</orcidid><orcidid>https://orcid.org/0000-0002-6015-2618</orcidid><orcidid>https://orcid.org/0000-0002-7566-2995</orcidid><orcidid>https://orcid.org/0000-0002-7313-6561</orcidid><orcidid>https://orcid.org/0000-0002-8043-9966</orcidid><orcidid>https://orcid.org/0000-0002-9134-4866</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2327-4697
ispartof IEEE transactions on network science and engineering, 2022-05, Vol.9 (3), p.1006-1019
issn 2327-4697
2334-329X
language eng
recordid cdi_proquest_journals_2669159214
source IEEE Xplore
subjects Algorithms
Autonomous navigation
Computational modeling
Deep learning
Feature extraction
Human motion
Modulation
Prediction algorithms
Predictions
Predictive models
social behavior
social computing
Social factors
Social interaction
social interactions
spatial-temporal attention
Task analysis
temporal pyramid network
Trajectories
Trajectory
trajectory prediction
title Temporal Pyramid Network With Spatial-Temporal Attention for Pedestrian Trajectory Prediction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T16%3A46%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Temporal%20Pyramid%20Network%20With%20Spatial-Temporal%20Attention%20for%20Pedestrian%20Trajectory%20Prediction&rft.jtitle=IEEE%20transactions%20on%20network%20science%20and%20engineering&rft.au=Li,%20Yuanman&rft.date=2022-05-01&rft.volume=9&rft.issue=3&rft.spage=1006&rft.epage=1019&rft.pages=1006-1019&rft.issn=2327-4697&rft.eissn=2334-329X&rft.coden=ITNSD5&rft_id=info:doi/10.1109/TNSE.2021.3065019&rft_dat=%3Cproquest_RIE%3E2669159214%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2669159214&rft_id=info:pmid/&rft_ieee_id=9373939&rfr_iscdi=true