PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction

For autonomous driving, one of the major challenges is to predict pedestrian crossing intention in ego-view. Pedestrian intention depends not only on their intrinsic goals but also on the stimulation of surrounding traffic elements. Considering the influence of other traffic elements on pedestrian i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on intelligent transportation systems 2023-12, Vol.24 (12), p.14213-14225
Hauptverfasser: Zhou, Yuchen, Tan, Guang, Zhong, Rui, Li, Yaokun, Gou, Chao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 14225
container_issue 12
container_start_page 14213
container_title IEEE transactions on intelligent transportation systems
container_volume 24
creator Zhou, Yuchen
Tan, Guang
Zhong, Rui
Li, Yaokun
Gou, Chao
description For autonomous driving, one of the major challenges is to predict pedestrian crossing intention in ego-view. Pedestrian intention depends not only on their intrinsic goals but also on the stimulation of surrounding traffic elements. Considering the influence of other traffic elements on pedestrian intention, recent work introduced more traffic element information into the model to successfully improve performance. However, it is still difficult to effectively capture and fully exploit the potential dynamic spatio-temporal interactions among the target pedestrian and its surrounding traffic elements for accurate reasoning. In this work, inspired by neuroscience that human drivers tend to make continuous sensory-motor driving decisions by progressive visual stimulation, we propose a model termed Progressive Interaction Transformer (PIT) for pedestrian crossing intention prediction. Local pedestrian, global environment, and ego-vehicle motion are considered simultaneously in the proposed PIT. In particular, the temporal fusion block and self-attention mechanism are introduced to jointly and progressively model the dynamic spatio-temporal interactions among the three parties, allowing it to capture richer information and make prediction in a similar way to human drivers. Experimental results demonstrate that PIT achieves higher performance compared with other state-of-the-arts and preserves real-time inference.
doi_str_mv 10.1109/TITS.2023.3309309
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TITS_2023_3309309</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10247098</ieee_id><sourcerecordid>2896027970</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-fa19b9dce5119d78277b76140e3f674490850787e26eb930609dc2c78d6f24dc3</originalsourceid><addsrcrecordid>eNpNkF9LwzAUxYMoOKcfQPCh4HPnTZrmj28ynBYGFuyeQ9fejg7XzptO8NubbnsQLpxL-J3k5DB2z2HGOdinIis-ZwJEMksSsGEu2ISnqYkBuLocdyFjCylcsxvvt-FUppxP2CrPiucop35D6H37g1HWDUhlNbR9FxVUdr7paYcUBYlyrNEP1JZdNKc-8N3myHdHOies26Pxll015ZfHu7NO2WrxWszf4-XHWzZ_WcaVsHKIm5Lbta0rDFFsrY3Qeq0Vl4BJo7SUFkwK2mgUCtfhUwoCLCptatUIWVfJlD2e7t1T_30I0dy2P1AXnnTCWAVCWw2B4ieqGjMTNm5P7a6kX8fBje25sT03tufO7QXPw8nTIuI_XkgN1iR_IpdrdA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2896027970</pqid></control><display><type>article</type><title>PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction</title><source>IEEE Electronic Library (IEL)</source><creator>Zhou, Yuchen ; Tan, Guang ; Zhong, Rui ; Li, Yaokun ; Gou, Chao</creator><creatorcontrib>Zhou, Yuchen ; Tan, Guang ; Zhong, Rui ; Li, Yaokun ; Gou, Chao</creatorcontrib><description>For autonomous driving, one of the major challenges is to predict pedestrian crossing intention in ego-view. Pedestrian intention depends not only on their intrinsic goals but also on the stimulation of surrounding traffic elements. Considering the influence of other traffic elements on pedestrian intention, recent work introduced more traffic element information into the model to successfully improve performance. However, it is still difficult to effectively capture and fully exploit the potential dynamic spatio-temporal interactions among the target pedestrian and its surrounding traffic elements for accurate reasoning. In this work, inspired by neuroscience that human drivers tend to make continuous sensory-motor driving decisions by progressive visual stimulation, we propose a model termed Progressive Interaction Transformer (PIT) for pedestrian crossing intention prediction. Local pedestrian, global environment, and ego-vehicle motion are considered simultaneously in the proposed PIT. In particular, the temporal fusion block and self-attention mechanism are introduced to jointly and progressively model the dynamic spatio-temporal interactions among the three parties, allowing it to capture richer information and make prediction in a similar way to human drivers. Experimental results demonstrate that PIT achieves higher performance compared with other state-of-the-arts and preserves real-time inference.</description><identifier>ISSN: 1524-9050</identifier><identifier>EISSN: 1558-0016</identifier><identifier>DOI: 10.1109/TITS.2023.3309309</identifier><identifier>CODEN: ITISFG</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Autonomous vehicles ; intelligent vehicle ; Modeling ; neuroscience ; Pedestrian crossings ; Pedestrian intention prediction ; Pedestrians ; Performance enhancement ; Predictive models ; Stimulation ; Traffic information ; traffic scene understanding ; transformer ; Transformers ; Vehicle dynamics ; Vehicles</subject><ispartof>IEEE transactions on intelligent transportation systems, 2023-12, Vol.24 (12), p.14213-14225</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-fa19b9dce5119d78277b76140e3f674490850787e26eb930609dc2c78d6f24dc3</citedby><cites>FETCH-LOGICAL-c294t-fa19b9dce5119d78277b76140e3f674490850787e26eb930609dc2c78d6f24dc3</cites><orcidid>0000-0002-2470-0293 ; 0000-0002-0658-8867 ; 0000-0002-4128-886X ; 0000-0003-0833-231X ; 0000-0002-8891-1057</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10247098$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10247098$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhou, Yuchen</creatorcontrib><creatorcontrib>Tan, Guang</creatorcontrib><creatorcontrib>Zhong, Rui</creatorcontrib><creatorcontrib>Li, Yaokun</creatorcontrib><creatorcontrib>Gou, Chao</creatorcontrib><title>PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction</title><title>IEEE transactions on intelligent transportation systems</title><addtitle>TITS</addtitle><description>For autonomous driving, one of the major challenges is to predict pedestrian crossing intention in ego-view. Pedestrian intention depends not only on their intrinsic goals but also on the stimulation of surrounding traffic elements. Considering the influence of other traffic elements on pedestrian intention, recent work introduced more traffic element information into the model to successfully improve performance. However, it is still difficult to effectively capture and fully exploit the potential dynamic spatio-temporal interactions among the target pedestrian and its surrounding traffic elements for accurate reasoning. In this work, inspired by neuroscience that human drivers tend to make continuous sensory-motor driving decisions by progressive visual stimulation, we propose a model termed Progressive Interaction Transformer (PIT) for pedestrian crossing intention prediction. Local pedestrian, global environment, and ego-vehicle motion are considered simultaneously in the proposed PIT. In particular, the temporal fusion block and self-attention mechanism are introduced to jointly and progressively model the dynamic spatio-temporal interactions among the three parties, allowing it to capture richer information and make prediction in a similar way to human drivers. Experimental results demonstrate that PIT achieves higher performance compared with other state-of-the-arts and preserves real-time inference.</description><subject>Autonomous vehicles</subject><subject>intelligent vehicle</subject><subject>Modeling</subject><subject>neuroscience</subject><subject>Pedestrian crossings</subject><subject>Pedestrian intention prediction</subject><subject>Pedestrians</subject><subject>Performance enhancement</subject><subject>Predictive models</subject><subject>Stimulation</subject><subject>Traffic information</subject><subject>traffic scene understanding</subject><subject>transformer</subject><subject>Transformers</subject><subject>Vehicle dynamics</subject><subject>Vehicles</subject><issn>1524-9050</issn><issn>1558-0016</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkF9LwzAUxYMoOKcfQPCh4HPnTZrmj28ynBYGFuyeQ9fejg7XzptO8NubbnsQLpxL-J3k5DB2z2HGOdinIis-ZwJEMksSsGEu2ISnqYkBuLocdyFjCylcsxvvt-FUppxP2CrPiucop35D6H37g1HWDUhlNbR9FxVUdr7paYcUBYlyrNEP1JZdNKc-8N3myHdHOies26Pxll015ZfHu7NO2WrxWszf4-XHWzZ_WcaVsHKIm5Lbta0rDFFsrY3Qeq0Vl4BJo7SUFkwK2mgUCtfhUwoCLCptatUIWVfJlD2e7t1T_30I0dy2P1AXnnTCWAVCWw2B4ieqGjMTNm5P7a6kX8fBje25sT03tufO7QXPw8nTIuI_XkgN1iR_IpdrdA</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Zhou, Yuchen</creator><creator>Tan, Guang</creator><creator>Zhong, Rui</creator><creator>Li, Yaokun</creator><creator>Gou, Chao</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2470-0293</orcidid><orcidid>https://orcid.org/0000-0002-0658-8867</orcidid><orcidid>https://orcid.org/0000-0002-4128-886X</orcidid><orcidid>https://orcid.org/0000-0003-0833-231X</orcidid><orcidid>https://orcid.org/0000-0002-8891-1057</orcidid></search><sort><creationdate>20231201</creationdate><title>PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction</title><author>Zhou, Yuchen ; Tan, Guang ; Zhong, Rui ; Li, Yaokun ; Gou, Chao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-fa19b9dce5119d78277b76140e3f674490850787e26eb930609dc2c78d6f24dc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Autonomous vehicles</topic><topic>intelligent vehicle</topic><topic>Modeling</topic><topic>neuroscience</topic><topic>Pedestrian crossings</topic><topic>Pedestrian intention prediction</topic><topic>Pedestrians</topic><topic>Performance enhancement</topic><topic>Predictive models</topic><topic>Stimulation</topic><topic>Traffic information</topic><topic>traffic scene understanding</topic><topic>transformer</topic><topic>Transformers</topic><topic>Vehicle dynamics</topic><topic>Vehicles</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Yuchen</creatorcontrib><creatorcontrib>Tan, Guang</creatorcontrib><creatorcontrib>Zhong, Rui</creatorcontrib><creatorcontrib>Li, Yaokun</creatorcontrib><creatorcontrib>Gou, Chao</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on intelligent transportation systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhou, Yuchen</au><au>Tan, Guang</au><au>Zhong, Rui</au><au>Li, Yaokun</au><au>Gou, Chao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction</atitle><jtitle>IEEE transactions on intelligent transportation systems</jtitle><stitle>TITS</stitle><date>2023-12-01</date><risdate>2023</risdate><volume>24</volume><issue>12</issue><spage>14213</spage><epage>14225</epage><pages>14213-14225</pages><issn>1524-9050</issn><eissn>1558-0016</eissn><coden>ITISFG</coden><abstract>For autonomous driving, one of the major challenges is to predict pedestrian crossing intention in ego-view. Pedestrian intention depends not only on their intrinsic goals but also on the stimulation of surrounding traffic elements. Considering the influence of other traffic elements on pedestrian intention, recent work introduced more traffic element information into the model to successfully improve performance. However, it is still difficult to effectively capture and fully exploit the potential dynamic spatio-temporal interactions among the target pedestrian and its surrounding traffic elements for accurate reasoning. In this work, inspired by neuroscience that human drivers tend to make continuous sensory-motor driving decisions by progressive visual stimulation, we propose a model termed Progressive Interaction Transformer (PIT) for pedestrian crossing intention prediction. Local pedestrian, global environment, and ego-vehicle motion are considered simultaneously in the proposed PIT. In particular, the temporal fusion block and self-attention mechanism are introduced to jointly and progressively model the dynamic spatio-temporal interactions among the three parties, allowing it to capture richer information and make prediction in a similar way to human drivers. Experimental results demonstrate that PIT achieves higher performance compared with other state-of-the-arts and preserves real-time inference.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TITS.2023.3309309</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-2470-0293</orcidid><orcidid>https://orcid.org/0000-0002-0658-8867</orcidid><orcidid>https://orcid.org/0000-0002-4128-886X</orcidid><orcidid>https://orcid.org/0000-0003-0833-231X</orcidid><orcidid>https://orcid.org/0000-0002-8891-1057</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1524-9050
ispartof IEEE transactions on intelligent transportation systems, 2023-12, Vol.24 (12), p.14213-14225
issn 1524-9050
1558-0016
language eng
recordid cdi_crossref_primary_10_1109_TITS_2023_3309309
source IEEE Electronic Library (IEL)
subjects Autonomous vehicles
intelligent vehicle
Modeling
neuroscience
Pedestrian crossings
Pedestrian intention prediction
Pedestrians
Performance enhancement
Predictive models
Stimulation
Traffic information
traffic scene understanding
transformer
Transformers
Vehicle dynamics
Vehicles
title PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T19%3A01%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PIT:%20Progressive%20Interaction%20Transformer%20for%20Pedestrian%20Crossing%20Intention%20Prediction&rft.jtitle=IEEE%20transactions%20on%20intelligent%20transportation%20systems&rft.au=Zhou,%20Yuchen&rft.date=2023-12-01&rft.volume=24&rft.issue=12&rft.spage=14213&rft.epage=14225&rft.pages=14213-14225&rft.issn=1524-9050&rft.eissn=1558-0016&rft.coden=ITISFG&rft_id=info:doi/10.1109/TITS.2023.3309309&rft_dat=%3Cproquest_RIE%3E2896027970%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2896027970&rft_id=info:pmid/&rft_ieee_id=10247098&rfr_iscdi=true