PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction
For autonomous driving, one of the major challenges is to predict pedestrian crossing intention in ego-view. Pedestrian intention depends not only on their intrinsic goals but also on the stimulation of surrounding traffic elements. Considering the influence of other traffic elements on pedestrian i...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on intelligent transportation systems 2023-12, Vol.24 (12), p.14213-14225 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 14225 |
---|---|
container_issue | 12 |
container_start_page | 14213 |
container_title | IEEE transactions on intelligent transportation systems |
container_volume | 24 |
creator | Zhou, Yuchen Tan, Guang Zhong, Rui Li, Yaokun Gou, Chao |
description | For autonomous driving, one of the major challenges is to predict pedestrian crossing intention in ego-view. Pedestrian intention depends not only on their intrinsic goals but also on the stimulation of surrounding traffic elements. Considering the influence of other traffic elements on pedestrian intention, recent work introduced more traffic element information into the model to successfully improve performance. However, it is still difficult to effectively capture and fully exploit the potential dynamic spatio-temporal interactions among the target pedestrian and its surrounding traffic elements for accurate reasoning. In this work, inspired by neuroscience that human drivers tend to make continuous sensory-motor driving decisions by progressive visual stimulation, we propose a model termed Progressive Interaction Transformer (PIT) for pedestrian crossing intention prediction. Local pedestrian, global environment, and ego-vehicle motion are considered simultaneously in the proposed PIT. In particular, the temporal fusion block and self-attention mechanism are introduced to jointly and progressively model the dynamic spatio-temporal interactions among the three parties, allowing it to capture richer information and make prediction in a similar way to human drivers. Experimental results demonstrate that PIT achieves higher performance compared with other state-of-the-arts and preserves real-time inference. |
doi_str_mv | 10.1109/TITS.2023.3309309 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TITS_2023_3309309</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10247098</ieee_id><sourcerecordid>2896027970</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-fa19b9dce5119d78277b76140e3f674490850787e26eb930609dc2c78d6f24dc3</originalsourceid><addsrcrecordid>eNpNkF9LwzAUxYMoOKcfQPCh4HPnTZrmj28ynBYGFuyeQ9fejg7XzptO8NubbnsQLpxL-J3k5DB2z2HGOdinIis-ZwJEMksSsGEu2ISnqYkBuLocdyFjCylcsxvvt-FUppxP2CrPiucop35D6H37g1HWDUhlNbR9FxVUdr7paYcUBYlyrNEP1JZdNKc-8N3myHdHOies26Pxll015ZfHu7NO2WrxWszf4-XHWzZ_WcaVsHKIm5Lbta0rDFFsrY3Qeq0Vl4BJo7SUFkwK2mgUCtfhUwoCLCptatUIWVfJlD2e7t1T_30I0dy2P1AXnnTCWAVCWw2B4ieqGjMTNm5P7a6kX8fBje25sT03tufO7QXPw8nTIuI_XkgN1iR_IpdrdA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2896027970</pqid></control><display><type>article</type><title>PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction</title><source>IEEE Electronic Library (IEL)</source><creator>Zhou, Yuchen ; Tan, Guang ; Zhong, Rui ; Li, Yaokun ; Gou, Chao</creator><creatorcontrib>Zhou, Yuchen ; Tan, Guang ; Zhong, Rui ; Li, Yaokun ; Gou, Chao</creatorcontrib><description>For autonomous driving, one of the major challenges is to predict pedestrian crossing intention in ego-view. Pedestrian intention depends not only on their intrinsic goals but also on the stimulation of surrounding traffic elements. Considering the influence of other traffic elements on pedestrian intention, recent work introduced more traffic element information into the model to successfully improve performance. However, it is still difficult to effectively capture and fully exploit the potential dynamic spatio-temporal interactions among the target pedestrian and its surrounding traffic elements for accurate reasoning. In this work, inspired by neuroscience that human drivers tend to make continuous sensory-motor driving decisions by progressive visual stimulation, we propose a model termed Progressive Interaction Transformer (PIT) for pedestrian crossing intention prediction. Local pedestrian, global environment, and ego-vehicle motion are considered simultaneously in the proposed PIT. In particular, the temporal fusion block and self-attention mechanism are introduced to jointly and progressively model the dynamic spatio-temporal interactions among the three parties, allowing it to capture richer information and make prediction in a similar way to human drivers. Experimental results demonstrate that PIT achieves higher performance compared with other state-of-the-arts and preserves real-time inference.</description><identifier>ISSN: 1524-9050</identifier><identifier>EISSN: 1558-0016</identifier><identifier>DOI: 10.1109/TITS.2023.3309309</identifier><identifier>CODEN: ITISFG</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Autonomous vehicles ; intelligent vehicle ; Modeling ; neuroscience ; Pedestrian crossings ; Pedestrian intention prediction ; Pedestrians ; Performance enhancement ; Predictive models ; Stimulation ; Traffic information ; traffic scene understanding ; transformer ; Transformers ; Vehicle dynamics ; Vehicles</subject><ispartof>IEEE transactions on intelligent transportation systems, 2023-12, Vol.24 (12), p.14213-14225</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-fa19b9dce5119d78277b76140e3f674490850787e26eb930609dc2c78d6f24dc3</citedby><cites>FETCH-LOGICAL-c294t-fa19b9dce5119d78277b76140e3f674490850787e26eb930609dc2c78d6f24dc3</cites><orcidid>0000-0002-2470-0293 ; 0000-0002-0658-8867 ; 0000-0002-4128-886X ; 0000-0003-0833-231X ; 0000-0002-8891-1057</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10247098$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10247098$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhou, Yuchen</creatorcontrib><creatorcontrib>Tan, Guang</creatorcontrib><creatorcontrib>Zhong, Rui</creatorcontrib><creatorcontrib>Li, Yaokun</creatorcontrib><creatorcontrib>Gou, Chao</creatorcontrib><title>PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction</title><title>IEEE transactions on intelligent transportation systems</title><addtitle>TITS</addtitle><description>For autonomous driving, one of the major challenges is to predict pedestrian crossing intention in ego-view. Pedestrian intention depends not only on their intrinsic goals but also on the stimulation of surrounding traffic elements. Considering the influence of other traffic elements on pedestrian intention, recent work introduced more traffic element information into the model to successfully improve performance. However, it is still difficult to effectively capture and fully exploit the potential dynamic spatio-temporal interactions among the target pedestrian and its surrounding traffic elements for accurate reasoning. In this work, inspired by neuroscience that human drivers tend to make continuous sensory-motor driving decisions by progressive visual stimulation, we propose a model termed Progressive Interaction Transformer (PIT) for pedestrian crossing intention prediction. Local pedestrian, global environment, and ego-vehicle motion are considered simultaneously in the proposed PIT. In particular, the temporal fusion block and self-attention mechanism are introduced to jointly and progressively model the dynamic spatio-temporal interactions among the three parties, allowing it to capture richer information and make prediction in a similar way to human drivers. Experimental results demonstrate that PIT achieves higher performance compared with other state-of-the-arts and preserves real-time inference.</description><subject>Autonomous vehicles</subject><subject>intelligent vehicle</subject><subject>Modeling</subject><subject>neuroscience</subject><subject>Pedestrian crossings</subject><subject>Pedestrian intention prediction</subject><subject>Pedestrians</subject><subject>Performance enhancement</subject><subject>Predictive models</subject><subject>Stimulation</subject><subject>Traffic information</subject><subject>traffic scene understanding</subject><subject>transformer</subject><subject>Transformers</subject><subject>Vehicle dynamics</subject><subject>Vehicles</subject><issn>1524-9050</issn><issn>1558-0016</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkF9LwzAUxYMoOKcfQPCh4HPnTZrmj28ynBYGFuyeQ9fejg7XzptO8NubbnsQLpxL-J3k5DB2z2HGOdinIis-ZwJEMksSsGEu2ISnqYkBuLocdyFjCylcsxvvt-FUppxP2CrPiucop35D6H37g1HWDUhlNbR9FxVUdr7paYcUBYlyrNEP1JZdNKc-8N3myHdHOies26Pxll015ZfHu7NO2WrxWszf4-XHWzZ_WcaVsHKIm5Lbta0rDFFsrY3Qeq0Vl4BJo7SUFkwK2mgUCtfhUwoCLCptatUIWVfJlD2e7t1T_30I0dy2P1AXnnTCWAVCWw2B4ieqGjMTNm5P7a6kX8fBje25sT03tufO7QXPw8nTIuI_XkgN1iR_IpdrdA</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Zhou, Yuchen</creator><creator>Tan, Guang</creator><creator>Zhong, Rui</creator><creator>Li, Yaokun</creator><creator>Gou, Chao</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2470-0293</orcidid><orcidid>https://orcid.org/0000-0002-0658-8867</orcidid><orcidid>https://orcid.org/0000-0002-4128-886X</orcidid><orcidid>https://orcid.org/0000-0003-0833-231X</orcidid><orcidid>https://orcid.org/0000-0002-8891-1057</orcidid></search><sort><creationdate>20231201</creationdate><title>PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction</title><author>Zhou, Yuchen ; Tan, Guang ; Zhong, Rui ; Li, Yaokun ; Gou, Chao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-fa19b9dce5119d78277b76140e3f674490850787e26eb930609dc2c78d6f24dc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Autonomous vehicles</topic><topic>intelligent vehicle</topic><topic>Modeling</topic><topic>neuroscience</topic><topic>Pedestrian crossings</topic><topic>Pedestrian intention prediction</topic><topic>Pedestrians</topic><topic>Performance enhancement</topic><topic>Predictive models</topic><topic>Stimulation</topic><topic>Traffic information</topic><topic>traffic scene understanding</topic><topic>transformer</topic><topic>Transformers</topic><topic>Vehicle dynamics</topic><topic>Vehicles</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Yuchen</creatorcontrib><creatorcontrib>Tan, Guang</creatorcontrib><creatorcontrib>Zhong, Rui</creatorcontrib><creatorcontrib>Li, Yaokun</creatorcontrib><creatorcontrib>Gou, Chao</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on intelligent transportation systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhou, Yuchen</au><au>Tan, Guang</au><au>Zhong, Rui</au><au>Li, Yaokun</au><au>Gou, Chao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction</atitle><jtitle>IEEE transactions on intelligent transportation systems</jtitle><stitle>TITS</stitle><date>2023-12-01</date><risdate>2023</risdate><volume>24</volume><issue>12</issue><spage>14213</spage><epage>14225</epage><pages>14213-14225</pages><issn>1524-9050</issn><eissn>1558-0016</eissn><coden>ITISFG</coden><abstract>For autonomous driving, one of the major challenges is to predict pedestrian crossing intention in ego-view. Pedestrian intention depends not only on their intrinsic goals but also on the stimulation of surrounding traffic elements. Considering the influence of other traffic elements on pedestrian intention, recent work introduced more traffic element information into the model to successfully improve performance. However, it is still difficult to effectively capture and fully exploit the potential dynamic spatio-temporal interactions among the target pedestrian and its surrounding traffic elements for accurate reasoning. In this work, inspired by neuroscience that human drivers tend to make continuous sensory-motor driving decisions by progressive visual stimulation, we propose a model termed Progressive Interaction Transformer (PIT) for pedestrian crossing intention prediction. Local pedestrian, global environment, and ego-vehicle motion are considered simultaneously in the proposed PIT. In particular, the temporal fusion block and self-attention mechanism are introduced to jointly and progressively model the dynamic spatio-temporal interactions among the three parties, allowing it to capture richer information and make prediction in a similar way to human drivers. Experimental results demonstrate that PIT achieves higher performance compared with other state-of-the-arts and preserves real-time inference.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TITS.2023.3309309</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-2470-0293</orcidid><orcidid>https://orcid.org/0000-0002-0658-8867</orcidid><orcidid>https://orcid.org/0000-0002-4128-886X</orcidid><orcidid>https://orcid.org/0000-0003-0833-231X</orcidid><orcidid>https://orcid.org/0000-0002-8891-1057</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1524-9050 |
ispartof | IEEE transactions on intelligent transportation systems, 2023-12, Vol.24 (12), p.14213-14225 |
issn | 1524-9050 1558-0016 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TITS_2023_3309309 |
source | IEEE Electronic Library (IEL) |
subjects | Autonomous vehicles intelligent vehicle Modeling neuroscience Pedestrian crossings Pedestrian intention prediction Pedestrians Performance enhancement Predictive models Stimulation Traffic information traffic scene understanding transformer Transformers Vehicle dynamics Vehicles |
title | PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T19%3A01%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PIT:%20Progressive%20Interaction%20Transformer%20for%20Pedestrian%20Crossing%20Intention%20Prediction&rft.jtitle=IEEE%20transactions%20on%20intelligent%20transportation%20systems&rft.au=Zhou,%20Yuchen&rft.date=2023-12-01&rft.volume=24&rft.issue=12&rft.spage=14213&rft.epage=14225&rft.pages=14213-14225&rft.issn=1524-9050&rft.eissn=1558-0016&rft.coden=ITISFG&rft_id=info:doi/10.1109/TITS.2023.3309309&rft_dat=%3Cproquest_RIE%3E2896027970%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2896027970&rft_id=info:pmid/&rft_ieee_id=10247098&rfr_iscdi=true |