Sparse Pedestrian Character Learning for Trajectory Prediction

Pedestrian trajectory prediction in a first-person view has recently attracted much attention due to its importance in autonomous driving. Recent work utilizes pedestrian character information, i.e., action and appearance, to improve the learned trajectory embedding and achieves state-of-the-art per...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on multimedia 2024, Vol.26, p.11070-11082
Hauptverfasser: Dong, Yonghao, Wang, Le, Zhou, Sanping, Hua, Gang, Sun, Changyin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 11082
container_issue
container_start_page 11070
container_title IEEE transactions on multimedia
container_volume 26
creator Dong, Yonghao
Wang, Le
Zhou, Sanping
Hua, Gang
Sun, Changyin
description Pedestrian trajectory prediction in a first-person view has recently attracted much attention due to its importance in autonomous driving. Recent work utilizes pedestrian character information, i.e., action and appearance, to improve the learned trajectory embedding and achieves state-of-the-art performance. However, it neglects the invalid and negative pedestrian character information, which is harmful to trajectory representation and thus leads to performance degradation. To address this issue, we present a two-stream sparse-character-based network (TSNet) for pedestrian trajectory prediction. Specifically, TSNet learns the negative-removed characters in the sparse character representation stream to improve the trajectory embedding obtained in the trajectory representation stream. Moreover, to model the negative-removed characters, we propose a novel sparse character graph, including the sparse category and sparse temporal character graphs, to learn the different effects of various characters in category and temporal dimensions, respectively. Extensive experiments on two first-person view datasets, PIE and JAAD, show that our method outperforms existing state-of-the-art methods. In addition, ablation studies demonstrate different effects of various characters and prove that TSNet outperforms approaches without eliminating negative characters.
doi_str_mv 10.1109/TMM.2024.3443591
format Article
fullrecord <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TMM_2024_3443591</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10636801</ieee_id><sourcerecordid>10_1109_TMM_2024_3443591</sourcerecordid><originalsourceid>FETCH-LOGICAL-c189t-3f782bafda0de0bd732412fe785550f6d6e02852f5cb980c02139f94018ed1b53</originalsourceid><addsrcrecordid>eNpNj0tLw0AUhQdRsFb3LlzkD6TeO49kZiNI8QUtFqzrMJm5oymalDvZ9N_b0i5cnbM434FPiFuEGSK4-_VyOZMg9UxprYzDMzFBp7EEqOvzfTcSSicRLsVVzhsA1AbqiXj42HrOVKwoUh65830x__bsw0hcLMhz3_VfRRq4WLPfUBgH3hUrptiFsRv6a3GR_E-mm1NOxefz03r-Wi7eX97mj4syoHVjqVJtZetT9BAJ2lgrqVEmqq0xBlIVKwJpjUwmtM5CAInKJacBLUVsjZoKOP4GHnJmSs2Wu1_PuwahOfg3e__m4N-c_PfI3RHpiOjfvFKVBVR_SNBWpA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Sparse Pedestrian Character Learning for Trajectory Prediction</title><source>IEEE Electronic Library (IEL)</source><creator>Dong, Yonghao ; Wang, Le ; Zhou, Sanping ; Hua, Gang ; Sun, Changyin</creator><creatorcontrib>Dong, Yonghao ; Wang, Le ; Zhou, Sanping ; Hua, Gang ; Sun, Changyin</creatorcontrib><description>Pedestrian trajectory prediction in a first-person view has recently attracted much attention due to its importance in autonomous driving. Recent work utilizes pedestrian character information, i.e., action and appearance, to improve the learned trajectory embedding and achieves state-of-the-art performance. However, it neglects the invalid and negative pedestrian character information, which is harmful to trajectory representation and thus leads to performance degradation. To address this issue, we present a two-stream sparse-character-based network (TSNet) for pedestrian trajectory prediction. Specifically, TSNet learns the negative-removed characters in the sparse character representation stream to improve the trajectory embedding obtained in the trajectory representation stream. Moreover, to model the negative-removed characters, we propose a novel sparse character graph, including the sparse category and sparse temporal character graphs, to learn the different effects of various characters in category and temporal dimensions, respectively. Extensive experiments on two first-person view datasets, PIE and JAAD, show that our method outperforms existing state-of-the-art methods. In addition, ablation studies demonstrate different effects of various characters and prove that TSNet outperforms approaches without eliminating negative characters.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2024.3443591</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>IEEE</publisher><subject>Accuracy ; Cameras ; Degradation ; Long short term memory ; Pedestrian trajectory prediction ; Pedestrians ; Predictive models ; sparse pedestrian character learning ; Trajectory</subject><ispartof>IEEE transactions on multimedia, 2024, Vol.26, p.11070-11082</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0001-9269-334X ; 0000-0003-4100-2304 ; 0000-0001-9522-6157 ; 0009-0009-4271-4398 ; 0000-0001-6636-6396</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10636801$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,4024,27923,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10636801$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Dong, Yonghao</creatorcontrib><creatorcontrib>Wang, Le</creatorcontrib><creatorcontrib>Zhou, Sanping</creatorcontrib><creatorcontrib>Hua, Gang</creatorcontrib><creatorcontrib>Sun, Changyin</creatorcontrib><title>Sparse Pedestrian Character Learning for Trajectory Prediction</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>Pedestrian trajectory prediction in a first-person view has recently attracted much attention due to its importance in autonomous driving. Recent work utilizes pedestrian character information, i.e., action and appearance, to improve the learned trajectory embedding and achieves state-of-the-art performance. However, it neglects the invalid and negative pedestrian character information, which is harmful to trajectory representation and thus leads to performance degradation. To address this issue, we present a two-stream sparse-character-based network (TSNet) for pedestrian trajectory prediction. Specifically, TSNet learns the negative-removed characters in the sparse character representation stream to improve the trajectory embedding obtained in the trajectory representation stream. Moreover, to model the negative-removed characters, we propose a novel sparse character graph, including the sparse category and sparse temporal character graphs, to learn the different effects of various characters in category and temporal dimensions, respectively. Extensive experiments on two first-person view datasets, PIE and JAAD, show that our method outperforms existing state-of-the-art methods. In addition, ablation studies demonstrate different effects of various characters and prove that TSNet outperforms approaches without eliminating negative characters.</description><subject>Accuracy</subject><subject>Cameras</subject><subject>Degradation</subject><subject>Long short term memory</subject><subject>Pedestrian trajectory prediction</subject><subject>Pedestrians</subject><subject>Predictive models</subject><subject>sparse pedestrian character learning</subject><subject>Trajectory</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNj0tLw0AUhQdRsFb3LlzkD6TeO49kZiNI8QUtFqzrMJm5oymalDvZ9N_b0i5cnbM434FPiFuEGSK4-_VyOZMg9UxprYzDMzFBp7EEqOvzfTcSSicRLsVVzhsA1AbqiXj42HrOVKwoUh65830x__bsw0hcLMhz3_VfRRq4WLPfUBgH3hUrptiFsRv6a3GR_E-mm1NOxefz03r-Wi7eX97mj4syoHVjqVJtZetT9BAJ2lgrqVEmqq0xBlIVKwJpjUwmtM5CAInKJacBLUVsjZoKOP4GHnJmSs2Wu1_PuwahOfg3e__m4N-c_PfI3RHpiOjfvFKVBVR_SNBWpA</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Dong, Yonghao</creator><creator>Wang, Le</creator><creator>Zhou, Sanping</creator><creator>Hua, Gang</creator><creator>Sun, Changyin</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-9269-334X</orcidid><orcidid>https://orcid.org/0000-0003-4100-2304</orcidid><orcidid>https://orcid.org/0000-0001-9522-6157</orcidid><orcidid>https://orcid.org/0009-0009-4271-4398</orcidid><orcidid>https://orcid.org/0000-0001-6636-6396</orcidid></search><sort><creationdate>2024</creationdate><title>Sparse Pedestrian Character Learning for Trajectory Prediction</title><author>Dong, Yonghao ; Wang, Le ; Zhou, Sanping ; Hua, Gang ; Sun, Changyin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c189t-3f782bafda0de0bd732412fe785550f6d6e02852f5cb980c02139f94018ed1b53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Cameras</topic><topic>Degradation</topic><topic>Long short term memory</topic><topic>Pedestrian trajectory prediction</topic><topic>Pedestrians</topic><topic>Predictive models</topic><topic>sparse pedestrian character learning</topic><topic>Trajectory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dong, Yonghao</creatorcontrib><creatorcontrib>Wang, Le</creatorcontrib><creatorcontrib>Zhou, Sanping</creatorcontrib><creatorcontrib>Hua, Gang</creatorcontrib><creatorcontrib>Sun, Changyin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dong, Yonghao</au><au>Wang, Le</au><au>Zhou, Sanping</au><au>Hua, Gang</au><au>Sun, Changyin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sparse Pedestrian Character Learning for Trajectory Prediction</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2024</date><risdate>2024</risdate><volume>26</volume><spage>11070</spage><epage>11082</epage><pages>11070-11082</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>Pedestrian trajectory prediction in a first-person view has recently attracted much attention due to its importance in autonomous driving. Recent work utilizes pedestrian character information, i.e., action and appearance, to improve the learned trajectory embedding and achieves state-of-the-art performance. However, it neglects the invalid and negative pedestrian character information, which is harmful to trajectory representation and thus leads to performance degradation. To address this issue, we present a two-stream sparse-character-based network (TSNet) for pedestrian trajectory prediction. Specifically, TSNet learns the negative-removed characters in the sparse character representation stream to improve the trajectory embedding obtained in the trajectory representation stream. Moreover, to model the negative-removed characters, we propose a novel sparse character graph, including the sparse category and sparse temporal character graphs, to learn the different effects of various characters in category and temporal dimensions, respectively. Extensive experiments on two first-person view datasets, PIE and JAAD, show that our method outperforms existing state-of-the-art methods. In addition, ablation studies demonstrate different effects of various characters and prove that TSNet outperforms approaches without eliminating negative characters.</abstract><pub>IEEE</pub><doi>10.1109/TMM.2024.3443591</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-9269-334X</orcidid><orcidid>https://orcid.org/0000-0003-4100-2304</orcidid><orcidid>https://orcid.org/0000-0001-9522-6157</orcidid><orcidid>https://orcid.org/0009-0009-4271-4398</orcidid><orcidid>https://orcid.org/0000-0001-6636-6396</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-9210
ispartof IEEE transactions on multimedia, 2024, Vol.26, p.11070-11082
issn 1520-9210
1941-0077
language eng
recordid cdi_crossref_primary_10_1109_TMM_2024_3443591
source IEEE Electronic Library (IEL)
subjects Accuracy
Cameras
Degradation
Long short term memory
Pedestrian trajectory prediction
Pedestrians
Predictive models
sparse pedestrian character learning
Trajectory
title Sparse Pedestrian Character Learning for Trajectory Prediction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T18%3A10%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sparse%20Pedestrian%20Character%20Learning%20for%20Trajectory%20Prediction&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Dong,%20Yonghao&rft.date=2024&rft.volume=26&rft.spage=11070&rft.epage=11082&rft.pages=11070-11082&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2024.3443591&rft_dat=%3Ccrossref_RIE%3E10_1109_TMM_2024_3443591%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10636801&rfr_iscdi=true