Sparse Pedestrian Character Learning for Trajectory Prediction

Pedestrian trajectory prediction in a first-person view has recently attracted much attention due to its importance in autonomous driving. Recent work utilizes pedestrian character information, i.e., action and appearance, to improve the learned trajectory embedding and achieves state-of-the-art per...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on multimedia 2024, Vol.26, p.11070-11082
Hauptverfasser:	Dong, Yonghao, Wang, Le, Zhou, Sanping, Hua, Gang, Sun, Changyin
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Cameras Degradation Long short term memory Pedestrian trajectory prediction Pedestrians Predictive models sparse pedestrian character learning Trajectory
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	11082
container_issue
container_start_page	11070
container_title	IEEE transactions on multimedia
container_volume	26
creator	Dong, Yonghao Wang, Le Zhou, Sanping Hua, Gang Sun, Changyin
description	Pedestrian trajectory prediction in a first-person view has recently attracted much attention due to its importance in autonomous driving. Recent work utilizes pedestrian character information, i.e., action and appearance, to improve the learned trajectory embedding and achieves state-of-the-art performance. However, it neglects the invalid and negative pedestrian character information, which is harmful to trajectory representation and thus leads to performance degradation. To address this issue, we present a two-stream sparse-character-based network (TSNet) for pedestrian trajectory prediction. Specifically, TSNet learns the negative-removed characters in the sparse character representation stream to improve the trajectory embedding obtained in the trajectory representation stream. Moreover, to model the negative-removed characters, we propose a novel sparse character graph, including the sparse category and sparse temporal character graphs, to learn the different effects of various characters in category and temporal dimensions, respectively. Extensive experiments on two first-person view datasets, PIE and JAAD, show that our method outperforms existing state-of-the-art methods. In addition, ablation studies demonstrate different effects of various characters and prove that TSNet outperforms approaches without eliminating negative characters.
doi_str_mv	10.1109/TMM.2024.3443591
format	Article
fullrecord	<record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TMM_2024_3443591</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10636801</ieee_id><sourcerecordid>10_1109_TMM_2024_3443591</sourcerecordid><originalsourceid>FETCH-LOGICAL-c189t-3f782bafda0de0bd732412fe785550f6d6e02852f5cb980c02139f94018ed1b53</originalsourceid><addsrcrecordid>eNpNj0tLw0AUhQdRsFb3LlzkD6TeO49kZiNI8QUtFqzrMJm5oymalDvZ9N_b0i5cnbM434FPiFuEGSK4-_VyOZMg9UxprYzDMzFBp7EEqOvzfTcSSicRLsVVzhsA1AbqiXj42HrOVKwoUh65830x__bsw0hcLMhz3_VfRRq4WLPfUBgH3hUrptiFsRv6a3GR_E-mm1NOxefz03r-Wi7eX97mj4syoHVjqVJtZetT9BAJ2lgrqVEmqq0xBlIVKwJpjUwmtM5CAInKJacBLUVsjZoKOP4GHnJmSs2Wu1_PuwahOfg3e__m4N-c_PfI3RHpiOjfvFKVBVR_SNBWpA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Sparse Pedestrian Character Learning for Trajectory Prediction</title><source>IEEE Electronic Library (IEL)</source><creator>Dong, Yonghao ; Wang, Le ; Zhou, Sanping ; Hua, Gang ; Sun, Changyin</creator><creatorcontrib>Dong, Yonghao ; Wang, Le ; Zhou, Sanping ; Hua, Gang ; Sun, Changyin</creatorcontrib><description>Pedestrian trajectory prediction in a first-person view has recently attracted much attention due to its importance in autonomous driving. Recent work utilizes pedestrian character information, i.e., action and appearance, to improve the learned trajectory embedding and achieves state-of-the-art performance. However, it neglects the invalid and negative pedestrian character information, which is harmful to trajectory representation and thus leads to performance degradation. To address this issue, we present a two-stream sparse-character-based network (TSNet) for pedestrian trajectory prediction. Specifically, TSNet learns the negative-removed characters in the sparse character representation stream to improve the trajectory embedding obtained in the trajectory representation stream. Moreover, to model the negative-removed characters, we propose a novel sparse character graph, including the sparse category and sparse temporal character graphs, to learn the different effects of various characters in category and temporal dimensions, respectively. Extensive experiments on two first-person view datasets, PIE and JAAD, show that our method outperforms existing state-of-the-art methods. In addition, ablation studies demonstrate different effects of various characters and prove that TSNet outperforms approaches without eliminating negative characters.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2024.3443591</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>IEEE</publisher><subject>Accuracy ; Cameras ; Degradation ; Long short term memory ; Pedestrian trajectory prediction ; Pedestrians ; Predictive models ; sparse pedestrian character learning ; Trajectory</subject><ispartof>IEEE transactions on multimedia, 2024, Vol.26, p.11070-11082</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0001-9269-334X ; 0000-0003-4100-2304 ; 0000-0001-9522-6157 ; 0009-0009-4271-4398 ; 0000-0001-6636-6396</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10636801$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,796,4024,27923,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10636801$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Dong, Yonghao</creatorcontrib><creatorcontrib>Wang, Le</creatorcontrib><creatorcontrib>Zhou, Sanping</creatorcontrib><creatorcontrib>Hua, Gang</creatorcontrib><creatorcontrib>Sun, Changyin</creatorcontrib><title>Sparse Pedestrian Character Learning for Trajectory Prediction</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>Pedestrian trajectory prediction in a first-person view has recently attracted much attention due to its importance in autonomous driving. Recent work utilizes pedestrian character information, i.e., action and appearance, to improve the learned trajectory embedding and achieves state-of-the-art performance. However, it neglects the invalid and negative pedestrian character information, which is harmful to trajectory representation and thus leads to performance degradation. To address this issue, we present a two-stream sparse-character-based network (TSNet) for pedestrian trajectory prediction. Specifically, TSNet learns the negative-removed characters in the sparse character representation stream to improve the trajectory embedding obtained in the trajectory representation stream. Moreover, to model the negative-removed characters, we propose a novel sparse character graph, including the sparse category and sparse temporal character graphs, to learn the different effects of various characters in category and temporal dimensions, respectively. Extensive experiments on two first-person view datasets, PIE and JAAD, show that our method outperforms existing state-of-the-art methods. In addition, ablation studies demonstrate different effects of various characters and prove that TSNet outperforms approaches without eliminating negative characters.</description><subject>Accuracy</subject><subject>Cameras</subject><subject>Degradation</subject><subject>Long short term memory</subject><subject>Pedestrian trajectory prediction</subject><subject>Pedestrians</subject><subject>Predictive models</subject><subject>sparse pedestrian character learning</subject><subject>Trajectory</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNj0tLw0AUhQdRsFb3LlzkD6TeO49kZiNI8QUtFqzrMJm5oymalDvZ9N_b0i5cnbM434FPiFuEGSK4-_VyOZMg9UxprYzDMzFBp7EEqOvzfTcSSicRLsVVzhsA1AbqiXj42HrOVKwoUh65830x__bsw0hcLMhz3_VfRRq4WLPfUBgH3hUrptiFsRv6a3GR_E-mm1NOxefz03r-Wi7eX97mj4syoHVjqVJtZetT9BAJ2lgrqVEmqq0xBlIVKwJpjUwmtM5CAInKJacBLUVsjZoKOP4GHnJmSs2Wu1_PuwahOfg3e__m4N-c_PfI3RHpiOjfvFKVBVR_SNBWpA</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Dong, Yonghao</creator><creator>Wang, Le</creator><creator>Zhou, Sanping</creator><creator>Hua, Gang</creator><creator>Sun, Changyin</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-9269-334X</orcidid><orcidid>https://orcid.org/0000-0003-4100-2304</orcidid><orcidid>https://orcid.org/0000-0001-9522-6157</orcidid><orcidid>https://orcid.org/0009-0009-4271-4398</orcidid><orcidid>https://orcid.org/0000-0001-6636-6396</orcidid></search><sort><creationdate>2024</creationdate><title>Sparse Pedestrian Character Learning for Trajectory Prediction</title><author>Dong, Yonghao ; Wang, Le ; Zhou, Sanping ; Hua, Gang ; Sun, Changyin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c189t-3f782bafda0de0bd732412fe785550f6d6e02852f5cb980c02139f94018ed1b53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Cameras</topic><topic>Degradation</topic><topic>Long short term memory</topic><topic>Pedestrian trajectory prediction</topic><topic>Pedestrians</topic><topic>Predictive models</topic><topic>sparse pedestrian character learning</topic><topic>Trajectory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dong, Yonghao</creatorcontrib><creatorcontrib>Wang, Le</creatorcontrib><creatorcontrib>Zhou, Sanping</creatorcontrib><creatorcontrib>Hua, Gang</creatorcontrib><creatorcontrib>Sun, Changyin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dong, Yonghao</au><au>Wang, Le</au><au>Zhou, Sanping</au><au>Hua, Gang</au><au>Sun, Changyin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sparse Pedestrian Character Learning for Trajectory Prediction</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2024</date><risdate>2024</risdate><volume>26</volume><spage>11070</spage><epage>11082</epage><pages>11070-11082</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>Pedestrian trajectory prediction in a first-person view has recently attracted much attention due to its importance in autonomous driving. Recent work utilizes pedestrian character information, i.e., action and appearance, to improve the learned trajectory embedding and achieves state-of-the-art performance. However, it neglects the invalid and negative pedestrian character information, which is harmful to trajectory representation and thus leads to performance degradation. To address this issue, we present a two-stream sparse-character-based network (TSNet) for pedestrian trajectory prediction. Specifically, TSNet learns the negative-removed characters in the sparse character representation stream to improve the trajectory embedding obtained in the trajectory representation stream. Moreover, to model the negative-removed characters, we propose a novel sparse character graph, including the sparse category and sparse temporal character graphs, to learn the different effects of various characters in category and temporal dimensions, respectively. Extensive experiments on two first-person view datasets, PIE and JAAD, show that our method outperforms existing state-of-the-art methods. In addition, ablation studies demonstrate different effects of various characters and prove that TSNet outperforms approaches without eliminating negative characters.</abstract><pub>IEEE</pub><doi>10.1109/TMM.2024.3443591</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-9269-334X</orcidid><orcidid>https://orcid.org/0000-0003-4100-2304</orcidid><orcidid>https://orcid.org/0000-0001-9522-6157</orcidid><orcidid>https://orcid.org/0009-0009-4271-4398</orcidid><orcidid>https://orcid.org/0000-0001-6636-6396</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1520-9210
ispartof	IEEE transactions on multimedia, 2024, Vol.26, p.11070-11082
issn	1520-9210 1941-0077
language	eng
recordid	cdi_crossref_primary_10_1109_TMM_2024_3443591
source	IEEE Electronic Library (IEL)
subjects	Accuracy Cameras Degradation Long short term memory Pedestrian trajectory prediction Pedestrians Predictive models sparse pedestrian character learning Trajectory
title	Sparse Pedestrian Character Learning for Trajectory Prediction
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T18%3A10%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sparse%20Pedestrian%20Character%20Learning%20for%20Trajectory%20Prediction&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Dong,%20Yonghao&rft.date=2024&rft.volume=26&rft.spage=11070&rft.epage=11082&rft.pages=11070-11082&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2024.3443591&rft_dat=%3Ccrossref_RIE%3E10_1109_TMM_2024_3443591%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10636801&rfr_iscdi=true