TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior

Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yang, Sen, Jiang, Minyue, Fan, Ziwei, Xie, Xiaolu, Tan, Xiao, Li, Yingying, Ding, Errui, Wang, Liang, Wang, Jingdong
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning Computer Science - Robotics
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Yang, Sen Jiang, Minyue Fan, Ziwei Xie, Xiaolu Tan, Xiao Li, Yingying Ding, Errui Wang, Liang Wang, Jingdong
description	Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches still face challenges in long-range perception due to the restricted views imposed by the mounting angles of onboard cameras, just as human drivers also rely on bird's-eye-view navigation maps for a comprehensive understanding of road structures. To address these issues, we propose to train the perception model to "see" standard definition maps (SDMaps). We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information to improve the bird's eye view (BEV) feature for lane geometry and topology decoding. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology. To further enhance the ability of geometry prediction and topology reasoning, we also use a topology-guided decoder to refine the predictions by exploiting the mutual relationships between topological and geometric features. We perform extensive experiments on OpenLane-V2 datasets to validate the proposed method. The results show that our model outperforms state-of-the-art methods by a large margin, with gains of +6.7 and +9.1 on the mAP and topology metrics. Our analysis also reveals that models trained with SDMap noise augmentation exhibit enhanced robustness.
doi_str_mv	10.48550/arxiv.2411.14751
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_14751</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_14751</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_147513</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DM0MTc15GRwCskvyA92sVIA0Tn56ZW6rnkZiXnJqSkKPol5qQrBqem5qXklCgGpRcmpBSWZ-XkK5ZklGQrBLr6JBQoBRZn5RTwMrGmJOcWpvFCam0HezTXE2UMXbFt8QVFmbmJRZTzI1niwrcaEVQAAEpU2oA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior</title><source>arXiv.org</source><creator>Yang, Sen ; Jiang, Minyue ; Fan, Ziwei ; Xie, Xiaolu ; Tan, Xiao ; Li, Yingying ; Ding, Errui ; Wang, Liang ; Wang, Jingdong</creator><creatorcontrib>Yang, Sen ; Jiang, Minyue ; Fan, Ziwei ; Xie, Xiaolu ; Tan, Xiao ; Li, Yingying ; Ding, Errui ; Wang, Liang ; Wang, Jingdong</creatorcontrib><description>Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches still face challenges in long-range perception due to the restricted views imposed by the mounting angles of onboard cameras, just as human drivers also rely on bird's-eye-view navigation maps for a comprehensive understanding of road structures. To address these issues, we propose to train the perception model to "see" standard definition maps (SDMaps). We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information to improve the bird's eye view (BEV) feature for lane geometry and topology decoding. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology. To further enhance the ability of geometry prediction and topology reasoning, we also use a topology-guided decoder to refine the predictions by exploiting the mutual relationships between topological and geometric features. We perform extensive experiments on OpenLane-V2 datasets to validate the proposed method. The results show that our model outperforms state-of-the-art methods by a large margin, with gains of +6.7 and +9.1 on the mAP and topology metrics. Our analysis also reveals that models trained with SDMap noise augmentation exhibit enhanced robustness.</description><identifier>DOI: 10.48550/arxiv.2411.14751</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning ; Computer Science - Robotics</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by-nc-nd/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.14751$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.14751$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yang, Sen</creatorcontrib><creatorcontrib>Jiang, Minyue</creatorcontrib><creatorcontrib>Fan, Ziwei</creatorcontrib><creatorcontrib>Xie, Xiaolu</creatorcontrib><creatorcontrib>Tan, Xiao</creatorcontrib><creatorcontrib>Li, Yingying</creatorcontrib><creatorcontrib>Ding, Errui</creatorcontrib><creatorcontrib>Wang, Liang</creatorcontrib><creatorcontrib>Wang, Jingdong</creatorcontrib><title>TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior</title><description>Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches still face challenges in long-range perception due to the restricted views imposed by the mounting angles of onboard cameras, just as human drivers also rely on bird's-eye-view navigation maps for a comprehensive understanding of road structures. To address these issues, we propose to train the perception model to "see" standard definition maps (SDMaps). We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information to improve the bird's eye view (BEV) feature for lane geometry and topology decoding. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology. To further enhance the ability of geometry prediction and topology reasoning, we also use a topology-guided decoder to refine the predictions by exploiting the mutual relationships between topological and geometric features. We perform extensive experiments on OpenLane-V2 datasets to validate the proposed method. The results show that our model outperforms state-of-the-art methods by a large margin, with gains of +6.7 and +9.1 on the mAP and topology metrics. Our analysis also reveals that models trained with SDMap noise augmentation exhibit enhanced robustness.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DM0MTc15GRwCskvyA92sVIA0Tn56ZW6rnkZiXnJqSkKPol5qQrBqem5qXklCgGpRcmpBSWZ-XkK5ZklGQrBLr6JBQoBRZn5RTwMrGmJOcWpvFCam0HezTXE2UMXbFt8QVFmbmJRZTzI1niwrcaEVQAAEpU2oA</recordid><startdate>20241122</startdate><enddate>20241122</enddate><creator>Yang, Sen</creator><creator>Jiang, Minyue</creator><creator>Fan, Ziwei</creator><creator>Xie, Xiaolu</creator><creator>Tan, Xiao</creator><creator>Li, Yingying</creator><creator>Ding, Errui</creator><creator>Wang, Liang</creator><creator>Wang, Jingdong</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241122</creationdate><title>TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior</title><author>Yang, Sen ; Jiang, Minyue ; Fan, Ziwei ; Xie, Xiaolu ; Tan, Xiao ; Li, Yingying ; Ding, Errui ; Wang, Liang ; Wang, Jingdong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_147513</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Yang, Sen</creatorcontrib><creatorcontrib>Jiang, Minyue</creatorcontrib><creatorcontrib>Fan, Ziwei</creatorcontrib><creatorcontrib>Xie, Xiaolu</creatorcontrib><creatorcontrib>Tan, Xiao</creatorcontrib><creatorcontrib>Li, Yingying</creatorcontrib><creatorcontrib>Ding, Errui</creatorcontrib><creatorcontrib>Wang, Liang</creatorcontrib><creatorcontrib>Wang, Jingdong</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yang, Sen</au><au>Jiang, Minyue</au><au>Fan, Ziwei</au><au>Xie, Xiaolu</au><au>Tan, Xiao</au><au>Li, Yingying</au><au>Ding, Errui</au><au>Wang, Liang</au><au>Wang, Jingdong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior</atitle><date>2024-11-22</date><risdate>2024</risdate><abstract>Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches still face challenges in long-range perception due to the restricted views imposed by the mounting angles of onboard cameras, just as human drivers also rely on bird's-eye-view navigation maps for a comprehensive understanding of road structures. To address these issues, we propose to train the perception model to "see" standard definition maps (SDMaps). We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information to improve the bird's eye view (BEV) feature for lane geometry and topology decoding. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology. To further enhance the ability of geometry prediction and topology reasoning, we also use a topology-guided decoder to refine the predictions by exploiting the mutual relationships between topological and geometric features. We perform extensive experiments on OpenLane-V2 datasets to validate the proposed method. The results show that our model outperforms state-of-the-art methods by a large margin, with gains of +6.7 and +9.1 on the mAP and topology metrics. Our analysis also reveals that models trained with SDMap noise augmentation exhibit enhanced robustness.</abstract><doi>10.48550/arxiv.2411.14751</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2411.14751
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2411_14751
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning Computer Science - Robotics
title	TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T04%3A32%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TopoSD:%20Topology-Enhanced%20Lane%20Segment%20Perception%20with%20SDMap%20Prior&rft.au=Yang,%20Sen&rft.date=2024-11-22&rft_id=info:doi/10.48550/arxiv.2411.14751&rft_dat=%3Carxiv_GOX%3E2411_14751%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true