TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior

Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yang, Sen, Jiang, Minyue, Fan, Ziwei, Xie, Xiaolu, Tan, Xiao, Li, Yingying, Ding, Errui, Wang, Liang, Wang, Jingdong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Yang, Sen
Jiang, Minyue
Fan, Ziwei
Xie, Xiaolu
Tan, Xiao
Li, Yingying
Ding, Errui
Wang, Liang
Wang, Jingdong
description Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches still face challenges in long-range perception due to the restricted views imposed by the mounting angles of onboard cameras, just as human drivers also rely on bird's-eye-view navigation maps for a comprehensive understanding of road structures. To address these issues, we propose to train the perception model to "see" standard definition maps (SDMaps). We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information to improve the bird's eye view (BEV) feature for lane geometry and topology decoding. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology. To further enhance the ability of geometry prediction and topology reasoning, we also use a topology-guided decoder to refine the predictions by exploiting the mutual relationships between topological and geometric features. We perform extensive experiments on OpenLane-V2 datasets to validate the proposed method. The results show that our model outperforms state-of-the-art methods by a large margin, with gains of +6.7 and +9.1 on the mAP and topology metrics. Our analysis also reveals that models trained with SDMap noise augmentation exhibit enhanced robustness.
doi_str_mv 10.48550/arxiv.2411.14751
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_14751</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_14751</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_147513</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DM0MTc15GRwCskvyA92sVIA0Tn56ZW6rnkZiXnJqSkKPol5qQrBqem5qXklCgGpRcmpBSWZ-XkK5ZklGQrBLr6JBQoBRZn5RTwMrGmJOcWpvFCam0HezTXE2UMXbFt8QVFmbmJRZTzI1niwrcaEVQAAEpU2oA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior</title><source>arXiv.org</source><creator>Yang, Sen ; Jiang, Minyue ; Fan, Ziwei ; Xie, Xiaolu ; Tan, Xiao ; Li, Yingying ; Ding, Errui ; Wang, Liang ; Wang, Jingdong</creator><creatorcontrib>Yang, Sen ; Jiang, Minyue ; Fan, Ziwei ; Xie, Xiaolu ; Tan, Xiao ; Li, Yingying ; Ding, Errui ; Wang, Liang ; Wang, Jingdong</creatorcontrib><description>Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches still face challenges in long-range perception due to the restricted views imposed by the mounting angles of onboard cameras, just as human drivers also rely on bird's-eye-view navigation maps for a comprehensive understanding of road structures. To address these issues, we propose to train the perception model to "see" standard definition maps (SDMaps). We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information to improve the bird's eye view (BEV) feature for lane geometry and topology decoding. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology. To further enhance the ability of geometry prediction and topology reasoning, we also use a topology-guided decoder to refine the predictions by exploiting the mutual relationships between topological and geometric features. We perform extensive experiments on OpenLane-V2 datasets to validate the proposed method. The results show that our model outperforms state-of-the-art methods by a large margin, with gains of +6.7 and +9.1 on the mAP and topology metrics. Our analysis also reveals that models trained with SDMap noise augmentation exhibit enhanced robustness.</description><identifier>DOI: 10.48550/arxiv.2411.14751</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning ; Computer Science - Robotics</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by-nc-nd/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.14751$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.14751$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yang, Sen</creatorcontrib><creatorcontrib>Jiang, Minyue</creatorcontrib><creatorcontrib>Fan, Ziwei</creatorcontrib><creatorcontrib>Xie, Xiaolu</creatorcontrib><creatorcontrib>Tan, Xiao</creatorcontrib><creatorcontrib>Li, Yingying</creatorcontrib><creatorcontrib>Ding, Errui</creatorcontrib><creatorcontrib>Wang, Liang</creatorcontrib><creatorcontrib>Wang, Jingdong</creatorcontrib><title>TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior</title><description>Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches still face challenges in long-range perception due to the restricted views imposed by the mounting angles of onboard cameras, just as human drivers also rely on bird's-eye-view navigation maps for a comprehensive understanding of road structures. To address these issues, we propose to train the perception model to "see" standard definition maps (SDMaps). We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information to improve the bird's eye view (BEV) feature for lane geometry and topology decoding. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology. To further enhance the ability of geometry prediction and topology reasoning, we also use a topology-guided decoder to refine the predictions by exploiting the mutual relationships between topological and geometric features. We perform extensive experiments on OpenLane-V2 datasets to validate the proposed method. The results show that our model outperforms state-of-the-art methods by a large margin, with gains of +6.7 and +9.1 on the mAP and topology metrics. Our analysis also reveals that models trained with SDMap noise augmentation exhibit enhanced robustness.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DM0MTc15GRwCskvyA92sVIA0Tn56ZW6rnkZiXnJqSkKPol5qQrBqem5qXklCgGpRcmpBSWZ-XkK5ZklGQrBLr6JBQoBRZn5RTwMrGmJOcWpvFCam0HezTXE2UMXbFt8QVFmbmJRZTzI1niwrcaEVQAAEpU2oA</recordid><startdate>20241122</startdate><enddate>20241122</enddate><creator>Yang, Sen</creator><creator>Jiang, Minyue</creator><creator>Fan, Ziwei</creator><creator>Xie, Xiaolu</creator><creator>Tan, Xiao</creator><creator>Li, Yingying</creator><creator>Ding, Errui</creator><creator>Wang, Liang</creator><creator>Wang, Jingdong</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241122</creationdate><title>TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior</title><author>Yang, Sen ; Jiang, Minyue ; Fan, Ziwei ; Xie, Xiaolu ; Tan, Xiao ; Li, Yingying ; Ding, Errui ; Wang, Liang ; Wang, Jingdong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_147513</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Yang, Sen</creatorcontrib><creatorcontrib>Jiang, Minyue</creatorcontrib><creatorcontrib>Fan, Ziwei</creatorcontrib><creatorcontrib>Xie, Xiaolu</creatorcontrib><creatorcontrib>Tan, Xiao</creatorcontrib><creatorcontrib>Li, Yingying</creatorcontrib><creatorcontrib>Ding, Errui</creatorcontrib><creatorcontrib>Wang, Liang</creatorcontrib><creatorcontrib>Wang, Jingdong</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yang, Sen</au><au>Jiang, Minyue</au><au>Fan, Ziwei</au><au>Xie, Xiaolu</au><au>Tan, Xiao</au><au>Li, Yingying</au><au>Ding, Errui</au><au>Wang, Liang</au><au>Wang, Jingdong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior</atitle><date>2024-11-22</date><risdate>2024</risdate><abstract>Recent advances in autonomous driving systems have shifted towards reducing reliance on high-definition maps (HDMaps) due to the huge costs of annotation and maintenance. Instead, researchers are focusing on online vectorized HDMap construction using on-board sensors. However, sensor-only approaches still face challenges in long-range perception due to the restricted views imposed by the mounting angles of onboard cameras, just as human drivers also rely on bird's-eye-view navigation maps for a comprehensive understanding of road structures. To address these issues, we propose to train the perception model to "see" standard definition maps (SDMaps). We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information to improve the bird's eye view (BEV) feature for lane geometry and topology decoding. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology. To further enhance the ability of geometry prediction and topology reasoning, we also use a topology-guided decoder to refine the predictions by exploiting the mutual relationships between topological and geometric features. We perform extensive experiments on OpenLane-V2 datasets to validate the proposed method. The results show that our model outperforms state-of-the-art methods by a large margin, with gains of +6.7 and +9.1 on the mAP and topology metrics. Our analysis also reveals that models trained with SDMap noise augmentation exhibit enhanced robustness.</abstract><doi>10.48550/arxiv.2411.14751</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2411.14751
ispartof
issn
language eng
recordid cdi_arxiv_primary_2411_14751
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computer Vision and Pattern Recognition
Computer Science - Learning
Computer Science - Robotics
title TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T04%3A32%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TopoSD:%20Topology-Enhanced%20Lane%20Segment%20Perception%20with%20SDMap%20Prior&rft.au=Yang,%20Sen&rft.date=2024-11-22&rft_id=info:doi/10.48550/arxiv.2411.14751&rft_dat=%3Carxiv_GOX%3E2411_14751%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true