LAPTNet-FPN: Multi-scale LiDAR-aided Projective Transform Network for Real Time Semantic Grid Prediction

Semantic grids can be useful representations of the scene around an autonomous system. By having information about the layout of the space around itself, a robot can leverage this type of representation for crucial tasks such as navigation or tracking. By fusing information from multiple sensors, ro...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Diaz-Zapata, Manuel Alejandro, González, David Sierra, Erkent, Özgür, Dibangoye, Jilles, Laugier, Christian
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition Computer Science - Robotics
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Diaz-Zapata, Manuel Alejandro González, David Sierra Erkent, Özgür Dibangoye, Jilles Laugier, Christian
description	Semantic grids can be useful representations of the scene around an autonomous system. By having information about the layout of the space around itself, a robot can leverage this type of representation for crucial tasks such as navigation or tracking. By fusing information from multiple sensors, robustness can be increased and the computational load for the task can be lowered, achieving real time performance. Our multi-scale LiDAR-Aided Perspective Transform network uses information available in point clouds to guide the projection of image features to a top-view representation, resulting in a relative improvement in the state of the art for semantic grid generation for human (+8.67%) and movable object (+49.07%) classes in the nuScenes dataset, as well as achieving results close to the state of the art for the vehicle, drivable area and walkway classes, while performing inference at 25 FPS.
doi_str_mv	10.48550/arxiv.2302.06414
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2302_06414</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2302_06414</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2302_064143</originalsourceid><addsrcrecordid>eNqFjsEKgkAURWfTIqoPaNX7gTFNjWgnlbUwEZu9DPqkV6PGaFZ_n0r7VpcD98BhbG6ZhrNxXXMp9ZtaY2WbK8NcO5YzZtfAi0SIDfejcAvnp2qI16lUCAHtvZhLyjCDSFc3TBtqEYSWZZ1XuoDOelX6Dh1AjFKBoALhgoUsG0rhqKkXMaNOrMopG-VS1Tj77YQt_IPYnfjQlDw0FVJ_kr4tGdrs_48vnzNDtw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>LAPTNet-FPN: Multi-scale LiDAR-aided Projective Transform Network for Real Time Semantic Grid Prediction</title><source>arXiv.org</source><creator>Diaz-Zapata, Manuel Alejandro ; González, David Sierra ; Erkent, Özgür ; Dibangoye, Jilles ; Laugier, Christian</creator><creatorcontrib>Diaz-Zapata, Manuel Alejandro ; González, David Sierra ; Erkent, Özgür ; Dibangoye, Jilles ; Laugier, Christian</creatorcontrib><description>Semantic grids can be useful representations of the scene around an autonomous system. By having information about the layout of the space around itself, a robot can leverage this type of representation for crucial tasks such as navigation or tracking. By fusing information from multiple sensors, robustness can be increased and the computational load for the task can be lowered, achieving real time performance. Our multi-scale LiDAR-Aided Perspective Transform network uses information available in point clouds to guide the projection of image features to a top-view representation, resulting in a relative improvement in the state of the art for semantic grid generation for human (+8.67%) and movable object (+49.07%) classes in the nuScenes dataset, as well as achieving results close to the state of the art for the vehicle, drivable area and walkway classes, while performing inference at 25 FPS.</description><identifier>DOI: 10.48550/arxiv.2302.06414</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Robotics</subject><creationdate>2023-02</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2302.06414$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2302.06414$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Diaz-Zapata, Manuel Alejandro</creatorcontrib><creatorcontrib>González, David Sierra</creatorcontrib><creatorcontrib>Erkent, Özgür</creatorcontrib><creatorcontrib>Dibangoye, Jilles</creatorcontrib><creatorcontrib>Laugier, Christian</creatorcontrib><title>LAPTNet-FPN: Multi-scale LiDAR-aided Projective Transform Network for Real Time Semantic Grid Prediction</title><description>Semantic grids can be useful representations of the scene around an autonomous system. By having information about the layout of the space around itself, a robot can leverage this type of representation for crucial tasks such as navigation or tracking. By fusing information from multiple sensors, robustness can be increased and the computational load for the task can be lowered, achieving real time performance. Our multi-scale LiDAR-Aided Perspective Transform network uses information available in point clouds to guide the projection of image features to a top-view representation, resulting in a relative improvement in the state of the art for semantic grid generation for human (+8.67%) and movable object (+49.07%) classes in the nuScenes dataset, as well as achieving results close to the state of the art for the vehicle, drivable area and walkway classes, while performing inference at 25 FPS.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Robotics</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjsEKgkAURWfTIqoPaNX7gTFNjWgnlbUwEZu9DPqkV6PGaFZ_n0r7VpcD98BhbG6ZhrNxXXMp9ZtaY2WbK8NcO5YzZtfAi0SIDfejcAvnp2qI16lUCAHtvZhLyjCDSFc3TBtqEYSWZZ1XuoDOelX6Dh1AjFKBoALhgoUsG0rhqKkXMaNOrMopG-VS1Tj77YQt_IPYnfjQlDw0FVJ_kr4tGdrs_48vnzNDtw</recordid><startdate>20230210</startdate><enddate>20230210</enddate><creator>Diaz-Zapata, Manuel Alejandro</creator><creator>González, David Sierra</creator><creator>Erkent, Özgür</creator><creator>Dibangoye, Jilles</creator><creator>Laugier, Christian</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230210</creationdate><title>LAPTNet-FPN: Multi-scale LiDAR-aided Projective Transform Network for Real Time Semantic Grid Prediction</title><author>Diaz-Zapata, Manuel Alejandro ; González, David Sierra ; Erkent, Özgür ; Dibangoye, Jilles ; Laugier, Christian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2302_064143</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Robotics</topic><toplevel>online_resources</toplevel><creatorcontrib>Diaz-Zapata, Manuel Alejandro</creatorcontrib><creatorcontrib>González, David Sierra</creatorcontrib><creatorcontrib>Erkent, Özgür</creatorcontrib><creatorcontrib>Dibangoye, Jilles</creatorcontrib><creatorcontrib>Laugier, Christian</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Diaz-Zapata, Manuel Alejandro</au><au>González, David Sierra</au><au>Erkent, Özgür</au><au>Dibangoye, Jilles</au><au>Laugier, Christian</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>LAPTNet-FPN: Multi-scale LiDAR-aided Projective Transform Network for Real Time Semantic Grid Prediction</atitle><date>2023-02-10</date><risdate>2023</risdate><abstract>Semantic grids can be useful representations of the scene around an autonomous system. By having information about the layout of the space around itself, a robot can leverage this type of representation for crucial tasks such as navigation or tracking. By fusing information from multiple sensors, robustness can be increased and the computational load for the task can be lowered, achieving real time performance. Our multi-scale LiDAR-Aided Perspective Transform network uses information available in point clouds to guide the projection of image features to a top-view representation, resulting in a relative improvement in the state of the art for semantic grid generation for human (+8.67%) and movable object (+49.07%) classes in the nuScenes dataset, as well as achieving results close to the state of the art for the vehicle, drivable area and walkway classes, while performing inference at 25 FPS.</abstract><doi>10.48550/arxiv.2302.06414</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2302.06414
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2302_06414
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computer Vision and Pattern Recognition Computer Science - Robotics
title	LAPTNet-FPN: Multi-scale LiDAR-aided Projective Transform Network for Real Time Semantic Grid Prediction
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T17%3A26%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=LAPTNet-FPN:%20Multi-scale%20LiDAR-aided%20Projective%20Transform%20Network%20for%20Real%20Time%20Semantic%20Grid%20Prediction&rft.au=Diaz-Zapata,%20Manuel%20Alejandro&rft.date=2023-02-10&rft_id=info:doi/10.48550/arxiv.2302.06414&rft_dat=%3Carxiv_GOX%3E2302_06414%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true