ImLiDAR: Cross-Sensor Dynamic Message Propagation Network for 3-D Object Detection

LiDAR and camera, as two different sensors, supply geometric (point clouds) and semantic (RGB images) information of 3-D scenes. However, it is still challenging for existing methods to fuse data from the two cross sensors, making them complementary for quality 3-D object detection (3OD). We propose...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on geoscience and remote sensing 2023, Vol.61, p.1-13
Hauptverfasser: Shen, Yiyang, Yu, Rongwei, Wu, Peng, Xie, Haoran, Gong, Lina, Qin, Jing, Wei, Mingqiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 13
container_issue
container_start_page 1
container_title IEEE transactions on geoscience and remote sensing
container_volume 61
creator Shen, Yiyang
Yu, Rongwei
Wu, Peng
Xie, Haoran
Gong, Lina
Qin, Jing
Wei, Mingqiang
description LiDAR and camera, as two different sensors, supply geometric (point clouds) and semantic (RGB images) information of 3-D scenes. However, it is still challenging for existing methods to fuse data from the two cross sensors, making them complementary for quality 3-D object detection (3OD). We propose ImLiDAR, a new 3OD paradigm to narrow the cross-sensor discrepancies by progressively fusing the multiscale features of camera Images and LiDAR point clouds. ImLiDAR enables to provide the detection head with cross-sensor yet robustly fused features. To achieve this, two core designs exist in ImLiDAR. First, we propose a cross-sensor dynamic message propagation (CDMP) module to combine the best of the multiscale image and point features. Second, we raise a direct set prediction problem that allows designing an effective set-based detector (SD) to tackle the inconsistency of the classification and localization confidences, and the sensitivity of hand-tuned hyperparameters. Besides, the novel SD can be detachable and easily integrated into various detection networks. Comparisons on the KITTI, nuScenes, and SUN-RGBD datasets all show clear visual and numerical improvements of our ImLiDAR over 45 state-of-the-art 3OD methods.
doi_str_mv 10.1109/TGRS.2023.3321138
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_10268462</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10268462</ieee_id><sourcerecordid>2878485483</sourcerecordid><originalsourceid>FETCH-LOGICAL-c289t-79f4f3aa45b4dc5ce88035c7f7b6ebae8b2aa7e25a014c115afb73ba3a5e87663</originalsourceid><addsrcrecordid>eNpNkE1PwkAQhjdGExH9ASYeNvFc3M_u1huhiiRVDOB5s12npCgt7pYY_r3bwMHTe3lm5p0HoVtKRpSS7GE1XSxHjDA-4pxRyvUZGlApdUJSIc7RgNAsTZjO2CW6CmFDCBWSqgFazLZFnY8Xj3ji2xCSJTSh9Tg_NHZbO_wKIdg14Hff7uzadnXb4Dfoflv_havI8STH83IDrsM5dDEicI0uKvsd4OaUQ_Tx_LSavCTFfDqbjIvExR5dorJKVNxaIUvx6aQDrQmXTlWqTKG0oEtmrQImbezqKJW2KhUvLbcStEpTPkT3x7073_7sIXRm0-59E08appUWWgrNI0WPlOv_81CZna-31h8MJaZXZ3p1pldnTurizN1xpgaAfzxLtUgZ_wOoR2nx</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2878485483</pqid></control><display><type>article</type><title>ImLiDAR: Cross-Sensor Dynamic Message Propagation Network for 3-D Object Detection</title><source>IEEE Electronic Library (IEL)</source><creator>Shen, Yiyang ; Yu, Rongwei ; Wu, Peng ; Xie, Haoran ; Gong, Lina ; Qin, Jing ; Wei, Mingqiang</creator><creatorcontrib>Shen, Yiyang ; Yu, Rongwei ; Wu, Peng ; Xie, Haoran ; Gong, Lina ; Qin, Jing ; Wei, Mingqiang</creatorcontrib><description>LiDAR and camera, as two different sensors, supply geometric (point clouds) and semantic (RGB images) information of 3-D scenes. However, it is still challenging for existing methods to fuse data from the two cross sensors, making them complementary for quality 3-D object detection (3OD). We propose ImLiDAR, a new 3OD paradigm to narrow the cross-sensor discrepancies by progressively fusing the multiscale features of camera Images and LiDAR point clouds. ImLiDAR enables to provide the detection head with cross-sensor yet robustly fused features. To achieve this, two core designs exist in ImLiDAR. First, we propose a cross-sensor dynamic message propagation (CDMP) module to combine the best of the multiscale image and point features. Second, we raise a direct set prediction problem that allows designing an effective set-based detector (SD) to tackle the inconsistency of the classification and localization confidences, and the sensitivity of hand-tuned hyperparameters. Besides, the novel SD can be detachable and easily integrated into various detection networks. Comparisons on the KITTI, nuScenes, and SUN-RGBD datasets all show clear visual and numerical improvements of our ImLiDAR over 45 state-of-the-art 3OD methods.</description><identifier>ISSN: 0196-2892</identifier><identifier>EISSN: 1558-0644</identifier><identifier>DOI: 10.1109/TGRS.2023.3321138</identifier><identifier>CODEN: IGRSD2</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>3-D object detection (3OD) ; Cameras ; Color imagery ; cross sensors ; Detection ; Detectors ; dynamic message propagation ; Feature extraction ; ImLiDAR ; Laser radar ; Lidar ; Localization ; Messages ; Object detection ; Object recognition ; Point cloud compression ; Sensors ; set-based detector (SD) ; Three dimensional models ; Three-dimensional displays</subject><ispartof>IEEE transactions on geoscience and remote sensing, 2023, Vol.61, p.1-13</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c289t-79f4f3aa45b4dc5ce88035c7f7b6ebae8b2aa7e25a014c115afb73ba3a5e87663</cites><orcidid>0000-0002-5272-6706 ; 0000-0002-2961-0860 ; 0000-0003-2311-0950 ; 0000-0003-0965-3617 ; 0000-0003-0429-490X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10268462$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,4010,27900,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10268462$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Shen, Yiyang</creatorcontrib><creatorcontrib>Yu, Rongwei</creatorcontrib><creatorcontrib>Wu, Peng</creatorcontrib><creatorcontrib>Xie, Haoran</creatorcontrib><creatorcontrib>Gong, Lina</creatorcontrib><creatorcontrib>Qin, Jing</creatorcontrib><creatorcontrib>Wei, Mingqiang</creatorcontrib><title>ImLiDAR: Cross-Sensor Dynamic Message Propagation Network for 3-D Object Detection</title><title>IEEE transactions on geoscience and remote sensing</title><addtitle>TGRS</addtitle><description>LiDAR and camera, as two different sensors, supply geometric (point clouds) and semantic (RGB images) information of 3-D scenes. However, it is still challenging for existing methods to fuse data from the two cross sensors, making them complementary for quality 3-D object detection (3OD). We propose ImLiDAR, a new 3OD paradigm to narrow the cross-sensor discrepancies by progressively fusing the multiscale features of camera Images and LiDAR point clouds. ImLiDAR enables to provide the detection head with cross-sensor yet robustly fused features. To achieve this, two core designs exist in ImLiDAR. First, we propose a cross-sensor dynamic message propagation (CDMP) module to combine the best of the multiscale image and point features. Second, we raise a direct set prediction problem that allows designing an effective set-based detector (SD) to tackle the inconsistency of the classification and localization confidences, and the sensitivity of hand-tuned hyperparameters. Besides, the novel SD can be detachable and easily integrated into various detection networks. Comparisons on the KITTI, nuScenes, and SUN-RGBD datasets all show clear visual and numerical improvements of our ImLiDAR over 45 state-of-the-art 3OD methods.</description><subject>3-D object detection (3OD)</subject><subject>Cameras</subject><subject>Color imagery</subject><subject>cross sensors</subject><subject>Detection</subject><subject>Detectors</subject><subject>dynamic message propagation</subject><subject>Feature extraction</subject><subject>ImLiDAR</subject><subject>Laser radar</subject><subject>Lidar</subject><subject>Localization</subject><subject>Messages</subject><subject>Object detection</subject><subject>Object recognition</subject><subject>Point cloud compression</subject><subject>Sensors</subject><subject>set-based detector (SD)</subject><subject>Three dimensional models</subject><subject>Three-dimensional displays</subject><issn>0196-2892</issn><issn>1558-0644</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE1PwkAQhjdGExH9ASYeNvFc3M_u1huhiiRVDOB5s12npCgt7pYY_r3bwMHTe3lm5p0HoVtKRpSS7GE1XSxHjDA-4pxRyvUZGlApdUJSIc7RgNAsTZjO2CW6CmFDCBWSqgFazLZFnY8Xj3ji2xCSJTSh9Tg_NHZbO_wKIdg14Hff7uzadnXb4Dfoflv_havI8STH83IDrsM5dDEicI0uKvsd4OaUQ_Tx_LSavCTFfDqbjIvExR5dorJKVNxaIUvx6aQDrQmXTlWqTKG0oEtmrQImbezqKJW2KhUvLbcStEpTPkT3x7073_7sIXRm0-59E08appUWWgrNI0WPlOv_81CZna-31h8MJaZXZ3p1pldnTurizN1xpgaAfzxLtUgZ_wOoR2nx</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Shen, Yiyang</creator><creator>Yu, Rongwei</creator><creator>Wu, Peng</creator><creator>Xie, Haoran</creator><creator>Gong, Lina</creator><creator>Qin, Jing</creator><creator>Wei, Mingqiang</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0002-5272-6706</orcidid><orcidid>https://orcid.org/0000-0002-2961-0860</orcidid><orcidid>https://orcid.org/0000-0003-2311-0950</orcidid><orcidid>https://orcid.org/0000-0003-0965-3617</orcidid><orcidid>https://orcid.org/0000-0003-0429-490X</orcidid></search><sort><creationdate>2023</creationdate><title>ImLiDAR: Cross-Sensor Dynamic Message Propagation Network for 3-D Object Detection</title><author>Shen, Yiyang ; Yu, Rongwei ; Wu, Peng ; Xie, Haoran ; Gong, Lina ; Qin, Jing ; Wei, Mingqiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c289t-79f4f3aa45b4dc5ce88035c7f7b6ebae8b2aa7e25a014c115afb73ba3a5e87663</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>3-D object detection (3OD)</topic><topic>Cameras</topic><topic>Color imagery</topic><topic>cross sensors</topic><topic>Detection</topic><topic>Detectors</topic><topic>dynamic message propagation</topic><topic>Feature extraction</topic><topic>ImLiDAR</topic><topic>Laser radar</topic><topic>Lidar</topic><topic>Localization</topic><topic>Messages</topic><topic>Object detection</topic><topic>Object recognition</topic><topic>Point cloud compression</topic><topic>Sensors</topic><topic>set-based detector (SD)</topic><topic>Three dimensional models</topic><topic>Three-dimensional displays</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Shen, Yiyang</creatorcontrib><creatorcontrib>Yu, Rongwei</creatorcontrib><creatorcontrib>Wu, Peng</creatorcontrib><creatorcontrib>Xie, Haoran</creatorcontrib><creatorcontrib>Gong, Lina</creatorcontrib><creatorcontrib>Qin, Jing</creatorcontrib><creatorcontrib>Wei, Mingqiang</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy &amp; Non-Living Resources</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on geoscience and remote sensing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Shen, Yiyang</au><au>Yu, Rongwei</au><au>Wu, Peng</au><au>Xie, Haoran</au><au>Gong, Lina</au><au>Qin, Jing</au><au>Wei, Mingqiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ImLiDAR: Cross-Sensor Dynamic Message Propagation Network for 3-D Object Detection</atitle><jtitle>IEEE transactions on geoscience and remote sensing</jtitle><stitle>TGRS</stitle><date>2023</date><risdate>2023</risdate><volume>61</volume><spage>1</spage><epage>13</epage><pages>1-13</pages><issn>0196-2892</issn><eissn>1558-0644</eissn><coden>IGRSD2</coden><abstract>LiDAR and camera, as two different sensors, supply geometric (point clouds) and semantic (RGB images) information of 3-D scenes. However, it is still challenging for existing methods to fuse data from the two cross sensors, making them complementary for quality 3-D object detection (3OD). We propose ImLiDAR, a new 3OD paradigm to narrow the cross-sensor discrepancies by progressively fusing the multiscale features of camera Images and LiDAR point clouds. ImLiDAR enables to provide the detection head with cross-sensor yet robustly fused features. To achieve this, two core designs exist in ImLiDAR. First, we propose a cross-sensor dynamic message propagation (CDMP) module to combine the best of the multiscale image and point features. Second, we raise a direct set prediction problem that allows designing an effective set-based detector (SD) to tackle the inconsistency of the classification and localization confidences, and the sensitivity of hand-tuned hyperparameters. Besides, the novel SD can be detachable and easily integrated into various detection networks. Comparisons on the KITTI, nuScenes, and SUN-RGBD datasets all show clear visual and numerical improvements of our ImLiDAR over 45 state-of-the-art 3OD methods.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TGRS.2023.3321138</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-5272-6706</orcidid><orcidid>https://orcid.org/0000-0002-2961-0860</orcidid><orcidid>https://orcid.org/0000-0003-2311-0950</orcidid><orcidid>https://orcid.org/0000-0003-0965-3617</orcidid><orcidid>https://orcid.org/0000-0003-0429-490X</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0196-2892
ispartof IEEE transactions on geoscience and remote sensing, 2023, Vol.61, p.1-13
issn 0196-2892
1558-0644
language eng
recordid cdi_ieee_primary_10268462
source IEEE Electronic Library (IEL)
subjects 3-D object detection (3OD)
Cameras
Color imagery
cross sensors
Detection
Detectors
dynamic message propagation
Feature extraction
ImLiDAR
Laser radar
Lidar
Localization
Messages
Object detection
Object recognition
Point cloud compression
Sensors
set-based detector (SD)
Three dimensional models
Three-dimensional displays
title ImLiDAR: Cross-Sensor Dynamic Message Propagation Network for 3-D Object Detection
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T17%3A07%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ImLiDAR:%20Cross-Sensor%20Dynamic%20Message%20Propagation%20Network%20for%203-D%20Object%20Detection&rft.jtitle=IEEE%20transactions%20on%20geoscience%20and%20remote%20sensing&rft.au=Shen,%20Yiyang&rft.date=2023&rft.volume=61&rft.spage=1&rft.epage=13&rft.pages=1-13&rft.issn=0196-2892&rft.eissn=1558-0644&rft.coden=IGRSD2&rft_id=info:doi/10.1109/TGRS.2023.3321138&rft_dat=%3Cproquest_RIE%3E2878485483%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2878485483&rft_id=info:pmid/&rft_ieee_id=10268462&rfr_iscdi=true