An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation

Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2024, Vol.36 (3), p.1261-1281
Hauptverfasser: Choate, Jeffrey, Worth, Derek, Nykl, Scott, Taylor, Clark, Borghetti, Brett, Schubert Kabban, Christine
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1281
container_issue 3
container_start_page 1261
container_title Neural computing & applications
container_volume 36
creator Choate, Jeffrey
Worth, Derek
Nykl, Scott
Taylor, Clark
Borghetti, Brett
Schubert Kabban, Christine
description Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional neural network (CNN) for 6D pose localization of aircraft using only color imagery. Traditional object detection labeling methods suffer from inaccuracies due to perspective geometry and being limited to visible key points. This research demonstrates that with precise labeling, a CNN can predict object features with near-pixel accuracy, effectively learning the distinct appearance of the object due to perspective distortion with a pinhole camera. Additionally, we highlight the crucial role of knowledge about occluded features. Training the CNN with such knowledge slightly reduces pixel precision, but enables the prediction of 3 times more features, including those that are not initially visible, resulting in an overall better performing 6D system. Notably, we reveal that the data augmentation technique of scale can interfere with pixel precision when used during training. These findings are crucial for the entire system, which leverages the Solve Perspective-N-Point (Solve-PnP) algorithm, achieving 6D pose accuracy within 1 ∘ and 7 cm at distances ranging from 7.5 to 35 m from the camera. Moreover, this solution operates in real-time, achieving sub-10ms processing times on a desktop PC.
doi_str_mv 10.1007/s00521-023-09094-8
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2910703160</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2910703160</sourcerecordid><originalsourceid>FETCH-LOGICAL-c314t-ecd69d9722597af7dfa58aa11d9a0addc10fdbf7d33e0bb1e6820da4c49d8efd3</originalsourceid><addsrcrecordid>eNp9UMtOwzAQtBBIlMIPcLLEObCO8zK3qjylSlzgxMFy7U2VKo2DN0Xqjd_g9_gSXILEjdOuZmdmd4excwGXAqC8IoA8FQmkMgEFKkuqAzYRmZSJhLw6ZJOIxXGRyWN2QrQGgKyo8gl7nXXcdKbdUUPc17wPaBtqfHfNvbXtdt9GguM9BurRDs078hX6DQ5h9_XxSTz4FnnT8eKG956QIw3NxgxRd8qOatMSnv3WKXu5u32ePySLp_vH-WyRWCmyIUHrCuVUmaa5Kk1dutrklTFCOGXAOGcF1G4ZcSkRlkuBRZWCM5nNlKuwdnLKLkbfPvi3bdyv134b4lOkUyWgBCkKiKx0ZNngiQLWug_x0LDTAvQ-RD2GqGOI-idEXUWRHEUUyd0Kw5_1P6pv-bN3dg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2910703160</pqid></control><display><type>article</type><title>An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation</title><source>Springer Nature - Complete Springer Journals</source><creator>Choate, Jeffrey ; Worth, Derek ; Nykl, Scott ; Taylor, Clark ; Borghetti, Brett ; Schubert Kabban, Christine</creator><creatorcontrib>Choate, Jeffrey ; Worth, Derek ; Nykl, Scott ; Taylor, Clark ; Borghetti, Brett ; Schubert Kabban, Christine</creatorcontrib><description>Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional neural network (CNN) for 6D pose localization of aircraft using only color imagery. Traditional object detection labeling methods suffer from inaccuracies due to perspective geometry and being limited to visible key points. This research demonstrates that with precise labeling, a CNN can predict object features with near-pixel accuracy, effectively learning the distinct appearance of the object due to perspective distortion with a pinhole camera. Additionally, we highlight the crucial role of knowledge about occluded features. Training the CNN with such knowledge slightly reduces pixel precision, but enables the prediction of 3 times more features, including those that are not initially visible, resulting in an overall better performing 6D system. Notably, we reveal that the data augmentation technique of scale can interfere with pixel precision when used during training. These findings are crucial for the entire system, which leverages the Solve Perspective-N-Point (Solve-PnP) algorithm, achieving 6D pose accuracy within 1 ∘ and 7 cm at distances ranging from 7.5 to 35 m from the camera. Moreover, this solution operates in real-time, achieving sub-10ms processing times on a desktop PC.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-023-09094-8</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Aircraft ; Algorithms ; Artificial Intelligence ; Artificial neural networks ; Color imagery ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Data augmentation ; Data Mining and Knowledge Discovery ; Image Processing and Computer Vision ; Labeling ; Object recognition ; Occlusion ; Original Article ; Pinhole cameras ; Pixels ; Pose estimation ; Probability and Statistics in Computer Science ; Robotics ; Training</subject><ispartof>Neural computing &amp; applications, 2024, Vol.36 (3), p.1261-1281</ispartof><rights>This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2023</rights><rights>This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c314t-ecd69d9722597af7dfa58aa11d9a0addc10fdbf7d33e0bb1e6820da4c49d8efd3</cites><orcidid>0000-0002-6417-3715 ; 0000-0002-9635-6022 ; 0000-0002-4271-2132 ; 0000-0001-5649-3092 ; 0000-0003-4982-9859 ; 0000-0001-9778-0813</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00521-023-09094-8$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00521-023-09094-8$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51297</link.rule.ids></links><search><creatorcontrib>Choate, Jeffrey</creatorcontrib><creatorcontrib>Worth, Derek</creatorcontrib><creatorcontrib>Nykl, Scott</creatorcontrib><creatorcontrib>Taylor, Clark</creatorcontrib><creatorcontrib>Borghetti, Brett</creatorcontrib><creatorcontrib>Schubert Kabban, Christine</creatorcontrib><title>An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation</title><title>Neural computing &amp; applications</title><addtitle>Neural Comput &amp; Applic</addtitle><description>Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional neural network (CNN) for 6D pose localization of aircraft using only color imagery. Traditional object detection labeling methods suffer from inaccuracies due to perspective geometry and being limited to visible key points. This research demonstrates that with precise labeling, a CNN can predict object features with near-pixel accuracy, effectively learning the distinct appearance of the object due to perspective distortion with a pinhole camera. Additionally, we highlight the crucial role of knowledge about occluded features. Training the CNN with such knowledge slightly reduces pixel precision, but enables the prediction of 3 times more features, including those that are not initially visible, resulting in an overall better performing 6D system. Notably, we reveal that the data augmentation technique of scale can interfere with pixel precision when used during training. These findings are crucial for the entire system, which leverages the Solve Perspective-N-Point (Solve-PnP) algorithm, achieving 6D pose accuracy within 1 ∘ and 7 cm at distances ranging from 7.5 to 35 m from the camera. Moreover, this solution operates in real-time, achieving sub-10ms processing times on a desktop PC.</description><subject>Aircraft</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Color imagery</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Data augmentation</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Image Processing and Computer Vision</subject><subject>Labeling</subject><subject>Object recognition</subject><subject>Occlusion</subject><subject>Original Article</subject><subject>Pinhole cameras</subject><subject>Pixels</subject><subject>Pose estimation</subject><subject>Probability and Statistics in Computer Science</subject><subject>Robotics</subject><subject>Training</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9UMtOwzAQtBBIlMIPcLLEObCO8zK3qjylSlzgxMFy7U2VKo2DN0Xqjd_g9_gSXILEjdOuZmdmd4excwGXAqC8IoA8FQmkMgEFKkuqAzYRmZSJhLw6ZJOIxXGRyWN2QrQGgKyo8gl7nXXcdKbdUUPc17wPaBtqfHfNvbXtdt9GguM9BurRDs078hX6DQ5h9_XxSTz4FnnT8eKG956QIw3NxgxRd8qOatMSnv3WKXu5u32ePySLp_vH-WyRWCmyIUHrCuVUmaa5Kk1dutrklTFCOGXAOGcF1G4ZcSkRlkuBRZWCM5nNlKuwdnLKLkbfPvi3bdyv134b4lOkUyWgBCkKiKx0ZNngiQLWug_x0LDTAvQ-RD2GqGOI-idEXUWRHEUUyd0Kw5_1P6pv-bN3dg</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Choate, Jeffrey</creator><creator>Worth, Derek</creator><creator>Nykl, Scott</creator><creator>Taylor, Clark</creator><creator>Borghetti, Brett</creator><creator>Schubert Kabban, Christine</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-6417-3715</orcidid><orcidid>https://orcid.org/0000-0002-9635-6022</orcidid><orcidid>https://orcid.org/0000-0002-4271-2132</orcidid><orcidid>https://orcid.org/0000-0001-5649-3092</orcidid><orcidid>https://orcid.org/0000-0003-4982-9859</orcidid><orcidid>https://orcid.org/0000-0001-9778-0813</orcidid></search><sort><creationdate>2024</creationdate><title>An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation</title><author>Choate, Jeffrey ; Worth, Derek ; Nykl, Scott ; Taylor, Clark ; Borghetti, Brett ; Schubert Kabban, Christine</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c314t-ecd69d9722597af7dfa58aa11d9a0addc10fdbf7d33e0bb1e6820da4c49d8efd3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Aircraft</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Color imagery</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Data augmentation</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Image Processing and Computer Vision</topic><topic>Labeling</topic><topic>Object recognition</topic><topic>Occlusion</topic><topic>Original Article</topic><topic>Pinhole cameras</topic><topic>Pixels</topic><topic>Pose estimation</topic><topic>Probability and Statistics in Computer Science</topic><topic>Robotics</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Choate, Jeffrey</creatorcontrib><creatorcontrib>Worth, Derek</creatorcontrib><creatorcontrib>Nykl, Scott</creatorcontrib><creatorcontrib>Taylor, Clark</creatorcontrib><creatorcontrib>Borghetti, Brett</creatorcontrib><creatorcontrib>Schubert Kabban, Christine</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Neural computing &amp; applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Choate, Jeffrey</au><au>Worth, Derek</au><au>Nykl, Scott</au><au>Taylor, Clark</au><au>Borghetti, Brett</au><au>Schubert Kabban, Christine</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation</atitle><jtitle>Neural computing &amp; applications</jtitle><stitle>Neural Comput &amp; Applic</stitle><date>2024</date><risdate>2024</risdate><volume>36</volume><issue>3</issue><spage>1261</spage><epage>1281</epage><pages>1261-1281</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional neural network (CNN) for 6D pose localization of aircraft using only color imagery. Traditional object detection labeling methods suffer from inaccuracies due to perspective geometry and being limited to visible key points. This research demonstrates that with precise labeling, a CNN can predict object features with near-pixel accuracy, effectively learning the distinct appearance of the object due to perspective distortion with a pinhole camera. Additionally, we highlight the crucial role of knowledge about occluded features. Training the CNN with such knowledge slightly reduces pixel precision, but enables the prediction of 3 times more features, including those that are not initially visible, resulting in an overall better performing 6D system. Notably, we reveal that the data augmentation technique of scale can interfere with pixel precision when used during training. These findings are crucial for the entire system, which leverages the Solve Perspective-N-Point (Solve-PnP) algorithm, achieving 6D pose accuracy within 1 ∘ and 7 cm at distances ranging from 7.5 to 35 m from the camera. Moreover, this solution operates in real-time, achieving sub-10ms processing times on a desktop PC.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s00521-023-09094-8</doi><tpages>21</tpages><orcidid>https://orcid.org/0000-0002-6417-3715</orcidid><orcidid>https://orcid.org/0000-0002-9635-6022</orcidid><orcidid>https://orcid.org/0000-0002-4271-2132</orcidid><orcidid>https://orcid.org/0000-0001-5649-3092</orcidid><orcidid>https://orcid.org/0000-0003-4982-9859</orcidid><orcidid>https://orcid.org/0000-0001-9778-0813</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0941-0643
ispartof Neural computing & applications, 2024, Vol.36 (3), p.1261-1281
issn 0941-0643
1433-3058
language eng
recordid cdi_proquest_journals_2910703160
source Springer Nature - Complete Springer Journals
subjects Aircraft
Algorithms
Artificial Intelligence
Artificial neural networks
Color imagery
Computational Biology/Bioinformatics
Computational Science and Engineering
Computer Science
Data augmentation
Data Mining and Knowledge Discovery
Image Processing and Computer Vision
Labeling
Object recognition
Occlusion
Original Article
Pinhole cameras
Pixels
Pose estimation
Probability and Statistics in Computer Science
Robotics
Training
title An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T22%3A01%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20analysis%20of%20precision:%20occlusion%20and%20perspective%20geometry%E2%80%99s%20role%20in%206D%20pose%20estimation&rft.jtitle=Neural%20computing%20&%20applications&rft.au=Choate,%20Jeffrey&rft.date=2024&rft.volume=36&rft.issue=3&rft.spage=1261&rft.epage=1281&rft.pages=1261-1281&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-023-09094-8&rft_dat=%3Cproquest_cross%3E2910703160%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2910703160&rft_id=info:pmid/&rfr_iscdi=true