An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation
Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional...
Gespeichert in:
Veröffentlicht in: | Neural computing & applications 2024, Vol.36 (3), p.1261-1281 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1281 |
---|---|
container_issue | 3 |
container_start_page | 1261 |
container_title | Neural computing & applications |
container_volume | 36 |
creator | Choate, Jeffrey Worth, Derek Nykl, Scott Taylor, Clark Borghetti, Brett Schubert Kabban, Christine |
description | Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional neural network (CNN) for 6D pose localization of aircraft using only color imagery. Traditional object detection labeling methods suffer from inaccuracies due to perspective geometry and being limited to visible key points. This research demonstrates that with precise labeling, a CNN can predict object features with near-pixel accuracy, effectively learning the distinct appearance of the object due to perspective distortion with a pinhole camera. Additionally, we highlight the crucial role of knowledge about occluded features. Training the CNN with such knowledge slightly reduces pixel precision, but enables the prediction of 3 times more features, including those that are not initially visible, resulting in an overall better performing 6D system. Notably, we reveal that the data augmentation technique of
scale
can interfere with pixel precision when used during training. These findings are crucial for the entire system, which leverages the Solve Perspective-N-Point (Solve-PnP) algorithm, achieving 6D pose accuracy within 1
∘
and 7 cm at distances ranging from 7.5 to 35 m from the camera. Moreover, this solution operates in real-time, achieving sub-10ms processing times on a desktop PC. |
doi_str_mv | 10.1007/s00521-023-09094-8 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2910703160</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2910703160</sourcerecordid><originalsourceid>FETCH-LOGICAL-c314t-ecd69d9722597af7dfa58aa11d9a0addc10fdbf7d33e0bb1e6820da4c49d8efd3</originalsourceid><addsrcrecordid>eNp9UMtOwzAQtBBIlMIPcLLEObCO8zK3qjylSlzgxMFy7U2VKo2DN0Xqjd_g9_gSXILEjdOuZmdmd4excwGXAqC8IoA8FQmkMgEFKkuqAzYRmZSJhLw6ZJOIxXGRyWN2QrQGgKyo8gl7nXXcdKbdUUPc17wPaBtqfHfNvbXtdt9GguM9BurRDs078hX6DQ5h9_XxSTz4FnnT8eKG956QIw3NxgxRd8qOatMSnv3WKXu5u32ePySLp_vH-WyRWCmyIUHrCuVUmaa5Kk1dutrklTFCOGXAOGcF1G4ZcSkRlkuBRZWCM5nNlKuwdnLKLkbfPvi3bdyv134b4lOkUyWgBCkKiKx0ZNngiQLWug_x0LDTAvQ-RD2GqGOI-idEXUWRHEUUyd0Kw5_1P6pv-bN3dg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2910703160</pqid></control><display><type>article</type><title>An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation</title><source>Springer Nature - Complete Springer Journals</source><creator>Choate, Jeffrey ; Worth, Derek ; Nykl, Scott ; Taylor, Clark ; Borghetti, Brett ; Schubert Kabban, Christine</creator><creatorcontrib>Choate, Jeffrey ; Worth, Derek ; Nykl, Scott ; Taylor, Clark ; Borghetti, Brett ; Schubert Kabban, Christine</creatorcontrib><description>Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional neural network (CNN) for 6D pose localization of aircraft using only color imagery. Traditional object detection labeling methods suffer from inaccuracies due to perspective geometry and being limited to visible key points. This research demonstrates that with precise labeling, a CNN can predict object features with near-pixel accuracy, effectively learning the distinct appearance of the object due to perspective distortion with a pinhole camera. Additionally, we highlight the crucial role of knowledge about occluded features. Training the CNN with such knowledge slightly reduces pixel precision, but enables the prediction of 3 times more features, including those that are not initially visible, resulting in an overall better performing 6D system. Notably, we reveal that the data augmentation technique of
scale
can interfere with pixel precision when used during training. These findings are crucial for the entire system, which leverages the Solve Perspective-N-Point (Solve-PnP) algorithm, achieving 6D pose accuracy within 1
∘
and 7 cm at distances ranging from 7.5 to 35 m from the camera. Moreover, this solution operates in real-time, achieving sub-10ms processing times on a desktop PC.</description><identifier>ISSN: 0941-0643</identifier><identifier>EISSN: 1433-3058</identifier><identifier>DOI: 10.1007/s00521-023-09094-8</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Aircraft ; Algorithms ; Artificial Intelligence ; Artificial neural networks ; Color imagery ; Computational Biology/Bioinformatics ; Computational Science and Engineering ; Computer Science ; Data augmentation ; Data Mining and Knowledge Discovery ; Image Processing and Computer Vision ; Labeling ; Object recognition ; Occlusion ; Original Article ; Pinhole cameras ; Pixels ; Pose estimation ; Probability and Statistics in Computer Science ; Robotics ; Training</subject><ispartof>Neural computing & applications, 2024, Vol.36 (3), p.1261-1281</ispartof><rights>This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2023</rights><rights>This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c314t-ecd69d9722597af7dfa58aa11d9a0addc10fdbf7d33e0bb1e6820da4c49d8efd3</cites><orcidid>0000-0002-6417-3715 ; 0000-0002-9635-6022 ; 0000-0002-4271-2132 ; 0000-0001-5649-3092 ; 0000-0003-4982-9859 ; 0000-0001-9778-0813</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s00521-023-09094-8$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s00521-023-09094-8$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51297</link.rule.ids></links><search><creatorcontrib>Choate, Jeffrey</creatorcontrib><creatorcontrib>Worth, Derek</creatorcontrib><creatorcontrib>Nykl, Scott</creatorcontrib><creatorcontrib>Taylor, Clark</creatorcontrib><creatorcontrib>Borghetti, Brett</creatorcontrib><creatorcontrib>Schubert Kabban, Christine</creatorcontrib><title>An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation</title><title>Neural computing & applications</title><addtitle>Neural Comput & Applic</addtitle><description>Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional neural network (CNN) for 6D pose localization of aircraft using only color imagery. Traditional object detection labeling methods suffer from inaccuracies due to perspective geometry and being limited to visible key points. This research demonstrates that with precise labeling, a CNN can predict object features with near-pixel accuracy, effectively learning the distinct appearance of the object due to perspective distortion with a pinhole camera. Additionally, we highlight the crucial role of knowledge about occluded features. Training the CNN with such knowledge slightly reduces pixel precision, but enables the prediction of 3 times more features, including those that are not initially visible, resulting in an overall better performing 6D system. Notably, we reveal that the data augmentation technique of
scale
can interfere with pixel precision when used during training. These findings are crucial for the entire system, which leverages the Solve Perspective-N-Point (Solve-PnP) algorithm, achieving 6D pose accuracy within 1
∘
and 7 cm at distances ranging from 7.5 to 35 m from the camera. Moreover, this solution operates in real-time, achieving sub-10ms processing times on a desktop PC.</description><subject>Aircraft</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Color imagery</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computational Science and Engineering</subject><subject>Computer Science</subject><subject>Data augmentation</subject><subject>Data Mining and Knowledge Discovery</subject><subject>Image Processing and Computer Vision</subject><subject>Labeling</subject><subject>Object recognition</subject><subject>Occlusion</subject><subject>Original Article</subject><subject>Pinhole cameras</subject><subject>Pixels</subject><subject>Pose estimation</subject><subject>Probability and Statistics in Computer Science</subject><subject>Robotics</subject><subject>Training</subject><issn>0941-0643</issn><issn>1433-3058</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9UMtOwzAQtBBIlMIPcLLEObCO8zK3qjylSlzgxMFy7U2VKo2DN0Xqjd_g9_gSXILEjdOuZmdmd4excwGXAqC8IoA8FQmkMgEFKkuqAzYRmZSJhLw6ZJOIxXGRyWN2QrQGgKyo8gl7nXXcdKbdUUPc17wPaBtqfHfNvbXtdt9GguM9BurRDs078hX6DQ5h9_XxSTz4FnnT8eKG956QIw3NxgxRd8qOatMSnv3WKXu5u32ePySLp_vH-WyRWCmyIUHrCuVUmaa5Kk1dutrklTFCOGXAOGcF1G4ZcSkRlkuBRZWCM5nNlKuwdnLKLkbfPvi3bdyv134b4lOkUyWgBCkKiKx0ZNngiQLWug_x0LDTAvQ-RD2GqGOI-idEXUWRHEUUyd0Kw5_1P6pv-bN3dg</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Choate, Jeffrey</creator><creator>Worth, Derek</creator><creator>Nykl, Scott</creator><creator>Taylor, Clark</creator><creator>Borghetti, Brett</creator><creator>Schubert Kabban, Christine</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-6417-3715</orcidid><orcidid>https://orcid.org/0000-0002-9635-6022</orcidid><orcidid>https://orcid.org/0000-0002-4271-2132</orcidid><orcidid>https://orcid.org/0000-0001-5649-3092</orcidid><orcidid>https://orcid.org/0000-0003-4982-9859</orcidid><orcidid>https://orcid.org/0000-0001-9778-0813</orcidid></search><sort><creationdate>2024</creationdate><title>An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation</title><author>Choate, Jeffrey ; Worth, Derek ; Nykl, Scott ; Taylor, Clark ; Borghetti, Brett ; Schubert Kabban, Christine</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c314t-ecd69d9722597af7dfa58aa11d9a0addc10fdbf7d33e0bb1e6820da4c49d8efd3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Aircraft</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Color imagery</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computational Science and Engineering</topic><topic>Computer Science</topic><topic>Data augmentation</topic><topic>Data Mining and Knowledge Discovery</topic><topic>Image Processing and Computer Vision</topic><topic>Labeling</topic><topic>Object recognition</topic><topic>Occlusion</topic><topic>Original Article</topic><topic>Pinhole cameras</topic><topic>Pixels</topic><topic>Pose estimation</topic><topic>Probability and Statistics in Computer Science</topic><topic>Robotics</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Choate, Jeffrey</creatorcontrib><creatorcontrib>Worth, Derek</creatorcontrib><creatorcontrib>Nykl, Scott</creatorcontrib><creatorcontrib>Taylor, Clark</creatorcontrib><creatorcontrib>Borghetti, Brett</creatorcontrib><creatorcontrib>Schubert Kabban, Christine</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Neural computing & applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Choate, Jeffrey</au><au>Worth, Derek</au><au>Nykl, Scott</au><au>Taylor, Clark</au><au>Borghetti, Brett</au><au>Schubert Kabban, Christine</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation</atitle><jtitle>Neural computing & applications</jtitle><stitle>Neural Comput & Applic</stitle><date>2024</date><risdate>2024</risdate><volume>36</volume><issue>3</issue><spage>1261</spage><epage>1281</epage><pages>1261-1281</pages><issn>0941-0643</issn><eissn>1433-3058</eissn><abstract>Achieving precise 6 degrees of freedom (6D) pose estimation of rigid objects from color images is a critical challenge with wide-ranging applications in robotics and close-contact aircraft operations. This study investigates key techniques in the application of YOLOv5 object detection convolutional neural network (CNN) for 6D pose localization of aircraft using only color imagery. Traditional object detection labeling methods suffer from inaccuracies due to perspective geometry and being limited to visible key points. This research demonstrates that with precise labeling, a CNN can predict object features with near-pixel accuracy, effectively learning the distinct appearance of the object due to perspective distortion with a pinhole camera. Additionally, we highlight the crucial role of knowledge about occluded features. Training the CNN with such knowledge slightly reduces pixel precision, but enables the prediction of 3 times more features, including those that are not initially visible, resulting in an overall better performing 6D system. Notably, we reveal that the data augmentation technique of
scale
can interfere with pixel precision when used during training. These findings are crucial for the entire system, which leverages the Solve Perspective-N-Point (Solve-PnP) algorithm, achieving 6D pose accuracy within 1
∘
and 7 cm at distances ranging from 7.5 to 35 m from the camera. Moreover, this solution operates in real-time, achieving sub-10ms processing times on a desktop PC.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s00521-023-09094-8</doi><tpages>21</tpages><orcidid>https://orcid.org/0000-0002-6417-3715</orcidid><orcidid>https://orcid.org/0000-0002-9635-6022</orcidid><orcidid>https://orcid.org/0000-0002-4271-2132</orcidid><orcidid>https://orcid.org/0000-0001-5649-3092</orcidid><orcidid>https://orcid.org/0000-0003-4982-9859</orcidid><orcidid>https://orcid.org/0000-0001-9778-0813</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0941-0643 |
ispartof | Neural computing & applications, 2024, Vol.36 (3), p.1261-1281 |
issn | 0941-0643 1433-3058 |
language | eng |
recordid | cdi_proquest_journals_2910703160 |
source | Springer Nature - Complete Springer Journals |
subjects | Aircraft Algorithms Artificial Intelligence Artificial neural networks Color imagery Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Data augmentation Data Mining and Knowledge Discovery Image Processing and Computer Vision Labeling Object recognition Occlusion Original Article Pinhole cameras Pixels Pose estimation Probability and Statistics in Computer Science Robotics Training |
title | An analysis of precision: occlusion and perspective geometry’s role in 6D pose estimation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T22%3A01%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20analysis%20of%20precision:%20occlusion%20and%20perspective%20geometry%E2%80%99s%20role%20in%206D%20pose%20estimation&rft.jtitle=Neural%20computing%20&%20applications&rft.au=Choate,%20Jeffrey&rft.date=2024&rft.volume=36&rft.issue=3&rft.spage=1261&rft.epage=1281&rft.pages=1261-1281&rft.issn=0941-0643&rft.eissn=1433-3058&rft_id=info:doi/10.1007/s00521-023-09094-8&rft_dat=%3Cproquest_cross%3E2910703160%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2910703160&rft_id=info:pmid/&rfr_iscdi=true |