Improving the Robustness of Pedestrian Detection in Autonomous Driving With Generative Data Augmentation

Pedestrian detection plays a crucial role in autonomous driving by identifying the position, size, orientation, and dynamic features of pedestrians in images or videos, assisting autonomous vehicles in making better decisions and controls. It's worth noting that the performance of pedestrian de...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE network 2024-05, Vol.38 (3), p.63-69
Hauptverfasser:	Wu, Yalun, Xiang, Yingxiao, Tong, Endong, Ye, Yuqi, Cui, Zhibo, Tian, Yunzhe, Zhang, Lejun, Liu, Jiqiang, Han, Zhen, Niu, Wenjia
Format:	Artikel
Sprache:	eng
Schlagworte:	Autonomous vehicles Data augmentation Data models Datasets Descriptions diffusion model Feature extraction generative data augmentation image caption Image capture Lighting pedestrian detection Pedestrians Robustness Semantics Weather
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	69
container_issue	3
container_start_page	63
container_title	IEEE network
container_volume	38
creator	Wu, Yalun Xiang, Yingxiao Tong, Endong Ye, Yuqi Cui, Zhibo Tian, Yunzhe Zhang, Lejun Liu, Jiqiang Han, Zhen Niu, Wenjia
description	Pedestrian detection plays a crucial role in autonomous driving by identifying the position, size, orientation, and dynamic features of pedestrians in images or videos, assisting autonomous vehicles in making better decisions and controls. It's worth noting that the performance of pedestrian detection models largely depends on the quality and diversity of available training data. Current datasets for autonomous driving have limitations in terms of diversity, scale, and quality. In recent years, numerous studies have proposed the use of data augmentation strategies to expand the coverage of datasets, aiming to maximize the utilization of existing training data. However, these data augmentation methods often overlook the diversity of data scenarios. To overcome this challenge, in this paper, we propose a more comprehensive method for data augmentation, based on image descriptions and diffusion models. This method aims to cover a wider range of scene variations, including different weather conditions and lighting situations. We have designed a classifier to select data samples for augmentation, followed by extracting visual features based on image captions and converting them into high-level semantic information as textual descriptions for the corresponding samples. Finally, we utilize diffusion models to generate new variants. Additionally, we have designed three modification patterns to increase diversity in aspects such as weather conditions, lighting, and pedestrian poses within the data. We conducted extensive experiments on the KITTI dataset and in real-world environments, demonstrating that our proposed method significantly enhances the performance of pedestrian detection models in complex scenarios. This meticulous consideration of data augmentation will notably enhance the applicability and robustness of pedestrian detection models in actual autonomous driving scenarios.
doi_str_mv	10.1109/MNET.2024.3366232
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_MNET_2024_3366232</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10438990</ieee_id><sourcerecordid>3062736580</sourcerecordid><originalsourceid>FETCH-LOGICAL-c294t-5356ebd38a8f43f4e717f9e1046269593943344c624eecb12df02bd0461dd80f3</originalsourceid><addsrcrecordid>eNpNkMtKAzEUhoMoWKsPILgIuJ6a26STZWlrLdQLUtHdMJeTNsVJapIp-PbOWBeuzuJ837n8CF1TMqKUqLvHp_l6xAgTI86lZJydoAFN0yyhqfw4RQOSKZJkRIhzdBHCjhAqUs4GaLts9t4djN3guAX86so2RAshYKfxC9QQojeFxTOIUEXjLDYWT9rorGtcG_DMm1_53cQtXoAFX0RzADwrYtFxmwZsLHrvEp3p4jPA1V8dorf7-Xr6kKyeF8vpZJVUTImYpDyVUNY8KzItuBYwpmOtgBIhmVSp4kpwLkQlmQCoSspqTVhZd21a1xnRfIhuj3O7t77a7vx851pvu5U5J5KNuUwz0lH0SFXeheBB53tvmsJ_55TkfaB5H2jeB5r_Bdo5N0fHAMA_XvBMKcJ_ACUkcpo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3062736580</pqid></control><display><type>article</type><title>Improving the Robustness of Pedestrian Detection in Autonomous Driving With Generative Data Augmentation</title><source>IEEE Electronic Library (IEL)</source><creator>Wu, Yalun ; Xiang, Yingxiao ; Tong, Endong ; Ye, Yuqi ; Cui, Zhibo ; Tian, Yunzhe ; Zhang, Lejun ; Liu, Jiqiang ; Han, Zhen ; Niu, Wenjia</creator><creatorcontrib>Wu, Yalun ; Xiang, Yingxiao ; Tong, Endong ; Ye, Yuqi ; Cui, Zhibo ; Tian, Yunzhe ; Zhang, Lejun ; Liu, Jiqiang ; Han, Zhen ; Niu, Wenjia</creatorcontrib><description>Pedestrian detection plays a crucial role in autonomous driving by identifying the position, size, orientation, and dynamic features of pedestrians in images or videos, assisting autonomous vehicles in making better decisions and controls. It's worth noting that the performance of pedestrian detection models largely depends on the quality and diversity of available training data. Current datasets for autonomous driving have limitations in terms of diversity, scale, and quality. In recent years, numerous studies have proposed the use of data augmentation strategies to expand the coverage of datasets, aiming to maximize the utilization of existing training data. However, these data augmentation methods often overlook the diversity of data scenarios. To overcome this challenge, in this paper, we propose a more comprehensive method for data augmentation, based on image descriptions and diffusion models. This method aims to cover a wider range of scene variations, including different weather conditions and lighting situations. We have designed a classifier to select data samples for augmentation, followed by extracting visual features based on image captions and converting them into high-level semantic information as textual descriptions for the corresponding samples. Finally, we utilize diffusion models to generate new variants. Additionally, we have designed three modification patterns to increase diversity in aspects such as weather conditions, lighting, and pedestrian poses within the data. We conducted extensive experiments on the KITTI dataset and in real-world environments, demonstrating that our proposed method significantly enhances the performance of pedestrian detection models in complex scenarios. This meticulous consideration of data augmentation will notably enhance the applicability and robustness of pedestrian detection models in actual autonomous driving scenarios.</description><identifier>ISSN: 0890-8044</identifier><identifier>EISSN: 1558-156X</identifier><identifier>DOI: 10.1109/MNET.2024.3366232</identifier><identifier>CODEN: IENEET</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Autonomous vehicles ; Data augmentation ; Data models ; Datasets ; Descriptions ; diffusion model ; Feature extraction ; generative data augmentation ; image caption ; Image capture ; Lighting ; pedestrian detection ; Pedestrians ; Robustness ; Semantics ; Weather</subject><ispartof>IEEE network, 2024-05, Vol.38 (3), p.63-69</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c294t-5356ebd38a8f43f4e717f9e1046269593943344c624eecb12df02bd0461dd80f3</citedby><cites>FETCH-LOGICAL-c294t-5356ebd38a8f43f4e717f9e1046269593943344c624eecb12df02bd0461dd80f3</cites><orcidid>0000-0002-8679-7000 ; 0000-0002-3688-873X ; 0000-0003-1489-4765 ; 0000-0002-0891-1904 ; 0000-0002-3458-7431 ; 0000-0003-0015-7780 ; 0000-0003-4706-4266 ; 0000-0003-1147-4327 ; 0000-0003-0348-2108 ; 0000-0002-2237-3380</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10438990$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>315,781,785,797,27928,27929,54762</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10438990$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wu, Yalun</creatorcontrib><creatorcontrib>Xiang, Yingxiao</creatorcontrib><creatorcontrib>Tong, Endong</creatorcontrib><creatorcontrib>Ye, Yuqi</creatorcontrib><creatorcontrib>Cui, Zhibo</creatorcontrib><creatorcontrib>Tian, Yunzhe</creatorcontrib><creatorcontrib>Zhang, Lejun</creatorcontrib><creatorcontrib>Liu, Jiqiang</creatorcontrib><creatorcontrib>Han, Zhen</creatorcontrib><creatorcontrib>Niu, Wenjia</creatorcontrib><title>Improving the Robustness of Pedestrian Detection in Autonomous Driving With Generative Data Augmentation</title><title>IEEE network</title><addtitle>NET-M</addtitle><description>Pedestrian detection plays a crucial role in autonomous driving by identifying the position, size, orientation, and dynamic features of pedestrians in images or videos, assisting autonomous vehicles in making better decisions and controls. It's worth noting that the performance of pedestrian detection models largely depends on the quality and diversity of available training data. Current datasets for autonomous driving have limitations in terms of diversity, scale, and quality. In recent years, numerous studies have proposed the use of data augmentation strategies to expand the coverage of datasets, aiming to maximize the utilization of existing training data. However, these data augmentation methods often overlook the diversity of data scenarios. To overcome this challenge, in this paper, we propose a more comprehensive method for data augmentation, based on image descriptions and diffusion models. This method aims to cover a wider range of scene variations, including different weather conditions and lighting situations. We have designed a classifier to select data samples for augmentation, followed by extracting visual features based on image captions and converting them into high-level semantic information as textual descriptions for the corresponding samples. Finally, we utilize diffusion models to generate new variants. Additionally, we have designed three modification patterns to increase diversity in aspects such as weather conditions, lighting, and pedestrian poses within the data. We conducted extensive experiments on the KITTI dataset and in real-world environments, demonstrating that our proposed method significantly enhances the performance of pedestrian detection models in complex scenarios. This meticulous consideration of data augmentation will notably enhance the applicability and robustness of pedestrian detection models in actual autonomous driving scenarios.</description><subject>Autonomous vehicles</subject><subject>Data augmentation</subject><subject>Data models</subject><subject>Datasets</subject><subject>Descriptions</subject><subject>diffusion model</subject><subject>Feature extraction</subject><subject>generative data augmentation</subject><subject>image caption</subject><subject>Image capture</subject><subject>Lighting</subject><subject>pedestrian detection</subject><subject>Pedestrians</subject><subject>Robustness</subject><subject>Semantics</subject><subject>Weather</subject><issn>0890-8044</issn><issn>1558-156X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkMtKAzEUhoMoWKsPILgIuJ6a26STZWlrLdQLUtHdMJeTNsVJapIp-PbOWBeuzuJ837n8CF1TMqKUqLvHp_l6xAgTI86lZJydoAFN0yyhqfw4RQOSKZJkRIhzdBHCjhAqUs4GaLts9t4djN3guAX86so2RAshYKfxC9QQojeFxTOIUEXjLDYWT9rorGtcG_DMm1_53cQtXoAFX0RzADwrYtFxmwZsLHrvEp3p4jPA1V8dorf7-Xr6kKyeF8vpZJVUTImYpDyVUNY8KzItuBYwpmOtgBIhmVSp4kpwLkQlmQCoSspqTVhZd21a1xnRfIhuj3O7t77a7vx851pvu5U5J5KNuUwz0lH0SFXeheBB53tvmsJ_55TkfaB5H2jeB5r_Bdo5N0fHAMA_XvBMKcJ_ACUkcpo</recordid><startdate>20240501</startdate><enddate>20240501</enddate><creator>Wu, Yalun</creator><creator>Xiang, Yingxiao</creator><creator>Tong, Endong</creator><creator>Ye, Yuqi</creator><creator>Cui, Zhibo</creator><creator>Tian, Yunzhe</creator><creator>Zhang, Lejun</creator><creator>Liu, Jiqiang</creator><creator>Han, Zhen</creator><creator>Niu, Wenjia</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-8679-7000</orcidid><orcidid>https://orcid.org/0000-0002-3688-873X</orcidid><orcidid>https://orcid.org/0000-0003-1489-4765</orcidid><orcidid>https://orcid.org/0000-0002-0891-1904</orcidid><orcidid>https://orcid.org/0000-0002-3458-7431</orcidid><orcidid>https://orcid.org/0000-0003-0015-7780</orcidid><orcidid>https://orcid.org/0000-0003-4706-4266</orcidid><orcidid>https://orcid.org/0000-0003-1147-4327</orcidid><orcidid>https://orcid.org/0000-0003-0348-2108</orcidid><orcidid>https://orcid.org/0000-0002-2237-3380</orcidid></search><sort><creationdate>20240501</creationdate><title>Improving the Robustness of Pedestrian Detection in Autonomous Driving With Generative Data Augmentation</title><author>Wu, Yalun ; Xiang, Yingxiao ; Tong, Endong ; Ye, Yuqi ; Cui, Zhibo ; Tian, Yunzhe ; Zhang, Lejun ; Liu, Jiqiang ; Han, Zhen ; Niu, Wenjia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c294t-5356ebd38a8f43f4e717f9e1046269593943344c624eecb12df02bd0461dd80f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Autonomous vehicles</topic><topic>Data augmentation</topic><topic>Data models</topic><topic>Datasets</topic><topic>Descriptions</topic><topic>diffusion model</topic><topic>Feature extraction</topic><topic>generative data augmentation</topic><topic>image caption</topic><topic>Image capture</topic><topic>Lighting</topic><topic>pedestrian detection</topic><topic>Pedestrians</topic><topic>Robustness</topic><topic>Semantics</topic><topic>Weather</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Yalun</creatorcontrib><creatorcontrib>Xiang, Yingxiao</creatorcontrib><creatorcontrib>Tong, Endong</creatorcontrib><creatorcontrib>Ye, Yuqi</creatorcontrib><creatorcontrib>Cui, Zhibo</creatorcontrib><creatorcontrib>Tian, Yunzhe</creatorcontrib><creatorcontrib>Zhang, Lejun</creatorcontrib><creatorcontrib>Liu, Jiqiang</creatorcontrib><creatorcontrib>Han, Zhen</creatorcontrib><creatorcontrib>Niu, Wenjia</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE network</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wu, Yalun</au><au>Xiang, Yingxiao</au><au>Tong, Endong</au><au>Ye, Yuqi</au><au>Cui, Zhibo</au><au>Tian, Yunzhe</au><au>Zhang, Lejun</au><au>Liu, Jiqiang</au><au>Han, Zhen</au><au>Niu, Wenjia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Improving the Robustness of Pedestrian Detection in Autonomous Driving With Generative Data Augmentation</atitle><jtitle>IEEE network</jtitle><stitle>NET-M</stitle><date>2024-05-01</date><risdate>2024</risdate><volume>38</volume><issue>3</issue><spage>63</spage><epage>69</epage><pages>63-69</pages><issn>0890-8044</issn><eissn>1558-156X</eissn><coden>IENEET</coden><abstract>Pedestrian detection plays a crucial role in autonomous driving by identifying the position, size, orientation, and dynamic features of pedestrians in images or videos, assisting autonomous vehicles in making better decisions and controls. It's worth noting that the performance of pedestrian detection models largely depends on the quality and diversity of available training data. Current datasets for autonomous driving have limitations in terms of diversity, scale, and quality. In recent years, numerous studies have proposed the use of data augmentation strategies to expand the coverage of datasets, aiming to maximize the utilization of existing training data. However, these data augmentation methods often overlook the diversity of data scenarios. To overcome this challenge, in this paper, we propose a more comprehensive method for data augmentation, based on image descriptions and diffusion models. This method aims to cover a wider range of scene variations, including different weather conditions and lighting situations. We have designed a classifier to select data samples for augmentation, followed by extracting visual features based on image captions and converting them into high-level semantic information as textual descriptions for the corresponding samples. Finally, we utilize diffusion models to generate new variants. Additionally, we have designed three modification patterns to increase diversity in aspects such as weather conditions, lighting, and pedestrian poses within the data. We conducted extensive experiments on the KITTI dataset and in real-world environments, demonstrating that our proposed method significantly enhances the performance of pedestrian detection models in complex scenarios. This meticulous consideration of data augmentation will notably enhance the applicability and robustness of pedestrian detection models in actual autonomous driving scenarios.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/MNET.2024.3366232</doi><tpages>7</tpages><orcidid>https://orcid.org/0000-0002-8679-7000</orcidid><orcidid>https://orcid.org/0000-0002-3688-873X</orcidid><orcidid>https://orcid.org/0000-0003-1489-4765</orcidid><orcidid>https://orcid.org/0000-0002-0891-1904</orcidid><orcidid>https://orcid.org/0000-0002-3458-7431</orcidid><orcidid>https://orcid.org/0000-0003-0015-7780</orcidid><orcidid>https://orcid.org/0000-0003-4706-4266</orcidid><orcidid>https://orcid.org/0000-0003-1147-4327</orcidid><orcidid>https://orcid.org/0000-0003-0348-2108</orcidid><orcidid>https://orcid.org/0000-0002-2237-3380</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0890-8044
ispartof	IEEE network, 2024-05, Vol.38 (3), p.63-69
issn	0890-8044 1558-156X
language	eng
recordid	cdi_crossref_primary_10_1109_MNET_2024_3366232
source	IEEE Electronic Library (IEL)
subjects	Autonomous vehicles Data augmentation Data models Datasets Descriptions diffusion model Feature extraction generative data augmentation image caption Image capture Lighting pedestrian detection Pedestrians Robustness Semantics Weather
title	Improving the Robustness of Pedestrian Detection in Autonomous Driving With Generative Data Augmentation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-17T06%3A44%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Improving%20the%20Robustness%20of%20Pedestrian%20Detection%20in%20Autonomous%20Driving%20With%20Generative%20Data%20Augmentation&rft.jtitle=IEEE%20network&rft.au=Wu,%20Yalun&rft.date=2024-05-01&rft.volume=38&rft.issue=3&rft.spage=63&rft.epage=69&rft.pages=63-69&rft.issn=0890-8044&rft.eissn=1558-156X&rft.coden=IENEET&rft_id=info:doi/10.1109/MNET.2024.3366232&rft_dat=%3Cproquest_RIE%3E3062736580%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3062736580&rft_id=info:pmid/&rft_ieee_id=10438990&rfr_iscdi=true