CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation

Code vulnerability detection is critical for software security prevention. Vulnerability annotation in large-scale software code is quite tedious and challenging, which requires domain experts to spend a lot of time annotating. This work offers CPVD, a cross-domain vulnerability detection approach b...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on software engineering 2023-08, Vol.49 (8), p.4152-4168
Hauptverfasser:	Zhang, Chunyong, Liu, Bin, Xin, Yang, Yao, Liangwei
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Annotations Code property graph Codes cross-domain vulnerability detection domain adaptation representation learning Feature extraction graph attention network Graph neural networks Labels Learning Natural language processing Neural networks Security Software Software reliability Task analysis
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	4168
container_issue	8
container_start_page	4152
container_title	IEEE transactions on software engineering
container_volume	49
creator	Zhang, Chunyong Liu, Bin Xin, Yang Yao, Liangwei
description	Code vulnerability detection is critical for software security prevention. Vulnerability annotation in large-scale software code is quite tedious and challenging, which requires domain experts to spend a lot of time annotating. This work offers CPVD, a cross-domain vulnerability detection approach based on the challenge of "learning to predict the vulnerability labels of another item quickly using one item with rich vulnerability labels." CPVD uses the code property graph to represent the code and uses the Graph Attention Network and Convolution Pooling Network to extract the graph feature vector. It reduces the distribution between the source domain and target domain data in the Domain Adaptation Representation Learning stage for cross-domain vulnerability detection. In this paper, we test each other on different real-world project codes. Compared with methods without domain adaptation and domain adaptation methods based on natural language processing, CPVD is more general and performs better in cross-domain vulnerability detection tasks. Specifically, for the four datasets of chr_deb, qemu, libav, and sard, they achieved the best results of 70.2%, 81.1%, 59.7%, and 78.1% respectively on the F1-Score, and 88.4%,86.3%, 85.2%, and 88.6% on the AUC.
doi_str_mv	10.1109/TSE.2023.3285910
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TSE_2023_3285910</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10149539</ieee_id><sourcerecordid>2851355375</sourcerecordid><originalsourceid>FETCH-LOGICAL-c292t-4a45c68a3206d56eef16e94abcbd616caef3a90672cd40e6afad5bf8fc82afe23</originalsourceid><addsrcrecordid>eNpNkM1PwzAMxSMEEmNw58AhEueOfDRpw21sYyBNMImxa-S2jugYbUkzof33tGwHTrb83rPlHyHXnI04Z-Zu9TYbCSbkSIpUGc5OyIAbaSKpBDslA8ZMGimVmnNy0bYbxphKEjUg2WS5nt7Tia_bli59vcE80PVuW6GHrNyWYU-nGLphWVf0AVosaNfMPTQfdBwCVn_CC4af2n9SqAo6rb-grOi4gCZAr16SMwfbFq-OdUjeH2eryVO0eJ0_T8aLKBdGhCiGWOU6BSmYLpRGdFyjiSHLs0JznQM6CYbpRORFzFCDg0JlLnV5KsChkENye9jb-Pp7h22wm3rnq-6k7ZBwqZRMVOdiB1fev-zR2caXX-D3ljPbk7QdSduTtEeSXeTmECkR8Z-dx0ZJI38Bt-BwfQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2851355375</pqid></control><display><type>article</type><title>CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation</title><source>IEEE Electronic Library (IEL)</source><creator>Zhang, Chunyong ; Liu, Bin ; Xin, Yang ; Yao, Liangwei</creator><creatorcontrib>Zhang, Chunyong ; Liu, Bin ; Xin, Yang ; Yao, Liangwei</creatorcontrib><description>Code vulnerability detection is critical for software security prevention. Vulnerability annotation in large-scale software code is quite tedious and challenging, which requires domain experts to spend a lot of time annotating. This work offers CPVD, a cross-domain vulnerability detection approach based on the challenge of "learning to predict the vulnerability labels of another item quickly using one item with rich vulnerability labels." CPVD uses the code property graph to represent the code and uses the Graph Attention Network and Convolution Pooling Network to extract the graph feature vector. It reduces the distribution between the source domain and target domain data in the Domain Adaptation Representation Learning stage for cross-domain vulnerability detection. In this paper, we test each other on different real-world project codes. Compared with methods without domain adaptation and domain adaptation methods based on natural language processing, CPVD is more general and performs better in cross-domain vulnerability detection tasks. Specifically, for the four datasets of chr_deb, qemu, libav, and sard, they achieved the best results of 70.2%, 81.1%, 59.7%, and 78.1% respectively on the F1-Score, and 88.4%,86.3%, 85.2%, and 88.6% on the AUC.</description><identifier>ISSN: 0098-5589</identifier><identifier>EISSN: 1939-3520</identifier><identifier>DOI: 10.1109/TSE.2023.3285910</identifier><identifier>CODEN: IESEDJ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Adaptation ; Annotations ; Code property graph ; Codes ; cross-domain vulnerability detection ; domain adaptation representation learning ; Feature extraction ; graph attention network ; Graph neural networks ; Labels ; Learning ; Natural language processing ; Neural networks ; Security ; Software ; Software reliability ; Task analysis</subject><ispartof>IEEE transactions on software engineering, 2023-08, Vol.49 (8), p.4152-4168</ispartof><rights>Copyright IEEE Computer Society 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c292t-4a45c68a3206d56eef16e94abcbd616caef3a90672cd40e6afad5bf8fc82afe23</citedby><cites>FETCH-LOGICAL-c292t-4a45c68a3206d56eef16e94abcbd616caef3a90672cd40e6afad5bf8fc82afe23</cites><orcidid>0000-0003-1571-1932 ; 0009-0007-2944-4632 ; 0000-0002-7372-1760 ; 0000-0002-9706-3950</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10149539$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10149539$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhang, Chunyong</creatorcontrib><creatorcontrib>Liu, Bin</creatorcontrib><creatorcontrib>Xin, Yang</creatorcontrib><creatorcontrib>Yao, Liangwei</creatorcontrib><title>CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation</title><title>IEEE transactions on software engineering</title><addtitle>TSE</addtitle><description>Code vulnerability detection is critical for software security prevention. Vulnerability annotation in large-scale software code is quite tedious and challenging, which requires domain experts to spend a lot of time annotating. This work offers CPVD, a cross-domain vulnerability detection approach based on the challenge of "learning to predict the vulnerability labels of another item quickly using one item with rich vulnerability labels." CPVD uses the code property graph to represent the code and uses the Graph Attention Network and Convolution Pooling Network to extract the graph feature vector. It reduces the distribution between the source domain and target domain data in the Domain Adaptation Representation Learning stage for cross-domain vulnerability detection. In this paper, we test each other on different real-world project codes. Compared with methods without domain adaptation and domain adaptation methods based on natural language processing, CPVD is more general and performs better in cross-domain vulnerability detection tasks. Specifically, for the four datasets of chr_deb, qemu, libav, and sard, they achieved the best results of 70.2%, 81.1%, 59.7%, and 78.1% respectively on the F1-Score, and 88.4%,86.3%, 85.2%, and 88.6% on the AUC.</description><subject>Adaptation</subject><subject>Annotations</subject><subject>Code property graph</subject><subject>Codes</subject><subject>cross-domain vulnerability detection</subject><subject>domain adaptation representation learning</subject><subject>Feature extraction</subject><subject>graph attention network</subject><subject>Graph neural networks</subject><subject>Labels</subject><subject>Learning</subject><subject>Natural language processing</subject><subject>Neural networks</subject><subject>Security</subject><subject>Software</subject><subject>Software reliability</subject><subject>Task analysis</subject><issn>0098-5589</issn><issn>1939-3520</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM1PwzAMxSMEEmNw58AhEueOfDRpw21sYyBNMImxa-S2jugYbUkzof33tGwHTrb83rPlHyHXnI04Z-Zu9TYbCSbkSIpUGc5OyIAbaSKpBDslA8ZMGimVmnNy0bYbxphKEjUg2WS5nt7Tia_bli59vcE80PVuW6GHrNyWYU-nGLphWVf0AVosaNfMPTQfdBwCVn_CC4af2n9SqAo6rb-grOi4gCZAr16SMwfbFq-OdUjeH2eryVO0eJ0_T8aLKBdGhCiGWOU6BSmYLpRGdFyjiSHLs0JznQM6CYbpRORFzFCDg0JlLnV5KsChkENye9jb-Pp7h22wm3rnq-6k7ZBwqZRMVOdiB1fev-zR2caXX-D3ljPbk7QdSduTtEeSXeTmECkR8Z-dx0ZJI38Bt-BwfQ</recordid><startdate>20230801</startdate><enddate>20230801</enddate><creator>Zhang, Chunyong</creator><creator>Liu, Bin</creator><creator>Xin, Yang</creator><creator>Yao, Liangwei</creator><general>IEEE</general><general>IEEE Computer Society</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>K9.</scope><orcidid>https://orcid.org/0000-0003-1571-1932</orcidid><orcidid>https://orcid.org/0009-0007-2944-4632</orcidid><orcidid>https://orcid.org/0000-0002-7372-1760</orcidid><orcidid>https://orcid.org/0000-0002-9706-3950</orcidid></search><sort><creationdate>20230801</creationdate><title>CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation</title><author>Zhang, Chunyong ; Liu, Bin ; Xin, Yang ; Yao, Liangwei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c292t-4a45c68a3206d56eef16e94abcbd616caef3a90672cd40e6afad5bf8fc82afe23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptation</topic><topic>Annotations</topic><topic>Code property graph</topic><topic>Codes</topic><topic>cross-domain vulnerability detection</topic><topic>domain adaptation representation learning</topic><topic>Feature extraction</topic><topic>graph attention network</topic><topic>Graph neural networks</topic><topic>Labels</topic><topic>Learning</topic><topic>Natural language processing</topic><topic>Neural networks</topic><topic>Security</topic><topic>Software</topic><topic>Software reliability</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Chunyong</creatorcontrib><creatorcontrib>Liu, Bin</creatorcontrib><creatorcontrib>Xin, Yang</creatorcontrib><creatorcontrib>Yao, Liangwei</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><jtitle>IEEE transactions on software engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhang, Chunyong</au><au>Liu, Bin</au><au>Xin, Yang</au><au>Yao, Liangwei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation</atitle><jtitle>IEEE transactions on software engineering</jtitle><stitle>TSE</stitle><date>2023-08-01</date><risdate>2023</risdate><volume>49</volume><issue>8</issue><spage>4152</spage><epage>4168</epage><pages>4152-4168</pages><issn>0098-5589</issn><eissn>1939-3520</eissn><coden>IESEDJ</coden><abstract>Code vulnerability detection is critical for software security prevention. Vulnerability annotation in large-scale software code is quite tedious and challenging, which requires domain experts to spend a lot of time annotating. This work offers CPVD, a cross-domain vulnerability detection approach based on the challenge of "learning to predict the vulnerability labels of another item quickly using one item with rich vulnerability labels." CPVD uses the code property graph to represent the code and uses the Graph Attention Network and Convolution Pooling Network to extract the graph feature vector. It reduces the distribution between the source domain and target domain data in the Domain Adaptation Representation Learning stage for cross-domain vulnerability detection. In this paper, we test each other on different real-world project codes. Compared with methods without domain adaptation and domain adaptation methods based on natural language processing, CPVD is more general and performs better in cross-domain vulnerability detection tasks. Specifically, for the four datasets of chr_deb, qemu, libav, and sard, they achieved the best results of 70.2%, 81.1%, 59.7%, and 78.1% respectively on the F1-Score, and 88.4%,86.3%, 85.2%, and 88.6% on the AUC.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TSE.2023.3285910</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0003-1571-1932</orcidid><orcidid>https://orcid.org/0009-0007-2944-4632</orcidid><orcidid>https://orcid.org/0000-0002-7372-1760</orcidid><orcidid>https://orcid.org/0000-0002-9706-3950</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 0098-5589
ispartof	IEEE transactions on software engineering, 2023-08, Vol.49 (8), p.4152-4168
issn	0098-5589 1939-3520
language	eng
recordid	cdi_crossref_primary_10_1109_TSE_2023_3285910
source	IEEE Electronic Library (IEL)
subjects	Adaptation Annotations Code property graph Codes cross-domain vulnerability detection domain adaptation representation learning Feature extraction graph attention network Graph neural networks Labels Learning Natural language processing Neural networks Security Software Software reliability Task analysis
title	CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T21%3A44%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CPVD:%20Cross%20Project%20Vulnerability%20Detection%20Based%20on%20Graph%20Attention%20Network%20and%20Domain%20Adaptation&rft.jtitle=IEEE%20transactions%20on%20software%20engineering&rft.au=Zhang,%20Chunyong&rft.date=2023-08-01&rft.volume=49&rft.issue=8&rft.spage=4152&rft.epage=4168&rft.pages=4152-4168&rft.issn=0098-5589&rft.eissn=1939-3520&rft.coden=IESEDJ&rft_id=info:doi/10.1109/TSE.2023.3285910&rft_dat=%3Cproquest_RIE%3E2851355375%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2851355375&rft_id=info:pmid/&rft_ieee_id=10149539&rfr_iscdi=true