CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation

Code vulnerability detection is critical for software security prevention. Vulnerability annotation in large-scale software code is quite tedious and challenging, which requires domain experts to spend a lot of time annotating. This work offers CPVD, a cross-domain vulnerability detection approach b...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on software engineering 2023-08, Vol.49 (8), p.4152-4168
Hauptverfasser: Zhang, Chunyong, Liu, Bin, Xin, Yang, Yao, Liangwei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4168
container_issue 8
container_start_page 4152
container_title IEEE transactions on software engineering
container_volume 49
creator Zhang, Chunyong
Liu, Bin
Xin, Yang
Yao, Liangwei
description Code vulnerability detection is critical for software security prevention. Vulnerability annotation in large-scale software code is quite tedious and challenging, which requires domain experts to spend a lot of time annotating. This work offers CPVD, a cross-domain vulnerability detection approach based on the challenge of "learning to predict the vulnerability labels of another item quickly using one item with rich vulnerability labels." CPVD uses the code property graph to represent the code and uses the Graph Attention Network and Convolution Pooling Network to extract the graph feature vector. It reduces the distribution between the source domain and target domain data in the Domain Adaptation Representation Learning stage for cross-domain vulnerability detection. In this paper, we test each other on different real-world project codes. Compared with methods without domain adaptation and domain adaptation methods based on natural language processing, CPVD is more general and performs better in cross-domain vulnerability detection tasks. Specifically, for the four datasets of chr_deb, qemu, libav, and sard, they achieved the best results of 70.2%, 81.1%, 59.7%, and 78.1% respectively on the F1-Score, and 88.4%,86.3%, 85.2%, and 88.6% on the AUC.
doi_str_mv 10.1109/TSE.2023.3285910
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TSE_2023_3285910</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10149539</ieee_id><sourcerecordid>2851355375</sourcerecordid><originalsourceid>FETCH-LOGICAL-c292t-4a45c68a3206d56eef16e94abcbd616caef3a90672cd40e6afad5bf8fc82afe23</originalsourceid><addsrcrecordid>eNpNkM1PwzAMxSMEEmNw58AhEueOfDRpw21sYyBNMImxa-S2jugYbUkzof33tGwHTrb83rPlHyHXnI04Z-Zu9TYbCSbkSIpUGc5OyIAbaSKpBDslA8ZMGimVmnNy0bYbxphKEjUg2WS5nt7Tia_bli59vcE80PVuW6GHrNyWYU-nGLphWVf0AVosaNfMPTQfdBwCVn_CC4af2n9SqAo6rb-grOi4gCZAr16SMwfbFq-OdUjeH2eryVO0eJ0_T8aLKBdGhCiGWOU6BSmYLpRGdFyjiSHLs0JznQM6CYbpRORFzFCDg0JlLnV5KsChkENye9jb-Pp7h22wm3rnq-6k7ZBwqZRMVOdiB1fev-zR2caXX-D3ljPbk7QdSduTtEeSXeTmECkR8Z-dx0ZJI38Bt-BwfQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2851355375</pqid></control><display><type>article</type><title>CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation</title><source>IEEE Electronic Library (IEL)</source><creator>Zhang, Chunyong ; Liu, Bin ; Xin, Yang ; Yao, Liangwei</creator><creatorcontrib>Zhang, Chunyong ; Liu, Bin ; Xin, Yang ; Yao, Liangwei</creatorcontrib><description>Code vulnerability detection is critical for software security prevention. Vulnerability annotation in large-scale software code is quite tedious and challenging, which requires domain experts to spend a lot of time annotating. This work offers CPVD, a cross-domain vulnerability detection approach based on the challenge of "learning to predict the vulnerability labels of another item quickly using one item with rich vulnerability labels." CPVD uses the code property graph to represent the code and uses the Graph Attention Network and Convolution Pooling Network to extract the graph feature vector. It reduces the distribution between the source domain and target domain data in the Domain Adaptation Representation Learning stage for cross-domain vulnerability detection. In this paper, we test each other on different real-world project codes. Compared with methods without domain adaptation and domain adaptation methods based on natural language processing, CPVD is more general and performs better in cross-domain vulnerability detection tasks. Specifically, for the four datasets of chr_deb, qemu, libav, and sard, they achieved the best results of 70.2%, 81.1%, 59.7%, and 78.1% respectively on the F1-Score, and 88.4%,86.3%, 85.2%, and 88.6% on the AUC.</description><identifier>ISSN: 0098-5589</identifier><identifier>EISSN: 1939-3520</identifier><identifier>DOI: 10.1109/TSE.2023.3285910</identifier><identifier>CODEN: IESEDJ</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Adaptation ; Annotations ; Code property graph ; Codes ; cross-domain vulnerability detection ; domain adaptation representation learning ; Feature extraction ; graph attention network ; Graph neural networks ; Labels ; Learning ; Natural language processing ; Neural networks ; Security ; Software ; Software reliability ; Task analysis</subject><ispartof>IEEE transactions on software engineering, 2023-08, Vol.49 (8), p.4152-4168</ispartof><rights>Copyright IEEE Computer Society 2023</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c292t-4a45c68a3206d56eef16e94abcbd616caef3a90672cd40e6afad5bf8fc82afe23</citedby><cites>FETCH-LOGICAL-c292t-4a45c68a3206d56eef16e94abcbd616caef3a90672cd40e6afad5bf8fc82afe23</cites><orcidid>0000-0003-1571-1932 ; 0009-0007-2944-4632 ; 0000-0002-7372-1760 ; 0000-0002-9706-3950</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10149539$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54736</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10149539$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Zhang, Chunyong</creatorcontrib><creatorcontrib>Liu, Bin</creatorcontrib><creatorcontrib>Xin, Yang</creatorcontrib><creatorcontrib>Yao, Liangwei</creatorcontrib><title>CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation</title><title>IEEE transactions on software engineering</title><addtitle>TSE</addtitle><description>Code vulnerability detection is critical for software security prevention. Vulnerability annotation in large-scale software code is quite tedious and challenging, which requires domain experts to spend a lot of time annotating. This work offers CPVD, a cross-domain vulnerability detection approach based on the challenge of "learning to predict the vulnerability labels of another item quickly using one item with rich vulnerability labels." CPVD uses the code property graph to represent the code and uses the Graph Attention Network and Convolution Pooling Network to extract the graph feature vector. It reduces the distribution between the source domain and target domain data in the Domain Adaptation Representation Learning stage for cross-domain vulnerability detection. In this paper, we test each other on different real-world project codes. Compared with methods without domain adaptation and domain adaptation methods based on natural language processing, CPVD is more general and performs better in cross-domain vulnerability detection tasks. Specifically, for the four datasets of chr_deb, qemu, libav, and sard, they achieved the best results of 70.2%, 81.1%, 59.7%, and 78.1% respectively on the F1-Score, and 88.4%,86.3%, 85.2%, and 88.6% on the AUC.</description><subject>Adaptation</subject><subject>Annotations</subject><subject>Code property graph</subject><subject>Codes</subject><subject>cross-domain vulnerability detection</subject><subject>domain adaptation representation learning</subject><subject>Feature extraction</subject><subject>graph attention network</subject><subject>Graph neural networks</subject><subject>Labels</subject><subject>Learning</subject><subject>Natural language processing</subject><subject>Neural networks</subject><subject>Security</subject><subject>Software</subject><subject>Software reliability</subject><subject>Task analysis</subject><issn>0098-5589</issn><issn>1939-3520</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkM1PwzAMxSMEEmNw58AhEueOfDRpw21sYyBNMImxa-S2jugYbUkzof33tGwHTrb83rPlHyHXnI04Z-Zu9TYbCSbkSIpUGc5OyIAbaSKpBDslA8ZMGimVmnNy0bYbxphKEjUg2WS5nt7Tia_bli59vcE80PVuW6GHrNyWYU-nGLphWVf0AVosaNfMPTQfdBwCVn_CC4af2n9SqAo6rb-grOi4gCZAr16SMwfbFq-OdUjeH2eryVO0eJ0_T8aLKBdGhCiGWOU6BSmYLpRGdFyjiSHLs0JznQM6CYbpRORFzFCDg0JlLnV5KsChkENye9jb-Pp7h22wm3rnq-6k7ZBwqZRMVOdiB1fev-zR2caXX-D3ljPbk7QdSduTtEeSXeTmECkR8Z-dx0ZJI38Bt-BwfQ</recordid><startdate>20230801</startdate><enddate>20230801</enddate><creator>Zhang, Chunyong</creator><creator>Liu, Bin</creator><creator>Xin, Yang</creator><creator>Yao, Liangwei</creator><general>IEEE</general><general>IEEE Computer Society</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>K9.</scope><orcidid>https://orcid.org/0000-0003-1571-1932</orcidid><orcidid>https://orcid.org/0009-0007-2944-4632</orcidid><orcidid>https://orcid.org/0000-0002-7372-1760</orcidid><orcidid>https://orcid.org/0000-0002-9706-3950</orcidid></search><sort><creationdate>20230801</creationdate><title>CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation</title><author>Zhang, Chunyong ; Liu, Bin ; Xin, Yang ; Yao, Liangwei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c292t-4a45c68a3206d56eef16e94abcbd616caef3a90672cd40e6afad5bf8fc82afe23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptation</topic><topic>Annotations</topic><topic>Code property graph</topic><topic>Codes</topic><topic>cross-domain vulnerability detection</topic><topic>domain adaptation representation learning</topic><topic>Feature extraction</topic><topic>graph attention network</topic><topic>Graph neural networks</topic><topic>Labels</topic><topic>Learning</topic><topic>Natural language processing</topic><topic>Neural networks</topic><topic>Security</topic><topic>Software</topic><topic>Software reliability</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Chunyong</creatorcontrib><creatorcontrib>Liu, Bin</creatorcontrib><creatorcontrib>Xin, Yang</creatorcontrib><creatorcontrib>Yao, Liangwei</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><jtitle>IEEE transactions on software engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Zhang, Chunyong</au><au>Liu, Bin</au><au>Xin, Yang</au><au>Yao, Liangwei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation</atitle><jtitle>IEEE transactions on software engineering</jtitle><stitle>TSE</stitle><date>2023-08-01</date><risdate>2023</risdate><volume>49</volume><issue>8</issue><spage>4152</spage><epage>4168</epage><pages>4152-4168</pages><issn>0098-5589</issn><eissn>1939-3520</eissn><coden>IESEDJ</coden><abstract>Code vulnerability detection is critical for software security prevention. Vulnerability annotation in large-scale software code is quite tedious and challenging, which requires domain experts to spend a lot of time annotating. This work offers CPVD, a cross-domain vulnerability detection approach based on the challenge of "learning to predict the vulnerability labels of another item quickly using one item with rich vulnerability labels." CPVD uses the code property graph to represent the code and uses the Graph Attention Network and Convolution Pooling Network to extract the graph feature vector. It reduces the distribution between the source domain and target domain data in the Domain Adaptation Representation Learning stage for cross-domain vulnerability detection. In this paper, we test each other on different real-world project codes. Compared with methods without domain adaptation and domain adaptation methods based on natural language processing, CPVD is more general and performs better in cross-domain vulnerability detection tasks. Specifically, for the four datasets of chr_deb, qemu, libav, and sard, they achieved the best results of 70.2%, 81.1%, 59.7%, and 78.1% respectively on the F1-Score, and 88.4%,86.3%, 85.2%, and 88.6% on the AUC.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TSE.2023.3285910</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0003-1571-1932</orcidid><orcidid>https://orcid.org/0009-0007-2944-4632</orcidid><orcidid>https://orcid.org/0000-0002-7372-1760</orcidid><orcidid>https://orcid.org/0000-0002-9706-3950</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0098-5589
ispartof IEEE transactions on software engineering, 2023-08, Vol.49 (8), p.4152-4168
issn 0098-5589
1939-3520
language eng
recordid cdi_crossref_primary_10_1109_TSE_2023_3285910
source IEEE Electronic Library (IEL)
subjects Adaptation
Annotations
Code property graph
Codes
cross-domain vulnerability detection
domain adaptation representation learning
Feature extraction
graph attention network
Graph neural networks
Labels
Learning
Natural language processing
Neural networks
Security
Software
Software reliability
Task analysis
title CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T21%3A44%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=CPVD:%20Cross%20Project%20Vulnerability%20Detection%20Based%20on%20Graph%20Attention%20Network%20and%20Domain%20Adaptation&rft.jtitle=IEEE%20transactions%20on%20software%20engineering&rft.au=Zhang,%20Chunyong&rft.date=2023-08-01&rft.volume=49&rft.issue=8&rft.spage=4152&rft.epage=4168&rft.pages=4152-4168&rft.issn=0098-5589&rft.eissn=1939-3520&rft.coden=IESEDJ&rft_id=info:doi/10.1109/TSE.2023.3285910&rft_dat=%3Cproquest_RIE%3E2851355375%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2851355375&rft_id=info:pmid/&rft_ieee_id=10149539&rfr_iscdi=true