Getting pwn'd by AI: Penetration Testing with Large Language Models
The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers w...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2023-08 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Happe, Andreas Cito, Jürgen |
description | The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers with AI sparring partners. We explore the feasibility of supplementing penetration testers with AI models for two distinct use cases: high-level task planning for security testing assignments and low-level vulnerability hunting within a vulnerable virtual machine. For the latter, we implemented a closed-feedback loop between LLM-generated low-level actions with a vulnerable virtual machine (connected through SSH) and allowed the LLM to analyze the machine state for vulnerabilities and suggest concrete attack vectors which were automatically executed within the virtual machine. We discuss promising initial results, detail avenues for improvement, and close deliberating on the ethics of providing AI-based sparring partners. |
doi_str_mv | 10.48550/arxiv.2308.00121 |
format | Article |
fullrecord | <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2308_00121</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2844930389</sourcerecordid><originalsourceid>FETCH-LOGICAL-a529-eeea6301fe613314eb6c149f0fae793e38c8d8326bf0ddb861c4d53392aad38d3</originalsourceid><addsrcrecordid>eNotj0FPgzAUgBsTE5e5H-DJJh48gW1fy4q3hehcgtEDd1LoA1kmYAHn_r2VeXnvHb68fB8hN5yFUivFHoz7ab5DAUyHjHHBL8hCAPBASyGuyGoY9owxEa2FUrAgyRbHsWlr2h_be0uLE93sHuk7tjg6MzZdSzMcZuDYjB80Na5GP9t6Mv547SwehmtyWZnDgKv_vSTZ81OWvATp23aXbNLAKBEHiGgiYLzCiHsfiUVUchlXrDK4jgFBl9pqEFFRMWsLHfFSWgUQC2MsaAtLcnt-OxfmvWs-jTvlf6X5XOqJuzPRu-5r8t75vptc651yoaWMgYGO4RdddlUw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2844930389</pqid></control><display><type>article</type><title>Getting pwn'd by AI: Penetration Testing with Large Language Models</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Happe, Andreas ; Cito, Jürgen</creator><creatorcontrib>Happe, Andreas ; Cito, Jürgen</creatorcontrib><description>The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers with AI sparring partners. We explore the feasibility of supplementing penetration testers with AI models for two distinct use cases: high-level task planning for security testing assignments and low-level vulnerability hunting within a vulnerable virtual machine. For the latter, we implemented a closed-feedback loop between LLM-generated low-level actions with a vulnerable virtual machine (connected through SSH) and allowed the LLM to analyze the machine state for vulnerabilities and suggest concrete attack vectors which were automatically executed within the virtual machine. We discuss promising initial results, detail avenues for improvement, and close deliberating on the ethics of providing AI-based sparring partners.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2308.00121</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Cryptography and Security ; Computer Science - Software Engineering ; Feedback loops ; Large language models ; Security ; Virtual environments</subject><ispartof>arXiv.org, 2023-08</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.1145/3611643.3613083$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.48550/arXiv.2308.00121$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Happe, Andreas</creatorcontrib><creatorcontrib>Cito, Jürgen</creatorcontrib><title>Getting pwn'd by AI: Penetration Testing with Large Language Models</title><title>arXiv.org</title><description>The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers with AI sparring partners. We explore the feasibility of supplementing penetration testers with AI models for two distinct use cases: high-level task planning for security testing assignments and low-level vulnerability hunting within a vulnerable virtual machine. For the latter, we implemented a closed-feedback loop between LLM-generated low-level actions with a vulnerable virtual machine (connected through SSH) and allowed the LLM to analyze the machine state for vulnerabilities and suggest concrete attack vectors which were automatically executed within the virtual machine. We discuss promising initial results, detail avenues for improvement, and close deliberating on the ethics of providing AI-based sparring partners.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Cryptography and Security</subject><subject>Computer Science - Software Engineering</subject><subject>Feedback loops</subject><subject>Large language models</subject><subject>Security</subject><subject>Virtual environments</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj0FPgzAUgBsTE5e5H-DJJh48gW1fy4q3hehcgtEDd1LoA1kmYAHn_r2VeXnvHb68fB8hN5yFUivFHoz7ab5DAUyHjHHBL8hCAPBASyGuyGoY9owxEa2FUrAgyRbHsWlr2h_be0uLE93sHuk7tjg6MzZdSzMcZuDYjB80Na5GP9t6Mv547SwehmtyWZnDgKv_vSTZ81OWvATp23aXbNLAKBEHiGgiYLzCiHsfiUVUchlXrDK4jgFBl9pqEFFRMWsLHfFSWgUQC2MsaAtLcnt-OxfmvWs-jTvlf6X5XOqJuzPRu-5r8t75vptc651yoaWMgYGO4RdddlUw</recordid><startdate>20230817</startdate><enddate>20230817</enddate><creator>Happe, Andreas</creator><creator>Cito, Jürgen</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PKEHL</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230817</creationdate><title>Getting pwn'd by AI: Penetration Testing with Large Language Models</title><author>Happe, Andreas ; Cito, Jürgen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a529-eeea6301fe613314eb6c149f0fae793e38c8d8326bf0ddb861c4d53392aad38d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Cryptography and Security</topic><topic>Computer Science - Software Engineering</topic><topic>Feedback loops</topic><topic>Large language models</topic><topic>Security</topic><topic>Virtual environments</topic><toplevel>online_resources</toplevel><creatorcontrib>Happe, Andreas</creatorcontrib><creatorcontrib>Cito, Jürgen</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied & Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Happe, Andreas</au><au>Cito, Jürgen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Getting pwn'd by AI: Penetration Testing with Large Language Models</atitle><jtitle>arXiv.org</jtitle><date>2023-08-17</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers with AI sparring partners. We explore the feasibility of supplementing penetration testers with AI models for two distinct use cases: high-level task planning for security testing assignments and low-level vulnerability hunting within a vulnerable virtual machine. For the latter, we implemented a closed-feedback loop between LLM-generated low-level actions with a vulnerable virtual machine (connected through SSH) and allowed the LLM to analyze the machine state for vulnerabilities and suggest concrete attack vectors which were automatically executed within the virtual machine. We discuss promising initial results, detail avenues for improvement, and close deliberating on the ethics of providing AI-based sparring partners.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2308.00121</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2023-08 |
issn | 2331-8422 |
language | eng |
recordid | cdi_arxiv_primary_2308_00121 |
source | arXiv.org; Free E- Journals |
subjects | Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Cryptography and Security Computer Science - Software Engineering Feedback loops Large language models Security Virtual environments |
title | Getting pwn'd by AI: Penetration Testing with Large Language Models |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T10%3A58%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Getting%20pwn'd%20by%20AI:%20Penetration%20Testing%20with%20Large%20Language%20Models&rft.jtitle=arXiv.org&rft.au=Happe,%20Andreas&rft.date=2023-08-17&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2308.00121&rft_dat=%3Cproquest_arxiv%3E2844930389%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2844930389&rft_id=info:pmid/&rfr_iscdi=true |