Getting pwn'd by AI: Penetration Testing with Large Language Models

The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers w...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-08
Hauptverfasser:	Happe, Andreas, Cito, Jürgen
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Cryptography and Security Computer Science - Software Engineering Feedback loops Large language models Security Virtual environments
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Happe, Andreas Cito, Jürgen
description	The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers with AI sparring partners. We explore the feasibility of supplementing penetration testers with AI models for two distinct use cases: high-level task planning for security testing assignments and low-level vulnerability hunting within a vulnerable virtual machine. For the latter, we implemented a closed-feedback loop between LLM-generated low-level actions with a vulnerable virtual machine (connected through SSH) and allowed the LLM to analyze the machine state for vulnerabilities and suggest concrete attack vectors which were automatically executed within the virtual machine. We discuss promising initial results, detail avenues for improvement, and close deliberating on the ethics of providing AI-based sparring partners.
doi_str_mv	10.48550/arxiv.2308.00121
format	Article
fullrecord	<record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2308_00121</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2844930389</sourcerecordid><originalsourceid>FETCH-LOGICAL-a529-eeea6301fe613314eb6c149f0fae793e38c8d8326bf0ddb861c4d53392aad38d3</originalsourceid><addsrcrecordid>eNotj0FPgzAUgBsTE5e5H-DJJh48gW1fy4q3hehcgtEDd1LoA1kmYAHn_r2VeXnvHb68fB8hN5yFUivFHoz7ab5DAUyHjHHBL8hCAPBASyGuyGoY9owxEa2FUrAgyRbHsWlr2h_be0uLE93sHuk7tjg6MzZdSzMcZuDYjB80Na5GP9t6Mv547SwehmtyWZnDgKv_vSTZ81OWvATp23aXbNLAKBEHiGgiYLzCiHsfiUVUchlXrDK4jgFBl9pqEFFRMWsLHfFSWgUQC2MsaAtLcnt-OxfmvWs-jTvlf6X5XOqJuzPRu-5r8t75vptc651yoaWMgYGO4RdddlUw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2844930389</pqid></control><display><type>article</type><title>Getting pwn'd by AI: Penetration Testing with Large Language Models</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Happe, Andreas ; Cito, Jürgen</creator><creatorcontrib>Happe, Andreas ; Cito, Jürgen</creatorcontrib><description>The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers with AI sparring partners. We explore the feasibility of supplementing penetration testers with AI models for two distinct use cases: high-level task planning for security testing assignments and low-level vulnerability hunting within a vulnerable virtual machine. For the latter, we implemented a closed-feedback loop between LLM-generated low-level actions with a vulnerable virtual machine (connected through SSH) and allowed the LLM to analyze the machine state for vulnerabilities and suggest concrete attack vectors which were automatically executed within the virtual machine. We discuss promising initial results, detail avenues for improvement, and close deliberating on the ethics of providing AI-based sparring partners.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2308.00121</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Cryptography and Security ; Computer Science - Software Engineering ; Feedback loops ; Large language models ; Security ; Virtual environments</subject><ispartof>arXiv.org, 2023-08</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,780,881,27902</link.rule.ids><backlink>$$Uhttps://doi.org/10.1145/3611643.3613083$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.48550/arXiv.2308.00121$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Happe, Andreas</creatorcontrib><creatorcontrib>Cito, Jürgen</creatorcontrib><title>Getting pwn'd by AI: Penetration Testing with Large Language Models</title><title>arXiv.org</title><description>The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers with AI sparring partners. We explore the feasibility of supplementing penetration testers with AI models for two distinct use cases: high-level task planning for security testing assignments and low-level vulnerability hunting within a vulnerable virtual machine. For the latter, we implemented a closed-feedback loop between LLM-generated low-level actions with a vulnerable virtual machine (connected through SSH) and allowed the LLM to analyze the machine state for vulnerabilities and suggest concrete attack vectors which were automatically executed within the virtual machine. We discuss promising initial results, detail avenues for improvement, and close deliberating on the ethics of providing AI-based sparring partners.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Cryptography and Security</subject><subject>Computer Science - Software Engineering</subject><subject>Feedback loops</subject><subject>Large language models</subject><subject>Security</subject><subject>Virtual environments</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><sourceid>GOX</sourceid><recordid>eNotj0FPgzAUgBsTE5e5H-DJJh48gW1fy4q3hehcgtEDd1LoA1kmYAHn_r2VeXnvHb68fB8hN5yFUivFHoz7ab5DAUyHjHHBL8hCAPBASyGuyGoY9owxEa2FUrAgyRbHsWlr2h_be0uLE93sHuk7tjg6MzZdSzMcZuDYjB80Na5GP9t6Mv547SwehmtyWZnDgKv_vSTZ81OWvATp23aXbNLAKBEHiGgiYLzCiHsfiUVUchlXrDK4jgFBl9pqEFFRMWsLHfFSWgUQC2MsaAtLcnt-OxfmvWs-jTvlf6X5XOqJuzPRu-5r8t75vptc651yoaWMgYGO4RdddlUw</recordid><startdate>20230817</startdate><enddate>20230817</enddate><creator>Happe, Andreas</creator><creator>Cito, Jürgen</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PKEHL</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230817</creationdate><title>Getting pwn'd by AI: Penetration Testing with Large Language Models</title><author>Happe, Andreas ; Cito, Jürgen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a529-eeea6301fe613314eb6c149f0fae793e38c8d8326bf0ddb861c4d53392aad38d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Cryptography and Security</topic><topic>Computer Science - Software Engineering</topic><topic>Feedback loops</topic><topic>Large language models</topic><topic>Security</topic><topic>Virtual environments</topic><toplevel>online_resources</toplevel><creatorcontrib>Happe, Andreas</creatorcontrib><creatorcontrib>Cito, Jürgen</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied & Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Happe, Andreas</au><au>Cito, Jürgen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Getting pwn'd by AI: Penetration Testing with Large Language Models</atitle><jtitle>arXiv.org</jtitle><date>2023-08-17</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers with AI sparring partners. We explore the feasibility of supplementing penetration testers with AI models for two distinct use cases: high-level task planning for security testing assignments and low-level vulnerability hunting within a vulnerable virtual machine. For the latter, we implemented a closed-feedback loop between LLM-generated low-level actions with a vulnerable virtual machine (connected through SSH) and allowed the LLM to analyze the machine state for vulnerabilities and suggest concrete attack vectors which were automatically executed within the virtual machine. We discuss promising initial results, detail avenues for improvement, and close deliberating on the ethics of providing AI-based sparring partners.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2308.00121</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-08
issn	2331-8422
language	eng
recordid	cdi_arxiv_primary_2308_00121
source	arXiv.org; Free E- Journals
subjects	Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Cryptography and Security Computer Science - Software Engineering Feedback loops Large language models Security Virtual environments
title	Getting pwn'd by AI: Penetration Testing with Large Language Models
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T10%3A58%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Getting%20pwn'd%20by%20AI:%20Penetration%20Testing%20with%20Large%20Language%20Models&rft.jtitle=arXiv.org&rft.au=Happe,%20Andreas&rft.date=2023-08-17&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2308.00121&rft_dat=%3Cproquest_arxiv%3E2844930389%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2844930389&rft_id=info:pmid/&rfr_iscdi=true