Can Developers Prompt? A Controlled Experiment for Code Documentation Generation

Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experimen...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kruse, Hans-Alexander, Puhlfürß, Tim, Maalej, Walid
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Human-Computer Interaction Computer Science - Software Engineering
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Kruse, Hans-Alexander Puhlfürß, Tim Maalej, Walid
description	Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.
doi_str_mv	10.48550/arxiv.2408.00686
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2408_00686</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2408_00686</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2408_006863</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DMwMLMw42QIcE7MU3BJLUvNyS9ILSpWCCjKzy0osVdwVHDOzyspys_JSU1RcK0AymXmpuaVKKTlFwFlUlIVXPKTS0EiiSWZ-XkK7ql5qUVgJg8Da1piTnEqL5TmZpB3cw1x9tAF2x1fADQnsagyHuSGeLAbjAmrAADjKDyW</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</title><source>arXiv.org</source><creator>Kruse, Hans-Alexander ; Puhlfürß, Tim ; Maalej, Walid</creator><creatorcontrib>Kruse, Hans-Alexander ; Puhlfürß, Tim ; Maalej, Walid</creatorcontrib><description>Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.</description><identifier>DOI: 10.48550/arxiv.2408.00686</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Human-Computer Interaction ; Computer Science - Software Engineering</subject><creationdate>2024-08</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2408.00686$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2408.00686$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kruse, Hans-Alexander</creatorcontrib><creatorcontrib>Puhlfürß, Tim</creatorcontrib><creatorcontrib>Maalej, Walid</creatorcontrib><title>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</title><description>Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Human-Computer Interaction</subject><subject>Computer Science - Software Engineering</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DMwMLMw42QIcE7MU3BJLUvNyS9ILSpWCCjKzy0osVdwVHDOzyspys_JSU1RcK0AymXmpuaVKKTlFwFlUlIVXPKTS0EiiSWZ-XkK7ql5qUVgJg8Da1piTnEqL5TmZpB3cw1x9tAF2x1fADQnsagyHuSGeLAbjAmrAADjKDyW</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>Kruse, Hans-Alexander</creator><creator>Puhlfürß, Tim</creator><creator>Maalej, Walid</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240801</creationdate><title>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</title><author>Kruse, Hans-Alexander ; Puhlfürß, Tim ; Maalej, Walid</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2408_006863</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Human-Computer Interaction</topic><topic>Computer Science - Software Engineering</topic><toplevel>online_resources</toplevel><creatorcontrib>Kruse, Hans-Alexander</creatorcontrib><creatorcontrib>Puhlfürß, Tim</creatorcontrib><creatorcontrib>Maalej, Walid</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kruse, Hans-Alexander</au><au>Puhlfürß, Tim</au><au>Maalej, Walid</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</atitle><date>2024-08-01</date><risdate>2024</risdate><abstract>Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.</abstract><doi>10.48550/arxiv.2408.00686</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2408.00686
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2408_00686
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Human-Computer Interaction Computer Science - Software Engineering
title	Can Developers Prompt? A Controlled Experiment for Code Documentation Generation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T22%3A19%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Can%20Developers%20Prompt?%20A%20Controlled%20Experiment%20for%20Code%20Documentation%20Generation&rft.au=Kruse,%20Hans-Alexander&rft.date=2024-08-01&rft_id=info:doi/10.48550/arxiv.2408.00686&rft_dat=%3Carxiv_GOX%3E2408_00686%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true