Can Developers Prompt? A Controlled Experiment for Code Documentation Generation

Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experimen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kruse, Hans-Alexander, Puhlfürß, Tim, Maalej, Walid
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Kruse, Hans-Alexander
Puhlfürß, Tim
Maalej, Walid
description Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.
doi_str_mv 10.48550/arxiv.2408.00686
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2408_00686</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2408_00686</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2408_006863</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DMwMLMw42QIcE7MU3BJLUvNyS9ILSpWCCjKzy0osVdwVHDOzyspys_JSU1RcK0AymXmpuaVKKTlFwFlUlIVXPKTS0EiiSWZ-XkK7ql5qUVgJg8Da1piTnEqL5TmZpB3cw1x9tAF2x1fADQnsagyHuSGeLAbjAmrAADjKDyW</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</title><source>arXiv.org</source><creator>Kruse, Hans-Alexander ; Puhlfürß, Tim ; Maalej, Walid</creator><creatorcontrib>Kruse, Hans-Alexander ; Puhlfürß, Tim ; Maalej, Walid</creatorcontrib><description>Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.</description><identifier>DOI: 10.48550/arxiv.2408.00686</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Human-Computer Interaction ; Computer Science - Software Engineering</subject><creationdate>2024-08</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2408.00686$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2408.00686$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kruse, Hans-Alexander</creatorcontrib><creatorcontrib>Puhlfürß, Tim</creatorcontrib><creatorcontrib>Maalej, Walid</creatorcontrib><title>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</title><description>Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Human-Computer Interaction</subject><subject>Computer Science - Software Engineering</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DMwMLMw42QIcE7MU3BJLUvNyS9ILSpWCCjKzy0osVdwVHDOzyspys_JSU1RcK0AymXmpuaVKKTlFwFlUlIVXPKTS0EiiSWZ-XkK7ql5qUVgJg8Da1piTnEqL5TmZpB3cw1x9tAF2x1fADQnsagyHuSGeLAbjAmrAADjKDyW</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>Kruse, Hans-Alexander</creator><creator>Puhlfürß, Tim</creator><creator>Maalej, Walid</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240801</creationdate><title>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</title><author>Kruse, Hans-Alexander ; Puhlfürß, Tim ; Maalej, Walid</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2408_006863</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Human-Computer Interaction</topic><topic>Computer Science - Software Engineering</topic><toplevel>online_resources</toplevel><creatorcontrib>Kruse, Hans-Alexander</creatorcontrib><creatorcontrib>Puhlfürß, Tim</creatorcontrib><creatorcontrib>Maalej, Walid</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kruse, Hans-Alexander</au><au>Puhlfürß, Tim</au><au>Maalej, Walid</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</atitle><date>2024-08-01</date><risdate>2024</risdate><abstract>Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.</abstract><doi>10.48550/arxiv.2408.00686</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2408.00686
ispartof
issn
language eng
recordid cdi_arxiv_primary_2408_00686
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Human-Computer Interaction
Computer Science - Software Engineering
title Can Developers Prompt? A Controlled Experiment for Code Documentation Generation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T22%3A19%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Can%20Developers%20Prompt?%20A%20Controlled%20Experiment%20for%20Code%20Documentation%20Generation&rft.au=Kruse,%20Hans-Alexander&rft.date=2024-08-01&rft_id=info:doi/10.48550/arxiv.2408.00686&rft_dat=%3Carxiv_GOX%3E2408_00686%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true