Can Developers Prompt? A Controlled Experiment for Code Documentation Generation
Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experimen...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Kruse, Hans-Alexander Puhlfürß, Tim Maalej, Walid |
description | Large language models (LLMs) bear great potential for automating tedious
development tasks such as creating and maintaining code documentation. However,
it is unclear to what extent developers can effectively prompt LLMs to create
concise and useful documentation. We report on a controlled experiment with 20
professionals and 30 computer science students tasked with code documentation
generation for two Python functions. The experimental group freely entered
ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the
control group executed a predefined few-shot prompt. Our results reveal that
professionals and students were unaware of or unable to apply prompt
engineering techniques. Especially students perceived the documentation
produced from ad-hoc prompts as significantly less readable, less concise, and
less helpful than documentation from prepared prompts. Some professionals
produced higher quality documentation by just including the keyword Docstring
in their ad-hoc prompts. While students desired more support in formulating
prompts, professionals appreciated the flexibility of ad-hoc prompting.
Participants in both groups rarely assessed the output as perfect. Instead,
they understood the tools as support to iteratively refine the documentation.
Further research is needed to understand which prompting skills and preferences
developers have and which support they need for certain tasks. |
doi_str_mv | 10.48550/arxiv.2408.00686 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2408_00686</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2408_00686</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2408_006863</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DMwMLMw42QIcE7MU3BJLUvNyS9ILSpWCCjKzy0osVdwVHDOzyspys_JSU1RcK0AymXmpuaVKKTlFwFlUlIVXPKTS0EiiSWZ-XkK7ql5qUVgJg8Da1piTnEqL5TmZpB3cw1x9tAF2x1fADQnsagyHuSGeLAbjAmrAADjKDyW</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</title><source>arXiv.org</source><creator>Kruse, Hans-Alexander ; Puhlfürß, Tim ; Maalej, Walid</creator><creatorcontrib>Kruse, Hans-Alexander ; Puhlfürß, Tim ; Maalej, Walid</creatorcontrib><description>Large language models (LLMs) bear great potential for automating tedious
development tasks such as creating and maintaining code documentation. However,
it is unclear to what extent developers can effectively prompt LLMs to create
concise and useful documentation. We report on a controlled experiment with 20
professionals and 30 computer science students tasked with code documentation
generation for two Python functions. The experimental group freely entered
ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the
control group executed a predefined few-shot prompt. Our results reveal that
professionals and students were unaware of or unable to apply prompt
engineering techniques. Especially students perceived the documentation
produced from ad-hoc prompts as significantly less readable, less concise, and
less helpful than documentation from prepared prompts. Some professionals
produced higher quality documentation by just including the keyword Docstring
in their ad-hoc prompts. While students desired more support in formulating
prompts, professionals appreciated the flexibility of ad-hoc prompting.
Participants in both groups rarely assessed the output as perfect. Instead,
they understood the tools as support to iteratively refine the documentation.
Further research is needed to understand which prompting skills and preferences
developers have and which support they need for certain tasks.</description><identifier>DOI: 10.48550/arxiv.2408.00686</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Human-Computer Interaction ; Computer Science - Software Engineering</subject><creationdate>2024-08</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2408.00686$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2408.00686$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kruse, Hans-Alexander</creatorcontrib><creatorcontrib>Puhlfürß, Tim</creatorcontrib><creatorcontrib>Maalej, Walid</creatorcontrib><title>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</title><description>Large language models (LLMs) bear great potential for automating tedious
development tasks such as creating and maintaining code documentation. However,
it is unclear to what extent developers can effectively prompt LLMs to create
concise and useful documentation. We report on a controlled experiment with 20
professionals and 30 computer science students tasked with code documentation
generation for two Python functions. The experimental group freely entered
ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the
control group executed a predefined few-shot prompt. Our results reveal that
professionals and students were unaware of or unable to apply prompt
engineering techniques. Especially students perceived the documentation
produced from ad-hoc prompts as significantly less readable, less concise, and
less helpful than documentation from prepared prompts. Some professionals
produced higher quality documentation by just including the keyword Docstring
in their ad-hoc prompts. While students desired more support in formulating
prompts, professionals appreciated the flexibility of ad-hoc prompting.
Participants in both groups rarely assessed the output as perfect. Instead,
they understood the tools as support to iteratively refine the documentation.
Further research is needed to understand which prompting skills and preferences
developers have and which support they need for certain tasks.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Human-Computer Interaction</subject><subject>Computer Science - Software Engineering</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw0DMwMLMw42QIcE7MU3BJLUvNyS9ILSpWCCjKzy0osVdwVHDOzyspys_JSU1RcK0AymXmpuaVKKTlFwFlUlIVXPKTS0EiiSWZ-XkK7ql5qUVgJg8Da1piTnEqL5TmZpB3cw1x9tAF2x1fADQnsagyHuSGeLAbjAmrAADjKDyW</recordid><startdate>20240801</startdate><enddate>20240801</enddate><creator>Kruse, Hans-Alexander</creator><creator>Puhlfürß, Tim</creator><creator>Maalej, Walid</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240801</creationdate><title>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</title><author>Kruse, Hans-Alexander ; Puhlfürß, Tim ; Maalej, Walid</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2408_006863</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Human-Computer Interaction</topic><topic>Computer Science - Software Engineering</topic><toplevel>online_resources</toplevel><creatorcontrib>Kruse, Hans-Alexander</creatorcontrib><creatorcontrib>Puhlfürß, Tim</creatorcontrib><creatorcontrib>Maalej, Walid</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kruse, Hans-Alexander</au><au>Puhlfürß, Tim</au><au>Maalej, Walid</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Can Developers Prompt? A Controlled Experiment for Code Documentation Generation</atitle><date>2024-08-01</date><risdate>2024</risdate><abstract>Large language models (LLMs) bear great potential for automating tedious
development tasks such as creating and maintaining code documentation. However,
it is unclear to what extent developers can effectively prompt LLMs to create
concise and useful documentation. We report on a controlled experiment with 20
professionals and 30 computer science students tasked with code documentation
generation for two Python functions. The experimental group freely entered
ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the
control group executed a predefined few-shot prompt. Our results reveal that
professionals and students were unaware of or unable to apply prompt
engineering techniques. Especially students perceived the documentation
produced from ad-hoc prompts as significantly less readable, less concise, and
less helpful than documentation from prepared prompts. Some professionals
produced higher quality documentation by just including the keyword Docstring
in their ad-hoc prompts. While students desired more support in formulating
prompts, professionals appreciated the flexibility of ad-hoc prompting.
Participants in both groups rarely assessed the output as perfect. Instead,
they understood the tools as support to iteratively refine the documentation.
Further research is needed to understand which prompting skills and preferences
developers have and which support they need for certain tasks.</abstract><doi>10.48550/arxiv.2408.00686</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2408.00686 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2408_00686 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Human-Computer Interaction Computer Science - Software Engineering |
title | Can Developers Prompt? A Controlled Experiment for Code Documentation Generation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T22%3A19%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Can%20Developers%20Prompt?%20A%20Controlled%20Experiment%20for%20Code%20Documentation%20Generation&rft.au=Kruse,%20Hans-Alexander&rft.date=2024-08-01&rft_id=info:doi/10.48550/arxiv.2408.00686&rft_dat=%3Carxiv_GOX%3E2408_00686%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |