What kinds of contracts do ML APIs need?

Recent work has shown that Machine Learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API users in writing correct code. The question is: what kinds of contracts would provide the most help to A...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Empirical software engineering : an international journal 2023-11, Vol.28 (6), p.142, Article 142
Hauptverfasser: Khairunnesa, Samantha Syeda, Ahmed, Shibbir, Imtiaz, Sayem Mohammad, Rajan, Hridesh, Leavens, Gary T.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 6
container_start_page 142
container_title Empirical software engineering : an international journal
container_volume 28
creator Khairunnesa, Samantha Syeda
Ahmed, Shibbir
Imtiaz, Sayem Mohammad
Rajan, Hridesh
Leavens, Gary T.
description Recent work has shown that Machine Learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API users in writing correct code. The question is: what kinds of contracts would provide the most help to API users? We are especially interested in what kinds of contracts help API users catch errors at earlier stages in the ML pipeline. We describe an empirical study of posts on Stack Overflow of the four most often-discussed ML libraries: TensorFlow , Scikit-learn , Keras , and PyTorch . For these libraries, our study extracted 413 informal (English) API specifications. We used these specifications to understand the following questions. What are the root causes and effects behind ML contract violations? Are there common patterns of ML contract violations? When does understanding ML contracts require an advanced level of ML software expertise? Could checking contracts at the API level help detect the violations in early ML pipeline stages? Our key findings are that the most commonly needed contracts for ML APIs are either checking constraints on single arguments of an API or on the order of API calls. The software engineering community could employ existing contract mining approaches to mine these contracts to promote an increased understanding of ML APIs. We also noted a need to combine behavioral and temporal contract mining approaches. We report on categories of required ML contracts, which may help designers of contract languages.
doi_str_mv 10.1007/s10664-023-10320-z
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2878151581</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2878151581</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-ec8b1c52a88ef99f2192e64dffe15ab251d03ee81ff73c3b7cb9815ab403a8fb3</originalsourceid><addsrcrecordid>eNp9kMFKAzEQhoMoWKsv4CngxUt0JtlssicpxWqhogfFY8hmE23V3ZpsD_bpTV3Bm6cZmO__Bz5CThEuEEBdJoSyLBhwwRAEB7bdIyOUSjBVYrmfd6E5E1yWh-QopRUAVKqQI3L-_Gp7-rZsm0S7QF3X9tG6PtGmo3cLOnmYJ9p631wdk4Ng35M_-Z1j8jS7fpzessX9zXw6WTDHFfTMO12jk9xq7UNVBY4V92XRhOBR2ppLbEB4rzEEJZyolasrvbsUIKwOtRiTs6F3HbvPjU-9WXWb2OaXhmuVUZQaM8UHysUupeiDWcflh41fBsHsjJjBiMlGzI8Rs80hMYRShtsXH_-q_0l9A6eyYjc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2878151581</pqid></control><display><type>article</type><title>What kinds of contracts do ML APIs need?</title><source>SpringerNature Journals</source><creator>Khairunnesa, Samantha Syeda ; Ahmed, Shibbir ; Imtiaz, Sayem Mohammad ; Rajan, Hridesh ; Leavens, Gary T.</creator><creatorcontrib>Khairunnesa, Samantha Syeda ; Ahmed, Shibbir ; Imtiaz, Sayem Mohammad ; Rajan, Hridesh ; Leavens, Gary T.</creatorcontrib><description>Recent work has shown that Machine Learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API users in writing correct code. The question is: what kinds of contracts would provide the most help to API users? We are especially interested in what kinds of contracts help API users catch errors at earlier stages in the ML pipeline. We describe an empirical study of posts on Stack Overflow of the four most often-discussed ML libraries: TensorFlow , Scikit-learn , Keras , and PyTorch . For these libraries, our study extracted 413 informal (English) API specifications. We used these specifications to understand the following questions. What are the root causes and effects behind ML contract violations? Are there common patterns of ML contract violations? When does understanding ML contracts require an advanced level of ML software expertise? Could checking contracts at the API level help detect the violations in early ML pipeline stages? Our key findings are that the most commonly needed contracts for ML APIs are either checking constraints on single arguments of an API or on the order of API calls. The software engineering community could employ existing contract mining approaches to mine these contracts to promote an increased understanding of ML APIs. We also noted a need to combine behavioral and temporal contract mining approaches. We report on categories of required ML contracts, which may help designers of contract languages.</description><identifier>ISSN: 1382-3256</identifier><identifier>EISSN: 1573-7616</identifier><identifier>DOI: 10.1007/s10664-023-10320-z</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Application programming interface ; Compilers ; Computer Science ; Contracts ; Interpreters ; Libraries ; Machine learning ; Programming Languages ; Questions ; Software engineering ; Software Engineering/Programming and Operating Systems ; Specifications ; Violations</subject><ispartof>Empirical software engineering : an international journal, 2023-11, Vol.28 (6), p.142, Article 142</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-ec8b1c52a88ef99f2192e64dffe15ab251d03ee81ff73c3b7cb9815ab403a8fb3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10664-023-10320-z$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10664-023-10320-z$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Khairunnesa, Samantha Syeda</creatorcontrib><creatorcontrib>Ahmed, Shibbir</creatorcontrib><creatorcontrib>Imtiaz, Sayem Mohammad</creatorcontrib><creatorcontrib>Rajan, Hridesh</creatorcontrib><creatorcontrib>Leavens, Gary T.</creatorcontrib><title>What kinds of contracts do ML APIs need?</title><title>Empirical software engineering : an international journal</title><addtitle>Empir Software Eng</addtitle><description>Recent work has shown that Machine Learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API users in writing correct code. The question is: what kinds of contracts would provide the most help to API users? We are especially interested in what kinds of contracts help API users catch errors at earlier stages in the ML pipeline. We describe an empirical study of posts on Stack Overflow of the four most often-discussed ML libraries: TensorFlow , Scikit-learn , Keras , and PyTorch . For these libraries, our study extracted 413 informal (English) API specifications. We used these specifications to understand the following questions. What are the root causes and effects behind ML contract violations? Are there common patterns of ML contract violations? When does understanding ML contracts require an advanced level of ML software expertise? Could checking contracts at the API level help detect the violations in early ML pipeline stages? Our key findings are that the most commonly needed contracts for ML APIs are either checking constraints on single arguments of an API or on the order of API calls. The software engineering community could employ existing contract mining approaches to mine these contracts to promote an increased understanding of ML APIs. We also noted a need to combine behavioral and temporal contract mining approaches. We report on categories of required ML contracts, which may help designers of contract languages.</description><subject>Application programming interface</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Contracts</subject><subject>Interpreters</subject><subject>Libraries</subject><subject>Machine learning</subject><subject>Programming Languages</subject><subject>Questions</subject><subject>Software engineering</subject><subject>Software Engineering/Programming and Operating Systems</subject><subject>Specifications</subject><subject>Violations</subject><issn>1382-3256</issn><issn>1573-7616</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNp9kMFKAzEQhoMoWKsv4CngxUt0JtlssicpxWqhogfFY8hmE23V3ZpsD_bpTV3Bm6cZmO__Bz5CThEuEEBdJoSyLBhwwRAEB7bdIyOUSjBVYrmfd6E5E1yWh-QopRUAVKqQI3L-_Gp7-rZsm0S7QF3X9tG6PtGmo3cLOnmYJ9p631wdk4Ng35M_-Z1j8jS7fpzessX9zXw6WTDHFfTMO12jk9xq7UNVBY4V92XRhOBR2ppLbEB4rzEEJZyolasrvbsUIKwOtRiTs6F3HbvPjU-9WXWb2OaXhmuVUZQaM8UHysUupeiDWcflh41fBsHsjJjBiMlGzI8Rs80hMYRShtsXH_-q_0l9A6eyYjc</recordid><startdate>20231101</startdate><enddate>20231101</enddate><creator>Khairunnesa, Samantha Syeda</creator><creator>Ahmed, Shibbir</creator><creator>Imtiaz, Sayem Mohammad</creator><creator>Rajan, Hridesh</creator><creator>Leavens, Gary T.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope><scope>S0W</scope></search><sort><creationdate>20231101</creationdate><title>What kinds of contracts do ML APIs need?</title><author>Khairunnesa, Samantha Syeda ; Ahmed, Shibbir ; Imtiaz, Sayem Mohammad ; Rajan, Hridesh ; Leavens, Gary T.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-ec8b1c52a88ef99f2192e64dffe15ab251d03ee81ff73c3b7cb9815ab403a8fb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Application programming interface</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Contracts</topic><topic>Interpreters</topic><topic>Libraries</topic><topic>Machine learning</topic><topic>Programming Languages</topic><topic>Questions</topic><topic>Software engineering</topic><topic>Software Engineering/Programming and Operating Systems</topic><topic>Specifications</topic><topic>Violations</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Khairunnesa, Samantha Syeda</creatorcontrib><creatorcontrib>Ahmed, Shibbir</creatorcontrib><creatorcontrib>Imtiaz, Sayem Mohammad</creatorcontrib><creatorcontrib>Rajan, Hridesh</creatorcontrib><creatorcontrib>Leavens, Gary T.</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Engineering Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection><collection>DELNET Engineering &amp; Technology Collection</collection><jtitle>Empirical software engineering : an international journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Khairunnesa, Samantha Syeda</au><au>Ahmed, Shibbir</au><au>Imtiaz, Sayem Mohammad</au><au>Rajan, Hridesh</au><au>Leavens, Gary T.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>What kinds of contracts do ML APIs need?</atitle><jtitle>Empirical software engineering : an international journal</jtitle><stitle>Empir Software Eng</stitle><date>2023-11-01</date><risdate>2023</risdate><volume>28</volume><issue>6</issue><spage>142</spage><pages>142-</pages><artnum>142</artnum><issn>1382-3256</issn><eissn>1573-7616</eissn><abstract>Recent work has shown that Machine Learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API users in writing correct code. The question is: what kinds of contracts would provide the most help to API users? We are especially interested in what kinds of contracts help API users catch errors at earlier stages in the ML pipeline. We describe an empirical study of posts on Stack Overflow of the four most often-discussed ML libraries: TensorFlow , Scikit-learn , Keras , and PyTorch . For these libraries, our study extracted 413 informal (English) API specifications. We used these specifications to understand the following questions. What are the root causes and effects behind ML contract violations? Are there common patterns of ML contract violations? When does understanding ML contracts require an advanced level of ML software expertise? Could checking contracts at the API level help detect the violations in early ML pipeline stages? Our key findings are that the most commonly needed contracts for ML APIs are either checking constraints on single arguments of an API or on the order of API calls. The software engineering community could employ existing contract mining approaches to mine these contracts to promote an increased understanding of ML APIs. We also noted a need to combine behavioral and temporal contract mining approaches. We report on categories of required ML contracts, which may help designers of contract languages.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10664-023-10320-z</doi></addata></record>
fulltext fulltext
identifier ISSN: 1382-3256
ispartof Empirical software engineering : an international journal, 2023-11, Vol.28 (6), p.142, Article 142
issn 1382-3256
1573-7616
language eng
recordid cdi_proquest_journals_2878151581
source SpringerNature Journals
subjects Application programming interface
Compilers
Computer Science
Contracts
Interpreters
Libraries
Machine learning
Programming Languages
Questions
Software engineering
Software Engineering/Programming and Operating Systems
Specifications
Violations
title What kinds of contracts do ML APIs need?
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T05%3A39%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=What%20kinds%20of%20contracts%20do%20ML%20APIs%20need?&rft.jtitle=Empirical%20software%20engineering%20:%20an%20international%20journal&rft.au=Khairunnesa,%20Samantha%20Syeda&rft.date=2023-11-01&rft.volume=28&rft.issue=6&rft.spage=142&rft.pages=142-&rft.artnum=142&rft.issn=1382-3256&rft.eissn=1573-7616&rft_id=info:doi/10.1007/s10664-023-10320-z&rft_dat=%3Cproquest_cross%3E2878151581%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2878151581&rft_id=info:pmid/&rfr_iscdi=true