COSET: A Benchmark for Evaluating Neural Program Embeddings

Neural program embedding can be helpful in analyzing large software, a task that is challenging for traditional logic-based program analyses due to their limited scalability. A key focus of recent machine-learning advances in this area is on modeling program semantics instead of just syntax. Unfortu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Wang, Ke, Christodorescu, Mihai
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Wang, Ke
Christodorescu, Mihai
description Neural program embedding can be helpful in analyzing large software, a task that is challenging for traditional logic-based program analyses due to their limited scalability. A key focus of recent machine-learning advances in this area is on modeling program semantics instead of just syntax. Unfortunately evaluating such advances is not obvious, as program semantics does not lend itself to straightforward metrics. In this paper, we introduce a benchmarking framework called COSET for standardizing the evaluation of neural program embeddings. COSET consists of a diverse dataset of programs in source-code format, labeled by human experts according to a number of program properties of interest. A point of novelty is a suite of program transformations included in COSET. These transformations when applied to the base dataset can simulate natural changes to program code due to optimization and refactoring and can serve as a "debugging" tool for classification mistakes. We conducted a pilot study on four prominent models: TreeLSTM, gated graph neural network (GGNN), AST-Path neural network (APNN), and DYPRO. We found that COSET is useful in identifying the strengths and limitations of each model and in pinpointing specific syntactic and semantic characteristics of programs that pose challenges.
doi_str_mv 10.48550/arxiv.1905.11445
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1905_11445</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1905_11445</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-1f191c1b73386086b13ae65d8576d4882f075e18866a635b436543c4aac5b5d13</originalsourceid><addsrcrecordid>eNotj81OwkAURmfjgiAPwIp5gda5ztzbQVfYVDQhQkL3zZ3OFBpbMMNP9O1BdPUtTvLlHCHGoFJjEdUDx-_2nMJUYQpgDA7Ec75cF-WTnMmXsKu3PcdP2eyjLM7cnfjY7jbyI5wid3IV95vIvSx6F7y_gsO9uGu4O4TR_w5F-VqU-VuyWM7f89kiYcowgQamUIPLtLakLDnQHAi9xYy8sfaxURkGsJaISaMzmtDo2jDX6NCDHorJ3-3NvvqK7dXyp_qtqG4V-gJFbz-U</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>COSET: A Benchmark for Evaluating Neural Program Embeddings</title><source>arXiv.org</source><creator>Wang, Ke ; Christodorescu, Mihai</creator><creatorcontrib>Wang, Ke ; Christodorescu, Mihai</creatorcontrib><description>Neural program embedding can be helpful in analyzing large software, a task that is challenging for traditional logic-based program analyses due to their limited scalability. A key focus of recent machine-learning advances in this area is on modeling program semantics instead of just syntax. Unfortunately evaluating such advances is not obvious, as program semantics does not lend itself to straightforward metrics. In this paper, we introduce a benchmarking framework called COSET for standardizing the evaluation of neural program embeddings. COSET consists of a diverse dataset of programs in source-code format, labeled by human experts according to a number of program properties of interest. A point of novelty is a suite of program transformations included in COSET. These transformations when applied to the base dataset can simulate natural changes to program code due to optimization and refactoring and can serve as a "debugging" tool for classification mistakes. We conducted a pilot study on four prominent models: TreeLSTM, gated graph neural network (GGNN), AST-Path neural network (APNN), and DYPRO. We found that COSET is useful in identifying the strengths and limitations of each model and in pinpointing specific syntactic and semantic characteristics of programs that pose challenges.</description><identifier>DOI: 10.48550/arxiv.1905.11445</identifier><language>eng</language><subject>Computer Science - Learning ; Computer Science - Programming Languages ; Statistics - Machine Learning</subject><creationdate>2019-05</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1905.11445$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1905.11445$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Ke</creatorcontrib><creatorcontrib>Christodorescu, Mihai</creatorcontrib><title>COSET: A Benchmark for Evaluating Neural Program Embeddings</title><description>Neural program embedding can be helpful in analyzing large software, a task that is challenging for traditional logic-based program analyses due to their limited scalability. A key focus of recent machine-learning advances in this area is on modeling program semantics instead of just syntax. Unfortunately evaluating such advances is not obvious, as program semantics does not lend itself to straightforward metrics. In this paper, we introduce a benchmarking framework called COSET for standardizing the evaluation of neural program embeddings. COSET consists of a diverse dataset of programs in source-code format, labeled by human experts according to a number of program properties of interest. A point of novelty is a suite of program transformations included in COSET. These transformations when applied to the base dataset can simulate natural changes to program code due to optimization and refactoring and can serve as a "debugging" tool for classification mistakes. We conducted a pilot study on four prominent models: TreeLSTM, gated graph neural network (GGNN), AST-Path neural network (APNN), and DYPRO. We found that COSET is useful in identifying the strengths and limitations of each model and in pinpointing specific syntactic and semantic characteristics of programs that pose challenges.</description><subject>Computer Science - Learning</subject><subject>Computer Science - Programming Languages</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj81OwkAURmfjgiAPwIp5gda5ztzbQVfYVDQhQkL3zZ3OFBpbMMNP9O1BdPUtTvLlHCHGoFJjEdUDx-_2nMJUYQpgDA7Ec75cF-WTnMmXsKu3PcdP2eyjLM7cnfjY7jbyI5wid3IV95vIvSx6F7y_gsO9uGu4O4TR_w5F-VqU-VuyWM7f89kiYcowgQamUIPLtLakLDnQHAi9xYy8sfaxURkGsJaISaMzmtDo2jDX6NCDHorJ3-3NvvqK7dXyp_qtqG4V-gJFbz-U</recordid><startdate>20190527</startdate><enddate>20190527</enddate><creator>Wang, Ke</creator><creator>Christodorescu, Mihai</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20190527</creationdate><title>COSET: A Benchmark for Evaluating Neural Program Embeddings</title><author>Wang, Ke ; Christodorescu, Mihai</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-1f191c1b73386086b13ae65d8576d4882f075e18866a635b436543c4aac5b5d13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer Science - Learning</topic><topic>Computer Science - Programming Languages</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Wang, Ke</creatorcontrib><creatorcontrib>Christodorescu, Mihai</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Ke</au><au>Christodorescu, Mihai</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>COSET: A Benchmark for Evaluating Neural Program Embeddings</atitle><date>2019-05-27</date><risdate>2019</risdate><abstract>Neural program embedding can be helpful in analyzing large software, a task that is challenging for traditional logic-based program analyses due to their limited scalability. A key focus of recent machine-learning advances in this area is on modeling program semantics instead of just syntax. Unfortunately evaluating such advances is not obvious, as program semantics does not lend itself to straightforward metrics. In this paper, we introduce a benchmarking framework called COSET for standardizing the evaluation of neural program embeddings. COSET consists of a diverse dataset of programs in source-code format, labeled by human experts according to a number of program properties of interest. A point of novelty is a suite of program transformations included in COSET. These transformations when applied to the base dataset can simulate natural changes to program code due to optimization and refactoring and can serve as a "debugging" tool for classification mistakes. We conducted a pilot study on four prominent models: TreeLSTM, gated graph neural network (GGNN), AST-Path neural network (APNN), and DYPRO. We found that COSET is useful in identifying the strengths and limitations of each model and in pinpointing specific syntactic and semantic characteristics of programs that pose challenges.</abstract><doi>10.48550/arxiv.1905.11445</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1905.11445
ispartof
issn
language eng
recordid cdi_arxiv_primary_1905_11445
source arXiv.org
subjects Computer Science - Learning
Computer Science - Programming Languages
Statistics - Machine Learning
title COSET: A Benchmark for Evaluating Neural Program Embeddings
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T07%3A40%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=COSET:%20A%20Benchmark%20for%20Evaluating%20Neural%20Program%20Embeddings&rft.au=Wang,%20Ke&rft.date=2019-05-27&rft_id=info:doi/10.48550/arxiv.1905.11445&rft_dat=%3Carxiv_GOX%3E1905_11445%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true