Precise and Scalable Querying of Syntactical Source Code Patterns Using Sample Code Snippets and a Database

While analyzing a log file of a text-based source code search engine we discovered that developers search for fine-grained syntactical patterns in 36% of queries. Currently, to cope with queries of this kind developers need to use regular expressions, to add redundant terms to the query or to combin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Panchenko, O., Karstens, J., Plattner, H., Zeier, A.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 50
container_issue
container_start_page 41
container_title
container_volume
creator Panchenko, O.
Karstens, J.
Plattner, H.
Zeier, A.
description While analyzing a log file of a text-based source code search engine we discovered that developers search for fine-grained syntactical patterns in 36% of queries. Currently, to cope with queries of this kind developers need to use regular expressions, to add redundant terms to the query or to combine searching with other tools provided by the development environment. To improve the expressiveness of the queries, these can be formulated as tree patterns of abstract syntax trees. These search patterns can be expressed by using query languages, such as XPath. However, developers usually do not work with either XPath or with AST. To shield developers from the complexity of query formulation we propose using sample code snippets as queries. The novelty of our approach is the combination of a query language that is very close to the surface programming language and a special database technology to store a large amount of abstract syntax trees. The advantage of this approach over existing source code query languages and search engines is the performance of both query formulation and query execution. This paper describes the technical details of the method and illustrates the value of this approach with performance measures and an industrial controlled experiment. All developers were able to complete the tasks of the experiment faster and more accurately by using our tool (ACS) than by using a text-based search engine. The number of false positives in the result lists was significantly decreased.
doi_str_mv 10.1109/ICPC.2011.31
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5970162</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5970162</ieee_id><sourcerecordid>5970162</sourcerecordid><originalsourceid>FETCH-LOGICAL-g230t-c7a94745a89bd5dfc47d2ce20695867a019dc69fc11df0dda537b70144c80eea3</originalsourceid><addsrcrecordid>eNotjsFOwzAQRI0Aiar0xo2LfyDFayexfUQBSqVKFIWeq429qQJpGsXuoX9PKJ3LajRvR8PYA4g5gLBPy2JdzKUAmCu4YjOrjdC5zVJljb4-e8hBmlQJo2_YZHyRiQFl7tgshG8xKh9xsBP2sx7INYE4dp6XDlusWuKfRxpOTbfjh5qXpy6ii82Y8fJwHBzx4uCJrzFGGrrAN-GPLHHft5eo7Jq-pxjOpchfMGKFge7ZbY1toNnlTtnm7fWreE9WH4tl8bxKdlKJmDiNNtVphsZWPvO1S7WXjqQYJ5tcowDrXW5rB-Br4T1mSldaQJo6I4hQTdnjf29DRNt-aPY4nLaZHZlcql8aT1sQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Precise and Scalable Querying of Syntactical Source Code Patterns Using Sample Code Snippets and a Database</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Panchenko, O. ; Karstens, J. ; Plattner, H. ; Zeier, A.</creator><creatorcontrib>Panchenko, O. ; Karstens, J. ; Plattner, H. ; Zeier, A.</creatorcontrib><description>While analyzing a log file of a text-based source code search engine we discovered that developers search for fine-grained syntactical patterns in 36% of queries. Currently, to cope with queries of this kind developers need to use regular expressions, to add redundant terms to the query or to combine searching with other tools provided by the development environment. To improve the expressiveness of the queries, these can be formulated as tree patterns of abstract syntax trees. These search patterns can be expressed by using query languages, such as XPath. However, developers usually do not work with either XPath or with AST. To shield developers from the complexity of query formulation we propose using sample code snippets as queries. The novelty of our approach is the combination of a query language that is very close to the surface programming language and a special database technology to store a large amount of abstract syntax trees. The advantage of this approach over existing source code query languages and search engines is the performance of both query formulation and query execution. This paper describes the technical details of the method and illustrates the value of this approach with performance measures and an industrial controlled experiment. All developers were able to complete the tasks of the experiment faster and more accurately by using our tool (ACS) than by using a text-based search engine. The number of false positives in the result lists was significantly decreased.</description><identifier>ISSN: 1092-8138</identifier><identifier>ISBN: 9781612843087</identifier><identifier>ISBN: 1612843085</identifier><identifier>EISBN: 9780769543987</identifier><identifier>EISBN: 0769543987</identifier><identifier>DOI: 10.1109/ICPC.2011.31</identifier><language>eng</language><publisher>IEEE</publisher><subject>abstract syntax trees ; Data models ; Database languages ; query-by-example ; Search engines ; source code query language ; source code search ; Syntactics ; XML ; XPath</subject><ispartof>2011 IEEE 19th International Conference on Program Comprehension, 2011, p.41-50</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5970162$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5970162$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Panchenko, O.</creatorcontrib><creatorcontrib>Karstens, J.</creatorcontrib><creatorcontrib>Plattner, H.</creatorcontrib><creatorcontrib>Zeier, A.</creatorcontrib><title>Precise and Scalable Querying of Syntactical Source Code Patterns Using Sample Code Snippets and a Database</title><title>2011 IEEE 19th International Conference on Program Comprehension</title><addtitle>icpc</addtitle><description>While analyzing a log file of a text-based source code search engine we discovered that developers search for fine-grained syntactical patterns in 36% of queries. Currently, to cope with queries of this kind developers need to use regular expressions, to add redundant terms to the query or to combine searching with other tools provided by the development environment. To improve the expressiveness of the queries, these can be formulated as tree patterns of abstract syntax trees. These search patterns can be expressed by using query languages, such as XPath. However, developers usually do not work with either XPath or with AST. To shield developers from the complexity of query formulation we propose using sample code snippets as queries. The novelty of our approach is the combination of a query language that is very close to the surface programming language and a special database technology to store a large amount of abstract syntax trees. The advantage of this approach over existing source code query languages and search engines is the performance of both query formulation and query execution. This paper describes the technical details of the method and illustrates the value of this approach with performance measures and an industrial controlled experiment. All developers were able to complete the tasks of the experiment faster and more accurately by using our tool (ACS) than by using a text-based search engine. The number of false positives in the result lists was significantly decreased.</description><subject>abstract syntax trees</subject><subject>Data models</subject><subject>Database languages</subject><subject>query-by-example</subject><subject>Search engines</subject><subject>source code query language</subject><subject>source code search</subject><subject>Syntactics</subject><subject>XML</subject><subject>XPath</subject><issn>1092-8138</issn><isbn>9781612843087</isbn><isbn>1612843085</isbn><isbn>9780769543987</isbn><isbn>0769543987</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2011</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotjsFOwzAQRI0Aiar0xo2LfyDFayexfUQBSqVKFIWeq429qQJpGsXuoX9PKJ3LajRvR8PYA4g5gLBPy2JdzKUAmCu4YjOrjdC5zVJljb4-e8hBmlQJo2_YZHyRiQFl7tgshG8xKh9xsBP2sx7INYE4dp6XDlusWuKfRxpOTbfjh5qXpy6ii82Y8fJwHBzx4uCJrzFGGrrAN-GPLHHft5eo7Jq-pxjOpchfMGKFge7ZbY1toNnlTtnm7fWreE9WH4tl8bxKdlKJmDiNNtVphsZWPvO1S7WXjqQYJ5tcowDrXW5rB-Br4T1mSldaQJo6I4hQTdnjf29DRNt-aPY4nLaZHZlcql8aT1sQ</recordid><startdate>201106</startdate><enddate>201106</enddate><creator>Panchenko, O.</creator><creator>Karstens, J.</creator><creator>Plattner, H.</creator><creator>Zeier, A.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201106</creationdate><title>Precise and Scalable Querying of Syntactical Source Code Patterns Using Sample Code Snippets and a Database</title><author>Panchenko, O. ; Karstens, J. ; Plattner, H. ; Zeier, A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-g230t-c7a94745a89bd5dfc47d2ce20695867a019dc69fc11df0dda537b70144c80eea3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2011</creationdate><topic>abstract syntax trees</topic><topic>Data models</topic><topic>Database languages</topic><topic>query-by-example</topic><topic>Search engines</topic><topic>source code query language</topic><topic>source code search</topic><topic>Syntactics</topic><topic>XML</topic><topic>XPath</topic><toplevel>online_resources</toplevel><creatorcontrib>Panchenko, O.</creatorcontrib><creatorcontrib>Karstens, J.</creatorcontrib><creatorcontrib>Plattner, H.</creatorcontrib><creatorcontrib>Zeier, A.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Panchenko, O.</au><au>Karstens, J.</au><au>Plattner, H.</au><au>Zeier, A.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Precise and Scalable Querying of Syntactical Source Code Patterns Using Sample Code Snippets and a Database</atitle><btitle>2011 IEEE 19th International Conference on Program Comprehension</btitle><stitle>icpc</stitle><date>2011-06</date><risdate>2011</risdate><spage>41</spage><epage>50</epage><pages>41-50</pages><issn>1092-8138</issn><isbn>9781612843087</isbn><isbn>1612843085</isbn><eisbn>9780769543987</eisbn><eisbn>0769543987</eisbn><abstract>While analyzing a log file of a text-based source code search engine we discovered that developers search for fine-grained syntactical patterns in 36% of queries. Currently, to cope with queries of this kind developers need to use regular expressions, to add redundant terms to the query or to combine searching with other tools provided by the development environment. To improve the expressiveness of the queries, these can be formulated as tree patterns of abstract syntax trees. These search patterns can be expressed by using query languages, such as XPath. However, developers usually do not work with either XPath or with AST. To shield developers from the complexity of query formulation we propose using sample code snippets as queries. The novelty of our approach is the combination of a query language that is very close to the surface programming language and a special database technology to store a large amount of abstract syntax trees. The advantage of this approach over existing source code query languages and search engines is the performance of both query formulation and query execution. This paper describes the technical details of the method and illustrates the value of this approach with performance measures and an industrial controlled experiment. All developers were able to complete the tasks of the experiment faster and more accurately by using our tool (ACS) than by using a text-based search engine. The number of false positives in the result lists was significantly decreased.</abstract><pub>IEEE</pub><doi>10.1109/ICPC.2011.31</doi><tpages>10</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1092-8138
ispartof 2011 IEEE 19th International Conference on Program Comprehension, 2011, p.41-50
issn 1092-8138
language eng
recordid cdi_ieee_primary_5970162
source IEEE Electronic Library (IEL) Conference Proceedings
subjects abstract syntax trees
Data models
Database languages
query-by-example
Search engines
source code query language
source code search
Syntactics
XML
XPath
title Precise and Scalable Querying of Syntactical Source Code Patterns Using Sample Code Snippets and a Database
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T12%3A46%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Precise%20and%20Scalable%20Querying%20of%20Syntactical%20Source%20Code%20Patterns%20Using%20Sample%20Code%20Snippets%20and%20a%20Database&rft.btitle=2011%20IEEE%2019th%20International%20Conference%20on%20Program%20Comprehension&rft.au=Panchenko,%20O.&rft.date=2011-06&rft.spage=41&rft.epage=50&rft.pages=41-50&rft.issn=1092-8138&rft.isbn=9781612843087&rft.isbn_list=1612843085&rft_id=info:doi/10.1109/ICPC.2011.31&rft_dat=%3Cieee_6IE%3E5970162%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9780769543987&rft.eisbn_list=0769543987&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5970162&rfr_iscdi=true