Grammar Mutation for Testing Input Parsers

Grammar-based fuzzing is an effective method for testing programs that consume structured inputs, particularly input parsers. However, if the available grammar does not accurately represent the input format, or if the system under test (SUT) does not conform strictly to the grammar, there may be an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ACM transactions on software engineering and methodology 2024-12
Hauptverfasser:	Bendrissou, Bachir, Cadar, Cristian, Donaldson, Alastair F.
Format:	Artikel
Sprache:	eng
Schlagworte:	Software and its engineering Software testing and debugging
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	ACM transactions on software engineering and methodology
container_volume
creator	Bendrissou, Bachir Cadar, Cristian Donaldson, Alastair F.
description	Grammar-based fuzzing is an effective method for testing programs that consume structured inputs, particularly input parsers. However, if the available grammar does not accurately represent the input format, or if the system under test (SUT) does not conform strictly to the grammar, there may be an impedance mismatch between inputs generated via grammars and inputs accepted by the SUT. Even if the SUT has been designed to strictly conform to the grammar, the SUT parser may exhibit vulnerabilities that would only be triggered by slightly invalid inputs. Grammar-based generation, by construction, will not yield such edge case inputs. To overcome these limitations, we present two mutational-based approaches: Gmutator and G+M. Both approaches are built upon Grammarinator, a grammar-based generator. Gmutator applies mutations to the grammar input of Grammarinator, while G+M directly applies byte-level mutations to Grammarinator-generated inputs. To evaluate the effectiveness of these techniques (Grammarinator, Gmutator, G+M) in testing programs that parse various input formats, we conducted an experimental evaluation over four different input formats and twelve SUTs (three per input format). Our findings suggest that both Gmutator and G+M excel in generating edge case inputs, facilitating the detection of disparities between input specifications and parser implementations.
doi_str_mv	10.1145/3708517
format	Article
fullrecord	<record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3708517</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3708517</sourcerecordid><originalsourceid>FETCH-LOGICAL-a517-b442bbe40ad2d542731c95e9c8530802372bace5aa2be668e05cdba84625d2b23</originalsourceid><addsrcrecordid>eNo9j8FLwzAUxoMoOKd495SbINS9vCRNepTh5mBDDz14Ky9pKhXbjqQ7-N9b2fT0ffD9-N77GLsV8CiE0gtpwGphzthMaG0yIws8nzyoIpNSvF-yq5Q-AYQEVDP2sI7UdRT57jDS2A49b4bIy5DGtv_gm35_GPkbxRRiumYXDX2lcHPSOStXz-XyJdu-rjfLp21G09nMKYXOBQVUY60VGil8oUPhrZZgAaVBRz5oInQhz20A7WtHVuWoa3Qo5-z-WOvjkFIMTbWP7fThdyWg-l1YnRZO5N2RJN_9Q3_hD1uHSbY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Grammar Mutation for Testing Input Parsers</title><source>ACM Digital Library Complete</source><creator>Bendrissou, Bachir ; Cadar, Cristian ; Donaldson, Alastair F.</creator><creatorcontrib>Bendrissou, Bachir ; Cadar, Cristian ; Donaldson, Alastair F.</creatorcontrib><description>Grammar-based fuzzing is an effective method for testing programs that consume structured inputs, particularly input parsers. However, if the available grammar does not accurately represent the input format, or if the system under test (SUT) does not conform strictly to the grammar, there may be an impedance mismatch between inputs generated via grammars and inputs accepted by the SUT. Even if the SUT has been designed to strictly conform to the grammar, the SUT parser may exhibit vulnerabilities that would only be triggered by slightly invalid inputs. Grammar-based generation, by construction, will not yield such edge case inputs. To overcome these limitations, we present two mutational-based approaches: Gmutator and G+M. Both approaches are built upon Grammarinator, a grammar-based generator. Gmutator applies mutations to the grammar input of Grammarinator, while G+M directly applies byte-level mutations to Grammarinator-generated inputs. To evaluate the effectiveness of these techniques (Grammarinator, Gmutator, G+M) in testing programs that parse various input formats, we conducted an experimental evaluation over four different input formats and twelve SUTs (three per input format). Our findings suggest that both Gmutator and G+M excel in generating edge case inputs, facilitating the detection of disparities between input specifications and parser implementations.</description><identifier>ISSN: 1049-331X</identifier><identifier>EISSN: 1557-7392</identifier><identifier>DOI: 10.1145/3708517</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Software and its engineering ; Software testing and debugging</subject><ispartof>ACM transactions on software engineering and methodology, 2024-12</ispartof><rights>Copyright held by the owner/author(s).</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a517-b442bbe40ad2d542731c95e9c8530802372bace5aa2be668e05cdba84625d2b23</cites><orcidid>0000-0002-7448-7961 ; 0000-0002-3599-7264 ; 0000-0002-2864-1892</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids></links><search><creatorcontrib>Bendrissou, Bachir</creatorcontrib><creatorcontrib>Cadar, Cristian</creatorcontrib><creatorcontrib>Donaldson, Alastair F.</creatorcontrib><title>Grammar Mutation for Testing Input Parsers</title><title>ACM transactions on software engineering and methodology</title><addtitle>ACM TOSEM</addtitle><description>Grammar-based fuzzing is an effective method for testing programs that consume structured inputs, particularly input parsers. However, if the available grammar does not accurately represent the input format, or if the system under test (SUT) does not conform strictly to the grammar, there may be an impedance mismatch between inputs generated via grammars and inputs accepted by the SUT. Even if the SUT has been designed to strictly conform to the grammar, the SUT parser may exhibit vulnerabilities that would only be triggered by slightly invalid inputs. Grammar-based generation, by construction, will not yield such edge case inputs. To overcome these limitations, we present two mutational-based approaches: Gmutator and G+M. Both approaches are built upon Grammarinator, a grammar-based generator. Gmutator applies mutations to the grammar input of Grammarinator, while G+M directly applies byte-level mutations to Grammarinator-generated inputs. To evaluate the effectiveness of these techniques (Grammarinator, Gmutator, G+M) in testing programs that parse various input formats, we conducted an experimental evaluation over four different input formats and twelve SUTs (three per input format). Our findings suggest that both Gmutator and G+M excel in generating edge case inputs, facilitating the detection of disparities between input specifications and parser implementations.</description><subject>Software and its engineering</subject><subject>Software testing and debugging</subject><issn>1049-331X</issn><issn>1557-7392</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNo9j8FLwzAUxoMoOKd495SbINS9vCRNepTh5mBDDz14Ky9pKhXbjqQ7-N9b2fT0ffD9-N77GLsV8CiE0gtpwGphzthMaG0yIws8nzyoIpNSvF-yq5Q-AYQEVDP2sI7UdRT57jDS2A49b4bIy5DGtv_gm35_GPkbxRRiumYXDX2lcHPSOStXz-XyJdu-rjfLp21G09nMKYXOBQVUY60VGil8oUPhrZZgAaVBRz5oInQhz20A7WtHVuWoa3Qo5-z-WOvjkFIMTbWP7fThdyWg-l1YnRZO5N2RJN_9Q3_hD1uHSbY</recordid><startdate>20241220</startdate><enddate>20241220</enddate><creator>Bendrissou, Bachir</creator><creator>Cadar, Cristian</creator><creator>Donaldson, Alastair F.</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-7448-7961</orcidid><orcidid>https://orcid.org/0000-0002-3599-7264</orcidid><orcidid>https://orcid.org/0000-0002-2864-1892</orcidid></search><sort><creationdate>20241220</creationdate><title>Grammar Mutation for Testing Input Parsers</title><author>Bendrissou, Bachir ; Cadar, Cristian ; Donaldson, Alastair F.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a517-b442bbe40ad2d542731c95e9c8530802372bace5aa2be668e05cdba84625d2b23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Software and its engineering</topic><topic>Software testing and debugging</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bendrissou, Bachir</creatorcontrib><creatorcontrib>Cadar, Cristian</creatorcontrib><creatorcontrib>Donaldson, Alastair F.</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on software engineering and methodology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bendrissou, Bachir</au><au>Cadar, Cristian</au><au>Donaldson, Alastair F.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Grammar Mutation for Testing Input Parsers</atitle><jtitle>ACM transactions on software engineering and methodology</jtitle><stitle>ACM TOSEM</stitle><date>2024-12-20</date><risdate>2024</risdate><issn>1049-331X</issn><eissn>1557-7392</eissn><abstract>Grammar-based fuzzing is an effective method for testing programs that consume structured inputs, particularly input parsers. However, if the available grammar does not accurately represent the input format, or if the system under test (SUT) does not conform strictly to the grammar, there may be an impedance mismatch between inputs generated via grammars and inputs accepted by the SUT. Even if the SUT has been designed to strictly conform to the grammar, the SUT parser may exhibit vulnerabilities that would only be triggered by slightly invalid inputs. Grammar-based generation, by construction, will not yield such edge case inputs. To overcome these limitations, we present two mutational-based approaches: Gmutator and G+M. Both approaches are built upon Grammarinator, a grammar-based generator. Gmutator applies mutations to the grammar input of Grammarinator, while G+M directly applies byte-level mutations to Grammarinator-generated inputs. To evaluate the effectiveness of these techniques (Grammarinator, Gmutator, G+M) in testing programs that parse various input formats, we conducted an experimental evaluation over four different input formats and twelve SUTs (three per input format). Our findings suggest that both Gmutator and G+M excel in generating edge case inputs, facilitating the detection of disparities between input specifications and parser implementations.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3708517</doi><orcidid>https://orcid.org/0000-0002-7448-7961</orcidid><orcidid>https://orcid.org/0000-0002-3599-7264</orcidid><orcidid>https://orcid.org/0000-0002-2864-1892</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1049-331X
ispartof	ACM transactions on software engineering and methodology, 2024-12
issn	1049-331X 1557-7392
language	eng
recordid	cdi_crossref_primary_10_1145_3708517
source	ACM Digital Library Complete
subjects	Software and its engineering Software testing and debugging
title	Grammar Mutation for Testing Input Parsers
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T21%3A57%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Grammar%20Mutation%20for%20Testing%20Input%20Parsers&rft.jtitle=ACM%20transactions%20on%20software%20engineering%20and%20methodology&rft.au=Bendrissou,%20Bachir&rft.date=2024-12-20&rft.issn=1049-331X&rft.eissn=1557-7392&rft_id=info:doi/10.1145/3708517&rft_dat=%3Cacm_cross%3E3708517%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true