Is This the Same Code? A Comprehensive Study of Decompilation Techniques for WebAssembly Binaries

WebAssembly is a low-level bytecode language designed for client-side execution in web browsers. The need for decompilation techniques that recover high-level source code from WASM binaries has grown as WASM continues to gain widespread adoption and its security concerns. However little research has...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Wu, Wei-Cheng, Yan, Yutian, Egilsson, Hallgrimur David, Park, David, Chan, Steven, Hauser, Christophe, Wang, Weihang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Wu, Wei-Cheng
Yan, Yutian
Egilsson, Hallgrimur David
Park, David
Chan, Steven
Hauser, Christophe
Wang, Weihang
description WebAssembly is a low-level bytecode language designed for client-side execution in web browsers. The need for decompilation techniques that recover high-level source code from WASM binaries has grown as WASM continues to gain widespread adoption and its security concerns. However little research has been done to assess the quality of decompiled code from WASM. This paper aims to fill this gap by conducting a comprehensive comparative analysis between decompiled C code from WASM binaries and state-of-the-art native binary decompilers. We presented a novel framework for empirically evaluating C-based decompilers from various aspects including correctness/ readability/ and structural similarity. The proposed metrics are validated practicality in decompiler assessment and provided insightful observations regarding the characteristics and constraints of existing decompiled code. This in turn contributes to bolstering the security and reliability of software systems that rely on WASM and native binaries.
doi_str_mv 10.48550/arxiv.2411.02278
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_02278</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_02278</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_022783</originalsourceid><addsrcrecordid>eNqFjrEOgjAUALs4GPUDnHw_IAJCZDOIGp0lcSQFHulLaIstEPl7kbg73XA3HGNrz3WCKAzdHTdv6h0_8DzH9f1DNGf8biEVZKEVCA8uERJd4hHikbIxKFBZ6kfVduUAuoIzFqOgmrekFaRYCEWvDi1U2sAT89halHk9wIkUN4R2yWYVry2uflywzfWSJrftNJM1hiQ3Q_adyqap_f_iAxVsQhU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Is This the Same Code? A Comprehensive Study of Decompilation Techniques for WebAssembly Binaries</title><source>arXiv.org</source><creator>Wu, Wei-Cheng ; Yan, Yutian ; Egilsson, Hallgrimur David ; Park, David ; Chan, Steven ; Hauser, Christophe ; Wang, Weihang</creator><creatorcontrib>Wu, Wei-Cheng ; Yan, Yutian ; Egilsson, Hallgrimur David ; Park, David ; Chan, Steven ; Hauser, Christophe ; Wang, Weihang</creatorcontrib><description>WebAssembly is a low-level bytecode language designed for client-side execution in web browsers. The need for decompilation techniques that recover high-level source code from WASM binaries has grown as WASM continues to gain widespread adoption and its security concerns. However little research has been done to assess the quality of decompiled code from WASM. This paper aims to fill this gap by conducting a comprehensive comparative analysis between decompiled C code from WASM binaries and state-of-the-art native binary decompilers. We presented a novel framework for empirically evaluating C-based decompilers from various aspects including correctness/ readability/ and structural similarity. The proposed metrics are validated practicality in decompiler assessment and provided insightful observations regarding the characteristics and constraints of existing decompiled code. This in turn contributes to bolstering the security and reliability of software systems that rely on WASM and native binaries.</description><identifier>DOI: 10.48550/arxiv.2411.02278</identifier><language>eng</language><subject>Computer Science - Software Engineering</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.02278$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.02278$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Wu, Wei-Cheng</creatorcontrib><creatorcontrib>Yan, Yutian</creatorcontrib><creatorcontrib>Egilsson, Hallgrimur David</creatorcontrib><creatorcontrib>Park, David</creatorcontrib><creatorcontrib>Chan, Steven</creatorcontrib><creatorcontrib>Hauser, Christophe</creatorcontrib><creatorcontrib>Wang, Weihang</creatorcontrib><title>Is This the Same Code? A Comprehensive Study of Decompilation Techniques for WebAssembly Binaries</title><description>WebAssembly is a low-level bytecode language designed for client-side execution in web browsers. The need for decompilation techniques that recover high-level source code from WASM binaries has grown as WASM continues to gain widespread adoption and its security concerns. However little research has been done to assess the quality of decompiled code from WASM. This paper aims to fill this gap by conducting a comprehensive comparative analysis between decompiled C code from WASM binaries and state-of-the-art native binary decompilers. We presented a novel framework for empirically evaluating C-based decompilers from various aspects including correctness/ readability/ and structural similarity. The proposed metrics are validated practicality in decompiler assessment and provided insightful observations regarding the characteristics and constraints of existing decompiled code. This in turn contributes to bolstering the security and reliability of software systems that rely on WASM and native binaries.</description><subject>Computer Science - Software Engineering</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFjrEOgjAUALs4GPUDnHw_IAJCZDOIGp0lcSQFHulLaIstEPl7kbg73XA3HGNrz3WCKAzdHTdv6h0_8DzH9f1DNGf8biEVZKEVCA8uERJd4hHikbIxKFBZ6kfVduUAuoIzFqOgmrekFaRYCEWvDi1U2sAT89halHk9wIkUN4R2yWYVry2uflywzfWSJrftNJM1hiQ3Q_adyqap_f_iAxVsQhU</recordid><startdate>20241104</startdate><enddate>20241104</enddate><creator>Wu, Wei-Cheng</creator><creator>Yan, Yutian</creator><creator>Egilsson, Hallgrimur David</creator><creator>Park, David</creator><creator>Chan, Steven</creator><creator>Hauser, Christophe</creator><creator>Wang, Weihang</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241104</creationdate><title>Is This the Same Code? A Comprehensive Study of Decompilation Techniques for WebAssembly Binaries</title><author>Wu, Wei-Cheng ; Yan, Yutian ; Egilsson, Hallgrimur David ; Park, David ; Chan, Steven ; Hauser, Christophe ; Wang, Weihang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_022783</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Software Engineering</topic><toplevel>online_resources</toplevel><creatorcontrib>Wu, Wei-Cheng</creatorcontrib><creatorcontrib>Yan, Yutian</creatorcontrib><creatorcontrib>Egilsson, Hallgrimur David</creatorcontrib><creatorcontrib>Park, David</creatorcontrib><creatorcontrib>Chan, Steven</creatorcontrib><creatorcontrib>Hauser, Christophe</creatorcontrib><creatorcontrib>Wang, Weihang</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wu, Wei-Cheng</au><au>Yan, Yutian</au><au>Egilsson, Hallgrimur David</au><au>Park, David</au><au>Chan, Steven</au><au>Hauser, Christophe</au><au>Wang, Weihang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Is This the Same Code? A Comprehensive Study of Decompilation Techniques for WebAssembly Binaries</atitle><date>2024-11-04</date><risdate>2024</risdate><abstract>WebAssembly is a low-level bytecode language designed for client-side execution in web browsers. The need for decompilation techniques that recover high-level source code from WASM binaries has grown as WASM continues to gain widespread adoption and its security concerns. However little research has been done to assess the quality of decompiled code from WASM. This paper aims to fill this gap by conducting a comprehensive comparative analysis between decompiled C code from WASM binaries and state-of-the-art native binary decompilers. We presented a novel framework for empirically evaluating C-based decompilers from various aspects including correctness/ readability/ and structural similarity. The proposed metrics are validated practicality in decompiler assessment and provided insightful observations regarding the characteristics and constraints of existing decompiled code. This in turn contributes to bolstering the security and reliability of software systems that rely on WASM and native binaries.</abstract><doi>10.48550/arxiv.2411.02278</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2411.02278
ispartof
issn
language eng
recordid cdi_arxiv_primary_2411_02278
source arXiv.org
subjects Computer Science - Software Engineering
title Is This the Same Code? A Comprehensive Study of Decompilation Techniques for WebAssembly Binaries
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T10%3A45%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Is%20This%20the%20Same%20Code?%20A%20Comprehensive%20Study%20of%20Decompilation%20Techniques%20for%20WebAssembly%20Binaries&rft.au=Wu,%20Wei-Cheng&rft.date=2024-11-04&rft_id=info:doi/10.48550/arxiv.2411.02278&rft_dat=%3Carxiv_GOX%3E2411_02278%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true