A semantics aware approach to automated reverse engineering unknown protocols
Extracting the protocol message format specifications of unknown applications from network traces is important for a variety of applications such as application protocol parsing, vulnerability discovery, and system integration. In this paper, we propose ProDecoder, a network trace based protocol mes...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 10 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | |
container_volume | |
creator | Yipeng Wang Xiaochun Yun Shafiq, M. Z. Liyan Wang Liu, A. X. Zhibin Zhang Danfeng Yao Yongzheng Zhang Li Guo |
description | Extracting the protocol message format specifications of unknown applications from network traces is important for a variety of applications such as application protocol parsing, vulnerability discovery, and system integration. In this paper, we propose ProDecoder, a network trace based protocol message format inference system that exploits the semantics of protocol messages without the executable code of application protocols. ProDecoder is based on the key insight that the n-grams of protocol traces exhibit highly skewed frequency distribution that can be leveraged for accurate protocol message format inference. In ProDecoder, we first discover the latent relationship among n-grams by first grouping protocol messages with the same semantics and then inferring message formats by keyword based clustering and cluster sequence alignment. We implemented and evaluated ProDecoder to infer message format specifications of SMB (a binary protocol) and SMTP (a textual protocol). Our experimental results show that ProDecoder accurately parses and infers SMB protocol with 100% precision and recall. For SMTP, ProDecoder achieves approximately 95% precision and recall. |
doi_str_mv | 10.1109/ICNP.2012.6459963 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6459963</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6459963</ieee_id><sourcerecordid>6459963</sourcerecordid><originalsourceid>FETCH-LOGICAL-c223t-6877ecbca3a82a8b784ebd7de6902610b38ce5a43bb952b7bacdd1ac1a3bb3bb3</originalsourceid><addsrcrecordid>eNpFUFtLwzAYjTewzv0A8SV_oDW3Ju3jGE4H8_Kgz-NL-jmra1KSzuG_t-JAOHDg3B4OIVecFZyz-mY5f3wuBOOi0Kqsay2PyAVX2kihlDHHJBNayVxKJk_-jZKdkmxsi5xrVZ2TaUofjDHOpNK6zMjDjCbswA-tSxT2EJFC38cA7p0OgcJuCB0M2NCIXxgTUvSb1iPG1m_ozn_6sPd0zA_BhW26JGdvsE04PfCEvC5uX-b3-erpbjmfrXInhBxyXRmDzjqQUAmorKkU2sY0qGsmNGdWVg5LUNLauhTWWHBNw8FxGJVfTMj1326LiOs-th3E7_XhFvkDuRtUbA</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>A semantics aware approach to automated reverse engineering unknown protocols</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Yipeng Wang ; Xiaochun Yun ; Shafiq, M. Z. ; Liyan Wang ; Liu, A. X. ; Zhibin Zhang ; Danfeng Yao ; Yongzheng Zhang ; Li Guo</creator><creatorcontrib>Yipeng Wang ; Xiaochun Yun ; Shafiq, M. Z. ; Liyan Wang ; Liu, A. X. ; Zhibin Zhang ; Danfeng Yao ; Yongzheng Zhang ; Li Guo</creatorcontrib><description>Extracting the protocol message format specifications of unknown applications from network traces is important for a variety of applications such as application protocol parsing, vulnerability discovery, and system integration. In this paper, we propose ProDecoder, a network trace based protocol message format inference system that exploits the semantics of protocol messages without the executable code of application protocols. ProDecoder is based on the key insight that the n-grams of protocol traces exhibit highly skewed frequency distribution that can be leveraged for accurate protocol message format inference. In ProDecoder, we first discover the latent relationship among n-grams by first grouping protocol messages with the same semantics and then inferring message formats by keyword based clustering and cluster sequence alignment. We implemented and evaluated ProDecoder to infer message format specifications of SMB (a binary protocol) and SMTP (a textual protocol). Our experimental results show that ProDecoder accurately parses and infers SMB protocol with 100% precision and recall. For SMTP, ProDecoder achieves approximately 95% precision and recall.</description><identifier>ISSN: 1092-1648</identifier><identifier>ISBN: 1467324450</identifier><identifier>ISBN: 9781467324458</identifier><identifier>EISSN: 2643-3303</identifier><identifier>EISBN: 1467324477</identifier><identifier>EISBN: 9781467324472</identifier><identifier>EISBN: 9781467324465</identifier><identifier>EISBN: 1467324469</identifier><identifier>DOI: 10.1109/ICNP.2012.6459963</identifier><language>eng</language><publisher>IEEE</publisher><subject>Electronic mail ; Natural language processing ; Postal services ; Protocols ; Reverse engineering ; Semantics ; Vectors</subject><ispartof>2012 20th IEEE International Conference on Network Protocols (ICNP), 2012, p.1-10</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c223t-6877ecbca3a82a8b784ebd7de6902610b38ce5a43bb952b7bacdd1ac1a3bb3bb3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6459963$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2051,27904,54898</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6459963$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Yipeng Wang</creatorcontrib><creatorcontrib>Xiaochun Yun</creatorcontrib><creatorcontrib>Shafiq, M. Z.</creatorcontrib><creatorcontrib>Liyan Wang</creatorcontrib><creatorcontrib>Liu, A. X.</creatorcontrib><creatorcontrib>Zhibin Zhang</creatorcontrib><creatorcontrib>Danfeng Yao</creatorcontrib><creatorcontrib>Yongzheng Zhang</creatorcontrib><creatorcontrib>Li Guo</creatorcontrib><title>A semantics aware approach to automated reverse engineering unknown protocols</title><title>2012 20th IEEE International Conference on Network Protocols (ICNP)</title><addtitle>ICNP</addtitle><description>Extracting the protocol message format specifications of unknown applications from network traces is important for a variety of applications such as application protocol parsing, vulnerability discovery, and system integration. In this paper, we propose ProDecoder, a network trace based protocol message format inference system that exploits the semantics of protocol messages without the executable code of application protocols. ProDecoder is based on the key insight that the n-grams of protocol traces exhibit highly skewed frequency distribution that can be leveraged for accurate protocol message format inference. In ProDecoder, we first discover the latent relationship among n-grams by first grouping protocol messages with the same semantics and then inferring message formats by keyword based clustering and cluster sequence alignment. We implemented and evaluated ProDecoder to infer message format specifications of SMB (a binary protocol) and SMTP (a textual protocol). Our experimental results show that ProDecoder accurately parses and infers SMB protocol with 100% precision and recall. For SMTP, ProDecoder achieves approximately 95% precision and recall.</description><subject>Electronic mail</subject><subject>Natural language processing</subject><subject>Postal services</subject><subject>Protocols</subject><subject>Reverse engineering</subject><subject>Semantics</subject><subject>Vectors</subject><issn>1092-1648</issn><issn>2643-3303</issn><isbn>1467324450</isbn><isbn>9781467324458</isbn><isbn>1467324477</isbn><isbn>9781467324472</isbn><isbn>9781467324465</isbn><isbn>1467324469</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpFUFtLwzAYjTewzv0A8SV_oDW3Ju3jGE4H8_Kgz-NL-jmra1KSzuG_t-JAOHDg3B4OIVecFZyz-mY5f3wuBOOi0Kqsay2PyAVX2kihlDHHJBNayVxKJk_-jZKdkmxsi5xrVZ2TaUofjDHOpNK6zMjDjCbswA-tSxT2EJFC38cA7p0OgcJuCB0M2NCIXxgTUvSb1iPG1m_ozn_6sPd0zA_BhW26JGdvsE04PfCEvC5uX-b3-erpbjmfrXInhBxyXRmDzjqQUAmorKkU2sY0qGsmNGdWVg5LUNLauhTWWHBNw8FxGJVfTMj1326LiOs-th3E7_XhFvkDuRtUbA</recordid><startdate>201210</startdate><enddate>201210</enddate><creator>Yipeng Wang</creator><creator>Xiaochun Yun</creator><creator>Shafiq, M. Z.</creator><creator>Liyan Wang</creator><creator>Liu, A. X.</creator><creator>Zhibin Zhang</creator><creator>Danfeng Yao</creator><creator>Yongzheng Zhang</creator><creator>Li Guo</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201210</creationdate><title>A semantics aware approach to automated reverse engineering unknown protocols</title><author>Yipeng Wang ; Xiaochun Yun ; Shafiq, M. Z. ; Liyan Wang ; Liu, A. X. ; Zhibin Zhang ; Danfeng Yao ; Yongzheng Zhang ; Li Guo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c223t-6877ecbca3a82a8b784ebd7de6902610b38ce5a43bb952b7bacdd1ac1a3bb3bb3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Electronic mail</topic><topic>Natural language processing</topic><topic>Postal services</topic><topic>Protocols</topic><topic>Reverse engineering</topic><topic>Semantics</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Yipeng Wang</creatorcontrib><creatorcontrib>Xiaochun Yun</creatorcontrib><creatorcontrib>Shafiq, M. Z.</creatorcontrib><creatorcontrib>Liyan Wang</creatorcontrib><creatorcontrib>Liu, A. X.</creatorcontrib><creatorcontrib>Zhibin Zhang</creatorcontrib><creatorcontrib>Danfeng Yao</creatorcontrib><creatorcontrib>Yongzheng Zhang</creatorcontrib><creatorcontrib>Li Guo</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yipeng Wang</au><au>Xiaochun Yun</au><au>Shafiq, M. Z.</au><au>Liyan Wang</au><au>Liu, A. X.</au><au>Zhibin Zhang</au><au>Danfeng Yao</au><au>Yongzheng Zhang</au><au>Li Guo</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>A semantics aware approach to automated reverse engineering unknown protocols</atitle><btitle>2012 20th IEEE International Conference on Network Protocols (ICNP)</btitle><stitle>ICNP</stitle><date>2012-10</date><risdate>2012</risdate><spage>1</spage><epage>10</epage><pages>1-10</pages><issn>1092-1648</issn><eissn>2643-3303</eissn><isbn>1467324450</isbn><isbn>9781467324458</isbn><eisbn>1467324477</eisbn><eisbn>9781467324472</eisbn><eisbn>9781467324465</eisbn><eisbn>1467324469</eisbn><abstract>Extracting the protocol message format specifications of unknown applications from network traces is important for a variety of applications such as application protocol parsing, vulnerability discovery, and system integration. In this paper, we propose ProDecoder, a network trace based protocol message format inference system that exploits the semantics of protocol messages without the executable code of application protocols. ProDecoder is based on the key insight that the n-grams of protocol traces exhibit highly skewed frequency distribution that can be leveraged for accurate protocol message format inference. In ProDecoder, we first discover the latent relationship among n-grams by first grouping protocol messages with the same semantics and then inferring message formats by keyword based clustering and cluster sequence alignment. We implemented and evaluated ProDecoder to infer message format specifications of SMB (a binary protocol) and SMTP (a textual protocol). Our experimental results show that ProDecoder accurately parses and infers SMB protocol with 100% precision and recall. For SMTP, ProDecoder achieves approximately 95% precision and recall.</abstract><pub>IEEE</pub><doi>10.1109/ICNP.2012.6459963</doi><tpages>10</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1092-1648 |
ispartof | 2012 20th IEEE International Conference on Network Protocols (ICNP), 2012, p.1-10 |
issn | 1092-1648 2643-3303 |
language | eng |
recordid | cdi_ieee_primary_6459963 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Electronic mail Natural language processing Postal services Protocols Reverse engineering Semantics Vectors |
title | A semantics aware approach to automated reverse engineering unknown protocols |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T19%3A27%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=A%20semantics%20aware%20approach%20to%20automated%20reverse%20engineering%20unknown%20protocols&rft.btitle=2012%2020th%20IEEE%20International%20Conference%20on%20Network%20Protocols%20(ICNP)&rft.au=Yipeng%20Wang&rft.date=2012-10&rft.spage=1&rft.epage=10&rft.pages=1-10&rft.issn=1092-1648&rft.eissn=2643-3303&rft.isbn=1467324450&rft.isbn_list=9781467324458&rft_id=info:doi/10.1109/ICNP.2012.6459963&rft_dat=%3Cieee_6IE%3E6459963%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1467324477&rft.eisbn_list=9781467324472&rft.eisbn_list=9781467324465&rft.eisbn_list=1467324469&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6459963&rfr_iscdi=true |