Service Identification of TLS Flows Based on Handshake Analysis

Identification of services constituting traffic from given IP network flows is important for many purposes such as management of quality of service, prevention of security problems, and providing a discounting service for customers only in accessing specified services like zero-rating service. The s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Information Processing 2023, Vol.31, pp.131-142
Hauptverfasser: Asaoka, Ryo, Soma, Yuto, Yamauchi, Hiroaki, Nakao, Akihiro, Oguchi, Masato, Yamaguchi, Saneyasu, Kobayashi, Aki
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 142
container_issue
container_start_page 131
container_title Journal of Information Processing
container_volume 31
creator Asaoka, Ryo
Soma, Yuto
Yamauchi, Hiroaki
Nakao, Akihiro
Oguchi, Masato
Yamaguchi, Saneyasu
Kobayashi, Aki
description Identification of services constituting traffic from given IP network flows is important for many purposes such as management of quality of service, prevention of security problems, and providing a discounting service for customers only in accessing specified services like zero-rating service. The simplest methods for identifying these services are identifications based on IP addresses and port numbers. However, such methods are not sufficiently accurate and thus require improvement. Deep packet inspection (DPI) is an advanced method for improving the accuracy of identification. Many current IP flows are encrypted with the transport layer security (TLS) protocol. Therefore, an identification method cannot analyze almost all the data encrypted by TLS. In the cases of TLS 1.2 or less, some fields, e.g. server name indication (SNI), in the protocol header for the TLS session establishment are not encrypted and then can be analyzed. Thus, we can expect that the service can be identified from IP flows, which are composed of TLS sessions, by analyzing these fields. For achieving this, two challenges are mainly required. One is grouping TLS sessions by accesses from many TLS sessions that pass through a network element. The other is the identification of service from TLS sessions grouped in the first challenge. In our work, we mainly focus on the second theme, i.e., service identification from given TLS sessions. In our previous work, we proposed a method for identification by analyzing these non-encrypted data based on DPI and n-gram. However, there is room for improvement in identification accuracy because this method analyzed all the non-encrypted data including random values without protocol analysis. In this paper, we propose a new method for identifying the service from given TLS sessions based on SNI with protocol data unit (PDU) analysis. The proposed method clusters TLS sessions according to the value of SNI and identifies services from the occurrences of all groups. We evaluated the proposed method by identifying services on Google, Yahoo, and MSN sites, and the results showed that the proposed method could identify services more accurately than the existing method. The average ratios of inaccurate identifications were decreased by 65%, 72%, and 41% in our experiments of Google, Yahoo, and MSN services, respectively.
doi_str_mv 10.2197/ipsjjip.31.131
format Article
fullrecord <record><control><sourceid>jstage_cross</sourceid><recordid>TN_cdi_crossref_primary_10_2197_ipsjjip_31_131</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>article_ipsjjip_31_0_31_131_article_char_en</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3301-9523cd8361a17ede7bbf8dd261d54d1e2988999a28b30342cc81f1445f6df5c83</originalsourceid><addsrcrecordid>eNpNj8FKAzEQhoMoWKtXz3mBXTPJ7jY5SS3WFhY8tJ5Dmkxs1nW3JIvSt7fSUnqaYf7_G_gIeQSWc1CTp7BLTRN2uYAcBFyREUjJs6oq-fXFfkvuUmoYqxQr2Yg8rzD-BIt06bAbgg_WDKHvaO_pul7Redv_JvpiEjp6uC5M59LWfCGddqbdp5DuyY03bcKH0xyTj_nrerbI6ve35WxaZ1YIBpkqubBOigoMTNDhZLPx0jlegSsLB8iVlEopw-VGMFFwayV4KIrSV86XVooxyY9_bexTiuj1LoZvE_camP7X1yd9LUAf9A_A9Ag0aTCfeK6bOATb4mWdnZhzZrcmauzEH0uQZsA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Service Identification of TLS Flows Based on Handshake Analysis</title><source>J-STAGE Free</source><creator>Asaoka, Ryo ; Soma, Yuto ; Yamauchi, Hiroaki ; Nakao, Akihiro ; Oguchi, Masato ; Yamaguchi, Saneyasu ; Kobayashi, Aki</creator><creatorcontrib>Asaoka, Ryo ; Soma, Yuto ; Yamauchi, Hiroaki ; Nakao, Akihiro ; Oguchi, Masato ; Yamaguchi, Saneyasu ; Kobayashi, Aki</creatorcontrib><description>Identification of services constituting traffic from given IP network flows is important for many purposes such as management of quality of service, prevention of security problems, and providing a discounting service for customers only in accessing specified services like zero-rating service. The simplest methods for identifying these services are identifications based on IP addresses and port numbers. However, such methods are not sufficiently accurate and thus require improvement. Deep packet inspection (DPI) is an advanced method for improving the accuracy of identification. Many current IP flows are encrypted with the transport layer security (TLS) protocol. Therefore, an identification method cannot analyze almost all the data encrypted by TLS. In the cases of TLS 1.2 or less, some fields, e.g. server name indication (SNI), in the protocol header for the TLS session establishment are not encrypted and then can be analyzed. Thus, we can expect that the service can be identified from IP flows, which are composed of TLS sessions, by analyzing these fields. For achieving this, two challenges are mainly required. One is grouping TLS sessions by accesses from many TLS sessions that pass through a network element. The other is the identification of service from TLS sessions grouped in the first challenge. In our work, we mainly focus on the second theme, i.e., service identification from given TLS sessions. In our previous work, we proposed a method for identification by analyzing these non-encrypted data based on DPI and n-gram. However, there is room for improvement in identification accuracy because this method analyzed all the non-encrypted data including random values without protocol analysis. In this paper, we propose a new method for identifying the service from given TLS sessions based on SNI with protocol data unit (PDU) analysis. The proposed method clusters TLS sessions according to the value of SNI and identifies services from the occurrences of all groups. We evaluated the proposed method by identifying services on Google, Yahoo, and MSN sites, and the results showed that the proposed method could identify services more accurately than the existing method. The average ratios of inaccurate identifications were decreased by 65%, 72%, and 41% in our experiments of Google, Yahoo, and MSN services, respectively.</description><identifier>ISSN: 1882-6652</identifier><identifier>EISSN: 1882-6652</identifier><identifier>DOI: 10.2197/ipsjjip.31.131</identifier><language>eng</language><publisher>Information Processing Society of Japan</publisher><subject>HTTPS ; service identification ; SNI ; TLS</subject><ispartof>Journal of Information Processing, 2023, Vol.31, pp.131-142</ispartof><rights>2023 by the Information Processing Society of Japan</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c3301-9523cd8361a17ede7bbf8dd261d54d1e2988999a28b30342cc81f1445f6df5c83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1883,4024,27923,27924,27925</link.rule.ids></links><search><creatorcontrib>Asaoka, Ryo</creatorcontrib><creatorcontrib>Soma, Yuto</creatorcontrib><creatorcontrib>Yamauchi, Hiroaki</creatorcontrib><creatorcontrib>Nakao, Akihiro</creatorcontrib><creatorcontrib>Oguchi, Masato</creatorcontrib><creatorcontrib>Yamaguchi, Saneyasu</creatorcontrib><creatorcontrib>Kobayashi, Aki</creatorcontrib><title>Service Identification of TLS Flows Based on Handshake Analysis</title><title>Journal of Information Processing</title><addtitle>Journal of Information Processing</addtitle><description>Identification of services constituting traffic from given IP network flows is important for many purposes such as management of quality of service, prevention of security problems, and providing a discounting service for customers only in accessing specified services like zero-rating service. The simplest methods for identifying these services are identifications based on IP addresses and port numbers. However, such methods are not sufficiently accurate and thus require improvement. Deep packet inspection (DPI) is an advanced method for improving the accuracy of identification. Many current IP flows are encrypted with the transport layer security (TLS) protocol. Therefore, an identification method cannot analyze almost all the data encrypted by TLS. In the cases of TLS 1.2 or less, some fields, e.g. server name indication (SNI), in the protocol header for the TLS session establishment are not encrypted and then can be analyzed. Thus, we can expect that the service can be identified from IP flows, which are composed of TLS sessions, by analyzing these fields. For achieving this, two challenges are mainly required. One is grouping TLS sessions by accesses from many TLS sessions that pass through a network element. The other is the identification of service from TLS sessions grouped in the first challenge. In our work, we mainly focus on the second theme, i.e., service identification from given TLS sessions. In our previous work, we proposed a method for identification by analyzing these non-encrypted data based on DPI and n-gram. However, there is room for improvement in identification accuracy because this method analyzed all the non-encrypted data including random values without protocol analysis. In this paper, we propose a new method for identifying the service from given TLS sessions based on SNI with protocol data unit (PDU) analysis. The proposed method clusters TLS sessions according to the value of SNI and identifies services from the occurrences of all groups. We evaluated the proposed method by identifying services on Google, Yahoo, and MSN sites, and the results showed that the proposed method could identify services more accurately than the existing method. The average ratios of inaccurate identifications were decreased by 65%, 72%, and 41% in our experiments of Google, Yahoo, and MSN services, respectively.</description><subject>HTTPS</subject><subject>service identification</subject><subject>SNI</subject><subject>TLS</subject><issn>1882-6652</issn><issn>1882-6652</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpNj8FKAzEQhoMoWKtXz3mBXTPJ7jY5SS3WFhY8tJ5Dmkxs1nW3JIvSt7fSUnqaYf7_G_gIeQSWc1CTp7BLTRN2uYAcBFyREUjJs6oq-fXFfkvuUmoYqxQr2Yg8rzD-BIt06bAbgg_WDKHvaO_pul7Redv_JvpiEjp6uC5M59LWfCGddqbdp5DuyY03bcKH0xyTj_nrerbI6ve35WxaZ1YIBpkqubBOigoMTNDhZLPx0jlegSsLB8iVlEopw-VGMFFwayV4KIrSV86XVooxyY9_bexTiuj1LoZvE_camP7X1yd9LUAf9A_A9Ag0aTCfeK6bOATb4mWdnZhzZrcmauzEH0uQZsA</recordid><startdate>2023</startdate><enddate>2023</enddate><creator>Asaoka, Ryo</creator><creator>Soma, Yuto</creator><creator>Yamauchi, Hiroaki</creator><creator>Nakao, Akihiro</creator><creator>Oguchi, Masato</creator><creator>Yamaguchi, Saneyasu</creator><creator>Kobayashi, Aki</creator><general>Information Processing Society of Japan</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>2023</creationdate><title>Service Identification of TLS Flows Based on Handshake Analysis</title><author>Asaoka, Ryo ; Soma, Yuto ; Yamauchi, Hiroaki ; Nakao, Akihiro ; Oguchi, Masato ; Yamaguchi, Saneyasu ; Kobayashi, Aki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3301-9523cd8361a17ede7bbf8dd261d54d1e2988999a28b30342cc81f1445f6df5c83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>HTTPS</topic><topic>service identification</topic><topic>SNI</topic><topic>TLS</topic><toplevel>online_resources</toplevel><creatorcontrib>Asaoka, Ryo</creatorcontrib><creatorcontrib>Soma, Yuto</creatorcontrib><creatorcontrib>Yamauchi, Hiroaki</creatorcontrib><creatorcontrib>Nakao, Akihiro</creatorcontrib><creatorcontrib>Oguchi, Masato</creatorcontrib><creatorcontrib>Yamaguchi, Saneyasu</creatorcontrib><creatorcontrib>Kobayashi, Aki</creatorcontrib><collection>CrossRef</collection><jtitle>Journal of Information Processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Asaoka, Ryo</au><au>Soma, Yuto</au><au>Yamauchi, Hiroaki</au><au>Nakao, Akihiro</au><au>Oguchi, Masato</au><au>Yamaguchi, Saneyasu</au><au>Kobayashi, Aki</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Service Identification of TLS Flows Based on Handshake Analysis</atitle><jtitle>Journal of Information Processing</jtitle><addtitle>Journal of Information Processing</addtitle><date>2023</date><risdate>2023</risdate><volume>31</volume><spage>131</spage><epage>142</epage><pages>131-142</pages><issn>1882-6652</issn><eissn>1882-6652</eissn><abstract>Identification of services constituting traffic from given IP network flows is important for many purposes such as management of quality of service, prevention of security problems, and providing a discounting service for customers only in accessing specified services like zero-rating service. The simplest methods for identifying these services are identifications based on IP addresses and port numbers. However, such methods are not sufficiently accurate and thus require improvement. Deep packet inspection (DPI) is an advanced method for improving the accuracy of identification. Many current IP flows are encrypted with the transport layer security (TLS) protocol. Therefore, an identification method cannot analyze almost all the data encrypted by TLS. In the cases of TLS 1.2 or less, some fields, e.g. server name indication (SNI), in the protocol header for the TLS session establishment are not encrypted and then can be analyzed. Thus, we can expect that the service can be identified from IP flows, which are composed of TLS sessions, by analyzing these fields. For achieving this, two challenges are mainly required. One is grouping TLS sessions by accesses from many TLS sessions that pass through a network element. The other is the identification of service from TLS sessions grouped in the first challenge. In our work, we mainly focus on the second theme, i.e., service identification from given TLS sessions. In our previous work, we proposed a method for identification by analyzing these non-encrypted data based on DPI and n-gram. However, there is room for improvement in identification accuracy because this method analyzed all the non-encrypted data including random values without protocol analysis. In this paper, we propose a new method for identifying the service from given TLS sessions based on SNI with protocol data unit (PDU) analysis. The proposed method clusters TLS sessions according to the value of SNI and identifies services from the occurrences of all groups. We evaluated the proposed method by identifying services on Google, Yahoo, and MSN sites, and the results showed that the proposed method could identify services more accurately than the existing method. The average ratios of inaccurate identifications were decreased by 65%, 72%, and 41% in our experiments of Google, Yahoo, and MSN services, respectively.</abstract><pub>Information Processing Society of Japan</pub><doi>10.2197/ipsjjip.31.131</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1882-6652
ispartof Journal of Information Processing, 2023, Vol.31, pp.131-142
issn 1882-6652
1882-6652
language eng
recordid cdi_crossref_primary_10_2197_ipsjjip_31_131
source J-STAGE Free
subjects HTTPS
service identification
SNI
TLS
title Service Identification of TLS Flows Based on Handshake Analysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T07%3A17%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstage_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Service%20Identification%20of%20TLS%20Flows%20Based%20on%20Handshake%20Analysis&rft.jtitle=Journal%20of%20Information%20Processing&rft.au=Asaoka,%20Ryo&rft.date=2023&rft.volume=31&rft.spage=131&rft.epage=142&rft.pages=131-142&rft.issn=1882-6652&rft.eissn=1882-6652&rft_id=info:doi/10.2197/ipsjjip.31.131&rft_dat=%3Cjstage_cross%3Earticle_ipsjjip_31_0_31_131_article_char_en%3C/jstage_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true