Complex log file synthesis for rapid sandbox-benchmarking of security- and computer network analysis tools

Today Information and Communications Technology (ICT) networks are a dominating component of our daily life. Centralized logging allows keeping track of events occurring in ICT networks. Therefore a central log store is essential for timely detection of problems such as service quality degradations,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information systems (Oxford) 2016-08, Vol.60, p.13-33
Hauptverfasser: Wurzenberger, Markus, Skopik, Florian, Settanni, Giuseppe, Scherrer, Wolfgang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 33
container_issue
container_start_page 13
container_title Information systems (Oxford)
container_volume 60
creator Wurzenberger, Markus
Skopik, Florian
Settanni, Giuseppe
Scherrer, Wolfgang
description Today Information and Communications Technology (ICT) networks are a dominating component of our daily life. Centralized logging allows keeping track of events occurring in ICT networks. Therefore a central log store is essential for timely detection of problems such as service quality degradations, performance issues or especially security-relevant cyber attacks. There exist various software tools such as security information and event management (SIEM) systems, log analysis tools and anomaly detection systems, which exploit log data to achieve this. While there are many products on the market, based on different approaches, the identification of the most efficient solution for a specific infrastructure, and the optimal configuration is still an unsolved problem. Today׳s general test environments do not sufficiently account for the specific properties of individual infrastructure setups. Thus, tests in these environments are usually not representative. However, testing on the real running productive systems exposes the network infrastructure to dangerous or unstable situations. The solution to this dilemma is the design and implementation of a highly realistic test environment, i.e. sandbox solution, that follows a different – novel – approach. The idea is to generate realistic network event sequence (NES) data that reflects the actual system behavior and which is then used to challenge network analysis software tools with varying configurations safely and realistically offline. In this paper we define a model, based on log line clustering and Markov chain simulation to create this synthetic log data. The presented model requires only a small set of real network data as an input to understand the complex real system behavior. Based on the input׳s characteristics highly realistic customer specified NES data is generated. To prove the applicability of the concept developed in this work, we conclude the paper with an illustrative example of evaluation and test of an existing anomaly detection system by using generated NES data. •Generating log data that reflects realistic network behavior.•Log data modeling, based on log line clustering and Markov chain simulation.•Rate, analyze and improve software tools, which exploit log data.•Detailed evaluation of the model and presentation of an illustrative application.•Cornerstones to improve the selection, deployment and operation of IDSs.
doi_str_mv 10.1016/j.is.2016.02.006
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1825461054</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S030643791530212X</els_id><sourcerecordid>1825461054</sourcerecordid><originalsourceid>FETCH-LOGICAL-c327t-38013d3dfeae504f8c2449ac2911b3c5d250fb7375717673d2b135617adc67c93</originalsourceid><addsrcrecordid>eNp1kM1P4zAQxa0VK22Bve_RRy7Jju3Ebrihii8JiQucLceegEsaF08K9L9fV90rp3kavfek92Psj4BagNB_13WkWhZVg6wB9A-2EEujKg1Gn7AFKNBVo0z3i50SrQFAtl23YOtV2mxH_OJjeuFDHJHTfppfkSLxIWWe3TYGTm4Kffqqepz868bltzi98DRwQr_Lcd5XvBi4L1W7GTOfcP5M-a083bg_NM0pjXTOfg5uJPz9_56x55vrp9Vd9fB4e7-6eqi8kmau1BKECioM6LCFZlh62TSd87ITole-DbKFoTfKtEYYbVSQvVCtFsYFr43v1Bm7OPZuc3rfIc12E8njOLoJ046sWMq20QLapljhaPU5EWUc7DbHsm9vBdgDVru2kewBqwVpC9YSuTxGsEz4iJgt-ViwYIgZ_WxDit-H_wFLC4Cc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1825461054</pqid></control><display><type>article</type><title>Complex log file synthesis for rapid sandbox-benchmarking of security- and computer network analysis tools</title><source>Elsevier ScienceDirect Journals</source><creator>Wurzenberger, Markus ; Skopik, Florian ; Settanni, Giuseppe ; Scherrer, Wolfgang</creator><creatorcontrib>Wurzenberger, Markus ; Skopik, Florian ; Settanni, Giuseppe ; Scherrer, Wolfgang</creatorcontrib><description>Today Information and Communications Technology (ICT) networks are a dominating component of our daily life. Centralized logging allows keeping track of events occurring in ICT networks. Therefore a central log store is essential for timely detection of problems such as service quality degradations, performance issues or especially security-relevant cyber attacks. There exist various software tools such as security information and event management (SIEM) systems, log analysis tools and anomaly detection systems, which exploit log data to achieve this. While there are many products on the market, based on different approaches, the identification of the most efficient solution for a specific infrastructure, and the optimal configuration is still an unsolved problem. Today׳s general test environments do not sufficiently account for the specific properties of individual infrastructure setups. Thus, tests in these environments are usually not representative. However, testing on the real running productive systems exposes the network infrastructure to dangerous or unstable situations. The solution to this dilemma is the design and implementation of a highly realistic test environment, i.e. sandbox solution, that follows a different – novel – approach. The idea is to generate realistic network event sequence (NES) data that reflects the actual system behavior and which is then used to challenge network analysis software tools with varying configurations safely and realistically offline. In this paper we define a model, based on log line clustering and Markov chain simulation to create this synthetic log data. The presented model requires only a small set of real network data as an input to understand the complex real system behavior. Based on the input׳s characteristics highly realistic customer specified NES data is generated. To prove the applicability of the concept developed in this work, we conclude the paper with an illustrative example of evaluation and test of an existing anomaly detection system by using generated NES data. •Generating log data that reflects realistic network behavior.•Log data modeling, based on log line clustering and Markov chain simulation.•Rate, analyze and improve software tools, which exploit log data.•Detailed evaluation of the model and presentation of an illustrative application.•Cornerstones to improve the selection, deployment and operation of IDSs.</description><identifier>ISSN: 0306-4379</identifier><identifier>EISSN: 1873-6076</identifier><identifier>DOI: 10.1016/j.is.2016.02.006</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Anomalies ; Computer information security ; Computer programs ; IDS deployment optimization ; Infrastructure ; Log data modeling ; Log file analysis ; Log line clustering ; Markets ; Markov chains ; Mathematical models ; Networks ; Software</subject><ispartof>Information systems (Oxford), 2016-08, Vol.60, p.13-33</ispartof><rights>2016 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c327t-38013d3dfeae504f8c2449ac2911b3c5d250fb7375717673d2b135617adc67c93</citedby><cites>FETCH-LOGICAL-c327t-38013d3dfeae504f8c2449ac2911b3c5d250fb7375717673d2b135617adc67c93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S030643791530212X$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3536,27903,27904,65309</link.rule.ids></links><search><creatorcontrib>Wurzenberger, Markus</creatorcontrib><creatorcontrib>Skopik, Florian</creatorcontrib><creatorcontrib>Settanni, Giuseppe</creatorcontrib><creatorcontrib>Scherrer, Wolfgang</creatorcontrib><title>Complex log file synthesis for rapid sandbox-benchmarking of security- and computer network analysis tools</title><title>Information systems (Oxford)</title><description>Today Information and Communications Technology (ICT) networks are a dominating component of our daily life. Centralized logging allows keeping track of events occurring in ICT networks. Therefore a central log store is essential for timely detection of problems such as service quality degradations, performance issues or especially security-relevant cyber attacks. There exist various software tools such as security information and event management (SIEM) systems, log analysis tools and anomaly detection systems, which exploit log data to achieve this. While there are many products on the market, based on different approaches, the identification of the most efficient solution for a specific infrastructure, and the optimal configuration is still an unsolved problem. Today׳s general test environments do not sufficiently account for the specific properties of individual infrastructure setups. Thus, tests in these environments are usually not representative. However, testing on the real running productive systems exposes the network infrastructure to dangerous or unstable situations. The solution to this dilemma is the design and implementation of a highly realistic test environment, i.e. sandbox solution, that follows a different – novel – approach. The idea is to generate realistic network event sequence (NES) data that reflects the actual system behavior and which is then used to challenge network analysis software tools with varying configurations safely and realistically offline. In this paper we define a model, based on log line clustering and Markov chain simulation to create this synthetic log data. The presented model requires only a small set of real network data as an input to understand the complex real system behavior. Based on the input׳s characteristics highly realistic customer specified NES data is generated. To prove the applicability of the concept developed in this work, we conclude the paper with an illustrative example of evaluation and test of an existing anomaly detection system by using generated NES data. •Generating log data that reflects realistic network behavior.•Log data modeling, based on log line clustering and Markov chain simulation.•Rate, analyze and improve software tools, which exploit log data.•Detailed evaluation of the model and presentation of an illustrative application.•Cornerstones to improve the selection, deployment and operation of IDSs.</description><subject>Anomalies</subject><subject>Computer information security</subject><subject>Computer programs</subject><subject>IDS deployment optimization</subject><subject>Infrastructure</subject><subject>Log data modeling</subject><subject>Log file analysis</subject><subject>Log line clustering</subject><subject>Markets</subject><subject>Markov chains</subject><subject>Mathematical models</subject><subject>Networks</subject><subject>Software</subject><issn>0306-4379</issn><issn>1873-6076</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNp1kM1P4zAQxa0VK22Bve_RRy7Jju3Ebrihii8JiQucLceegEsaF08K9L9fV90rp3kavfek92Psj4BagNB_13WkWhZVg6wB9A-2EEujKg1Gn7AFKNBVo0z3i50SrQFAtl23YOtV2mxH_OJjeuFDHJHTfppfkSLxIWWe3TYGTm4Kffqqepz868bltzi98DRwQr_Lcd5XvBi4L1W7GTOfcP5M-a083bg_NM0pjXTOfg5uJPz9_56x55vrp9Vd9fB4e7-6eqi8kmau1BKECioM6LCFZlh62TSd87ITole-DbKFoTfKtEYYbVSQvVCtFsYFr43v1Bm7OPZuc3rfIc12E8njOLoJ046sWMq20QLapljhaPU5EWUc7DbHsm9vBdgDVru2kewBqwVpC9YSuTxGsEz4iJgt-ViwYIgZ_WxDit-H_wFLC4Cc</recordid><startdate>201608</startdate><enddate>201608</enddate><creator>Wurzenberger, Markus</creator><creator>Skopik, Florian</creator><creator>Settanni, Giuseppe</creator><creator>Scherrer, Wolfgang</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>201608</creationdate><title>Complex log file synthesis for rapid sandbox-benchmarking of security- and computer network analysis tools</title><author>Wurzenberger, Markus ; Skopik, Florian ; Settanni, Giuseppe ; Scherrer, Wolfgang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c327t-38013d3dfeae504f8c2449ac2911b3c5d250fb7375717673d2b135617adc67c93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Anomalies</topic><topic>Computer information security</topic><topic>Computer programs</topic><topic>IDS deployment optimization</topic><topic>Infrastructure</topic><topic>Log data modeling</topic><topic>Log file analysis</topic><topic>Log line clustering</topic><topic>Markets</topic><topic>Markov chains</topic><topic>Mathematical models</topic><topic>Networks</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wurzenberger, Markus</creatorcontrib><creatorcontrib>Skopik, Florian</creatorcontrib><creatorcontrib>Settanni, Giuseppe</creatorcontrib><creatorcontrib>Scherrer, Wolfgang</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Information systems (Oxford)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wurzenberger, Markus</au><au>Skopik, Florian</au><au>Settanni, Giuseppe</au><au>Scherrer, Wolfgang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Complex log file synthesis for rapid sandbox-benchmarking of security- and computer network analysis tools</atitle><jtitle>Information systems (Oxford)</jtitle><date>2016-08</date><risdate>2016</risdate><volume>60</volume><spage>13</spage><epage>33</epage><pages>13-33</pages><issn>0306-4379</issn><eissn>1873-6076</eissn><abstract>Today Information and Communications Technology (ICT) networks are a dominating component of our daily life. Centralized logging allows keeping track of events occurring in ICT networks. Therefore a central log store is essential for timely detection of problems such as service quality degradations, performance issues or especially security-relevant cyber attacks. There exist various software tools such as security information and event management (SIEM) systems, log analysis tools and anomaly detection systems, which exploit log data to achieve this. While there are many products on the market, based on different approaches, the identification of the most efficient solution for a specific infrastructure, and the optimal configuration is still an unsolved problem. Today׳s general test environments do not sufficiently account for the specific properties of individual infrastructure setups. Thus, tests in these environments are usually not representative. However, testing on the real running productive systems exposes the network infrastructure to dangerous or unstable situations. The solution to this dilemma is the design and implementation of a highly realistic test environment, i.e. sandbox solution, that follows a different – novel – approach. The idea is to generate realistic network event sequence (NES) data that reflects the actual system behavior and which is then used to challenge network analysis software tools with varying configurations safely and realistically offline. In this paper we define a model, based on log line clustering and Markov chain simulation to create this synthetic log data. The presented model requires only a small set of real network data as an input to understand the complex real system behavior. Based on the input׳s characteristics highly realistic customer specified NES data is generated. To prove the applicability of the concept developed in this work, we conclude the paper with an illustrative example of evaluation and test of an existing anomaly detection system by using generated NES data. •Generating log data that reflects realistic network behavior.•Log data modeling, based on log line clustering and Markov chain simulation.•Rate, analyze and improve software tools, which exploit log data.•Detailed evaluation of the model and presentation of an illustrative application.•Cornerstones to improve the selection, deployment and operation of IDSs.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.is.2016.02.006</doi><tpages>21</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0306-4379
ispartof Information systems (Oxford), 2016-08, Vol.60, p.13-33
issn 0306-4379
1873-6076
language eng
recordid cdi_proquest_miscellaneous_1825461054
source Elsevier ScienceDirect Journals
subjects Anomalies
Computer information security
Computer programs
IDS deployment optimization
Infrastructure
Log data modeling
Log file analysis
Log line clustering
Markets
Markov chains
Mathematical models
Networks
Software
title Complex log file synthesis for rapid sandbox-benchmarking of security- and computer network analysis tools
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T01%3A19%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Complex%20log%20file%20synthesis%20for%20rapid%20sandbox-benchmarking%20of%20security-%20and%20computer%20network%20analysis%20tools&rft.jtitle=Information%20systems%20(Oxford)&rft.au=Wurzenberger,%20Markus&rft.date=2016-08&rft.volume=60&rft.spage=13&rft.epage=33&rft.pages=13-33&rft.issn=0306-4379&rft.eissn=1873-6076&rft_id=info:doi/10.1016/j.is.2016.02.006&rft_dat=%3Cproquest_cross%3E1825461054%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1825461054&rft_id=info:pmid/&rft_els_id=S030643791530212X&rfr_iscdi=true