Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations
Expanding the scientific tools available to protect computer networks can be aided by a deeper understanding of the underlying statistical distributions of network traffic and their potential geometric interpretations. Analyses of large scale network observations provide a unique window into studyin...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Jananthan, Hayden Kepner, Jeremy Jones, Michael Arcand, William Bestor, David Bergeron, William Byun, Chansup Davis, Timothy Gadepally, Vijay Grant, Daniel Houle, Michael Hubbell, Matthew Klein, Anna Milechin, Lauren Morales, Guillermo Morris, Andrew Mullen, Julie Patel, Ritesh Pentland, Alex Pisharody, Sandeep Prout, Andrew Reuther, Albert Rosa, Antonio Samsi, Siddharth Trigg, Tyler Wachman, Gabriel Yee, Charles Michaleas, Peter |
description | Expanding the scientific tools available to protect computer networks can be
aided by a deeper understanding of the underlying statistical distributions of
network traffic and their potential geometric interpretations. Analyses of
large scale network observations provide a unique window into studying those
underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M
associative array technologies enable the efficient anonymized analysis of
network traffic on the scale of trillions of events. This work analyzes over
100,000,000,000 anonymized packets from the largest Internet telescope (CAIDA)
and over 10,000,000 anonymized sources from the largest commercial honeyfarm
(GreyNoise). Neither CAIDA nor GreyNoise actively emit Internet traffic and
provide distinct observations of unsolicited Internet traffic (primarily
botnets and scanners). Analysis of these observations confirms the previously
observed Cauchy-like distributions describing temporal correlations between
Internet sources. The Gull lighthouse problem is a well-known geometric
characterization of the standard Cauchy distribution and motivates a potential
geometric interpretation for Internet observations. This work generalizes the
Gull lighthouse problem to accommodate larger classes of coastlines, deriving a
closed-form solution for the resulting probability distributions, stating and
examining the inverse problem of identifying an appropriate coastline given a
continuous probability distribution, identifying a geometric heuristic for
solving this problem computationally, and applying that heuristic to examine
the temporal geometry of different subsets of network observations. Application
of this method to the CAIDA and GreyNoise data reveals a several orders of
magnitude difference between known benign and other traffic which can lead to
potentially novel ways to protect networks. |
doi_str_mv | 10.48550/arxiv.2310.00522 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2310_00522</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2310_00522</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-d010ea57eb02c6d12859476da7f35c540d9e0ddef4e7c80442acb93ce31f9f353</originalsourceid><addsrcrecordid>eNotzztPwzAUBWAvDKjwA5iwuqc4fsTJWEU8KgUYWokxurWvK4vUjhxTKL-eUpiOdHR0pI-Qm5ItZK0Uu4P05Q8LLk4FY4rzS_L2DOPow45GR1chYwqY6byNMOXBB5zm9OCBdpB2SNcGBqTLEMNx77_R0hfMnzG903X8SAZpG1PCAbKPYboiFw6GCa__c0Y2D_eb9qnoXh9X7bIroNK8sKxkCErjlnFT2ZLXqpG6sqCdUEZJZhtk1qKTqE3NpORgto0wKErXnCZiRm7_bs-yfkx-D-nY_wr7s1D8AGoxTB4</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations</title><source>arXiv.org</source><creator>Jananthan, Hayden ; Kepner, Jeremy ; Jones, Michael ; Arcand, William ; Bestor, David ; Bergeron, William ; Byun, Chansup ; Davis, Timothy ; Gadepally, Vijay ; Grant, Daniel ; Houle, Michael ; Hubbell, Matthew ; Klein, Anna ; Milechin, Lauren ; Morales, Guillermo ; Morris, Andrew ; Mullen, Julie ; Patel, Ritesh ; Pentland, Alex ; Pisharody, Sandeep ; Prout, Andrew ; Reuther, Albert ; Rosa, Antonio ; Samsi, Siddharth ; Trigg, Tyler ; Wachman, Gabriel ; Yee, Charles ; Michaleas, Peter</creator><creatorcontrib>Jananthan, Hayden ; Kepner, Jeremy ; Jones, Michael ; Arcand, William ; Bestor, David ; Bergeron, William ; Byun, Chansup ; Davis, Timothy ; Gadepally, Vijay ; Grant, Daniel ; Houle, Michael ; Hubbell, Matthew ; Klein, Anna ; Milechin, Lauren ; Morales, Guillermo ; Morris, Andrew ; Mullen, Julie ; Patel, Ritesh ; Pentland, Alex ; Pisharody, Sandeep ; Prout, Andrew ; Reuther, Albert ; Rosa, Antonio ; Samsi, Siddharth ; Trigg, Tyler ; Wachman, Gabriel ; Yee, Charles ; Michaleas, Peter</creatorcontrib><description>Expanding the scientific tools available to protect computer networks can be
aided by a deeper understanding of the underlying statistical distributions of
network traffic and their potential geometric interpretations. Analyses of
large scale network observations provide a unique window into studying those
underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M
associative array technologies enable the efficient anonymized analysis of
network traffic on the scale of trillions of events. This work analyzes over
100,000,000,000 anonymized packets from the largest Internet telescope (CAIDA)
and over 10,000,000 anonymized sources from the largest commercial honeyfarm
(GreyNoise). Neither CAIDA nor GreyNoise actively emit Internet traffic and
provide distinct observations of unsolicited Internet traffic (primarily
botnets and scanners). Analysis of these observations confirms the previously
observed Cauchy-like distributions describing temporal correlations between
Internet sources. The Gull lighthouse problem is a well-known geometric
characterization of the standard Cauchy distribution and motivates a potential
geometric interpretation for Internet observations. This work generalizes the
Gull lighthouse problem to accommodate larger classes of coastlines, deriving a
closed-form solution for the resulting probability distributions, stating and
examining the inverse problem of identifying an appropriate coastline given a
continuous probability distribution, identifying a geometric heuristic for
solving this problem computationally, and applying that heuristic to examine
the temporal geometry of different subsets of network observations. Application
of this method to the CAIDA and GreyNoise data reveals a several orders of
magnitude difference between known benign and other traffic which can lead to
potentially novel ways to protect networks.</description><identifier>DOI: 10.48550/arxiv.2310.00522</identifier><language>eng</language><subject>Computer Science - Social and Information Networks</subject><creationdate>2023-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2310.00522$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2310.00522$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Jananthan, Hayden</creatorcontrib><creatorcontrib>Kepner, Jeremy</creatorcontrib><creatorcontrib>Jones, Michael</creatorcontrib><creatorcontrib>Arcand, William</creatorcontrib><creatorcontrib>Bestor, David</creatorcontrib><creatorcontrib>Bergeron, William</creatorcontrib><creatorcontrib>Byun, Chansup</creatorcontrib><creatorcontrib>Davis, Timothy</creatorcontrib><creatorcontrib>Gadepally, Vijay</creatorcontrib><creatorcontrib>Grant, Daniel</creatorcontrib><creatorcontrib>Houle, Michael</creatorcontrib><creatorcontrib>Hubbell, Matthew</creatorcontrib><creatorcontrib>Klein, Anna</creatorcontrib><creatorcontrib>Milechin, Lauren</creatorcontrib><creatorcontrib>Morales, Guillermo</creatorcontrib><creatorcontrib>Morris, Andrew</creatorcontrib><creatorcontrib>Mullen, Julie</creatorcontrib><creatorcontrib>Patel, Ritesh</creatorcontrib><creatorcontrib>Pentland, Alex</creatorcontrib><creatorcontrib>Pisharody, Sandeep</creatorcontrib><creatorcontrib>Prout, Andrew</creatorcontrib><creatorcontrib>Reuther, Albert</creatorcontrib><creatorcontrib>Rosa, Antonio</creatorcontrib><creatorcontrib>Samsi, Siddharth</creatorcontrib><creatorcontrib>Trigg, Tyler</creatorcontrib><creatorcontrib>Wachman, Gabriel</creatorcontrib><creatorcontrib>Yee, Charles</creatorcontrib><creatorcontrib>Michaleas, Peter</creatorcontrib><title>Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations</title><description>Expanding the scientific tools available to protect computer networks can be
aided by a deeper understanding of the underlying statistical distributions of
network traffic and their potential geometric interpretations. Analyses of
large scale network observations provide a unique window into studying those
underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M
associative array technologies enable the efficient anonymized analysis of
network traffic on the scale of trillions of events. This work analyzes over
100,000,000,000 anonymized packets from the largest Internet telescope (CAIDA)
and over 10,000,000 anonymized sources from the largest commercial honeyfarm
(GreyNoise). Neither CAIDA nor GreyNoise actively emit Internet traffic and
provide distinct observations of unsolicited Internet traffic (primarily
botnets and scanners). Analysis of these observations confirms the previously
observed Cauchy-like distributions describing temporal correlations between
Internet sources. The Gull lighthouse problem is a well-known geometric
characterization of the standard Cauchy distribution and motivates a potential
geometric interpretation for Internet observations. This work generalizes the
Gull lighthouse problem to accommodate larger classes of coastlines, deriving a
closed-form solution for the resulting probability distributions, stating and
examining the inverse problem of identifying an appropriate coastline given a
continuous probability distribution, identifying a geometric heuristic for
solving this problem computationally, and applying that heuristic to examine
the temporal geometry of different subsets of network observations. Application
of this method to the CAIDA and GreyNoise data reveals a several orders of
magnitude difference between known benign and other traffic which can lead to
potentially novel ways to protect networks.</description><subject>Computer Science - Social and Information Networks</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzztPwzAUBWAvDKjwA5iwuqc4fsTJWEU8KgUYWokxurWvK4vUjhxTKL-eUpiOdHR0pI-Qm5ItZK0Uu4P05Q8LLk4FY4rzS_L2DOPow45GR1chYwqY6byNMOXBB5zm9OCBdpB2SNcGBqTLEMNx77_R0hfMnzG903X8SAZpG1PCAbKPYboiFw6GCa__c0Y2D_eb9qnoXh9X7bIroNK8sKxkCErjlnFT2ZLXqpG6sqCdUEZJZhtk1qKTqE3NpORgto0wKErXnCZiRm7_bs-yfkx-D-nY_wr7s1D8AGoxTB4</recordid><startdate>20230930</startdate><enddate>20230930</enddate><creator>Jananthan, Hayden</creator><creator>Kepner, Jeremy</creator><creator>Jones, Michael</creator><creator>Arcand, William</creator><creator>Bestor, David</creator><creator>Bergeron, William</creator><creator>Byun, Chansup</creator><creator>Davis, Timothy</creator><creator>Gadepally, Vijay</creator><creator>Grant, Daniel</creator><creator>Houle, Michael</creator><creator>Hubbell, Matthew</creator><creator>Klein, Anna</creator><creator>Milechin, Lauren</creator><creator>Morales, Guillermo</creator><creator>Morris, Andrew</creator><creator>Mullen, Julie</creator><creator>Patel, Ritesh</creator><creator>Pentland, Alex</creator><creator>Pisharody, Sandeep</creator><creator>Prout, Andrew</creator><creator>Reuther, Albert</creator><creator>Rosa, Antonio</creator><creator>Samsi, Siddharth</creator><creator>Trigg, Tyler</creator><creator>Wachman, Gabriel</creator><creator>Yee, Charles</creator><creator>Michaleas, Peter</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230930</creationdate><title>Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations</title><author>Jananthan, Hayden ; Kepner, Jeremy ; Jones, Michael ; Arcand, William ; Bestor, David ; Bergeron, William ; Byun, Chansup ; Davis, Timothy ; Gadepally, Vijay ; Grant, Daniel ; Houle, Michael ; Hubbell, Matthew ; Klein, Anna ; Milechin, Lauren ; Morales, Guillermo ; Morris, Andrew ; Mullen, Julie ; Patel, Ritesh ; Pentland, Alex ; Pisharody, Sandeep ; Prout, Andrew ; Reuther, Albert ; Rosa, Antonio ; Samsi, Siddharth ; Trigg, Tyler ; Wachman, Gabriel ; Yee, Charles ; Michaleas, Peter</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-d010ea57eb02c6d12859476da7f35c540d9e0ddef4e7c80442acb93ce31f9f353</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Social and Information Networks</topic><toplevel>online_resources</toplevel><creatorcontrib>Jananthan, Hayden</creatorcontrib><creatorcontrib>Kepner, Jeremy</creatorcontrib><creatorcontrib>Jones, Michael</creatorcontrib><creatorcontrib>Arcand, William</creatorcontrib><creatorcontrib>Bestor, David</creatorcontrib><creatorcontrib>Bergeron, William</creatorcontrib><creatorcontrib>Byun, Chansup</creatorcontrib><creatorcontrib>Davis, Timothy</creatorcontrib><creatorcontrib>Gadepally, Vijay</creatorcontrib><creatorcontrib>Grant, Daniel</creatorcontrib><creatorcontrib>Houle, Michael</creatorcontrib><creatorcontrib>Hubbell, Matthew</creatorcontrib><creatorcontrib>Klein, Anna</creatorcontrib><creatorcontrib>Milechin, Lauren</creatorcontrib><creatorcontrib>Morales, Guillermo</creatorcontrib><creatorcontrib>Morris, Andrew</creatorcontrib><creatorcontrib>Mullen, Julie</creatorcontrib><creatorcontrib>Patel, Ritesh</creatorcontrib><creatorcontrib>Pentland, Alex</creatorcontrib><creatorcontrib>Pisharody, Sandeep</creatorcontrib><creatorcontrib>Prout, Andrew</creatorcontrib><creatorcontrib>Reuther, Albert</creatorcontrib><creatorcontrib>Rosa, Antonio</creatorcontrib><creatorcontrib>Samsi, Siddharth</creatorcontrib><creatorcontrib>Trigg, Tyler</creatorcontrib><creatorcontrib>Wachman, Gabriel</creatorcontrib><creatorcontrib>Yee, Charles</creatorcontrib><creatorcontrib>Michaleas, Peter</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jananthan, Hayden</au><au>Kepner, Jeremy</au><au>Jones, Michael</au><au>Arcand, William</au><au>Bestor, David</au><au>Bergeron, William</au><au>Byun, Chansup</au><au>Davis, Timothy</au><au>Gadepally, Vijay</au><au>Grant, Daniel</au><au>Houle, Michael</au><au>Hubbell, Matthew</au><au>Klein, Anna</au><au>Milechin, Lauren</au><au>Morales, Guillermo</au><au>Morris, Andrew</au><au>Mullen, Julie</au><au>Patel, Ritesh</au><au>Pentland, Alex</au><au>Pisharody, Sandeep</au><au>Prout, Andrew</au><au>Reuther, Albert</au><au>Rosa, Antonio</au><au>Samsi, Siddharth</au><au>Trigg, Tyler</au><au>Wachman, Gabriel</au><au>Yee, Charles</au><au>Michaleas, Peter</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations</atitle><date>2023-09-30</date><risdate>2023</risdate><abstract>Expanding the scientific tools available to protect computer networks can be
aided by a deeper understanding of the underlying statistical distributions of
network traffic and their potential geometric interpretations. Analyses of
large scale network observations provide a unique window into studying those
underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M
associative array technologies enable the efficient anonymized analysis of
network traffic on the scale of trillions of events. This work analyzes over
100,000,000,000 anonymized packets from the largest Internet telescope (CAIDA)
and over 10,000,000 anonymized sources from the largest commercial honeyfarm
(GreyNoise). Neither CAIDA nor GreyNoise actively emit Internet traffic and
provide distinct observations of unsolicited Internet traffic (primarily
botnets and scanners). Analysis of these observations confirms the previously
observed Cauchy-like distributions describing temporal correlations between
Internet sources. The Gull lighthouse problem is a well-known geometric
characterization of the standard Cauchy distribution and motivates a potential
geometric interpretation for Internet observations. This work generalizes the
Gull lighthouse problem to accommodate larger classes of coastlines, deriving a
closed-form solution for the resulting probability distributions, stating and
examining the inverse problem of identifying an appropriate coastline given a
continuous probability distribution, identifying a geometric heuristic for
solving this problem computationally, and applying that heuristic to examine
the temporal geometry of different subsets of network observations. Application
of this method to the CAIDA and GreyNoise data reveals a several orders of
magnitude difference between known benign and other traffic which can lead to
potentially novel ways to protect networks.</abstract><doi>10.48550/arxiv.2310.00522</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2310.00522 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2310_00522 |
source | arXiv.org |
subjects | Computer Science - Social and Information Networks |
title | Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T21%3A35%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mapping%20of%20Internet%20%22Coastlines%22%20via%20Large%20Scale%20Anonymized%20Network%20Source%20Correlations&rft.au=Jananthan,%20Hayden&rft.date=2023-09-30&rft_id=info:doi/10.48550/arxiv.2310.00522&rft_dat=%3Carxiv_GOX%3E2310_00522%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |