Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations

Expanding the scientific tools available to protect computer networks can be aided by a deeper understanding of the underlying statistical distributions of network traffic and their potential geometric interpretations. Analyses of large scale network observations provide a unique window into studyin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Jananthan, Hayden, Kepner, Jeremy, Jones, Michael, Arcand, William, Bestor, David, Bergeron, William, Byun, Chansup, Davis, Timothy, Gadepally, Vijay, Grant, Daniel, Houle, Michael, Hubbell, Matthew, Klein, Anna, Milechin, Lauren, Morales, Guillermo, Morris, Andrew, Mullen, Julie, Patel, Ritesh, Pentland, Alex, Pisharody, Sandeep, Prout, Andrew, Reuther, Albert, Rosa, Antonio, Samsi, Siddharth, Trigg, Tyler, Wachman, Gabriel, Yee, Charles, Michaleas, Peter
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Jananthan, Hayden
Kepner, Jeremy
Jones, Michael
Arcand, William
Bestor, David
Bergeron, William
Byun, Chansup
Davis, Timothy
Gadepally, Vijay
Grant, Daniel
Houle, Michael
Hubbell, Matthew
Klein, Anna
Milechin, Lauren
Morales, Guillermo
Morris, Andrew
Mullen, Julie
Patel, Ritesh
Pentland, Alex
Pisharody, Sandeep
Prout, Andrew
Reuther, Albert
Rosa, Antonio
Samsi, Siddharth
Trigg, Tyler
Wachman, Gabriel
Yee, Charles
Michaleas, Peter
description Expanding the scientific tools available to protect computer networks can be aided by a deeper understanding of the underlying statistical distributions of network traffic and their potential geometric interpretations. Analyses of large scale network observations provide a unique window into studying those underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M associative array technologies enable the efficient anonymized analysis of network traffic on the scale of trillions of events. This work analyzes over 100,000,000,000 anonymized packets from the largest Internet telescope (CAIDA) and over 10,000,000 anonymized sources from the largest commercial honeyfarm (GreyNoise). Neither CAIDA nor GreyNoise actively emit Internet traffic and provide distinct observations of unsolicited Internet traffic (primarily botnets and scanners). Analysis of these observations confirms the previously observed Cauchy-like distributions describing temporal correlations between Internet sources. The Gull lighthouse problem is a well-known geometric characterization of the standard Cauchy distribution and motivates a potential geometric interpretation for Internet observations. This work generalizes the Gull lighthouse problem to accommodate larger classes of coastlines, deriving a closed-form solution for the resulting probability distributions, stating and examining the inverse problem of identifying an appropriate coastline given a continuous probability distribution, identifying a geometric heuristic for solving this problem computationally, and applying that heuristic to examine the temporal geometry of different subsets of network observations. Application of this method to the CAIDA and GreyNoise data reveals a several orders of magnitude difference between known benign and other traffic which can lead to potentially novel ways to protect networks.
doi_str_mv 10.48550/arxiv.2310.00522
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2310_00522</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2310_00522</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-d010ea57eb02c6d12859476da7f35c540d9e0ddef4e7c80442acb93ce31f9f353</originalsourceid><addsrcrecordid>eNotzztPwzAUBWAvDKjwA5iwuqc4fsTJWEU8KgUYWokxurWvK4vUjhxTKL-eUpiOdHR0pI-Qm5ItZK0Uu4P05Q8LLk4FY4rzS_L2DOPow45GR1chYwqY6byNMOXBB5zm9OCBdpB2SNcGBqTLEMNx77_R0hfMnzG903X8SAZpG1PCAbKPYboiFw6GCa__c0Y2D_eb9qnoXh9X7bIroNK8sKxkCErjlnFT2ZLXqpG6sqCdUEZJZhtk1qKTqE3NpORgto0wKErXnCZiRm7_bs-yfkx-D-nY_wr7s1D8AGoxTB4</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations</title><source>arXiv.org</source><creator>Jananthan, Hayden ; Kepner, Jeremy ; Jones, Michael ; Arcand, William ; Bestor, David ; Bergeron, William ; Byun, Chansup ; Davis, Timothy ; Gadepally, Vijay ; Grant, Daniel ; Houle, Michael ; Hubbell, Matthew ; Klein, Anna ; Milechin, Lauren ; Morales, Guillermo ; Morris, Andrew ; Mullen, Julie ; Patel, Ritesh ; Pentland, Alex ; Pisharody, Sandeep ; Prout, Andrew ; Reuther, Albert ; Rosa, Antonio ; Samsi, Siddharth ; Trigg, Tyler ; Wachman, Gabriel ; Yee, Charles ; Michaleas, Peter</creator><creatorcontrib>Jananthan, Hayden ; Kepner, Jeremy ; Jones, Michael ; Arcand, William ; Bestor, David ; Bergeron, William ; Byun, Chansup ; Davis, Timothy ; Gadepally, Vijay ; Grant, Daniel ; Houle, Michael ; Hubbell, Matthew ; Klein, Anna ; Milechin, Lauren ; Morales, Guillermo ; Morris, Andrew ; Mullen, Julie ; Patel, Ritesh ; Pentland, Alex ; Pisharody, Sandeep ; Prout, Andrew ; Reuther, Albert ; Rosa, Antonio ; Samsi, Siddharth ; Trigg, Tyler ; Wachman, Gabriel ; Yee, Charles ; Michaleas, Peter</creatorcontrib><description>Expanding the scientific tools available to protect computer networks can be aided by a deeper understanding of the underlying statistical distributions of network traffic and their potential geometric interpretations. Analyses of large scale network observations provide a unique window into studying those underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M associative array technologies enable the efficient anonymized analysis of network traffic on the scale of trillions of events. This work analyzes over 100,000,000,000 anonymized packets from the largest Internet telescope (CAIDA) and over 10,000,000 anonymized sources from the largest commercial honeyfarm (GreyNoise). Neither CAIDA nor GreyNoise actively emit Internet traffic and provide distinct observations of unsolicited Internet traffic (primarily botnets and scanners). Analysis of these observations confirms the previously observed Cauchy-like distributions describing temporal correlations between Internet sources. The Gull lighthouse problem is a well-known geometric characterization of the standard Cauchy distribution and motivates a potential geometric interpretation for Internet observations. This work generalizes the Gull lighthouse problem to accommodate larger classes of coastlines, deriving a closed-form solution for the resulting probability distributions, stating and examining the inverse problem of identifying an appropriate coastline given a continuous probability distribution, identifying a geometric heuristic for solving this problem computationally, and applying that heuristic to examine the temporal geometry of different subsets of network observations. Application of this method to the CAIDA and GreyNoise data reveals a several orders of magnitude difference between known benign and other traffic which can lead to potentially novel ways to protect networks.</description><identifier>DOI: 10.48550/arxiv.2310.00522</identifier><language>eng</language><subject>Computer Science - Social and Information Networks</subject><creationdate>2023-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2310.00522$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2310.00522$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Jananthan, Hayden</creatorcontrib><creatorcontrib>Kepner, Jeremy</creatorcontrib><creatorcontrib>Jones, Michael</creatorcontrib><creatorcontrib>Arcand, William</creatorcontrib><creatorcontrib>Bestor, David</creatorcontrib><creatorcontrib>Bergeron, William</creatorcontrib><creatorcontrib>Byun, Chansup</creatorcontrib><creatorcontrib>Davis, Timothy</creatorcontrib><creatorcontrib>Gadepally, Vijay</creatorcontrib><creatorcontrib>Grant, Daniel</creatorcontrib><creatorcontrib>Houle, Michael</creatorcontrib><creatorcontrib>Hubbell, Matthew</creatorcontrib><creatorcontrib>Klein, Anna</creatorcontrib><creatorcontrib>Milechin, Lauren</creatorcontrib><creatorcontrib>Morales, Guillermo</creatorcontrib><creatorcontrib>Morris, Andrew</creatorcontrib><creatorcontrib>Mullen, Julie</creatorcontrib><creatorcontrib>Patel, Ritesh</creatorcontrib><creatorcontrib>Pentland, Alex</creatorcontrib><creatorcontrib>Pisharody, Sandeep</creatorcontrib><creatorcontrib>Prout, Andrew</creatorcontrib><creatorcontrib>Reuther, Albert</creatorcontrib><creatorcontrib>Rosa, Antonio</creatorcontrib><creatorcontrib>Samsi, Siddharth</creatorcontrib><creatorcontrib>Trigg, Tyler</creatorcontrib><creatorcontrib>Wachman, Gabriel</creatorcontrib><creatorcontrib>Yee, Charles</creatorcontrib><creatorcontrib>Michaleas, Peter</creatorcontrib><title>Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations</title><description>Expanding the scientific tools available to protect computer networks can be aided by a deeper understanding of the underlying statistical distributions of network traffic and their potential geometric interpretations. Analyses of large scale network observations provide a unique window into studying those underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M associative array technologies enable the efficient anonymized analysis of network traffic on the scale of trillions of events. This work analyzes over 100,000,000,000 anonymized packets from the largest Internet telescope (CAIDA) and over 10,000,000 anonymized sources from the largest commercial honeyfarm (GreyNoise). Neither CAIDA nor GreyNoise actively emit Internet traffic and provide distinct observations of unsolicited Internet traffic (primarily botnets and scanners). Analysis of these observations confirms the previously observed Cauchy-like distributions describing temporal correlations between Internet sources. The Gull lighthouse problem is a well-known geometric characterization of the standard Cauchy distribution and motivates a potential geometric interpretation for Internet observations. This work generalizes the Gull lighthouse problem to accommodate larger classes of coastlines, deriving a closed-form solution for the resulting probability distributions, stating and examining the inverse problem of identifying an appropriate coastline given a continuous probability distribution, identifying a geometric heuristic for solving this problem computationally, and applying that heuristic to examine the temporal geometry of different subsets of network observations. Application of this method to the CAIDA and GreyNoise data reveals a several orders of magnitude difference between known benign and other traffic which can lead to potentially novel ways to protect networks.</description><subject>Computer Science - Social and Information Networks</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzztPwzAUBWAvDKjwA5iwuqc4fsTJWEU8KgUYWokxurWvK4vUjhxTKL-eUpiOdHR0pI-Qm5ItZK0Uu4P05Q8LLk4FY4rzS_L2DOPow45GR1chYwqY6byNMOXBB5zm9OCBdpB2SNcGBqTLEMNx77_R0hfMnzG903X8SAZpG1PCAbKPYboiFw6GCa__c0Y2D_eb9qnoXh9X7bIroNK8sKxkCErjlnFT2ZLXqpG6sqCdUEZJZhtk1qKTqE3NpORgto0wKErXnCZiRm7_bs-yfkx-D-nY_wr7s1D8AGoxTB4</recordid><startdate>20230930</startdate><enddate>20230930</enddate><creator>Jananthan, Hayden</creator><creator>Kepner, Jeremy</creator><creator>Jones, Michael</creator><creator>Arcand, William</creator><creator>Bestor, David</creator><creator>Bergeron, William</creator><creator>Byun, Chansup</creator><creator>Davis, Timothy</creator><creator>Gadepally, Vijay</creator><creator>Grant, Daniel</creator><creator>Houle, Michael</creator><creator>Hubbell, Matthew</creator><creator>Klein, Anna</creator><creator>Milechin, Lauren</creator><creator>Morales, Guillermo</creator><creator>Morris, Andrew</creator><creator>Mullen, Julie</creator><creator>Patel, Ritesh</creator><creator>Pentland, Alex</creator><creator>Pisharody, Sandeep</creator><creator>Prout, Andrew</creator><creator>Reuther, Albert</creator><creator>Rosa, Antonio</creator><creator>Samsi, Siddharth</creator><creator>Trigg, Tyler</creator><creator>Wachman, Gabriel</creator><creator>Yee, Charles</creator><creator>Michaleas, Peter</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230930</creationdate><title>Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations</title><author>Jananthan, Hayden ; Kepner, Jeremy ; Jones, Michael ; Arcand, William ; Bestor, David ; Bergeron, William ; Byun, Chansup ; Davis, Timothy ; Gadepally, Vijay ; Grant, Daniel ; Houle, Michael ; Hubbell, Matthew ; Klein, Anna ; Milechin, Lauren ; Morales, Guillermo ; Morris, Andrew ; Mullen, Julie ; Patel, Ritesh ; Pentland, Alex ; Pisharody, Sandeep ; Prout, Andrew ; Reuther, Albert ; Rosa, Antonio ; Samsi, Siddharth ; Trigg, Tyler ; Wachman, Gabriel ; Yee, Charles ; Michaleas, Peter</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-d010ea57eb02c6d12859476da7f35c540d9e0ddef4e7c80442acb93ce31f9f353</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Social and Information Networks</topic><toplevel>online_resources</toplevel><creatorcontrib>Jananthan, Hayden</creatorcontrib><creatorcontrib>Kepner, Jeremy</creatorcontrib><creatorcontrib>Jones, Michael</creatorcontrib><creatorcontrib>Arcand, William</creatorcontrib><creatorcontrib>Bestor, David</creatorcontrib><creatorcontrib>Bergeron, William</creatorcontrib><creatorcontrib>Byun, Chansup</creatorcontrib><creatorcontrib>Davis, Timothy</creatorcontrib><creatorcontrib>Gadepally, Vijay</creatorcontrib><creatorcontrib>Grant, Daniel</creatorcontrib><creatorcontrib>Houle, Michael</creatorcontrib><creatorcontrib>Hubbell, Matthew</creatorcontrib><creatorcontrib>Klein, Anna</creatorcontrib><creatorcontrib>Milechin, Lauren</creatorcontrib><creatorcontrib>Morales, Guillermo</creatorcontrib><creatorcontrib>Morris, Andrew</creatorcontrib><creatorcontrib>Mullen, Julie</creatorcontrib><creatorcontrib>Patel, Ritesh</creatorcontrib><creatorcontrib>Pentland, Alex</creatorcontrib><creatorcontrib>Pisharody, Sandeep</creatorcontrib><creatorcontrib>Prout, Andrew</creatorcontrib><creatorcontrib>Reuther, Albert</creatorcontrib><creatorcontrib>Rosa, Antonio</creatorcontrib><creatorcontrib>Samsi, Siddharth</creatorcontrib><creatorcontrib>Trigg, Tyler</creatorcontrib><creatorcontrib>Wachman, Gabriel</creatorcontrib><creatorcontrib>Yee, Charles</creatorcontrib><creatorcontrib>Michaleas, Peter</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jananthan, Hayden</au><au>Kepner, Jeremy</au><au>Jones, Michael</au><au>Arcand, William</au><au>Bestor, David</au><au>Bergeron, William</au><au>Byun, Chansup</au><au>Davis, Timothy</au><au>Gadepally, Vijay</au><au>Grant, Daniel</au><au>Houle, Michael</au><au>Hubbell, Matthew</au><au>Klein, Anna</au><au>Milechin, Lauren</au><au>Morales, Guillermo</au><au>Morris, Andrew</au><au>Mullen, Julie</au><au>Patel, Ritesh</au><au>Pentland, Alex</au><au>Pisharody, Sandeep</au><au>Prout, Andrew</au><au>Reuther, Albert</au><au>Rosa, Antonio</au><au>Samsi, Siddharth</au><au>Trigg, Tyler</au><au>Wachman, Gabriel</au><au>Yee, Charles</au><au>Michaleas, Peter</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations</atitle><date>2023-09-30</date><risdate>2023</risdate><abstract>Expanding the scientific tools available to protect computer networks can be aided by a deeper understanding of the underlying statistical distributions of network traffic and their potential geometric interpretations. Analyses of large scale network observations provide a unique window into studying those underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M associative array technologies enable the efficient anonymized analysis of network traffic on the scale of trillions of events. This work analyzes over 100,000,000,000 anonymized packets from the largest Internet telescope (CAIDA) and over 10,000,000 anonymized sources from the largest commercial honeyfarm (GreyNoise). Neither CAIDA nor GreyNoise actively emit Internet traffic and provide distinct observations of unsolicited Internet traffic (primarily botnets and scanners). Analysis of these observations confirms the previously observed Cauchy-like distributions describing temporal correlations between Internet sources. The Gull lighthouse problem is a well-known geometric characterization of the standard Cauchy distribution and motivates a potential geometric interpretation for Internet observations. This work generalizes the Gull lighthouse problem to accommodate larger classes of coastlines, deriving a closed-form solution for the resulting probability distributions, stating and examining the inverse problem of identifying an appropriate coastline given a continuous probability distribution, identifying a geometric heuristic for solving this problem computationally, and applying that heuristic to examine the temporal geometry of different subsets of network observations. Application of this method to the CAIDA and GreyNoise data reveals a several orders of magnitude difference between known benign and other traffic which can lead to potentially novel ways to protect networks.</abstract><doi>10.48550/arxiv.2310.00522</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2310.00522
ispartof
issn
language eng
recordid cdi_arxiv_primary_2310_00522
source arXiv.org
subjects Computer Science - Social and Information Networks
title Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T21%3A35%3A23IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Mapping%20of%20Internet%20%22Coastlines%22%20via%20Large%20Scale%20Anonymized%20Network%20Source%20Correlations&rft.au=Jananthan,%20Hayden&rft.date=2023-09-30&rft_id=info:doi/10.48550/arxiv.2310.00522&rft_dat=%3Carxiv_GOX%3E2310_00522%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true