Investigating Distribution of Data of HTTP Traffic: An Empirical Study
Internet traffic today is dominated by that of the hypertext transfer protocol (HTTP). Understanding the statistical characteristics of the data transferred via HTTP helps better model traffic patterns. In this work, we conduct an empirical study by employing an experiment that accesses roughly 34,0...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 5 |
---|---|
container_issue | |
container_start_page | 1 |
container_title | |
container_volume | |
creator | Chehadeh, Y.C. Hatahet, A.Z. Agamy, A.E. Bamakhrama, M.A. Banawan, S.A. |
description | Internet traffic today is dominated by that of the hypertext transfer protocol (HTTP). Understanding the statistical characteristics of the data transferred via HTTP helps better model traffic patterns. In this work, we conduct an empirical study by employing an experiment that accesses roughly 34,000 of the most popular Web sites on the Internet today and crawls their Web pages. We collect metadata information on the retrieved roughly two million objects. We determine statistics and distributions based on object sizes, occurrence of specific types, and sizes of specific types. The data of the distributions produced can be used as a template model for Web-traffic modeling in future research. We further note an intriguing result that 5.7% of HTTP traffic from Web servers to clients is due to sending spacer objects (image files representing a 1times1 white-space pixel) or to stale links referencing non-existing files. Such squander in bandwidth is not due to overhead and can be minimized by simple additions to the HTML standard and by automating the process of removing stale links |
doi_str_mv | 10.1109/INNOVATIONS.2006.301928 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_4085443</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4085443</ieee_id><sourcerecordid>4085443</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-959f0d39bd8bf0b5b2dbe27b1c1075e0889d2b54b542faf3e75cd4040f42ffd63</originalsourceid><addsrcrecordid>eNpVjF1LwzAYRiMiKLO_wAvzB1rffLWNd2UfrjBaYcHbkTTJiGzdaDNh_96K3vhw4HBuHoSeCWSEgHypm6b9qFTdNtuMAuQZAyJpeYMSWZSEU84hL3h--6-ZvEfJOH7CNCYFZfkDWtX9lxtj2OsY-j1ehDEOwVxiOPX45PFCR_3jtVLvWA3a-9C94qrHy-M5DKHTB7yNF3t9RHdeH0aX_HmG1Gqp5ut0077V82qTBgkxlUJ6sEwaWxoPRhhqjaOFIR2BQjgoS2mpEXyCeu2ZK0RnOXDwU3ubsxl6-r0NzrndeQhHPVx3HErBOWPf-rROhQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Investigating Distribution of Data of HTTP Traffic: An Empirical Study</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Chehadeh, Y.C. ; Hatahet, A.Z. ; Agamy, A.E. ; Bamakhrama, M.A. ; Banawan, S.A.</creator><creatorcontrib>Chehadeh, Y.C. ; Hatahet, A.Z. ; Agamy, A.E. ; Bamakhrama, M.A. ; Banawan, S.A.</creatorcontrib><description>Internet traffic today is dominated by that of the hypertext transfer protocol (HTTP). Understanding the statistical characteristics of the data transferred via HTTP helps better model traffic patterns. In this work, we conduct an empirical study by employing an experiment that accesses roughly 34,000 of the most popular Web sites on the Internet today and crawls their Web pages. We collect metadata information on the retrieved roughly two million objects. We determine statistics and distributions based on object sizes, occurrence of specific types, and sizes of specific types. The data of the distributions produced can be used as a template model for Web-traffic modeling in future research. We further note an intriguing result that 5.7% of HTTP traffic from Web servers to clients is due to sending spacer objects (image files representing a 1times1 white-space pixel) or to stale links referencing non-existing files. Such squander in bandwidth is not due to overhead and can be minimized by simple additions to the HTML standard and by automating the process of removing stale links</description><identifier>ISBN: 9781424406739</identifier><identifier>ISBN: 1424406730</identifier><identifier>EISBN: 9781424406746</identifier><identifier>EISBN: 1424406749</identifier><identifier>DOI: 10.1109/INNOVATIONS.2006.301928</identifier><language>eng</language><publisher>IEEE</publisher><subject>Access protocols ; Bandwidth ; Information retrieval ; Internet ; Pixel ; Statistical distributions ; Traffic control ; Web pages ; Web server ; White spaces</subject><ispartof>2006 Innovations in Information Technology, 2006, p.1-5</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4085443$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2057,27924,54919</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4085443$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Chehadeh, Y.C.</creatorcontrib><creatorcontrib>Hatahet, A.Z.</creatorcontrib><creatorcontrib>Agamy, A.E.</creatorcontrib><creatorcontrib>Bamakhrama, M.A.</creatorcontrib><creatorcontrib>Banawan, S.A.</creatorcontrib><title>Investigating Distribution of Data of HTTP Traffic: An Empirical Study</title><title>2006 Innovations in Information Technology</title><addtitle>INNOVATIONS</addtitle><description>Internet traffic today is dominated by that of the hypertext transfer protocol (HTTP). Understanding the statistical characteristics of the data transferred via HTTP helps better model traffic patterns. In this work, we conduct an empirical study by employing an experiment that accesses roughly 34,000 of the most popular Web sites on the Internet today and crawls their Web pages. We collect metadata information on the retrieved roughly two million objects. We determine statistics and distributions based on object sizes, occurrence of specific types, and sizes of specific types. The data of the distributions produced can be used as a template model for Web-traffic modeling in future research. We further note an intriguing result that 5.7% of HTTP traffic from Web servers to clients is due to sending spacer objects (image files representing a 1times1 white-space pixel) or to stale links referencing non-existing files. Such squander in bandwidth is not due to overhead and can be minimized by simple additions to the HTML standard and by automating the process of removing stale links</description><subject>Access protocols</subject><subject>Bandwidth</subject><subject>Information retrieval</subject><subject>Internet</subject><subject>Pixel</subject><subject>Statistical distributions</subject><subject>Traffic control</subject><subject>Web pages</subject><subject>Web server</subject><subject>White spaces</subject><isbn>9781424406739</isbn><isbn>1424406730</isbn><isbn>9781424406746</isbn><isbn>1424406749</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2006</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNpVjF1LwzAYRiMiKLO_wAvzB1rffLWNd2UfrjBaYcHbkTTJiGzdaDNh_96K3vhw4HBuHoSeCWSEgHypm6b9qFTdNtuMAuQZAyJpeYMSWZSEU84hL3h--6-ZvEfJOH7CNCYFZfkDWtX9lxtj2OsY-j1ehDEOwVxiOPX45PFCR_3jtVLvWA3a-9C94qrHy-M5DKHTB7yNF3t9RHdeH0aX_HmG1Gqp5ut0077V82qTBgkxlUJ6sEwaWxoPRhhqjaOFIR2BQjgoS2mpEXyCeu2ZK0RnOXDwU3ubsxl6-r0NzrndeQhHPVx3HErBOWPf-rROhQ</recordid><startdate>200611</startdate><enddate>200611</enddate><creator>Chehadeh, Y.C.</creator><creator>Hatahet, A.Z.</creator><creator>Agamy, A.E.</creator><creator>Bamakhrama, M.A.</creator><creator>Banawan, S.A.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>200611</creationdate><title>Investigating Distribution of Data of HTTP Traffic: An Empirical Study</title><author>Chehadeh, Y.C. ; Hatahet, A.Z. ; Agamy, A.E. ; Bamakhrama, M.A. ; Banawan, S.A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-959f0d39bd8bf0b5b2dbe27b1c1075e0889d2b54b542faf3e75cd4040f42ffd63</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Access protocols</topic><topic>Bandwidth</topic><topic>Information retrieval</topic><topic>Internet</topic><topic>Pixel</topic><topic>Statistical distributions</topic><topic>Traffic control</topic><topic>Web pages</topic><topic>Web server</topic><topic>White spaces</topic><toplevel>online_resources</toplevel><creatorcontrib>Chehadeh, Y.C.</creatorcontrib><creatorcontrib>Hatahet, A.Z.</creatorcontrib><creatorcontrib>Agamy, A.E.</creatorcontrib><creatorcontrib>Bamakhrama, M.A.</creatorcontrib><creatorcontrib>Banawan, S.A.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Chehadeh, Y.C.</au><au>Hatahet, A.Z.</au><au>Agamy, A.E.</au><au>Bamakhrama, M.A.</au><au>Banawan, S.A.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Investigating Distribution of Data of HTTP Traffic: An Empirical Study</atitle><btitle>2006 Innovations in Information Technology</btitle><stitle>INNOVATIONS</stitle><date>2006-11</date><risdate>2006</risdate><spage>1</spage><epage>5</epage><pages>1-5</pages><isbn>9781424406739</isbn><isbn>1424406730</isbn><eisbn>9781424406746</eisbn><eisbn>1424406749</eisbn><abstract>Internet traffic today is dominated by that of the hypertext transfer protocol (HTTP). Understanding the statistical characteristics of the data transferred via HTTP helps better model traffic patterns. In this work, we conduct an empirical study by employing an experiment that accesses roughly 34,000 of the most popular Web sites on the Internet today and crawls their Web pages. We collect metadata information on the retrieved roughly two million objects. We determine statistics and distributions based on object sizes, occurrence of specific types, and sizes of specific types. The data of the distributions produced can be used as a template model for Web-traffic modeling in future research. We further note an intriguing result that 5.7% of HTTP traffic from Web servers to clients is due to sending spacer objects (image files representing a 1times1 white-space pixel) or to stale links referencing non-existing files. Such squander in bandwidth is not due to overhead and can be minimized by simple additions to the HTML standard and by automating the process of removing stale links</abstract><pub>IEEE</pub><doi>10.1109/INNOVATIONS.2006.301928</doi><tpages>5</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISBN: 9781424406739 |
ispartof | 2006 Innovations in Information Technology, 2006, p.1-5 |
issn | |
language | eng |
recordid | cdi_ieee_primary_4085443 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Access protocols Bandwidth Information retrieval Internet Pixel Statistical distributions Traffic control Web pages Web server White spaces |
title | Investigating Distribution of Data of HTTP Traffic: An Empirical Study |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T17%3A14%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Investigating%20Distribution%20of%20Data%20of%20HTTP%20Traffic:%20An%20Empirical%20Study&rft.btitle=2006%20Innovations%20in%20Information%20Technology&rft.au=Chehadeh,%20Y.C.&rft.date=2006-11&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.isbn=9781424406739&rft.isbn_list=1424406730&rft_id=info:doi/10.1109/INNOVATIONS.2006.301928&rft_dat=%3Cieee_6IE%3E4085443%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781424406746&rft.eisbn_list=1424406749&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=4085443&rfr_iscdi=true |