Investigating Distribution of Data of HTTP Traffic: An Empirical Study

Internet traffic today is dominated by that of the hypertext transfer protocol (HTTP). Understanding the statistical characteristics of the data transferred via HTTP helps better model traffic patterns. In this work, we conduct an empirical study by employing an experiment that accesses roughly 34,0...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Chehadeh, Y.C., Hatahet, A.Z., Agamy, A.E., Bamakhrama, M.A., Banawan, S.A.
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Access protocols Bandwidth Information retrieval Internet Pixel Statistical distributions Traffic control Web pages Web server White spaces
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Internet traffic today is dominated by that of the hypertext transfer protocol (HTTP). Understanding the statistical characteristics of the data transferred via HTTP helps better model traffic patterns. In this work, we conduct an empirical study by employing an experiment that accesses roughly 34,000 of the most popular Web sites on the Internet today and crawls their Web pages. We collect metadata information on the retrieved roughly two million objects. We determine statistics and distributions based on object sizes, occurrence of specific types, and sizes of specific types. The data of the distributions produced can be used as a template model for Web-traffic modeling in future research. We further note an intriguing result that 5.7% of HTTP traffic from Web servers to clients is due to sending spacer objects (image files representing a 1times1 white-space pixel) or to stale links referencing non-existing files. Such squander in bandwidth is not due to overhead and can be minimized by simple additions to the HTML standard and by automating the process of removing stale links
DOI:	10.1109/INNOVATIONS.2006.301928