Obfuscatory obscanturism: Making workload traces of commercially-sensitive systems safe to release

Cloud providers such as Google are interested in fostering research on the daunting technical challenges they face in supporting planetary-scale distributed systems, but no academic organizations have similar scale systems on which to experiment. Fortunately, good research can still be done using tr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Reiss, C., Wilkes, J., Hellerstein, J. L.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Cloud providers such as Google are interested in fostering research on the daunting technical challenges they face in supporting planetary-scale distributed systems, but no academic organizations have similar scale systems on which to experiment. Fortunately, good research can still be done using traces of real-life production workloads, but there are risks in releasing such data, including inadvertently disclosing confidential or proprietary information, as happened with the Netflix Prize data. This paper discusses these risks, and our approach to them, which we call systematic obfuscation. It protects proprietary and personal data while leaving it possible to answer interesting research questions. We explain and motivate some of the risks and concerns and propose how they can best be mitigated, using as an example our recent publication of a month-long trace of a production system workload on a 11k-machine cluster.
ISSN:1542-1201
2374-9709
DOI:10.1109/NOMS.2012.6212064