Fault tolerant internet computing: Benchmarking and modelling trade-offs between availability, latency and consistency

The paper discusses our practical experience and theoretical results of investigating the impact of consistency on latency in distributed fault tolerant systems built over the Internet and clouds. We introduce a time-probabilistic failure model of distributed systems that employ the service-oriented...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of network and computer applications 2019-11, Vol.146, p.102412, Article 102412
Hauptverfasser: Gorbenko, Anatoliy, Romanovsky, Alexander, Tarasyuk, Olga
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The paper discusses our practical experience and theoretical results of investigating the impact of consistency on latency in distributed fault tolerant systems built over the Internet and clouds. We introduce a time-probabilistic failure model of distributed systems that employ the service-oriented paradigm for defining cooperation with clients over the Internet and clouds. The trade-offs between consistency, availability and latency are examined, as well as the role of the application timeout as the main determinant in the interplay between system availability and responsiveness. The model introduced heavily relies on collecting and analysing a large amount of data representing the probabilistic behaviour of such systems. The paper presents experimental results of measuring the response time in a distributed service-oriented system whose replicas are deployed at different Amazon EC2 location domains. These results clearly show that improvements in system consistency increase system latency, which is in line with the qualitative implication of the well-known CAP theorem. The paper proposes a set of novel mathematical models that are based on statistical analysis of collected data and enable quantified response time prediction depending on the timeout setup and on the level of consistency provided by the replicated system. •The improved understanding of the response time uncertainty contributes to the foundations of the Internet computing.•The results of experiments studying the consistency impact on response time will help developers of replicated systems.•Quantifying the trade-offs between consistency, availability and latency contribute to knowledge about Briewer's CAP theorem.•Novel analytical models help developers to meet the timing/consistency constraints and availability requirements.•The analogy between the Heisenberg's uncertainty principle and the uncertainty of replicated data sheds light upon the nature of the large scale Internet systems.
ISSN:1084-8045
1095-8592
DOI:10.1016/j.jnca.2019.102412