Slow Replica and Shared Protection: Energy-Efficient and Reliable Task Assignment in Cloud Data Centers

With the explosive growth in the scale of cloud computing infrastructures, reliability and energy efficiency have become important concerns considering the great complexity of cloud data centers. There is an urgent need for efficient task assignment that can dispatch tasks to appropriate cloud data...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on reliability 2021-09, Vol.70 (3), p.931-943
Hauptverfasser: Fan, Yuqi, Wang, Chen, Wu, Weili, Znati, Taieb, Du, Dingzhu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the explosive growth in the scale of cloud computing infrastructures, reliability and energy efficiency have become important concerns considering the great complexity of cloud data centers. There is an urgent need for efficient task assignment that can dispatch tasks to appropriate cloud data center servers, which is critical to achieve reliability and energy efficiency in current cloud data centers. Most of the research on task assignment focuses on only one of the objectives of reliability and energy efficiency, while the two objectives are intrinsically conflicting with each other. In this paper, we deal with the problem of task assignment in data centers, with the objective of minimizing the energy consumption while providing failure tolerance to task execution failure. We propose a reliability-aware and energy-efficient task replica assignment algorithm based on running task replicas at a low speed and enabling multiple task replicas to share the same server resources. Each task in a job processed by the cloud computing platform has two instances: main task and task replica (shadow). Each main task runs on an individual server, and the task replica associated with the main task is assigned on a different server. The main tasks run at the full server speed, while the task replicas run at a lower rate than the main tasks. The task replicas can be mapped onto dedicated backup servers or be assigned to the servers on which the main tasks are running. Multiple task replicas can share the same server resources to reduce the number of servers required. We conduct experiments through simulations. Experimental results demonstrate that the proposed algorithm can effectively reduce the energy consumption, while achieving a good balance between the number of servers used and job completion time.
ISSN:0018-9529
1558-1721
DOI:10.1109/TR.2019.2923770