Dynamic Resource Provisioning for Sustainable Cloud Computing Systems in the Presence of Correlated FailuresShow others and affiliations
2021 (English)In: IEEE Transactions on Sustainable Computing, ISSN 2377-3782, Vol. 6, no 4, p. 641-654Article in journal (Refereed) Published
Abstract [en]
Dependence of computing resources on each other in cloud computing systems (CCS) makes them prone to fail in correlated manner which significantly impacts their service reliability and energy efficiency. Focusing on these two metrics of CCS while considering correlated failures remained an open question, which is the focus of this work. This paper proposes mechanisms for improving reliability and energy efficiency jointly under correlated failures in CCS. In order to model failure correlation, statistical cluster analysis techniques are applied to real failure traces. Then, mathematical models are built to calculate reliability and energy consumption of failure prone CCS. These mathematical models are used to design fault-tolerant and energy-aware resource provisioning mechanisms/policies. In order to further reduce the energy consumption, a correlated failure-aware VM consolidation policy is also proposed in this paper. A simulation based study of the proposed resource management policies and fault tolerance mechanisms is conducted by using real failure traces and Bag-of-Tasks workload. The results demonstrate that by exploiting failure correlation with the proposed resource management policies, we reduce the occurrence of failures on tasks by 34% and increase the energy efficiency of the system by 20%, approximately in comparison to the environments where failures are handled independently.
Place, publisher, year, edition, pages
IEEE, 2021. Vol. 6, no 4, p. 641-654
Keywords [en]
Bag of Tasks, Checkpointing, Cloud Computing, Cluster Analysis, Correlated Failures, Energy Efficiency, Reliability, VM Consolidation, VM Migration, Energy policy, Energy utilization, Failure (mechanical), Fault tolerance, Fault tolerant computer systems, Natural resources management, Power management, Resource allocation, Cloud computing system (CCS), Cluster analysis technique, Dynamic resource provisioning, Failure correlation, Fault tolerance mechanisms, Resource management policy, Service reliability, Green computing
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kau:diva-83013DOI: 10.1109/TSUSC.2020.3025180Scopus ID: 2-s2.0-85091287275OAI: oai:DiVA.org:kau-83013DiVA, id: diva2:1529874
2021-02-192021-02-192022-05-23Bibliographically approved