[1] National Institute of Standards and Technology, “NIST Cloud Computing Standards Roadmap,” NIST Special Publication 500 - 291, 2013. [2] M. Toeroe and F. Tam, Service availability principles and practice. John Wiley and Sons Ltd publication, 2012. [3] M. Nabi, M. Toeroe, and F. Khendek, “Availability in the cloud: State of the art,” J. Netw. Comput. Appl., vol. 60, pp. 54–67, 2016. [4] A. Undheim, A. Chilwan, and P. Heegaard, “Differentiated availability in cloud computing SLAs,” in 2011 IEEE/ACM 12th International Conference on Grid Computing, 2011, pp. 129–136. [5] M. Nabi, F. Khendek, and M. Toeroe, “Upgrade of the IaaS cloud: Issues and potential solutions in the context of high-Availability,” in 26th IEEE International Symposium on Software Reliability Engineering, Industry track, 2015, pp. 21–24. [6] N. Roy, A. Dubey, and A. Gokhale, “Efficient autoscaling in the cloud using predictive models for workload forecasting,” in 2011 IEEE 4th International Conference on Cloud Computing (CLOUD), 2011, pp. 500–507. [7] Amazon Web Services, “Amazon EC2 Auto Scaling User Guide,” 2018. [Online]. Available: https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-dg.pdf. [Accessed: 05-Jul-2018]. [8] F. Paraiso, P. Merle, and L. Seinturier, “Managing elasticity across multiple cloud providers,” in 2013 International workshop on Multi-cloud applications and federated clouds - MultiCloud ’13, 2013, pp. 53–60. [9] Amazon Web Services, “UpdatePolicy Attribute,” 2019. [Online]. Available: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-attribute-updatepolicy.html. [Accessed: 05-Aug-2019]. [10] Amazon Web Services, “AWS::AutoScaling::ScheduledAction,” 2019. [Online]. Available: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-as-scheduledaction.html. [Accessed: 23-Aug-2019]. [11] I. Foster, Y. Zhao, I. Raicu, and S. Lu, “Cloud Computing and Grid Computing 360-Degree Compared,” in 2008 Grid Computing Environments Workshop, 2008, pp. 1–10. [12] Q. Zhang, L. Cheng, and R. Boutaba, “Cloud computing: state-of-the-art and research challenges,” pp. 7–18, 2010. [13] Amazon, “Amazon EC2,” 2018. [Online]. Available: http://aws.amazon.com/ec2/. [Accessed: 30-Jul-2018]. [14] “Google App Engine,” 2018. [Online]. Available: https://cloud.google.com/appengine/. [Accessed: 30-Jul-2018]. [15] “Salesforce,” 2018. [Online]. Available: https://www.salesforce.com/. [Accessed: 30-Jul-2018]. [16] H. Alipour, Y. Liu, and A. Hamou-Lhadj, “Analyzing Auto-scaling Issues in Cloud Environments,” Proc. 24th Annu. Int. Conf. Comput. Sci. Softw. Eng. IBM Corp., pp. 75–89, 2014. [17] F. L. Ferraris et al., “Evaluating the auto scaling performance of flexiscale and amazon EC2 clouds,” Proc. - 14th Int. Symp. Symb. Numer. Algorithms Sci. Comput. SYNASC 2012, pp. 423–429, 2012. [18] “OpenStack.” [Online]. Available: http://www.openstack.org/. [Accessed: 05-Aug-2019]. [19] OpenStack, “Heat documentation.” [Online]. Available: http://docs.openstack.org/developer/heat/. [Accessed: 01-May-2019]. [20] H. Khazaei, M. Jelena, V. B.Misic, and N. Beigi Mohammadi, “Availability Analysis of Cloud Computing Centers,” in Communication Software, Service and Multimeda Symposium, 2012, pp. 1981–1986. [21] F. Longo, R. Ghosh, V. K. Naik, and K. S. Trivedi, “A Scalable Availability Model for Infrastructure-as-a-Service Cloud,” in IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN), 2011, p. pp.335,346. [22] M. Mihailescu, A. Rodriguez, and C. Amza, “Enhancing application robustness in infrastructure-as-a-service clouds,” Proc. Int. Conf. Dependable Syst. Networks, pp. 146–151, 2011. [23] Q. Zhang, M. F. Zhani, M. Jabri, and R. Boutaba, “Venice: Reliable virtual data center embedding in clouds,” IEEE INFOCOM 2014 - IEEE Conf. Comput. Commun., pp. 289–297, 2014. [24] D. Jayasinghe, C. Pu, T. Eilam, M. Steinder, I. Whally, and E. Snible, “Improving Performance and Availability of Services Hosted on IaaS Clouds with Structural Constraint-Aware Virtual Machine Placement,” in IEEE International Conference on Services Computing, 2011, pp. 72–79. [25] A. Jahanbanifar, F. Khendek, and M. Toeroe, “Providing Hardware Redundancy for Highly Available Services in Virtualized Environments,” 8th IEEE Int. Conf. Softw. Secur. Reliab., no. Vmm, pp. 40–47, 2014. [26] Distributed-Management-Task-Force (DMTF), “Open Virtualization Format Specification,” 2013. [Online]. Available: https://www.dmtf.org/sites/default/files/standards/documents/DSP0243_2.1.0.pdf. [Accessed: 10-Dec-2018]. [27] E. A. Brewer, “Lessons from giant-scale services,” IEEE Internet Comput., vol. 5, no. 4, pp. 46–55, 2001. [28] T. Dumitras, P. Narasimhan, and E. Tilevich, “To Upgrade or Not to Upgrade Impact of Online Upgrades across Multiple Administrative Domains,” ACM Int. Conf. Object oriented Program. Syst. Lang. Appl. (OOPSLA ’10), pp. 865--876, 2010. [29] T. Dumitraş and P. Narasimhan, “Why do upgrades fail and what can we do about It? Toward dependable, online upgrades in enterprise system,” in 10th ACM/IFIP/USENIX International Conference on Middleware (Middleware ’09), 2009, vol. 5896 LNCS, pp. 349–372. [30] T. Dumitras, “Cloud Software Upgrades : Challenges and Opportunities,” in 2011 IEEE International Workshop on the Maintenance and Evolution of Service-Oriented and Cloud-Based Systems (MESOCA ’11), 2011, pp. 1–10. [31] T. Das, E. T. Roush, and P. Nandana, “Quantum Leap Cluster Upgrade,” in Proceedings of the 2nd Bangalore Annual Compute Conference (COMPUTE ’09), 2009, pp. 2–5. [32] X. Ouyang, B. Ding, and H. Wang, “Delayed switch: Cloud service upgrade with low availability and capacity loss,” in 2014 IEEE 5th International Conference on Software Engineering and Service Science (ICSESS), 2014, pp. 1158–1161. [33] T. Dumitras, “Dependable, Online Upgrades in Enterprise Systems,” 24th ACM SIGPLAN Conf. Companion Object Oriented Program. Syst. Lang. Appl. (OOPSLA ’09), pp. 835–836, 2009. [34] T. Dumitra and P. Narasimhan, “Toward Upgrades-as-a-Service in Distributed Systems,” in 10th ACM/IFIP/USENIX International Conference on Middleware (Middleware ’09), 2009. [35] B. Calder et al., “Windows Azure Storage : A Highly Available Cloud Storage Service with Strong Consistency,” in 23rd ACM Symposium on Operating Systems Principles (SOSP), 2011, vol. 20, pp. 143–157. [36] Amazon Web Services, “AWS Elastic Beanstalk Developer Guide API Version 2010-12-01,” 2010. [Online]. Available: http://awsdocs.s3.amazonaws.com/ElasticBeanstalk/latest/awseb-dg.pdf. [Accessed: 05-Aug-2019]. [37] D. Sun, D. Guimarans, A. Fekete, V. Gramoli, and L. Zhu, “Multi-objective Optimisation for Rolling Upgrade Allowing for Failures in Clouds,” in 2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS), 2015, pp. 68–73. [38] V. Gramoli, L. Bass, A. Fekete, and D. W. Sun, “Rollup: Non-Disruptive Rolling Upgrade with Fast Consensus-Based Dynamic Reconfigurations,” IEEE Trans. Parallel Distrib. Syst., vol. 27, pp. 2711–2724, 2016. [39] D. Sun et al., “Quantifying failure risk of version switch for rolling upgrade on clouds,” 2014 IEEE Fourth Int. Conf. Big Data Cloud Comput., pp. 175–182, 2014. [40] D. Sun, A. Fekete, V. Gramoli, G. Li, X. Xu, and L. Zhu, “R2C: Robust Rolling-Upgrade in Clouds,” IEEE Trans. Dependable Secur. Comput., pp. 1–1, 2016. [41] K. Liu, D. Zou, and H. Jin, “UaaS: Software Update as a Service for the IaaS Cloud,” Proc. - 2015 IEEE Int. Conf. Serv. Comput. SCC 2015, pp. 483–490, 2015. [42] Distributed-Management-Task-Force(DMTF), “Cloud Infrastructure Management Interface (CIMI) Model and RESTful HTTP-based Protocol: An Interface for Managing Cloud Infrastructure.” . [43] Open-Grid-Forum, “Open Cloud Computing Interface - OCCI.” [Online]. Available: http://occi-wg.org/. [Accessed: 01-May-2015]. [44] “Cloud Application Management for Platforms Version 1.1.” [Online]. Available: http://docs.oasis-open.org/camp/camp-spec/v1.1/camp-spec-v1.1.html. [45] OASIS, “Topology and Orchestration Specification for Cloud Applications (TOSCA).” [Online]. Available: http://docs.oasis-open.org/tosca/TOSCA/v1.0/TOSCA-v1.0.pdf. [46] R. Jain and S. Paul, “Network Virtualization and Software Defined Networking for Cloud Computing:A Survey,” IEEE Commun. Mag., no. November, pp. 24–31, 2013. [47] Intel, “PCI-SIG Single Root I / O Virtualization ( SR-IOV ) Support in Intel ® Virtualization Technology for Connectivity,” 2008. [48] H. M. Tseng, H. L. Lee, J. W. Hu, T. L. Liu, J. G. Chang, and W. C. Huang, “Network virtualization with cloud virtual switch,” Proc. Int. Conf. Parallel Distrib. Syst. - ICPADS, pp. 998–1003, 2011. [49] “ESXi: Bare Metal Hypervisor,” 2018. [Online]. Available: https://www.vmware.com/ca/products/esxi-and-esx.html. [Accessed: 01-Oct-2018]. [50] “Ceph.” [Online]. Available: https://ceph.com/. [Accessed: 05-Jan-2017]. [51] “Ansible.” [Online]. Available: http://www.ansible.com/home. [Accessed: 20-Aug-2019]. [52] “OpenSAF - The Open Service Availability Framework.” [Online]. Available: http://opensaf.sourceforge.net/documentation.html. [Accessed: 20-Aug-2019]. [53] P. Heidari, M. Hormati, M. Toeroe, Y. Al Ahmad, and F. Khendek, “Integrating OpenSAF High Availability Solution with OpenStack,” in Services (SERVICES), 2015 IEEE World Congress on, 2015, pp. 229–236. [54] “Puppet labs.” [Online]. Available: https://puppetlabs.com/?_ga=1.122891208.2105885589.1429055377. [Accessed: 01-May-2018]. [55] “Ruby.” [Online]. Available: https://www.ruby-lang.org/en/. [Accessed: 01-May-2019]. [56] “Chef.” [Online]. Available: https://www.chef.io/chef/. [Accessed: 01-May-2018]. [57] “Salt.” [Online]. Available: http://docs.saltstack.com/en/latest/. [Accessed: 01-May-2018]. [58] “Python.” [Online]. Available: https://www.python.org/. [Accessed: 01-May-2019]. [59] “Mistral.” [Online]. Available: https://docs.openstack.org/mistral/latest/. [Accessed: 01-Jun-2019]. [60] “TaskFlow.” [Online]. Available: https://wiki.openstack.org/wiki/TaskFlow. [Accessed: 01-May-2018]. [61] “The Official YAML Web Site.” [Online]. Available: http://yaml.org/. [Accessed: 01-May-2019]. [62] M. Nabi, M. Toeroe, and F. Khendek, “Rolling upgrade with dynamic batch size for Iaas cloud,” in 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), 2016, pp. 497–504. [63] “VMware vSAN.” [Online]. Available: https://docs.vmware.com/en/VMware-vSAN/index.html. [Accessed: 05-Jan-2018]. [64] H. Pham, “System Reliability Concepts,” Syst. Softw. Reliab., pp. 9–75, 2006. [65] L. Tomás and J. Tordsson, “Improving cloud infrastructure utilization through overbooking,” Proc. 2013 ACM Cloud Auton. Comput. Conf. - CAC ’13, p. 1, 2013. [66] L. Tomas and J. Tordsson, “An autonomic approach to risk-aware data center overbooking,” IEEE Trans. Cloud Comput., vol. 2, no. 3, pp. 292–305, 2014. [67] “Vagrant.” [Online]. Available: https://www.vagrantup.com/. [Accessed: 01-Oct-2018]. [68] “vagrant-ansible-openstack.” [Online]. Available: https://github.com/dguerri/vagrant-ansible-openstack. [69] “The Go Programming Language,” 2018. [Online]. Available: https://golang.org/. [Accessed: 10-Oct-2018]. [70] “gophercloud: The OpenStack SDK for Go,” 2018. [Online]. Available: http://gophercloud.io/. [Accessed: 10-Oct-2018]. [71] “QEMU.” [Online]. Available: https://www.qemu.org/. [Accessed: 20-Aug-2019]. [72] A. T. Foundjem, “Towards Improving the Reliability of Live Migration Operations in OpenStack Clouds.,” Ecole Polytechnique de Montreal, 2017. [73] S. K. Garg, A. N. Toosi, S. K. Gopalaiyengar, and R. Buyya, “SLA-based virtual machine management for heterogeneous workloads in a cloud datacenter,” J. Netw. Comput. Appl., vol. 45, pp. 108–120, 2014. [74] “JGraphT a Java library of graph theory data structures and algorithms.” [Online]. Available: https://jgrapht.org/. [Accessed: 05-Dec-2018].