DOCTORAL SEMINAR

Cost Performance of Scaling Applications on the Cloud

Speaker
Mr Rathnayake Mudiyanselage Sunimal Rathnayake
Advisor
Dr Teo Yong Meng, Associate Professor, School of Computing


06 Feb 2020 Thursday, 10:00 AM to 11:30 AM

Executive Classroom, COM2-04-02

Abstract:

The inherent scaling capability and pay-per-use charging has led to the growth of cloud computing as a choice of compute resource infrastructure for a wide range of users from large corporations to individuals. While the inherent resource scaling on cloud has been well explored, the cost performance of scaling applications on cloud has received much less attention. Leveraging application scalability on inherently scalable cloud resources opens up new opportunities for cloud consumers to yield the maximize advantage from cloud. This thesis investigates the premise of scaling applications on cloud and its implications on cost performance. Motivated by Amdahl's fixed-workload scaling and Gustafson's fixed-time scaling for high performance computing, we propose fixed-cost scaling for cloud computing and investigate the implications of the proposed fixed-cost law under fixed-workload and fixed-time. Under fixed-cost application scaling, we address three key issues: large cloud resource configuration space, scaling application problem size, and scaling accuracy.

To address the challenge presented by a large cloud resource configuration space, we propose a measurement-driven analytical modeling approach for determining cost-time Pareto-optimal cloud resource configurations. Our approach exposes the existence of multiple Pareto-optimal configurations that meet application cost budget and time deadline constraints. We investigate the impact of fixed-workload scaling on cloud and discuss the effect of resource configuration on cost and time performance. Our results show that up to 30% cost savings can be achieved with Pareto-optimal resource configuration for an n-body simulation application.

Given a fixed-cost budget and a time deadline, we investigate the effect of scaling the problem size of an application. Through a measurement-driven analytical modeling approach, we show that cost-time-problem size Pareto-frontier exists with multiple sweet spots meeting cost budget as well as time deadline constraints. Among the Pareto-optimal problem sizes, we show that there are opportunities for tightening cost budget and time deadline for comparatively smaller reduction in problem size. To characterize the cost-performance of cloud resources, we introduce a new metric Performance Cost Ratio (PCR) and demonstrate the use of PCR to efficiently derive cost-time near optimal cloud resource configurations for executing applications on cloud.

In contrast to traditional applications that produce exact results, there exist a class of applications that produce approximate results with a notion of accuracy. For such applications, accuracy can be traded-off for reduced execution cost and time. With a measurement-driven approach, we investigate the cost-time-accuracy performance for executing applications on cloud using Convolution Neural Networks (CNN) as an example. In contrast to studies that focus on CNN learning phase that occurs largely a one-off cost, we focus on improving the cost-performance of CNN inference. Without significant reduction in inference accuracy, our approach focuses on determining multiple sweet-spots where cost and time could be reduced. We show that selecting the right application configuration reduces inference cost and time by half with one-tenth reduction in accuracy. To address the challenge of having multiple resource and application configurations that give the same accuracy but with different cost and time, we introduce two new metrics Cost Accuracy Ratio (CAR) and Time Accuracy Ratio (TAR) for quantifying the performance of cloud resources with respect to accuracy. Using CAR to guide our heuristic, cost-accuracy efficient cloud resources an be determined in polynomial time.