- In the first workload model a task has a deadline and a certain amound of VM hours needed to execute the task, these hours can be divided over time such that it best fits the scheduler (more freedom to determine the number of VMs used in one hour).
- The second model has tasks specified with a deadline and a certain length, but at every hour is also specified how many VMs are needed to execute the task. When a tsk is generated using the task generator of the application, a file specifying the workload of the model is also generated. The ratio of EC2 compute instances needed for a certain instance type is used to determine the number of VMs needed. The workload generator now takes an upper and lower boundary and lets the workload fluctuate between these two according to the EC2 compute instances associated with the corresponding instance type. A snippet from a CSV file containing such a load description is given here.

**Workload Model 1 (total VM hours specified)**

To determine the optimal division between reserved and on-demand instances it is important to schedule the tasks/loads in such a way that the VMs are in use as much of the time as possible. Scheduling the tasks was rather easy for the second type of workload, so I'll first have a look at this workload model, where a task consist of a certain amount of VM hours that need to be executed before a deadline. When no scheduling is performed this would be an example of a possible result with a corresponding total price of 619.92 US dollars.

When scheduling is performed we obtain a result with a total price of 516 US dollars, here all 4 instances that are used are reserved instances, while when there was no intelligent scheduler used, there were 3 on-demand instances as well. The scheduling algorithm resulted for this example in a decrease of 16.76 percents of the total cost.

To perform the scheduling for this workload model, the tasks are first sorted in an earliest deadline first manner. Then for every task we start filling the VM hours for every instance from the earliest available VM hour till the current tasks deadline hour. If existing instances are filled till your deadline and you still have VM hours to distribute the needed extra instances are added. This actually yields an optimal scheduling solution such that the right amount of reserved instances will be provisioned. How to determine the boundary (from what amount of VM hour utilization it is cheaper to rent a reserved instance) between a reserved and an on-demand instance was explained earlier on this blog.

**Workload Model 2 (every hour #VMs specified)**

Getting the scheduling in order for the first workload model was rather difficult. I'll first present the results for this model as well and then describe the scheduling algorithm itself. Using the basic scheduler (which means that all tasks simply start at their starting date) the following result is obtained, with a total cost price of 37509.08 US dollars.

When scheduling is performed we obtain a result with a total price of 34531.56 US dollars, also we only need 27 instances instead of the 39 instances that we needed before. And now 13 instances are reserved ones, before only 11 were reserved instances. The scheduling algorithm resulted for this example in a decrease of 7.94 percents of the total cost.

I first developed a way to optimally schedule these tasks as well, but afterwards I came to the conclusion that the time complexity was too high, so a suboptimal algorithm had to be put in place. I'll first describe the optimal solution that I first tried out. It consisted of first calculating all different divisions of the workload hours over the time available for that task, this is equal to a combination of x out of y with y being the total number of time slots available and x the number of time slots used by the task. Because not all these combination can be stored in memory a combination was only generated at the moment it was used. This was made possible by a combination generator class that was initialized with x and y and could be used to iterate over all the possible combinations. By now determining the optimal combination of task schedules for the different tasks, we could determine the optimal solution. Optimal in this interpreted as the solution that minimizes the number of instances needed. This already makes the solution suboptimal, since this requirement will yield an even distribution in most cases, but it is not necessarily the optimal case. I also had a look at optimizing on how many reserved instances would be needed, this would yield the real optimal solution, I'll explain later why this wasn't a solution as well. To make the algorithm feasible again I decided to use the above algorithm in smaller intervals ... which means dividing the time in intervals and dividing the workload evenly between the intervals. The size of such interval is determined by having a look at how many combinations would be possible within the interval and comparing this with a threshold value that has to be set. Also the deadlines of the tasks will be necessary interval borders. In each interval the above described algorithm was used to determine a suboptimal solution. Now we see that it is impossible to determine how many reserved instances would be needed in total, I tried a scheduler which interpolated the number of hours the instance was needed in the interval to the bigger time frame, but this didn't seem to give better results than the scheduler that minimized the number of instances needed in each interval. Next I tried to minimize the standard deviation in number of instances needed each hour in the interval, this seemed to yield a better result for the example.

There are a lot of things that could still be tried to make this scheduler better, maybe some updates will be posted in the coming weeks/months. By the way, a new version of the EC2 Broker can be found on the SVN repository.

## No comments:

## Post a Comment