Optimize Your Cloud Capacity [Webinar Recap]

    May 25, 2018

    By Per Bauer

    Key Principles:

    • Trim workloads before migrating
    • Tag charges by responsible department
    • Continue improving cost and performance prediction

    In our recent webinar, we went deeper into managing your cloud migration. If you haven’t checked out our webinars on Getting Started with Your Cloud Migration and How to Control Your Cloud Deployment, you can watch those on-demand to get some background on why and how to move workloads to the cloud. Here’s a recap of our webinar on how to optimize those workloads to make sure you are making the most of your cloud capacity and staying on budget.

    Conversation about cloud used to be about private versus public, with the mindset that you could use both for different workloads or even move your workloads from one to another. This approach is called hybrid cloud. However, more recently, public cloud has advanced much more than private cloud, so managing hybrid cloud has become increasingly difficult. Multi-cloud—using multiple different public cloud providers—provides competition, so you’re always going with the vendor that can provide the best software as a service (SaaS) or platform as a service (PaaS). Infrastructure as a Service (IaaS) makes it difficult to manage multiple vendors at the same time. You could possibly partition your environment, but most organizations go with a single vendor, which is what we assume going forward.

    99% of the market is running workloads on-premises and in the cloud, or using hybrid IT. In other words, most organizations are facing the challenge of managing hybrid IT. With that in mind, we have to ask the question:

    How does capacity management change in the cloud?

    Most of the driving factors for capacity management still exist when we talk about cloud versus on-premises. The main difference is that on-premises capacity management is about optimizing limited data center resources to make sure you have headroom and growth margins. Whereas, public cloud has infinite capacity (at least in theory), so limited budget rather than resources is the motivation for optimizing.

    When optimizing your on-premises resources, efficiency gains are realized over time, not immediately. It takes longer to make changes, and you won’t see the impact in cost or performance right away. In the cloud, however, efficiency improvements have immediate impact; your next bill will be lower.

    You can grow your workloads as much as you want in the cloud, but it will cost more without optimization, and you won’t be able to predict what comes next. So how does one start to optimize workloads in the cloud?

    Identify and Address Inefficiencies

    When you first transfer a workload over to the cloud, making it cloud native, your newly migrated workload will cost more than expected if you don’t make any changes before moving. If you’ve followed our other webinars, you know that we advocate the Lift, Trim, and Shift model of cloud migration.

    Trim the Fat

    Lift, Trim, and Shift cloud migration means looking at the resource utilization, business activity cycles, instances, and growth headroom of each of your VMs before moving. The benefits of this approach include:

    • Better resource utilization
    • More room to grow
    • 35% to 85% cost savings for deployment to cloud, depending on how well you’ve trimmed
    • A golden opportunity to increase the efficiency of your instances

    New workloads in the cloud mean you have to let go of some control and be more reactive than proactive, which can be hard to accept. Usually capacity managers focus on proactive activities, but in the cloud you depend on reactive clean-up and right-sizing procedures.

    So, what are these right-sizing procedures? We have 4 specific ones for you, and it’s very important to do these correctly so you don’t hurt your workloads.

    1. Categorize workloads. Are they batch workloads? Online transactional? What is the level of sustained activity: static, always active at one time of day?

    2. Understand business activity cycles and seasonality. What is the peak that we need to provision for?

    3. Focus on performance impact rather than just looking at utilization metrics, because some workloads just grab whatever is available. High utilization levels don’t necessarily mean the application can’t perform. We suggest building queueing models for the system to make sure you understand the performance impact of the behavior and not just the utilization metrics.

    4. Compare apples to apples. Even though applications may—at face value—have the same amount of resources from the cloud instance, they may have completely different performance numbers. Use tools to help you benchmark different cloud instance performance characteristics. Your usage doesn’t necessarily tell you how well something is performing.

    On-Demand vs. Reserved Instances

    On-demand, reserved, and spot instances are the three types of instances you can use in the cloud. Spot is a discounted unit that someone else has reserved but is not using. Reserved means that you commit to a certain instance for a time period at a set rate. You’ll get a discount for committing, depending on your payment schedule and whether it’s a standard or convertible reservation. On-demand simply means using what you need when you need it. It’ll be slightly more expensive per instance, but it can balance out if your workloads are less predictable.

    When you’re evaluating whether or not reserved instances are worth it, we recommend having a payback period where you monitor for certain number of months with 100% usage until you see a price benefit. You’ll see moderate gains for instances used less than 65% of the time. Rather than reserving per instance, group your instances by application, function, service, or department and reserve based on those groups. It’ll make it easier to manage them.

    Your rule of thumb for reserved versus on-demand instances can be to use reserved instances to cover your sustained activity, or your consistent usage level. Then supplement with on-demand instances to cover peaks and bursts in activity.

    Practical ways to predict and manage the cost of optimization

    Cost monitoring

    With AWS and Azure, you can get ongoing monitoring of cost. You can also get an estimated charge during the month and a predicted bill at the end of the month. It uses linear extrapolation, not taking new demand into account, but that will give at least a benchmark to compare to your budget.

    Use tags for your instances to bring some order. You’ll need a global structure and naming convention within your organization, otherwise it becomes messy. But using tags will allow you to track and report consumption and cost by department to enable charge out.

    When departments pay for what they use, they’ll be much more thoughtful about how they use cloud instances. You can do chargeback to make the departments pay for what they use or showback where the departments are just made aware of what they use, either way, this view into usage is essential to motivate people to use resources wisely.

    Once you know how things are performing and what it costs, then you can look at forecasting capacity requirements and cost.

    Predict Spend

    This is fairly simple. You create a trend line with what you’ve spent in previous periods and where you expect to grow. Then add on migration activities and new initiatives to your predicted spend.

    How will you predict growth? Correlate business activity with provisioning. You can use standard monitoring solutions to see utilization and behavior methods in your cloud provider, but you’ll also need a tool that can store that data since the cloud provider does not save it for longer term analysis.

    Most organizations are gradually moving workloads to the cloud, so they need to determine how they will right-size and when, looking not only at CPU but also at storage. Plan about 3-6 months by looking at historical trends, replatforming plans, new projects (PMO sources), and business initiatives. Focus on the things that drive the most variance while automating the most repetitive aspects. Analyze your input and make sure you establish a dialogue with the business stake holders to validate and improve your understanding of their needs. Then review KPIs with your stakeholders on a regular basis.

    Keep Sharpening Your Prediction Skills

    Record your forecasts with actual outcome to continuously improve the forecast. Have you done it accurately? Once you know how accurate you’ve been, analyze the growth factors individually to see if there are systemic errors that can improve your accuracy.

    Once you understand the trends and seasonality patterns, knowing your rates and charges, and have forecasted as much as possible, you can combine them into a good forecast about cloud costs to add to your reporting.

    In summary, to optimize your cloud capacity:

    • Run as lean as possible.
    • Right-size your migrations and continuously analyze to make sure you’re running in the best possible way.
    • Tag and charge out the costs to make consumers aware of their resource utilization.
    • To predict, model your growth factors individually and establish a demand calendar.
    • And keep improving your predictions.

    Optimization takes effort, but when your workloads are right-sized and your bill fits your budget, it’s all worth it.

    Want to know more about monitoring business activity and predicting growth in your environments? Talk to one of our capacity management experts about Vityl Capacity Management.

    Ask an Expert

    Category: cloud