February 4, 2009

    Anybody with a dartboard can claim to have a server capacity planning tool. Unfortunately, companies selling dartboards for capacity planning aren’t likely to be very honest about the sophistication of their tools. It’s caveat emptor, let the buyer beware.


    So how do you avoid buying a dartboard when accurate predictions is what you really need? What makes a good capacity planning tool? What should you look for in a capacity planning tool?

    First off, monitoring performance is not capacity planning. Getting an alarm event 15 minutes before users complain does not constitute capacity planning. Monitoring and alarming are essential components for capacity management, but that's not capacity planning.

    Real time performance data isn't enough either. Capacity planning always requires a historical record of some sort. But just keeping some historical data available for graphing and charting is not enough. You need more than that for accurate capacity planning.

    Beware of companies touting trending as capacity planning. Computer system performance is not linear, and a capacity planning tool needs to know more than just past system performance in order to make accurate predictions about the future. And trending is of no use at all for many projects that involve capacity planning. You can't plan a server consolidation project with trending, for example.

    You should also be wary of tools that do "capacity planning" for server consolidation by adding together the resource utilization of each of the workloads that are being considered for possible consolidation. After normalizing CPU utilization to account for differences in computing capability, the utilization for each workload is added together to determine how much of the target CPU will be utilized after consolidation. A similar calculation is performed for other resources such as memory, I/O, and the network.

    This kind of simplistic procedure can be good enough to find potential consolidation candidates, but it leaves way too much out of the equation for making the final decision when consolidating important workloads. You need a tool that understands something of the details regarding your server architecture, more about your applications' use of that architecture, and how workloads will interact when they are consolidated.

    OK, so I've said a lot about what isn't a good capacity planning tool. You're probably wondering what I would actually recommend. Not to say that less sophisticated methods don't have their uses, but where critical apps are involved, you want a tool that performs capacity planning using some sort of modeling.

    Sometimes when people talk about a "model" they mean a description or diagram. That's not the kind of model I am talking about in this case. For sure, you need a description of the systems involved, but that description is really just a step in a good capacity planning process. What you want is a tool that can look at that description along with information regarding the incoming workloads, and predict how the systems will perform.

    There are at least two types of methods used by capacity planning tools that use modeling to predict performance: simulation modeling and analytic modeling. A good simulation modeling tool will create a queuing network based on the system being modeled and pretend to run the incoming workloads on that network. Simulations like these can be very accurate, but a lot of work is necessary to adequately describe the systems with enough detail for the results to be dependable.

    Queing network

    Analytic modeling also takes queuing into account, without pretending to run the incoming workloads on the model. Instead, in a good analytic modeling tool, formulas based on queuing theory are used to mathematically calculate processing times and delays. This type of modeling is much faster and not nearly so tedious to set up. And the results can be just as accurate as with simulation modeling.

    Analytic models are not as generalized as what's possible with simulation modeling, so when a crucial situation arises where a suitable analytic model is not available, it makes sense to put together a simulation model instead. The rest of the time you will want to stick with a much easier and faster analytic modeling process.

    So analytic modeling is usually what you want in a capacity planning tool. If you really want to cover your bases, get a tool that can do both analytic and simulation modeling. And check that your capacity planning tool vendor isn't mis-using the term "analytic" when making claims for their tool. You want to be sure that the tool you pick uses sound methods based on queuing theory to make its calculations, not something more closely resembling the less accurate capacity planning techniques I described in this article.

    For more information, see teamquest.com/capacityplanning.