TeamQuest Corporation

CMG 2008: Spreadsheets and Clouds

Modern spreadsheet applications such as Microsoft Excel and OpenOffice.org Calc are marvelous tools. In addition to their natural utility as business reporting tools they can also be a great extension to performance and capacity management tools.

In his presentation, “Pivot Tables/Charts - Magic Beans without Living in a Fairy Tale,” John S. Van Wagenen of Caterpillar Corporation gave a useful demonstration of how the PivotTables feature in Excel (OpenOffice.org Calc calls it DataPilot) can be used to dice and slice time series performance data, such as the data collected by TeamQuest Manager, and present it just they way management wants it.

We provide a similar and modest version of this capability in the “Chart Paging” feature of TeamQuest IT Service Analyzer, where lots of data can be “paged” by any identifier in the data such as server, virtual machine, LPAR, zone, workload, etc.

I find it interesting, but not surprising, that several session speakers at CMG this year mention spreadsheet applications as their favorite management communication tool for performance and capacity reports. Being firmly entrenched in the business world, what better place to plug in technical data than spreadsheet applications in the quest to connect IT with the Business.

BTW, TeamQuest Model and TeamQuest IT Service Analyzer and Reporter let you take the data you’re looking at over to Excel with the click of a button.

Paul Strong of eBay Research Labs gave a very interesting talk on “The Shape of Infrastructure to Come” where he presented the infrastructure powering the eBay website that we all know.

He also gave his views on cloud computing, which seems to be the buzzword du jour, a companion to “virtualization.” Behind the website is actually a trading cloud developed by eBay for eBay. The three major services used by the website are the auction service, the payment service (now PayPal), and the search service. The eBay programmers develop these major services using APIs of the eBay trading cloud. The cloud is an abstraction of the physical IT resources that collectively form the cloud.

Cumulus clouds

But it was not always so. Around the year 2000 (after some high profile outages of the website) eBay realized that to cope with their exponential growth they had to make a radical change to the IT infrastructure, particularly the core component: the auction items database.

They decided to break down their large, vertically scaled vendor hardware, “virtualize” the database, and spread it across horizontally scaled commodity hardware. The three big services drawing power from the cloud were updated to also scale horizontally and now use metadata to make calls for the “real” data.

The concept of a “virtual database” is very powerful. When we introduced TeamQuest IT Service Analyzer and Reporter, we decided to do the same thing as eBay, although for different reasons. I call our database a federated performance and capacity database.

This new database is an aggregation of metadata representing the actual data stored in hundreds, maybe thousands of TeamQuest Manager databases in your environment. As with the eBay web services, Analyzer and Reporter use metadata to request data from the “real” Manager databases.

Strong pointed out that although the scaling problem at eBay was now addressed, new challenges surfaced. With horizontally scaled, modular systems with lots of sharing of IT resources and millions of relationships and interdependencies, finding the point of failure or congestion is a much more complex and time-consuming task than before. Assuring good performance has also become more challenging.

This confirms our position at TeamQuest that no matter how much you virtualize your data center and introduce layers of abstraction, there will always be collections of physical IT resources out on the floor that will require proper instrumentation and tools for performance and capacity management.

Finally, and perhaps most interesting, Strong made the prediction that cloud computing will lower the barriers to entrepreneurship. With easy access any time to just the right amount of computing resources you need without the need for your own data center, we will see more creative and innovative ideas comes to life.

Pascal

PS. Fun trivia: in the early stages of development of TeamQuest IT Service Analyzer and Reporter, their code names were Cirrus and Stratus :)

Share


CMG 2008: Presenting the Conclusion

Statistical methods (average, mean, standard deviation, etc.) are the bread and butter of performance analysts and capacity planners. Not a day goes by without having to study a graph or a set of measurement data. According to Ray Wicks of IBM, we continuously process such data and draw conclusions about the meaning. The processing is often numeric and there is both a conceptual and sensual component at work here. 

The conceptual component is what we have been taught about how numbers work, such as the average is equal to the sum divided by the number of observations. 

The sensual component is the result of the evolution of our visual cortex. For example, we perceive circular-shaped objects with the shadow on the underside as raised bumps, not dimples, because our visual system is used to light coming from the sky and projecting the shadow on the underside of objects. 

Not surprisingly, the conceptual and visual components influence each other so that what we think and see are not independent. In other words, we can influence the conclusion of the data by how we present the data. Consider these two separate graphs:

TeamQuest image

Our visual system tells us that the two are different. We reflexively draw the conclusion that there is not much variability in the data in the first graph compared to the second. But conceptually, and upon closer inspection, we realize that we are looking at the same data at different scales. (BTW, both TeamQuest IT Service Analyzer and Reporter let you control the scaling of your graphs.) 

We can be tricked by statistics without the help of our visual system. Consider the fact that the average age of orchestra conductors is 73 compared to 68.5 for the rest of us. Are those guys healthier? Perhaps living a major part of your life waving your hands in the air is good for your health. The pitfall here is that the average is based on the population of orchestra conductors, and that population consists mostly of white, healthy males above the age of 65. The other population contains people of all ages, all walks of life, men and women.

Thus if you are an orchestra conductor and make it beyond 65 years of age, chances are that you’ll make it to 73, but not thanks to all those hours you’ve spent waving your hands in the air. The two averages are based on two very different populations, and thus not comparable. 

At TeamQuest we have long understood the treachery of simple averages. That’s why we use weighted averages, for the appropriate data, when aggregating the data in our performance database. 

Pascal

Share


CMG 2008: High Flying Models

In his CMG 2008 Sunday workshop, “How High Will It Fly“, Dr Neil Gunther showed how relatively simple mathematical models fed with appropriate measurement data can be used to predict the scalability of a computing system. Unlike physical systems such as airplanes and bridges, computing systems don’t lose their wings or break apart when the load on the system exceeds the material strength.

Instead, a computing system’s performance starts to degrade; workload throughput levels off (less work is completed) and workload response times increase to infinity (work takes longer to complete, or may never complete!). 

One of the important and basic tasks for a performance analyst and capacity planner is to determine these critical limits – to understand the capability of the system so that it can be fully exploited without adverse effects. This is not a trivial task. And to make matters worse, today’s popular computing systems such as UNIX and Windows servers are multiprocessing systems.

In a single processing system the capability of that single processing unit directly affects the throughput and response time of the workload running on the system. A faster or more capable processing unit will yield improved results.

In contrast, in a multiprocessing system, to utilize the full capability of the system, the workload must be divided and coordinated between the multiple processing units.

For instance, several users may be updating the same table in a database. Although to the user, the update occurs immediately, the system coordinates the work so that only one user at a time is allowed to update the data. The other users have to wait their turn. This coordination work is plain and simple overhead; it’s time spent arranging work instead of completing work. This fact is a major reason why it is difficult to determine the critical limits of multiprocessing systems. Adding additional and faster processing units may not necessarily yield better results.

Gene Amdahl, one of the pioneers in this area, actually advocated for the use of single processing unit systems even though his famous law is most often quoted in papers on multiprocessing systems, according to Dr. Gunther. Perhaps he saw how much work lay ahead!

It is interesting that IBM seems to be producing systems with faster and faster processing units, whereas Sun Microsystems is producing systems with massively multithreaded processing units. Apparently two different strategies at work out in the market.

So determining how high it will fly is not trivial.

But there is hope! Software vendors such as TeamQuest, offer products to help performance analysts and capacity planners explore the limits of their increasingly more complex and powerful computing systems. TeamQuest Model was recently updated to fully understand the behavior of multiprocessing systems (CPUs, Cores per CPU, and Threads per Core.) And where simple models only give you the boundaries of the limits of the system, TeamQuest Model also provides the components of response that contribute to the overall response time of a workload. :-)

Pascal

Share