December 8, 2008

    In his CMG 2008 Sunday workshop, "How High Will It Fly", Dr Neil Gunther showed how relatively simple mathematical models fed with appropriate measurement data can be used to predict the scalability of a computing system. Unlike physical systems such as airplanes and bridges, computing systems don't lose their wings or break apart when the load on the system exceeds the material strength.

    Instead, a computing system's performance starts to degrade; workload throughput levels off (less work is completed) and workload response times increase to infinity (work takes longer to complete, or may never complete!).

    One of the important and basic tasks for a performance analyst and capacity planner is to determine these critical limits - to understand the capability of the system so that it can be fully exploited without adverse effects. This is not a trivial task. And to make matters worse, today's popular computing systems such as UNIX and Windows servers are multiprocessing systems.

    In a single processing system the capability of that single processing unit directly affects the throughput and response time of the workload running on the system. A faster or more capable processing unit will yield improved results.

    In contrast, in a multiprocessing system, to utilize the full capability of the system, the workload must be divided and coordinated between the multiple processing units.

    For instance, several users may be updating the same table in a database. Although to the user, the update occurs immediately, the system coordinates the work so that only one user at a time is allowed to update the data. The other users have to wait their turn. This coordination work is plain and simple overhead; it's time spent arranging work instead of completing work. This fact is a major reason why it is difficult to determine the critical limits of multiprocessing systems. Adding additional and faster processing units may not necessarily yield better results.

    Gene Amdahl, one of the pioneers in this area, actually advocated for the use of single processing unit systems even though his famous law is most often quoted in papers on multiprocessing systems, according to Dr. Gunther. Perhaps he saw how much work lay ahead!

    It is interesting that IBM seems to be producing systems with faster and faster processing units, whereas Sun Microsystems is producing systems with massively multithreaded processing units. Apparently two different strategies at work out in the market.

    So determining how high it will fly is not trivial.

    But there is hope! Software vendors such as TeamQuest, offer products to help performance analysts and capacity planners explore the limits of their increasingly more complex and powerful computing systems. TeamQuest Model was recently updated to fully understand the behavior of multiprocessing systems (CPUs, Cores per CPU, and Threads per Core.) And where simple models only give you the boundaries of the limits of the system, TeamQuest Model also provides the components of response that contribute to the overall response time of a workload. :-)