March 4, 2013
    This is a post in a series of posts that dives deep into the mathematics involved in capacity planning. We have incredibly smart people figuring out the hardest problems so you don't have to. This is a closer look at what goes on behind the scenes in a capacity planning tool.


    The method of estimation of mean of constant coefficients of a nonlinear function using the mathematical rules of least square error (LSE) principle is often named as nonlinear regression. The mathematical rules are same as the LSE for the estimation of constant coefficients of a linear function in most of the cases. Let's elaborate on what is linear regression.

    Need to estimate the average of the constant parameter of the function of x in (1) from a set of collected measurements of pairs for  for a given function :

    Mathematically, the collected measurements of pairs of  is written as

    Note that  is symbolically the measurement of . The constant coefficients of expressions, used for modeling some practical phenomena or process, are almost always estimated by using LSE. This type of estimation approach is adopted if the model is intuitive and actual dynamics of the phenomena are unknown but certain factors are assumed to be influencing the phenomena. This means that the model of Eq.(1) is an intuitive approximation based on measured data.
    A common sense idea is to estimate the constant parameter such that some expression of errors between the modeled  and measurements of is minimum. So, it is assumed that for the measured , the modeled  will deviate from the measurement of . The errors are the difference between and the measurement of for .


    Most commonly known difference expression would be

    The total error from Eq.(3) would be

    Instead of estimating  such that the expression in Eq.(4) is smallest, a slightly different approach is used.

    The sum of square of error (SSE) function is minimized in the process of estimating . The SSE function, , for the function of Eq.(1) for the measured data of Eq.(2) is

    The expansion of the right hand side (RHS) of the equation of Eq.(5) is

    The RHS of above implies that is a 2nd degree polynomial function of . This seems contradicting the definition of as constant. In fact, it is no contradiction at all. is a constant in the model of Eq(1) but in Eq.(5) is the only unknown. All the quantities involving sum of and and combinations in Eq.(5) are known as and are known from Eq.(2). This explains why the left hand side (LHS) of equation in Eq.(5) is written as a function of . A 2nd degree polynomial function is also called quadratic polynomial. Some terminology and characteristics of polynomial functions are:

    1. The degree of a polynomial function is determined by the highest exponent of the variable among the terms of the series summations. The highest exponent is 2 in Eq.(6) and so it is 2nd degree polynomial.

    2. The term with highest exponent is called the leading term. The first term is the leading term in Eq.(6).

    3. The coefficient of the leading term is called the leading coefficient. If the leading coefficient is positive then the graph of the polynomial is of a bowl shape as in Figure 1, else if the leading coefficient is negative then the graph of the polynomial is an inverted bowl shape as in Figure 2.


    Notice that the function in Figure 1 has a minimum at and no maximum is possible. This allows using a simple first order derivative operator or where x is the independent variable to find the minimum point. According to the theory of calculus, the solution of

    will provide values of at a point where  of Eq.(5) is either minimum or maximum. From Figure 1 and the sign of leading coefficient of the RHS of Eq.(6), we know that in Eq.(5) is going to have a minimum only. Thus we can get the for which the SSE is minimum. From Eq.(7) and Eq.(5)

    Figure: 1. Quadratic polynomial with positive leading coefficient


    From the above expression, the solution is

    Figure: 2. Quadratic polynomial with negative leading coefficient


    So, of Eq.(5) will have least value if the estimate of computed by Eq.(8) from the measured data set of Eq.(2) for the model of Eq.(1).

    We will see how the estimated in Eq.(8) is an average in the next post . The LSE estimation of in Eq.(8) is extended to functions of multiple independent variables and then will solve the problem for polynomial functions of degree 2 or higher in the post after the next post of this series.

    Category: best-practices