March 4, 2013
This is a post in a series of posts that dives deep into the mathematics involved in capacity planning. We have incredibly smart people figuring out the hardest problems so you don't have to. This is a closer look at what goes on behind the scenes in a capacity planning tool.
The method of estimation of mean of constant coefficients of a nonlinear function using the mathematical rules of least square error (LSE) principle is often named as nonlinear regression. The mathematical rules are same as the LSE for the estimation of constant coefficients of a linear function in most of the cases. Let's elaborate on what is linear regression.
Need to estimate the average of the constant parameter of the function of x in (1) from a set of collected measurements of
for a given function
Mathematically, the collected measurements of pairs of
is written as
is symbolically the measurement of
. The constant coefficients of expressions, used for modeling some practical phenomena or process, are almost always estimated by using LSE. This type of estimation approach is adopted if the model is intuitive and actual dynamics of the phenomena are unknown but certain factors are assumed to be influencing the phenomena. This means that the model of Eq.(1) is an intuitive approximation based on measured data.
A common sense idea is to estimate the constant parameter such that some expression of errors between the modeled
and measurements of
is minimum. So, it is assumed that for the measured
, the modeled
will deviate from the measurement of
. The errors are the difference between
and the measurement of
Most commonly known difference expression would be
The total error from Eq.(3) would be
Instead of estimating
such that the expression in Eq.(4) is smallest, a slightly different approach is used.
The sum of square of error (SSE) function is minimized in the process of estimating
. The SSE function,
, for the function of Eq.(1) for the measured data of Eq.(2) is
The expansion of the right hand side (RHS) of the equation of Eq.(5) is
The RHS of above implies that
is a 2nd degree polynomial function of
. This seems contradicting the definition of
as constant. In fact, it is no contradiction at all.
is a constant in the model of Eq(1) but in Eq.(5)
is the only unknown. All the quantities involving sum of
and combinations in Eq.(5) are known as
are known from Eq.(2). This explains why the left hand side (LHS) of equation in Eq.(5) is written as a function of
. A 2nd degree polynomial function is also called quadratic polynomial. Some terminology and characteristics of polynomial functions are:
- The degree of a polynomial function is determined by the highest exponent of the variable among the terms of the series summations. The highest exponent is 2 in Eq.(6) and so it is 2nd degree polynomial.
- The term with highest exponent is called the leading term. The first term is the leading term in Eq.(6).
- The coefficient of the leading term is called the leading coefficient. If the leading coefficient is positive then the graph of the polynomial is of a bowl shape as in Figure 1, else if the leading coefficient is negative then the graph of the polynomial is an inverted bowl shape as in Figure 2.
Notice that the function in Figure 1 has a minimum at
and no maximum is possible. This allows using a simple first order derivative operator or
where x is the independent variable to find the minimum point. According to the theory of calculus, the solution of
will provide values of
at a point where
of Eq.(5) is either minimum or maximum. From Figure 1 and the sign of leading coefficient of the RHS of Eq.(6), we know that
in Eq.(5) is going to have a minimum only. Thus we can get the
for which the SSE is minimum. From Eq.(7) and Eq.(5)Figure: 1. Quadratic polynomial with positive leading coefficient
From the above expression, the solution isFigure: 2. Quadratic polynomial with negative leading coefficient
of Eq.(5) will have least value if the estimate of
computed by Eq.(8) from the measured data set of Eq.(2) for the model of Eq.(1).
We will see how the estimated
in Eq.(8) is an average in the next post . The LSE estimation of
in Eq.(8) is extended to functions of multiple independent variables and then will solve the problem for polynomial functions of degree 2 or higher in the post after the next post of this series.