UCES
Course: Methods and Analysis in UCES
Applied Area: business forcasting, nonlinear heat diffusion
Assessment: ill-conditioned normal equation
Prerequisites:
Objectives:
General Information: This is an introduction to curve fitting and to nonlinear aspects of heat transfer. The normal equations are introduced to solve the least squares problem, and the possibility of ill-conditioned normal equations is introduced.
Contact Person: R. E. White, NCSU, white@math.ncsu.edu
Revision Date: 7-28-94
Copyright ©1994, R. E. White. All rights reserved.
In this section we investigate algebraic systems that have more equations than unknowns. In general, these will not have a solution in the sense that one gets exact equality for each equation. Here we hope to be able to choose the unknowns so that the equations are as close to being satisfied as possible. For example, consider the following where there are two variables and three equations
Or, in matrix form, with the residual vector, r, inserted is
We will want to choose the solution x so that the "size" of the residual vector is a minimum. A very appropriate measure of "size" will be the sum of the squares of residual's components.
Such x are called the least squares solution to the overdetermined system.
We will apply the above concepts to two areas. Both are related to fitting curves to a number of data points. There are many other problems that can be formulated as minimization problem which are in many cases related to algebraic systems.
Application to Business Forecasting. Consider
a computer company which has recorded sales of 78, 85, 90, 96, 104 and 113
computers over the last six months. They wish to make a prediction of the
sales over the next six months. This will help them plan their production
needs during the next six months. This data is increasing more or less in a
linear manor. Consequently, we are looking for a straight line that is
"closest" to the data. An analytical way of saying this is that we are
looking for
![]() |
| Figure: Sales Data and a "Close" Straight Line |
In order to form the corresponding overdetermined system, note the components of the residual vector are just the vertical distances between the data points and the desired straight line. So, for each data point there is an equation which has the two unknowns m and c. For the above data the six equations are
Or, the matrix form with the residual vector is
Application to Nonlinear Heat Diffusion. In the past models of heat diffusion via the Fourier heat law, the thermal conductivity was the constant of proportionality for the diffusion of heat. However, if the temperature varies over a large range, the thermal conductivity will not be a constant. Consider the following data for thermal conductivity as a function of temperature.
| Temperature | Thermal Conductivity |
|---|---|
| 000 | .0010 |
| 200 | .0015 |
| 400 | .0021 |
| 600 | .0051 |
| 800 | .0094 |
Inspection of this data indicates that it is generally shaped like a
parabola whose equation is a second order polynomial
Or, the matrix form with the residual vector is
The model has the form of Ax = d where A is an m by n matrix with m larger
than n, that is, there are more rows or equations than the unknowns in the
x vector. Since there are no exact solutions, we try to find x so that the
residual vector in
Definition. The vector x is called a least squares solution of the overdetermined system if and only if x is such that
Fortunately, this solution in most cases has a very nice answer. If x is the least squares solution, then for all y
(d - Ax)T(d - Ax)
(d - Ay)T(d - Ay). |
(1) |
Let y = x + (y - x) and consider
| (d - Ay)T(d - Ay) | = ((d - Ax) - A(y - x))T ((d - Ax) - A(y - x)) | |
| = ((d - Ax)T - (A(y - x))T) ((d - Ax) - A(y - x)) | ||
| = (d - Ax)T(d - Ax) - (A(y - x))T(d - Ax)
- (d - Ax)TA(y - x) + (A(y - x))T(A(y - x)) |
||
(d - Ax)T(d - Ax) -
2(y - x)T AT(d - Ax). |
(2) |
If the inequality in (1) is to hold for all y, then the last term on the right side of (2) must be zero. That is, AT(d - Ax) = 0.
This prompts the following definition and theorem which we have just established.
Definition. ATAx = ATd is called the normal equation associated with the least squares problem.
Normal Equation Theorem. If ATA has an inverse, then the solution of the normal equation is also a solution of the least squares problem.
Example. Consider the problem in the introduction with three equations and two unknowns
The solution of the normal equation is
We will use the Maple procedures multiply and linsolve to find the solution of the normal equations. We do this for both the above applications.
Maple Code for the Business Forcast Model.
> with(linalg):> A:=matrix(6,2,[1,1, 2,1, 3,1, 4,1, 5,1, 6,1]);
> ATA:= multiply(transpose(A),A);
> d:=vector([78,85,90,96,104,113]);
> ATd:= multiply(transpose(A),d);
> lssol:= linsolve(ATA, ATd);
![]()
This means the slope m = 34/5 and the y intercept
Maple Code for the Thermal Conductivity Data.
> A:=matrix(5,3,[1,0,0,1,200,40000,1,400,160000,1,600,360000,1,800,640000]);
> ATA:= multiply(transpose(A),A);
> cond(ATA);
> d:= vector([.0010,.0015,.0021,.0051,.0094]);
> ATd := multiply(transpose(A),d);
> lssol:= linsolve(ATA,ATd);
> Digits:=12;> lssol:= linsolve(ATA,ATd);
> with(plots):> para:=plot(lssol[1] + lssol[2]*x + lssol[3]*x*x,x=0..800):
> data:=plot({[0,.001],[200,.0015],[400,.0021],[600,.0051],[800,.0094]});
> display({para,data});
![]() |
| Figure: Thermal Conductivity versus Temperature |
Matlab Code for Least Squares Solution.
>>format long (16 digits)>>A = [1 0 0; 1 200 40000; 1 400 160000; 1 600 360000; 1 800 640000]
A =
1 0 0
1 200 40000
1 400 160000
1 600 360000
1 800 640000
>>d = [.0010 .0015 .0021 .0051 .0094]'
d =
0.00100000000000
0.00150000000000
0.00210000000000
0.00510000000000
0.00940000000000
>>lssol = (A'*A)\(A'*d)
lssol =
0.00116857142857
-0.00000408571429
0.00000001785714
>>cond(A'*A)
ans =
5.016790862197858e+11
The least squares linear approximation model requires that the data be in the form of a straight line. Other factors governing the effectiveness of such models include other unpredictable costs which may make the attractiveness of items to increase or decrease rapidly.
The thermal conductivity model may not be valid if there are changes of physical state within the desired range. In this case, the thermal conductivity will have jump discontinuities, and therefore, polynomial approximations will not be a good model.
Not all models are in the form of polynomials. In many cases the models have the form of transcendental functions such as log or exponential functions. Here the least squares model must be altered to include these more complicated nonlinear effects.
In the thermal conductivity model the condition number of the ATA was huge. This is a common problem with the condition number of such matrix products. Generally, the condition number of ATA will be the square of the condition number of A.
In the previous section we considered another polynomial approximation, Lagrange polynomials, given by requiring the approximation to be exactly the value of a complicated function at a number of points. If there are a large number of points, then the polynomial will have a high order, and there will be many oscillations. By using least squares with the appropriate degree polynomial, which reflects the data, one can use many data points.