UNDERGRADUATE COMPUTATIONAL ENGINEERING AND SCIENCE

UCES

"Ill-conditioned Algebraic Systems and Function Approximation"

Course:   Methods and Analysis in UCES

Prerequisites:

  1. Math:   Algebra 2 and some matrices
  2. Computing:   Maple
  3. Science:   Physics

Objectives:

  1. Math:   Matrix norm, condition number
  2. Computing:   Use of different precision
  3. Science:   Trajectories

General Information: The main goal is to make the student aware that some problems will have solutions which can vary a great deal with small changes in the data of the problem. This is quantified by the use of a condition number with respect to the max norm.

Contact Person:   R. E. White, NCSU, white@math.ncsu.edu

Revision Date:   7-29-94

Copyright ©1994, R. E. White. All rights reserved.

Return to Table of Contents


1.7   Ill-conditioned Algebraic Systems and Function Approximation

1.7.1.   Introduction.

The Gaussian elimination algorithm can have a large number of operations. If one goes to the trouble of counting operations for the forward sweep, you will discover that there are order n3/3 operations. So, if the number of unknowns doubles the number of operations increases by a factor of 8. Or, if n = 200, then there are approximately 8/3 million operations! Certainly, one might begin to worry about the accumulation of roundoff error.

Some problems are very sensitive to small changes in data. That is, for small changes in data one will observe large change in the solution. The changes in the data could be either due to approximate empirical data or due to the use of floating point numbers approximation of real numbers. A simple illustration is depicted in the figure below where two lines are nearly parallel. If the intercepts with the vertical axis vary just a little, then the intersection will vary a great deal. Hence, the solution of the corresponding 2D system will vary a great deal.

Figure: Ill-conditioned System


1.7.2.   Applied Area.

We will present two applications. The first involves the unknown trajectory of a mass (artillery shell). The second application is somewhat related and requires a polynomial approximation to a function that is difficult to compute. Later we will illustrate some of the numerical difficulties.

Application to Trajectories. A mass is observed several times. Based upon these observations and basic laws of motion, we want to determine the path of the mass. This will allow us compute where the mass came from and where it will land.

Figure: Trajectory

The simplest model of motion is in 2D with the only force being gravitation in the vertical direction. In this case the path can be described in terms of the initial position (x(0),y(0)) = (x0 ,y0 ) and initial velocity (u(0),v(0)) = (u0 ,v0 ) as

x(t) = u0 t + x0 and

y(t) = v0 t + y0 - 16t2 .

These four parameters are the unknowns, and they will require four equations which can be gotten from two observations of the position. Let (x1 , y1 ) and (x2 , y2 ) be the observed positions at times t1 and t2 so that

u0 t1 + x0 = x1,

v0 t1 + y0 - 16t12 = y1 ,

u0 t2 + x0 = x2 and

v0 t2 + y0 - 16t22 = y2 .

The first and third equations decouple and have the matrix form

If the mass is moving fast, the observations will have to be close together which means the t1 and t2 will be close. Thus the lines in the (x0 ,u0 ) space will be almost parallel. Another way of viewing this is to do the first row operation

If t2 - t1 is small, then significant roundoff errors can occur in the calculation of u0 .

Application to Function Approximation. Many functions are very expensive to evaluate, but often segments of the graphs look like graphs of polynomials. In order to reduce the function evaluation times, these functions can be approximated by a variety of polynomials. One type is the Lagrange interpolation polynomials. The coefficients in an nth order polynomial are determined by setting the polynomial at distinct points equal to the function evaluated at these points. For example, if the functions looks like a quadratic polynomial near the points x1, x2 and x3 , then we can find the three coefficients of

P(x) = a0 + a1 x + a2 x2

by solving the system associated with P(xi ) = f(xi ) for i = 1,2,3. The matrix form is

If two of the interpolation points are close, then the computed solution may have significant error.


1.7.3.   Model.

The model in both these applications is an algebraic system. Not all algebraic systems have this problem, and there is a way of identifying many of the matrices that generate significant computing errors. In Maple the procedure called cond, for condition number of a matrix can help identify these matrices. The procedure cond returns a positive real number that attempts the measure ill-conditioned matrices.


1.7.4.   Method.

The method is Gaussian elimination. If the problem is ill-conditioned, then there are two relatively easy things one can do to reduce the error. Either, one can increase the precision of the floating point numbers, and this is easy to do in Maple. Or, one can avoid division by small numbers, and this is known as row pivoting in the forward sweep of the Gaussian elimination algorithm. The following remarkable example demonstrates both these methods.

Example. We will consider three different versions of the Gaussian elimination algorithm applied to the following system

Use real numbers:

Add (-104)(row 1) to (row 2) to get

Solve for x2 and x1:

x2 = 3.9997/1.9999
x1
= (4 - 2(3.9997/1.9999))104

= 2(2(1.9999) - 3.9997))/1.9999 104

= 2/1.9999

Use floating point numbers with 3 digits:

The above row operation gives a different upper triangular matrix

Thus, the floating point solution is

X2 = .2 101 and
X1 = .0   .

Interchange rows of the system to avoid division by a small number:

The 3 digit floating point solution is X1 = .100 101 and X2 = .200 101 !


1.7.5.   Implementation.

We will use Maple and its ability to control the number of digits in the floating point calculations to illustrate the above ill-conditioned problems.

Consider the 2 by 2 matrix from the trajectory problem.

	> Digits:=10;
	> a:=matrix(2,2,[1,1,1,1.001]);

        

	> cond(a);

        

	> linsolve(a,[1,3]);

        

	> Digits:=3;

        NO SOLUTION....Divide by zero

	> Digits:=4;

        

	> Digits:=5;

        

Consider the 3 by 3 matrix from the function approximation problem. Assume the function to be approximated satisfies f(1) = 1, f(1.1) = 2 and f(1.11) = 3.

	> Digits:=10;
	> a:=matrix(3,3,[1,1,1,1,1.1,1.21,1,1.11,1.2321]);

        
	> cond(a);

        
	> sol:=linsolve(a,[1,2,3]);
        

	> Digits:=3:
	> sol3:=linsolve(a,[1,2,3]);
        

	> Digits:=5:
	> sol5:=linsolve(a,[1,2,3]);
        

	> Digits:=8;
	> sol8:=linsolve(a,[1,2,3]);
        


1.7.6.   Assessment.

In the above problems the ill-conditioning was easily resolved by using more digits in the floating point numbers. A class of matrices called Hilbert matrices are notoriously ill-conditioned. The condition number of the following indicate this.

	>a:=hilbert(3);

        

	> cond(a);1

        

	a:=hilbert(4);

        

	> cond(a);

        

Next we will describe one of the many versions of condition number, and try to indicate why it is a measure of an ill-conditioned problem. An ill-conditioned problem generates large errors in the computed solution. Let the exact solution be x where Ax = d and let X be the computed solution where AX - d = -r is the nonzero residual. The relative error is given by the ratio of the "size" of x - X and the "size" of x. In ill-conditioned problem this ratio is large.

In order to be more precise about the "size" we define the max norm of a vector, and it is analogous to the absolute value of a single real number.

Definition. The max norm of the vector x is a real number

Basic Properties.

  1. 0 and = 0 if and only if x = 0.

Example.

The proofs of the first three are a consequence of the analogous properties of the absolute value. In order to prove the fourth property, apply the definition of max norm to the matrix-vector product Ax

.

Let us use this notion of "size" in the analysis of the relative error. Note Ax - AX = d - d + r, and so x - X = A-1r and x = A-1d. Now apply property four to the last two equations. This gives the following important estimate of the relative error.

.

This inequality prompts the following definition of condition number of a matrix, as well as the formal statement this inequality as a condition number theorem.

Definition. The condition number of the matrix A with respect to the max norm is

Condition Number Theorem. If A is an n by n matrix, A has an inverse matrix, Ax = d and AX - d = -r , then

Example.

There are other choices for norms, and hence, condition numbers. Note the condition number theorem implies the relative error should be small provided the condition number and the r are small. It does not imply, but does suggest, that the relative error will be large when the condition number is large!


1.7.7.   Homework.

  1. Verify the 3 digit floating point calculations for the 2 by 2 example with row interchange in the method section.

  2. Use the Maple procedures Digits and cond to study the polynomial function approximation where f(1) = 1, f(1.1) = 2, f(1.11) = 3 and f(1.13) = 4 . Use a third order polynomial, and vary the digits. Graph the polynomial whose coefficients are the accurate solution of the associated algebraic system.

  3. Find the norm of x, norm of A and the cond(A) where

    Use Maple to find the inverse of A and compare cond(A) with your hand calculated condition number of A with respect to the max norm.

  4. Prove the first three basic properties of the max norm.