UNDERGRADUATE COMPUTATIONAL ENGINEERING AND SCIENCE

UCES

"Computers and Matrix Models"

Course:   Methods and Analysis in UCES

Prerequisites:

  1. Math:   Algebra 2, probability and some matrices
  2. Computing:   Maple and arrays
  3. Science:   none

Objectives:

  1. Math:   Generalize first order finite difference model
  2. Computing:   Matrix computations and computer architecture
  3. Science:   Markov chains

General Information: This lecture contains the generalization of the first order finite difference model to its matrix form. This requires greater computing power and some discussion of computer architectures is included.

Contact Person:   R. E. White, NCSU, white@math.ncsu.edu

Revision Date:   7-28-94

Copyright ©1994, R. E. White. All rights reserved.

Return to Table of Contents


1.2   Computers and Matrix Models

1.2.1.   Introduction.

In this lecture we will consider matrix generalizations of the first order finite difference model, and consequently, we will eventually need more powerful computing hardware and software. We begin by a very brief discussion of what a computer is.

The von Neumann definition of a computer contains three parts: main memory, input-output device and central processing unit (CPU). The CPU has three sections: the arithmetic logic unit, the control unit and the local memory. The arithmetic logic unit does the floating point calculations while the control unit governs the instructions and data. The local memory is small compared to the main memory, but it is usually very fast. Hence, it is important to move data from the main memory to the local memory and do as much computation with this data as possible before moving it back to the main memory. The movement to data in and out of the local memory is relatively slow. Algorithms that have been optimized for a particular computer must take these facts into careful consideration.

Another way of describing a computer is the hierarchical classification of its components. There are three levels: the processor level with wide band communication paths, the register level with several bytes (8 bits per byte) pathways and the gate or logic level which has several bits in its pathways.

The figure below illustrates two processor level descriptions of computers. The top is a von Neumann computer with the three basic components. The lower part depicts a multiprocessing computer with four CPUs. The CPUs communicate with each other via the shared memory. The switch controls access to the shared memory, and here there is a potential for a bottleneck. The purpose of multiprocessors is to do more computation in less time. This is critical in many applications such as weather prediction.

Figure: Processor Level Computers

Within the CPU is the arithmetic logic unit and here there are many floating point adders. These can be described as a register level device. A floating point add can be described in four distinct steps each requiring a distinct hardware device. For example, use four digits to do a floating point add of 100.1 + (-3.6).

CE:   compare expressions       .1001 103 and -.36 101
AE:   mantissa alignment .1001 103 and -.0036 103
AD:   mantissa add 1001 - 0036 = 0965
NR:   normalization .9650 102.

This can be depicted by the following figure where the lines indicate communication pathways with several bytes of data. The data moves from left to right in time intervals called the clock cycle time of the particular computer. If each step takes one clock cycle and the clock cycle time was 6 nanoseconds, then a floating point add would take 24 nanoseconds (10-9 sec.).

Figure: Register Level Floating Point Add

Within a floating point adder there are many devices which add integers. These devices typically deal with just a few bits of information and are examples of gate or logic level devices. Here the integer adds can be done by base two numbers. These devices are combinations of a small number transistors designed to simulate truth tables that reflect basic binary operations.


1.2.2.   Applied Area.

In this section we introduce some applications which will lead to the matrix products of row vector times a column vector, matrix times a column vector and the general matrix times a matrix. These operations are fundamental to many numerical models, and these computations have been carefully implemented on most computers.

Application to Store Inventory. A typical department store will have about 10,000 different items each with a different price per item. The problem is to determine the net worth of the store's inventory. Clearly, we must sum 10,000 numbers where each number is a product of the number of each item and the price of this item. With 10,000 products in the store all this data must be organized in a systematic way. For example, suppose the store has just four types of TVs:

19" TV for $100
27" TV for $400
31" TV for $900
50" TV for $2100.

This information maybe recorded in an ordered list of four numbers called a price column vector

P =

Associated with this order is and inventory column vector

I =

where 7 is the number of 27" TVs, 12 is the number of 27" TVs, 3 is the number of 31" TVs and 2 is the number of 50" TVs. The net worth is given in the following form:

IT P = 7*100 + 12*400 + 3*900 + 2*2100.

The symbol IT is a row vector associated with the column vector I and is called the transpose of I:

IT = [7   12   3   2] .

This product is called a row vector times a column vector or dotproduct of two column vectors. Here the net worth is the dotproduct of the inventory vector and the price vector.

The general scheme for n different types of items with item k having price p(k) and i(k) number of item k is as follows:

ITP = i(1)*p(1) + i(2)*p(2) +... + i(n)*p(n) where

I = [i(1)   i(2)  ...   i(n)]T and P = [p(1)   p(2)   ...   p(n)]T are n dimensional column vectors. This product has n multiplications and n-1 additions, and we say it has order n operations. Another order n operation is multiplication of a column vector by a constant. For example, if the above TV store were to triple its inventory, then the new inventory vector would be

3IT = [21   36   9   6] .

Matrix operations which have order n operations are classified as BLAS1 which signifies basic linear algebra subroutine of order n to the first power operations. Most computers have BLAS1 subroutines which have been optimized for the best possible performance on that computer.

Application to Investment. An investor has some portions in stocks, bonds, growth stocks and money market. Each year she decides how to reallocate her investments. From past year's reallocations we have

.8 probability of investing in stock if already invested in stock,
.4 probability of investing in stock if already invested in bond,
.4 probability of investing in stock if already invested in growth,
.5 probability of investing in stock if already invested in money.

For the remainder of this application we shall keep the order stock, bond, growth and money. The above probabilities can be associated with a row vector to invest in stock

P1T = [.8   .4   .4   .5].

If the current investments are

50% stock
40% bond
  7% growth
  3% money,

then by writing this last information in vector form C = [.50   .40   .07   .03]T, we can expect

P1TC = .8*.5 + .4*.4 + .4*.07 + .5*.03

to be the fraction to be invested in stock the next year.

Associated with each investment category is a probability vector

PiT = [p(i,1)   p(i,2)   p(i,3)   p(i,4)]
where p(i,j) is the probability of investing in category i if already invested in category j. Hence, according to the above ordering p(2,4) is the probability of investing in bonds if already invested in growth.

PiTC = the fraction to be invested in category i given the current investments C. We can record this in vector format as

newC = [P1TC   P2TC   P3TC   P4TC]T.

Another way to say the is that the component i of the new investment vector is equal to the dotproduct of the probability vector i and the old investment vector, C.

The probability vectors can be formed into a 4 by 4 matrix or array

P = .

The newC can be viewed as a product of the matrix P and the column vector C where component i of the newC is defined as the product of row i of P times column vector C.

One can repeat this from one year to the next to try to predict future patterns of investment. For example, after two years the prediction will have the form P(PC), after three years P(P(PC) and so forth. We will later examine the behavior of this sequence of investment vectors. This is an example of a Markov chain where P is called a transition matrix from one state to the next state. Each column of P has non negative component whose sum equals one.


1.2.3.   Model.

Often the size of these matrices is very large and require extensive computational power. In fact, for each n by n matrix times a column vector there are n dotproducts. Hence, such matrix vector products require order n2 operations. These matrix operations belong to the BLAS2 family of subroutines which are also optimized for many computers.

In general one can multiply two matrices provided the rows in the left matrix have the same dimension as the columns in the right matrix.

Definition. Let A = [a(i,k)] have m rows (i = 1..m) and n columns (k = 1..n). Let B = [b(k,j)] have n columns and l rows (j = 1..l). The product AB is an matrix with m rows and l columns. The ij component of AB in row i and column j is the product of row i of A and column j of B

ij component of AB = .

Example. Let m = 3, n = 2 and l = 4.

.

If A and B are both n by n matrices, then there are n2 dotproducts, so there are order n3 operations. Here we say that such operations are a member of the BLAS3 subroutines.

Basic Properties.

  1. AB and BA are not, in general, equal (not commutative).
  2. A(BC) = (AB)C provided the products are defined (associative).
  3. A(B + C) = AB + AC provided the products are defined (distributive).

Even if A and B are square, the product may not commute. Consider

.

The proof of the distributive law is straight forward. The associative law requires the sigma notation. The il component of A(BC) has the form

Example. Consider a Markov chain with transition matrix P which is n by n. After k iterations the state vector is P(P(...(PC)...)) where P appears k times. Use the above associative law to write this state as (Pk)C where Pk denotes P times itself k times.


1.2.4.   Method.

The fundamental operation is the dotproduct or product of a row vector with a column vector. The matrix-vector and matrix-matrix products can be defined using the dotproducts. They also can be defined in alternate ways by making use of the basic rules of arithmetic. The advantage of the alternate mode of computation is that it often allows one to make use of particular properties of a computer such as communication speeds and local versus main memory size and speeds. We shall use the following notation:

a = [a(j)] is a column vector where j = 1,...,n and

A = [a(i,j)] is a matrix where i = 1,...,m are the row numbers
and j = 1,...,n are the column numbers.

Dotproduct Algorithm aTx = d:

        d = 0
        for j = 1,n
            d = d + a(j)*x(j)
        endloop.

Matrix-vector Product Algorithm (ij version) Ax = d:

        for i = 1,m
            d(i) = 0
            for j = 1,n
                d(i) = d(i) + a(i,j)*x(j)
            endloop
        endloop.

An alternate way to do matrix-vector products is based on following reordering of the arithmetic operations. Consider the case where n = m = 3 and

a(1,1)*x(1) + a(1,2)*x(2) + a(1,3)*x(3) = d(1)
a(2,1)*x(1) + a(2,2)*x(2) + a(2,3)*x(3) = d(2)
a(3,1)*x(1) + a(3,2)*x(2) + a(3,3)*x(3) = d(3).

This can be written in vector form as

In other words, the product is a linear combination of the columns of the matrix. So, the product can be formed by adding these multiples of the column vectors. This amounts to reversing the order of the nested loop in the above algorithm.

Matrix-vector Product Algorithm (ji version) Ax = d:

        d(1:m) = 0	  (implied loop that sets the vector d = 0)
        for j = 1,n
            for i = 1,m
                d(i) = d(i) + a(i,j)*x(j)
            endloop
        endloop.

The matrix vector products are similar, but they will have three nested loops. Here there are 6 different orderings of the loops, and so, the analysis is a little more complicated.


1.2.5.   Implementation.

This section contains Maple code for the generalization of the first order finite difference algorithm to matrices. The first one makes use of a Maple intrinsic for matrix-vector products. We have used a procedure called FOFDM for this algorithm. We have applied it to Markov chains where the column vector b is the zero vector. State vector y has an order as above of stocks, bonds, growth and money. The input below indicates an initial distribution of 50% stocks, 40% bonds, 5% growth and 5% money. The graph in the figure indicates that the states quickly converge to investments of 68% stock, 15% bonds, 8% growth and 9% money. Of course, these results depend upon the values in the transition matrix which is stored in matrix a.

Maple Code for newy = A*oldy + b.

Procedure is Defined

        > FOFDM:= proc(a,b,n,x0,y0,h,d)
        > local k;
        > with(plots):
        > with(linalg):
        > x(0):=x0:
        > y(0):=y0:
        > yy:=y0:
        > for k from 0 to n do
        >     y(k+1) := evalm(multiply(a,y(k)) + b);
        >     x(k+1) := x(k) + h;
        >     yy:=augment(yy,y(k+1));
        > od:
        > matrixplot(yy,style=wireframe,axes=normal);
        > end:

   Input Data:
        > a:=matrix(4,4,[.8,.4,.4,.5,  .1,.3,.4,.1,  .05,.2,.1,.05,
        			.05,.1,.1,.35]);
        > b:=vector([.0,.0,.0,.0]);
        > y0:=vector([.5,.4,.05,.05]);

   Call Procedure:
        > FOFDM(a,b,10,0,y0,1,10);

   Output Computed Data:
        > y(10);
        [.6813 .1539 .0769 .0879]

Figure: Computed States

The intrinsic multiply(A,y) could be replaced by either the ij or ji versions of the matrix-vector product algoirthms. The ij version in Maple is as follows.

Maple Code for ij Matrix-vector Product.

        > for i from 1 to n do
        >     d[i]:= 0:
        >     for j from 1 to n do
        >         d[i]:= d[i] + A[i,j]*y[j];
        >     od:
        > od:

In the following we present the Matlab code, markov.m, for the ji version of a matrix-vector product. However, Matlab has its matrix-vector products optimized, and so, one should in most cases use the Matlab intrinsic A*x. The reader should experiment with the three versions and determine which version is fastest.

Matlab Code for Markov Chain.

        >>A = [.8 .4 .4 .5;.1 .3 .4 .1; .05 .2 .1 .05; .05 .1 .1 .35]

        A =
                 0.8000    0.4000    0.4000    0.5000
                 0.1000    0.3000    0.4000    0.1000
                 0.0500    0.2000    0.1000    0.0500
                 0.0500    0.1000    0.1000    0.3500

        >>x = [.5 .4 .05 .05]

        x =
                 0.5000    0.4000    0.0500    0.0500

        >>x = x'

x = 0.5000 0.4000 0.0500 0.0500 >>x1 = A*x x1 = 0.6050 0.1950 0.1125 0.0875 >>x2 = A*x1 x2 = 0.6508 0.1728 0.0849 0.0916 >>x3 = A*x2 x3 = 0.6695 0.1600 0.0802 0.0904 >>x4 = A*x3 x4 = 0.6768 0.1560 0.0780 0.0891

Matlab Code for ji Matrix-vector Product.


        A = [.8 .4 .4 .5; .1 .3 .4 .1; .05 .2 .1 .05; .05 .1 .1 .35]
	x = [.5 .4 .05 .05]'
        d = zeros(4,1)
        for j = 1:4
            for i = 1:4
                d(i) = d(i) + A(i,j)*x(j);
            end
        end
        d

   Input:
	A =
            0.8000    0.4000    0.4000    0.5000
            0.1000    0.3000    0.4000    0.1000
            0.0500    0.2000    0.1000    0.0500
            0.0500    0.1000    0.1000    0.3500

        x =
            0.5000
            0.4000
            0.0500
            0.0500

        d =
            0
            0
            0
            0

   Output:
       d =
            0.6050
            0.1950
            0.1125
            0.0875


1.2.6.   Assessment.

The constants in the transition matrix for any Markov chain model may be estimates from past actions and are also subject to change as time evolves. Many Markov chains have large numbers of unknowns, and therefore, the matrix-vector products can be very expensive in both storage of the arrays and computation of the products.

Another concern is the potential for significant roundoff error. Because the transition matrix has column sums equal to one, roundoff errors may force column sums to be greater than one. This is similar to the first order finite difference algorithm where |a| > 1, and here the states tend to blowup!


1.2.7.   Homework.

  1. Consider the Markov chain for the investment application. Suppose the initial investments are different from above C column vector. What happens to the steady state solution for different C?

  2. What happens to Pk as k increases? How does this relate to the steady state solution?

  3. Show that the distributed law holds for matrices.

  4. Compare the execution times for the following three versions of matrix-vector products: ij version, ji version and any software for your computer.