UCES
Course: Methods and Analysis in UCES
Applied Area: store inventory, investment, Markov chain
Model: matrix products, properties
Method: ij and ji versions of matrix products
Assessment: time dependent quantities, stability
Prerequisites:
Objectives:
General Information: This lecture contains the generalization of the first order finite difference model to its matrix form. This requires greater computing power and some discussion of computer architectures is included.
Contact Person: R. E. White, NCSU, white@math.ncsu.edu
Revision Date: 7-28-94
Copyright ©1994, R. E. White. All rights reserved.
In this lecture we will consider matrix generalizations of the first order finite difference model, and consequently, we will eventually need more powerful computing hardware and software. We begin by a very brief discussion of what a computer is.
The von Neumann definition of a computer contains three parts: main memory, input-output device and central processing unit (CPU). The CPU has three sections: the arithmetic logic unit, the control unit and the local memory. The arithmetic logic unit does the floating point calculations while the control unit governs the instructions and data. The local memory is small compared to the main memory, but it is usually very fast. Hence, it is important to move data from the main memory to the local memory and do as much computation with this data as possible before moving it back to the main memory. The movement to data in and out of the local memory is relatively slow. Algorithms that have been optimized for a particular computer must take these facts into careful consideration.
Another way of describing a computer is the hierarchical classification of its components. There are three levels: the processor level with wide band communication paths, the register level with several bytes (8 bits per byte) pathways and the gate or logic level which has several bits in its pathways.
The figure below illustrates two processor level descriptions of computers. The top is a von Neumann computer with the three basic components. The lower part depicts a multiprocessing computer with four CPUs. The CPUs communicate with each other via the shared memory. The switch controls access to the shared memory, and here there is a potential for a bottleneck. The purpose of multiprocessors is to do more computation in less time. This is critical in many applications such as weather prediction.
![]() |
| Figure: Processor Level Computers |
Within the CPU is the arithmetic logic unit and here there are many floating point adders. These can be described as a register level device. A floating point add can be described in four distinct steps each requiring a distinct hardware device. For example, use four digits to do a floating point add of 100.1 + (-3.6).
| CE: compare expressions | .1001 103 and -.36 101 |
| AE: mantissa alignment | .1001 103 and -.0036 103 |
| AD: mantissa add | 1001 - 0036 = 0965 |
| NR: normalization | .9650 102. |
This can be depicted by the following figure where the lines indicate
communication pathways with several bytes of data. The data moves from left to
right in time intervals called the clock cycle time of the particular computer.
If each step takes one clock cycle and the clock cycle time was 6 nanoseconds,
then a floating point add would take 24 nanoseconds
![]() |
| Figure: Register Level Floating Point Add |
Within a floating point adder there are many devices which add integers. These devices typically deal with just a few bits of information and are examples of gate or logic level devices. Here the integer adds can be done by base two numbers. These devices are combinations of a small number transistors designed to simulate truth tables that reflect basic binary operations.
In this section we introduce some applications which will lead to the matrix products of row vector times a column vector, matrix times a column vector and the general matrix times a matrix. These operations are fundamental to many numerical models, and these computations have been carefully implemented on most computers.
Application to Store Inventory. A typical department store will have about 10,000 different items each with a different price per item. The problem is to determine the net worth of the store's inventory. Clearly, we must sum 10,000 numbers where each number is a product of the number of each item and the price of this item. With 10,000 products in the store all this data must be organized in a systematic way. For example, suppose the store has just four types of TVs:
This information maybe recorded in an ordered list of four numbers called a price column vector
| P = | ![]() |
Associated with this order is and inventory column vector
| I = | ![]() |
where 7 is the number of 27" TVs, 12 is the number of 27" TVs, 3 is the number of 31" TVs and 2 is the number of 50" TVs. The net worth is given in the following form:
The symbol IT is a row vector associated with the column vector I and is called the transpose of I:
This product is called a row vector times a column vector or dotproduct of two column vectors. Here the net worth is the dotproduct of the inventory vector and the price vector.
The general scheme for n different types of items with item k having price p(k) and i(k) number of item k is as follows:
I = [i(1) i(2) ... i(n)]T and
Matrix operations which have order n operations are classified as BLAS1 which signifies basic linear algebra subroutine of order n to the first power operations. Most computers have BLAS1 subroutines which have been optimized for the best possible performance on that computer.
Application to Investment. An investor has some portions in stocks, bonds, growth stocks and money market. Each year she decides how to reallocate her investments. From past year's reallocations we have
For the remainder of this application we shall keep the order stock, bond, growth and money. The above probabilities can be associated with a row vector to invest in stock
If the current investments are
then by writing this last information in vector form
to be the fraction to be invested in stock the next year.
Associated with each investment category is a probability vector
PiTC = the fraction to be invested in category i given the current investments C. We can record this in vector format as
Another way to say the is that the component i of the new investment vector is equal to the dotproduct of the probability vector i and the old investment vector, C.
The probability vectors can be formed into a 4 by 4 matrix or array
.
The newC can be viewed as a product of the matrix P and the column vector C where component i of the newC is defined as the product of row i of P times column vector C.
One can repeat this from one year to the next to try to predict future patterns of investment. For example, after two years the prediction will have the form P(PC), after three years P(P(PC) and so forth. We will later examine the behavior of this sequence of investment vectors. This is an example of a Markov chain where P is called a transition matrix from one state to the next state. Each column of P has non negative component whose sum equals one.
Often the size of these matrices is very large and require extensive computational power. In fact, for each n by n matrix times a column vector there are n dotproducts. Hence, such matrix vector products require order n2 operations. These matrix operations belong to the BLAS2 family of subroutines which are also optimized for many computers.
In general one can multiply two matrices provided the rows in the left matrix have the same dimension as the columns in the right matrix.
Definition. Let A = [a(i,k)] have m rows (i = 1..m) and n columns (k = 1..n). Let B = [b(k,j)] have n columns and l rows (j = 1..l). The product AB is an matrix with m rows and l columns. The ij component of AB in row i and column j is the product of row i of A and column j of B
.
Example. Let m = 3, n = 2 and l = 4.
.
If A and B are both n by n matrices, then there are n2 dotproducts, so there are order n3 operations. Here we say that such operations are a member of the BLAS3 subroutines.
Even if A and B are square, the product may not commute. Consider
.
The proof of the distributive law is straight forward. The associative law requires the sigma notation. The il component of A(BC) has the form
Example. Consider a Markov chain with transition matrix P which is n by n. After k iterations the state vector is P(P(...(PC)...)) where P appears k times. Use the above associative law to write this state as (Pk)C where Pk denotes P times itself k times.
The fundamental operation is the dotproduct or product of a row vector with a column vector. The matrix-vector and matrix-matrix products can be defined using the dotproducts. They also can be defined in alternate ways by making use of the basic rules of arithmetic. The advantage of the alternate mode of computation is that it often allows one to make use of particular properties of a computer such as communication speeds and local versus main memory size and speeds. We shall use the following notation:
Dotproduct Algorithm aTx = d:
d = 0
for j = 1,n
d = d + a(j)*x(j)
endloop.
Matrix-vector Product Algorithm (ij version)
Ax = d:
for i = 1,m
d(i) = 0
for j = 1,n
d(i) = d(i) + a(i,j)*x(j)
endloop
endloop.
An alternate way to do matrix-vector products is based on following reordering of the arithmetic operations. Consider the case where n = m = 3 and
This can be written in vector form as
In other words, the product is a linear combination of the columns of the matrix. So, the product can be formed by adding these multiples of the column vectors. This amounts to reversing the order of the nested loop in the above algorithm.
Matrix-vector Product Algorithm (ji version) Ax = d:
d(1:m) = 0 (implied loop that sets the vector d = 0)
for j = 1,n
for i = 1,m
d(i) = d(i) + a(i,j)*x(j)
endloop
endloop.
The matrix vector products are similar, but they will have three nested loops. Here there are 6 different orderings of the loops, and so, the analysis is a little more complicated.
This section contains Maple code for the generalization of the first order
finite difference algorithm to matrices. The first one makes use of a Maple
intrinsic for matrix-vector products. We have used a procedure called FOFDM
for this algorithm. We have applied it to Markov chains where the column
vector b is the zero vector. State vector y has an order as above of stocks,
bonds, growth and money. The input below indicates an initial distribution of
50% stocks, 40% bonds, 5% growth and 5% money. The graph in the figure
indicates that the states quickly converge to investments of 68% stock, 15%
bonds, 8% growth and 9% money. Of course, these results depend upon the
values in the transition matrix which is stored in matrix a.
Maple Code for newy = A*oldy + b.
Procedure is Defined
1.2.5. Implementation.
> FOFDM:= proc(a,b,n,x0,y0,h,d)
> local k;
> with(plots):
> with(linalg):
> x(0):=x0:
> y(0):=y0:
> yy:=y0:
> for k from 0 to n do
> y(k+1) := evalm(multiply(a,y(k)) + b);
> x(k+1) := x(k) + h;
> yy:=augment(yy,y(k+1));
> od:
> matrixplot(yy,style=wireframe,axes=normal);
> end:
Input Data:
> a:=matrix(4,4,[.8,.4,.4,.5, .1,.3,.4,.1, .05,.2,.1,.05,
.05,.1,.1,.35]);
> b:=vector([.0,.0,.0,.0]);
> y0:=vector([.5,.4,.05,.05]);
Call Procedure:
> FOFDM(a,b,10,0,y0,1,10);
Output Computed Data:
> y(10);
[.6813 .1539 .0769 .0879]
![]() |
| Figure: Computed States |
The intrinsic multiply(A,y) could be replaced by either the ij or ji versions of the matrix-vector product algoirthms. The ij version in Maple is as follows.
Maple Code for ij Matrix-vector Product.
> for i from 1 to n do
> d[i]:= 0:
> for j from 1 to n do
> d[i]:= d[i] + A[i,j]*y[j];
> od:
> od:
In the following we present the Matlab code, markov.m, for the ji version of a matrix-vector product. However, Matlab has its matrix-vector products optimized, and so, one should in most cases use the Matlab intrinsic A*x. The reader should experiment with the three versions and determine which version is fastest.
>>A = [.8 .4 .4 .5;.1 .3 .4 .1; .05 .2 .1 .05; .05 .1 .1 .35]
A =
0.8000 0.4000 0.4000 0.5000
0.1000 0.3000 0.4000 0.1000
0.0500 0.2000 0.1000 0.0500
0.0500 0.1000 0.1000 0.3500
>>x = [.5 .4 .05 .05]
x =
0.5000 0.4000 0.0500 0.0500
>>x = x'
x =
0.5000
0.4000
0.0500
0.0500
>>x1 = A*x
x1 =
0.6050
0.1950
0.1125
0.0875
>>x2 = A*x1
x2 =
0.6508
0.1728
0.0849
0.0916
>>x3 = A*x2
x3 =
0.6695
0.1600
0.0802
0.0904
>>x4 = A*x3
x4 =
0.6768
0.1560
0.0780
0.0891
Matlab Code for ji Matrix-vector Product.
A = [.8 .4 .4 .5; .1 .3 .4 .1; .05 .2 .1 .05; .05 .1 .1 .35]
x = [.5 .4 .05 .05]'
d = zeros(4,1)
for j = 1:4
for i = 1:4
d(i) = d(i) + A(i,j)*x(j);
end
end
d
Input:
A =
0.8000 0.4000 0.4000 0.5000
0.1000 0.3000 0.4000 0.1000
0.0500 0.2000 0.1000 0.0500
0.0500 0.1000 0.1000 0.3500
x =
0.5000
0.4000
0.0500
0.0500
d =
0
0
0
0
Output:
d =
0.6050
0.1950
0.1125
0.0875
The constants in the transition matrix for any Markov chain model may be estimates from past actions and are also subject to change as time evolves. Many Markov chains have large numbers of unknowns, and therefore, the matrix-vector products can be very expensive in both storage of the arrays and computation of the products.
Another concern is the potential for significant roundoff error. Because the transition matrix has column sums equal to one, roundoff errors may force column sums to be greater than one. This is similar to the first order finite difference algorithm where |a| > 1, and here the states tend to blowup!