UNDERGRADUATE COMPUTATIONAL ENGINEERING AND SCIENCE

UCES

"Vector Computers and Heat Diffusion"

Course:   Methods and Analysis in UCES

Prerequisites:

  1. Math:   Algebra 2
  2. Computing:   Nested loops, arrays
  3. Science:   Physics

Objectives:

  1. Math:   Time and 1D space model
  2. Computing:   Vector pipeline, timing model
  3. Science:   Heat transfer model for a thin wire

General Information: This is a continuation of the discrete Newton's law of cooling model. We include diffusion in 1D space. and this is implemented on a vector computer.

Contact Person:   R. E. White, white@math.ncsu.edu

Revision Date:   7-28-94

Copyright ©1994, R. E. White. All rights reserved.

Return to Table of Contents


1.3   Vector Computers and Heat Transfer

1.3.1.   Introduction.

In this lecture we will consider heat conduction in a thin electrical wire which is thermally insulated on its surface. The model of the temperature will have the form newy = A oldy + b. In general, the matrix A can be extremely large, but it will also have a special structure with many more zeros than nonzero components. Because of these properties, it is important to use some computing architectures which can perform more than one operation or task at the same time. We will in this lecture examine a vector pipeline, and in the next lecture we examine multiprocessors.

A vector pipeline is a register level device, which is usually in either the control unit or the arithmetic logic unit. It has a collection of distinct hardware modules or segments that (i) execute distinct steps of an operation and (ii) each segment is busy once the pipeline is full.

The operation of floating point addition is an example. Floating point addition has four distinct steps: CE (compare expression), AE (align mantissa), AD (add mantissa) and NR (normalize).

Figure: Serial Floating Point Add

In the serial execution each step is done by a particular piece of hardware, and then that hardware is idle until the next floating point add is done. So, if each step requires one clock cycle, then the floating point add will require four clock cycles. The idea behind a vector pipeline is to keep all pieces of hardware busy during all clock cycles. The following figure depicts how this is done for a sequence of floating point additions.

Figure: Vector Pipeline for Floating Point Additions

The first pair of floating point numbers is denoted by D1, and this pair enters the pipeline in the upper left of the above figure. Segment CE on D1 is done during the first clock cycle. During the second clock cycle D1 moves to segment AE, and the second pair of floating point numbers D2 enters segment CE. After three clock cycles the pipeline is full, and there after a floating point add is produced every clock cycle. So, for large number of floating point adds with four segments the ideal speedup is four.

A discrete model for the speedup of a particular pipeline is as follows. Such models are often used in the design phase of computers. They also are used to determine how to best utilize vector pipelines.

Vector Pipeline Timing Model.

Let K = time for one clock cycle,

N = number of items of data,

L = number of segments,

Sv = startup time for the vector pipeline operation and

Ss = startup time for a serial operation.

Tserial = serial time = Ss + (LK)N.

Tvector = vector time = [Sv + (L - 1)K] + KN.

Speedup = Tserial/Tvector.

The vector startup time is usually much larger than the serial startup time. So, for small amounts of data (small N), the serial time may be smaller than the vector time! Later we shall give some computations that illustrates this point. The vector pipeline does not start to become effective until it is full, and this takes [Sv + (L - 1)K] units of time. Note that the speedup approaches the number of segments L as N increases.

Another important consideration in the use of vector pipelines is the orderly and timely input of data. An example is matrix-vector multiplication and the Fortran programming language. Fortran stores arrays by listing the columns from left to right . So, if one wants to input data from the rows of a matrix, then this will be in stride equal to the length of the columns. On the other hand, if one inputs data from columns, then this will be in stride equal to one. For this reason, when vector pipelines are used to do matrix-vector products, the ji version performs much better than the ij version. This can have a very significant impact on the effective use of vector computers such as the Cray Y-MP which can have vector pipeline speedups of up to ten.


1.3.2.   Applied Area.

In this section we present a second model of heat transfer. In our first model we considered heat transfer via a discrete version of Newton's law of cooling which involves temperature as only a discrete function of time. That is, we assumed the mass was uniformly heated with respect to space. In this section we allow the temperature to be a function of both discrete time and discrete space.

The model for the diffusion of heat in space is based on empirical observations.

Discrete Fourier's Heat Law:

	(a).  heat flows from hot to cold,

	(b).  the change in heat is proportional to the
		cross sectional area,
		change in time, and
		change in temperature/change in space.

The last term is a good approximation provided the change in space is small. The proportionality constant, K, is called the thermal conductivity. The K varies with the particular material and with the temperature. Here we will assume the temperature varies over a smaller range so that K is approximately a constant.

Consider a thin wire so that there is diffusion in only one direction, x. The wire will have a current going through it so that there is a source of heat, f, which is from the electrical resistance of the wire. The f has units of (heat)/(volume time). We will assume the ends of the wire are kept a zero temperature, and the initial temperature is also zero. The goal is to be able to predict the temperature inside the wire for any future time and space location. Problems similar to this have important applications to devices used to measure wind speeds where the wind cools the wire enough so that the electrical resistance changes.


1.3.3.   Model.  

In order to develop a model to do this prediction, we will discretize both space and time and let u(ih, k delta t) be approximated by uik where delta t = T/maxit, h = 1/n and 1 is the length of the wire. The model will have the general form

change in heat content ~ (heat from the source)
+ (heat from diffusion through the right end)
+ (heat from diffusion through the left end).

This is depicted in the figure below where the time step has not been indicated. We can choose either k delta t or (k+1) delta t. Here we will choose k delta t which will eventually result in the matrix version of the first order finite difference method.

Figure: Heat Diffusion in a Thin Wire

The heat diffusing in the right face (when (ui+1k - uik )/h > 0) is

A delta t K (ui+1k - uik )/h.

The heat diffusing out the left face (when (uik - ui-1k )/h > 0) is

A delta t K (uik - ui-1k )/h.

Therefore, the heat from diffusion is

A delta t K (ui+1k - uik )/h - A delta t K (uik - ui-1k )/h.

The heat from the source is

A h delta t f.

The heat content of the the volume Ah at time k delta t is

rho c uik Ah

where rho is the density and c is the specific heat.

By combining these we have the following approximation of the change in the heat content for the small volume Ah:

rho c uik+1 Ah - rho c uik Ah = A h delta t f + A delta t K (ui+1k - uik )/h - A delta t K (uik - ui-1k )/h.

Now, divide by rho cAh, define alpha = (K/(rho c) (delta t/h2 ) and explicitly solve for uik+1.

Explicit Finite Difference Model of Heat Transfer.

uik+1 = delta t/(rho c) f + alpha (ui+1k + ui-1k) + (1 - 2 alpha )uik (1)
where
i = 1,...,n-1,
k = 0,...,maxit-1,

ui0 = 0 for i = 1,...,n-1 and (2)
u0k = unk = 0 for k= 1,...,maxit. (3)

Equation (2) is the initial temperature set equal to zero and (3) is the temperature at the left and right ends set equal to zero.

Equation (1) may be put into the matrix version of the first order finite difference method. For example, if the wire is divided into four equal parts, then n = 4 and (1) may be written as 3D vector equation

uk+1 = A uk + b
where

.

An extremely important restriction on the time step delta t is required to make sure the algorithm is stable. For example, consider the case n = 2 where the above is a scalar equation and we have the simplest first order finite difference model. Here a = 1 - 2 alpha and we must require |a| < 1. If a = 1 - 2 alpha 0 and alpha > 0, then this condition will hold. If n is larger than 2, this simple condition will imply that the matrix products Ak will converge to the zero matrix. This will make sure there are no blowups provided the source term f is bounded. The illustration of the stability condition, and an analysis will be presented later.

Stability Condition for (1).

1 - 2 alpha 0 and alpha > 0


1.3.4.   Method.

In order to compute all uik+1, which we will henceforth denote by u(i,k+1) with both i and k shifted up by one, we must use a nested loop where the i loop (space) is inside and the k loop (time) is the outer loop. This is illustrated in the following figure by the dependency of u(i,k+1) on the three previously computed u(i-1,k), u(i,k) and u(i+1,k) .

Figure: Time-Space Grid

The following algorithm takes advantage of all the zeros in each row of the matrix-vector product version of this model.

Explicit Algorithm for Problem (1), (2) and (3).

        define f, n, maxit, rho, c, cond, dt, dx
        kappa = cond/(rho*c)
        alpha = kappa*dt/(dx*dx)  (choose dt so than 1 - 2*alpha  0)
        for i = 1,n+1
            u(i,1) = 0
        endloop
        for k = 1,maxit+1
            u(1,k) = 0
            u(n+1,k) = 0
        endloop

        for k = 1,maxit
            for i = 2,n
                u(i,k+1) = dt/(rho*c)*f + alpha*(u(i-1,k) + u(i+1,k)) +
                                          (1 - 2*alpha)*u(i,k)
            endloop
       endloop.


1.3.5.   Implementation.

We give three implementations for the explicit heat transfer algorithm for heat transfer in a thin wire. The first is the Maple code which illustrates the stability condition on the discretization. The second is in Fortran and illustrates the speedup due to the proper use of the vector pipelines. The third uses Matlab.

The Maple code makes use of the matrixplot procedure to visualize the temperature on the vertical axis with the time on the column axis and space on row axis. In the code the outer k loop is the time loop, and the inner i loop is with respect to the space nodes.

Maple Code for Heat Transfer in a Thin Wire.


  Procedure is Defined:
	> HEAT1D:=proc(f,Kappa,dt,dx,n,maxit)
        > with(linalg):
	> with(plots):
	> u:=matrix(n+1,maxit+1,0):
	> alpha := Kappa*dt/(dx*dx):
	> for k from 1 to maxit do
	>     for i from 2 to n do
	>         u[i,k+1]:= dt*f + alpha*(u[i-1,k] + u[i+1,k]) +
	>                           (1-2*alpha)*u[i,k];
	>     od;
	> od;
	> matrixplot(u,style=wireframe,axes=normal);
	> end;

  Input Data:
	> f:=1:
        > Kappa:=.001:
	> n:=10:
	> dt := 5:
	> dx := .1:
	> n:=10:
	> maxit:= 30:

Call Procedure and Graph Outputs:

The first calculation uses the above data. Note the stability condition holds. The solution remains bounded over the time interval from 0 to 150.

	> HEAT1D(f,Kappa,dt,dx,n,maxit);

Figure: Stable Computation of Temperature

The second calculation increases the time step to 6 so that the stability condition does not hold. Note that significant oscillations develop.

	> HEAT1D(f,Kappa,6.0,dx,n,30);

Figure: Unstable Computation of Temperature

The third calculation involves the old shorter time step of 5 and more time iterations. Note that a steady state solution has been reached.

	> HEAT1D(f,Kappa,5.0,dx,n,100);

Figure: Steady State Temperature

Fortran Code with Vectorization for Heat Transfer in a Thin Wire.


      Program heat
      dimension u(70,100)
      real tt1(2),tt2(2)
      n = 30
      maxit = 30
      f = 1.0
      kappa = .001
      dt = 5
      dx = .1
      alpha = kappa*dt/(dx*dx)
      do 10 k = 1,maxit
            do 12 i = 1,n+1
                  u(i,k) = 0.0
   12       continue
   10 continue
      tbegin = etime(tt1)
      do 20 k = 1,maxit
cvd$l vector
            do 30 i = 2,n
                  u(i,k+1) = dt*f + alpha*(u(i-1,k) + u(i+1,k)) +
     $                (1 - 2*alpha)*u(i,k)
   30       continue
   20 continue
      tend = etime(tt2)
      print*,  'time =', tend - tbegin
      stop
      end

The reason for using vectorization is to achieve speedup. The above code was run on the Alliant FX/40 computer. The in code directive "cvd$l vector" before the beginning of loop 30 instructs the compiler to use the vector pipeline on loop 30. Similar in code directives are used on other computers. Note the loop index for loop 30 is i, and this is a row number in the array u. This ensures that the vector pipe can sequentially access the data in u. The following table indicates increased speedup as the length of loop 30 increases. This is because the startup time relative to the execution time is decreasing. The Alliant FX/40 has limit of four for the speedup, and the Cray Y-MP has more segments in its vector pipeline with speedups up to ten.

Table: Times (10-3 sec.) and Speedups
  Loop Length     Serial Time     Vector Time     Speedup  
30 04.07 2.60 1.57
62 06.21 2.88 2.16
126 11.0 4.04 2.72
254 20.7 6.21 3.33

The next implementation uses Matlab. A nonzero initial temperature and zero temperature at the boundaries are used. Again, the temperature is stored in a 2D array for the purpose of generating a graph of the temperature by using Matlab's intrinsic mesh. Note, the temperature converges to zero, the steady state temperature. The reader should experiment with the mesh size and observe the stability condition for the explicit method.

Matlab Code for the Heat Equation.

        l=1.0;
        t=.1;
        K=20;
        dt=t/K;
        n=10;
        dx=l/n;
        b=dt/(dx*dx);
        cond=1.0;
        spheat=1.0;
        rho=1.0;
        a=cond/(spheat*rho);
        d=a*b;
        %set initial conditions
        for i=1:n+1
            u(i,1)=10*sin(pi*((i-1)*dx));
        end
        %set boundary conditions
        for k=1:K+1
            u(1,k)=0;
            u((n+1),k)=0;
        end
        %calculate heat
        for k=1:K
            for i=2:n;
                u(i,k+1)=(1-2*d)*u(i,k)+d*(u(i-1,k)+u(i+1,k));
            end
        end
        mesh(u)

Figure: Temperature as Function of Space and Time


1.3.6.  Assessment.

The heat conduction in a thin wire has a number of approximations. Different mesh sizes in either the time or space variable will give different numerical results. However, if the stability conditions holds and the mesh size decreases, then the numerical computations will differ by smaller amounts.

The numerical model assumed that the surface of the wire was thermally insulated. This may not be the case, and one may use the discrete version of Newton's law of cooling by inserting a negative source term of

C(usur - uik ) h pi 2r dt
where r is the radius of the wire.

The constant C is a measure of insulation, and C = 0 corresponds to perfect insulation. The h pi 2r is the lateral surface area of the volume hA where A = pi r2. The modified version of (1) is

uik+1 = delta t/(rho c) f + alpha (ui+1k + ui-1k) + (1 - 2 alpha)uik
    + C(usur - uik) h pi 2r delta t/(rho c h pi r2)

= delta t/(rho c) (f +C(usur 2/r ))+ alpha (ui+1k + ui-1k)
    + (1 + (delta t/(rho c)C 2/r - 2 alpha)uik .

Other variations on the model include more complicated boundary conditions, variable thermal properties and diffusion in more than one direction.

In the scalar version of first order finite difference models the scheme was stable when |a| < 1. In this case, yk+1 converged to the steady state solution y = a y + b. This is also true of the matrix version of (1) provided the stability condition is satisfied. In this case the real number a will be replaced by the matrix A and Ak will converge to the zero matrix. The following is a more general statement of this.

Steady State Theorem. Consider the matrix version of the first order finite difference equation

yk+1 = A yk + b where A is a square matrix.

If Ak converges to the zero matrix and y = Ay + b, then yk converges to y.

Proof. Subtract yk+1 = A yk + b and y = Ay + b and use the properties of matrix products to get

yk+1 - y = A(yk - y)

= A(A(yk-1 - y))

= A2(yk-1 - y)

.
.
.

= Ak+1(y0 - y).

Since Ak converges to the zero matrix, the column vectors yk+1 - y must converge to zero.


1.3.7.   Homework.

  1. Consider the timing formula for a four segment pipeline whose start up time is 30 nsec. and whose clock cycle time is 6 nsec. If the serial startup time is 1 nsec., find N so that the speedup is 1. Why is this important?

  2. Write a computer program and observe the consequences of not satisfying the stability condition.

  3. Experiment with the mesh size and observe convergence as the mesh size decreases.

  4. Consider the above variation on the thin wire where a variation on Newton's law of cooling models the heat loss through the surface of the wire. Vary the C term. Vary the r term. Explain your computed results.

  5. Consider the 3 by 3 A matrix. Observe Ak for different values of alpha so that the stability condition either does or does not hold.