## Computer Science Modules

Selected Application Areas of Matrix Computations in Computer Science and Engineering

Module Author: Ronald F. Taylor & Rohithia Muppavarapu

Author Contact: ronald.taylor@wright.edu

Funded by: National Science Foundation

Many students taking a second course in linear algebra with a computational orientation find it very helpful to see an overview of concrete areas of application. This module contains two such applications. The first one is a general introduction to the Google PageRank algorithm showing why numerical eigenvalue computation is relevant. It demonstrates the need to work with large matrices that have special properties and sparseness features in a way that makes efficient use of computer resources. Many matrix problems need to be updated frequently so that useful results can be quickly obtained. The importance of topics such as stochastic matrices and the power method to obtain the dominant eigenvalue and eigenvector is shown. A second application is that of a spring-mass system with design updates. The focus is on solving eigenvalue and linear systems of equations using MATLAB. The matrices are real symmetric and positive definite. The two applications are tied by the common theme of the necessity of adjustments or frequent updates to the matrix equations. The new systems must be resolved making effective use the current solution, existing data, and special properties of the matrices.

Parallel Algorithms in Bioinformatics -- A First Introduction

Module Author: Eric Stahlberg

Author Contact: eas@osc.edu

Funded By: The National Science Foundation

This module provides an introduction to parallel programming with applications in bioinformatics. The module will review both the basic concepts of bioinformatics, algorithms and parallel programming. Examples are presented in an algorithmic fashion for use in multiple parallel programming environments. Implementations provided concentrate on multicore architectures with examples in OpenMP and Pthreads. Mathematical concepts such as graph theory will be introduced as needed to develop the relationship between the experimental data and the algorithmic solution to be implemented. Basic overviews of datamining concepts will also be included in the module.

Parallel Programming with MPI: Student

Parallel Programming with MPI: Instructor

Module Author: David G. Robertson

Author Contact: drobertson@otterbein.edu

Funded By: The National Science Foundation

This module introduces some basic concepts and techniques of parallel computing in the context of simple physical systems. It focuses on the distributed memory model of parallel computation and uses MPI (Message Passing Interface) as the programming environment. The physical systems studied illustrate naturally the idea of domain decomposition. Besides learning the basics of MPI programming, students can explore a range of issues related to performance in parallel computation. This module is designed for students with fairly sophisticated programming skills in C, C++ or Fortran; facility with the Unix operating system will most likely also be required. The students should ideally have taken (or be taking) an upper-level course in classical mechanics, as some familiarity with systems of coupled oscillators will be helpful. Some moderately exotic computing hardware is required, namely access to a distributed-memory parallel computing system. Virtually all such systems will use the Unix operating system and some sort of batch processing; these tools will need to be mastered. Finally, a tool for visualization will be very useful. Gnuplot is a freely available program that allows the necessary visualization, and its use is described below. Other tools such as Vtk, Matlab and Mathematica could also be used if desired.

Optimization and Mathematical Programming: Modeling and Problem-Solving: Student

Optimization and Mathematical Programming: Modeling and Problem-Solving: Instructor

Module Author: James L. Noyes

Author Contact: jnoyes@wittenberg.edu

Funded By: The National Science Foundation

This interactive computational science module addresses the area of applied optimization and discusses the importance of modeling in solving science, engineering, and management problems. When constraints are involved, optimization is also known as mathematical programming (not to be interpreted as computer programming) or constrained optimization. Here one needs to find the maximum or minimum of a given multivariable function possibly subject to a set of multivariable inequality and equality constraints. The focus of this module will be to describe the main types of unconstrained and constrained optimization models, analyze a given problem statement, identify the type of optimization involved, use a problem-solving approach to develop the appropriate optimization model, solve this model, confirm and interpret the results, and in some cases to indicate any sensitivities of interest.

Numerical Computation with Mathematica: Basic Analysis and Visualization: Student

Numerical Computation with Mathematica: Basic Analysis and Visualization: Instructor

Module Author: James L. Noyes

Author Contact: jnoyes@wittenberg.edu

Funded By: The National Science Foundation

This interactive computational science module addresses the area of effective numerical computation. Each topic covered in this module is as independent as possible from the other topics. When some dependence exists, the prerequisite topics have always been presented first. Hence, if all topics are to be covered (e.g., as part of a curriculum), they should be covered in the order presented. On the other hand, the appendix material can be covered whenever it is needed.**Steady State Heat Flow Problem (contact the module author for access to the module materials)**

Module Author: Robert Marcus

Author Contact: rmarcus@centralstate.edu

Funded By: The National Science Foundation

The goal of this module is to design and write a scalable parallel algorithm which determines the steady state heat flow through a rectangular metal sheet with fixed boundary conditions. As heat flows across the metal sheet there is no loss of heat vertically from the surface of the sheet. The temperatures on the boundary will remained fixed: 100o C on three edges and 0o C on the fourth edge. A graphical image of the steady state heat will be displayed using MATLAB. The parallel algorithm demonstrates a useful technique which uses what is called "ghost cells" that may be useful for implementing other algorithms that use difference equations that require the exchange of data between adjacent tasks in a parallel partition.

Computable Performance Metrics: Module 0: Floating Point Precision

Computable Performance Metrics: Module 0: Supporting Documents I

Computable Performance Metrics: Module 0: Supporting Documents II

Module Author: Kris Stewart

Author Contact: stewart@sdsu.edu

Funded By: W.M. Keck Foundation

In the 21st century, most computers are based on using chips of either 32 bit (single precision) or 64 bit (double precision) datapaths. In the early days of computing, the different manufacturers, for example the Univac 1108, Honeywell 6000, PDP-11, Control Data 6600, Cray-1, Illiac-IV, Burroughs B5500, Hewlett Packard HP-45, Texas Instruments SR-5x, IBM 370 or Telefunken TR440, used base-2 (binary), base-8 (octal), base-10 (decimal), base-16 (hexadecimal) arithmetic systems, as summarized in the table in Chapter 2 [4] of an excellent 1977 numerical computing text. A thorough overview of floating-point is presented in “What Every Computer Scientist Should Know about Floating-Point” [5] for details on the current status. Before we examine accuracy in performance calculations, we explore the actual arithmetic capability of the computing system used to gather the performance data in later modules. A computational way was established definitively by Michael Malcolm [6] in 1972, and further refined by W.J. Cody [7] in 1988 to compute machine hardware properties.

Computable Performance Metrics: Module 1: Floating Point Precision: MACHAR – Test for IEEE 754

Computable Performance Metrics: Module 1: Supporting Documents

Module Author: Kris Stewart

Author Contact: stewart@sdsu.edu

Funded By: W. M. Keck Foundation

Module 0: Floating Point Precision covered the task to find a “computable” way to detect the actual manner in which a computer performed arithmetic, developed by W.J. Cody [1] in 1988. In that module, a minimal test was examined to compute a machine’s unit round-off, the value eps so that 1.0 + eps = 1.0. Considering the extent to which the FORTRAN subroutine MACHAR can compute the properties of a processor, some may wish to examine the full range of values computed. This can be used to test if the user’s computer platform actually follows the IEEE 754 Floating Point Standard [2].

Computable Performance Metrics: Module 2: Computing Error and Work Estimators

Computable Performance Metrics: Module 2: Supporting Documents

Module Author: Kris Stewart

Author Contact: stewart@sdsu.edu

Funded By: W. M. Keck Foundation

Many of the calculations performed these days benefit from the methods of numerical analysis with their proven error analysis. An excellent text is Numerical Methods and Software [1] providing a careful balance between algorithms and their usefulness on modern computers, the development of the error properties of these algorithms and the impact of the implementation of the algorithm in a computer language, FORTRAN, in a portable manner so that the text, and its codes, can be applied to the wide variety of computing systems available now. The current module will augment this treatment by pursuing computationally measurable performance metrics, such as accuracy and cost that can be computed during the solution process.

Traveling Salesperson Problem: A Parallel Approach

Module Author: Ignatios Vakalis

Author Contact: ivakalis@csc.calpoly.edu

Funded By: National Science Foundation (9952806)

The Traveling Salesperson (or Salesman) Problem (TSP) is defined as the problem of finding the shortest “closed tour” through a given set of cities by visiting each city exactly once. It has commanded much attention of mathematicians and computer scientists, because it is so easy to describe but so difficult to solve. Originally formulated as the problem of finding the shortest route for a traveling salesman to visit all of his customers, the problem has found many important applications such as automating the routing robots through warehouses, sending couriers to automatic teller machines, and drilling holes through printed circuit boards. Optimal (non-exact) solutions can be computed by employing methods such as: Branch & Bound, Branch & Cut, or by using Genetic Algorithms. The objective of this module is to develop a parallel approach, suitable for distributed memory architectures, for the exact solution to the TSP.

The Edit Distance Problem: A Parallal Approach for Distributed Memory Machines

Module Author: Ignatios Vakalis

Author Contact: ivakalis@csc.calpoly.edu

Funded By: National Science Foundation (9952806)

Explosive growth in biological sequence data from sources like the Human Genome Project (HGP) has increased the need for efficient, scaleable algorithms in the area of string manipulation. Various algorithms addressing global and local alignment of string information aid life scientists to make decisions on relatedness of organisms. Extensive research has been executed in this field since the 1970's. This project examines the “edit distance” problem that has applications in global alignment of strings. This problem is useful in gauging the level of "relatedness" between varying sequences, of DNA, RNA, or protein. The module examines the design of an algorithm (using a dynamic programming approach) and investigates the potential for increased efficiency by applying parallelism on distributed memory machines.

N-Body Problem: A Parallel Programming Approach

N-Body Problem: A Parallel Programming Approach: Appendix

N-Body Problem: A Parallel Programming Approach: Appendix

Module Author: Ignatios Vakalis

Author Contact: ivakalis@csc.calpoly.edu

Funded By: National Science Foundation (9952806)

The n-body problem is a generalization of the classical three-body problem. The latter describes the motion of three point masses in the plane under their mutual Newtonian gravitation. In general terms, the n-body can be stated as follows: given the initial state of a system of individual particles interacting through long-range forces, predict the state of the system at some arbitrary time in the future. The problem usually appears in the context of gravitational (astrophysics) or electrostatic (molecular dynamics and plasma physics) forces. The module specifically addresses the problem from an astrophysical perspective, although the concepts and algorithms are relatively general and could be easily adapted.

Mandelbrot Set: A Parallel Programming Approach

Module Author: Ignatios Vakalis

Author Contact: ivakalis@csc.calpoly.edu

Funded By: National Science Foundation (9952806)

Fractals are defined as mathematical objects with several characteristics, such as: i) self-similarity ;and ii) formation by iteration. The Mandelbrot set is a fractal formed by iterations of a complex-valued polynomial function. The module presents the mathematical formulation and the computer implementation for generating the Mandelbrot set. A parallel code, suitable for implementation over a MIMD distributed memory machine (e.g., collection of networked workstations) is presented. Three load-balancing strategies (static, dynamic, and recursive boundary) are being employed and compared. Speedup analysis is being performed for each load-balancing strategy