# HPC - Algorithms and Applications - Winter 12

**Term**- Winter 12/13
**Lecturer**- Prof. Dr. Michael Bader
**Time and Place**- Lecture: Monday, 14.15-15.45, MI 02.07.023 (starts Oct 22)
- Tutorial: Wednesday, 10-12, MI 02.07.023 (starts Nov 7)
**Audience**- Elective topic in Informatik Bachelor/Master/Diplom subject area
*Algorithms and Scientific Computing* - students in mathematics or in any science or engineering discipline are welcome!
**Tutorials**- Oliver Meister
**Exam**- Monday, Feb 4, 2013; 14-16 in room MI 02.07.023 (time and room of the lecture) - Review: Wednesday, March 27th/March 6th; 10am to 12am in LRZ Room E.2.040
**Semesterwochenstunden / ECTS Credits**- 3 SWS (2V + 1Ü) / 4 ECTS
**TUMonline**- https://campus.tum.de/tumonline/lv.detail?clvnr=950078529

## Contents

# News

- This lecture will be held for the first time in winter term 2012/13, however, will be similar in content to the lecture Algorithms of Scientific Computing II - Winter 11.
- Together with the lecture High Performance Computing - Programming Paradigms and Scalability (formerly known as "Parallel Programming and HPC") this lecture will form a 2-semester track of HPC lectures
- The
**exam grades**are now available in TUM Online. An exam review will take place on**Wednesday, Feb 27th**and**Wdnesday, Mar 6th**from 10am to 12am in LRZ Room E.2.040.

# Content

The lecture will have a focus on parallel algorithms and implementation techniques in the field of numerical simulation and high performance computing, such as:

- linear algebra problems on dense and sparse matrices
- simulation on structured and unstructured meshes
- particle-based simulations (with long-range and short-range interactions)
- spectral methods (parallel FFT and related algorithms)
- Monte Carlo and statistical methods

(a.k.a. *the seven dwarfs of HPC*).

The accompanying tutorials will include practical assignments, and will concentrate on the programming of GPU and accelerator platforms.

# Lecture Material

Lecture slides will be published here after the lessons:

- Oct 22: Intro
- Oct 22, Oct 29: Fundamentals - Parallel Architectures, Models, and Languages
- Oct 29, Nov 5, Nov 12: Dwarf No. 1 - Dense Linear Algebra;
- additional material: article by Elmroth et al. in SIAM Review
- chapter In-core dense matrices of the ScaLAPACK User's Guide

- Nov 19, Nov 26: Dwarf No. 5 - Structured Grids
- articles by M. Frigo and V. Strumpen:

Cache oblivious stencil operations (preprint);

The memory behavior of cache oblivious stencil operations (preprint can be found via Google) - article by K. Datta et al. in SIAM Review (preprint)

- articles by M. Frigo and V. Strumpen:
- Nov 26, Dec 3: Structured Grids and Space-filling Curves, Solving the Shallow Water Equations
- Maple worksheets: hilbert_adap.mw (also as PDF);

- Dec 10: Dwarf No. 6 - Unstructured Grids and Partitioning
- Dec 17: Partitioning and Space-filling Curves revisited
- Maple worksheets: hilbert-arith-part.mw (also as PDF); reftree_hilbert_vertexlab.mw (also as PDF);
- related article by Mitchell: journal article, preprint, webpage by W. Mitchell on reftree method

- Jan 7: Dwarf no. 2 - Sparse Linear Algebra: Application example (page rank) and data structures
- Jan 14: Parallel Sparse Matrix-Vector Multiplication
- lecture material accompanying the book by R. Bisseling (see esp. the slides psc4_3.pdf, psc4_4.pdf, psc4_5.pdf, and psc4_6.pdf)

- Jan 21: Dwarf no. 4: N-body methods and implementation
- Maple worksheet: twobody.mw (also as PDF)
- article on Fast Multipole methods by C.R. Anderson
- article by Barnes & Hut in Nature (both articles can be accessed via TUM ebib-access)

# Tutorials

Roughly every second week a two hour tutorial will take place (details at page top; days and time will be announced here and in the lectures). The assignments and their solutions will be gradually posted here.

- Nov 7th: Introduction to CUDA

Source code for Exercise 1 Solution

Source code for Exercise 2 Solution

- Nov 28th: The Shallow Water Equations and CUDA

Source code for Exercise 3 Solution

- Dec 13th: Further Topics on SWE and CUDA

Source code for Exercise 4 Solution

- Jan 9th: Sparse Linear Algebra in CUDA

Source code for Exercise 5 (my.mtx and resp. solution included) Solution

- Jan 23rd: Solving the heat equation with CUDA

Source code for the heat equation solver

## Repeat Exam

- the repeat exam will be a written exam, if more than 12 students register for the exam; otherwise it will be an oral exam
- a written exam would take place on April, 16, from 16.00 (room to be announced); oral exam will be on this day and further days in the same week
**no helping material of any kind**will be allowed for the exam- please make sure that you register for the exam in TUMonline
- the exam will extend over all topics discussed in the lectures and tutorials:
- approx. 30% of the questions will deal with questions related to the tutorials; basic knowledge about GPU programming with CUDA is thus necessary
- the part on tree methods (Barnes-Hut and Fast Multipole algorithms) for N-body problems are excluded from the exam

# Literature and Online Material

- R.H. Bisseling: Parallel Scientific Computing - A structured approach using BSP and MPI, Oxford University Press, 2004.
- Course notes on Rob Bisseling's lecture on Parallel Algorithms (based on the text book)

- T.G. Mattson, B.A. Sanders, B.L. Massingill: Patterns for Parallel Programming, Addison-Wesley, 2005
- D.B. Kirk, W.W. Hwu: Programming Massively Parallel Processors - A Hands-on Approach, Morgan-Kaufman, 2010
- J. Sanders, E. Kandrot: CUDA by Example, Addison-Wesley, 2011

# Prerequisites

Lecture *IN0019 Numerical Programming* or similar basic knowledge in numerical methods. Basic knowledge in parallel programming (lecture *Parallel Programming*, *Parallele Algorithmen und Höchstleistungsrechnen*, or similar) is helpful (as is a certain interest in problems from scientific computing and numerical simulation), but not strictly required.