HPC - Algorithms and Applications - Winter 13
- Winter 13/14
- Prof. Dr. Michael Bader
- Time and Place
- Lecture: Monday, 14.00-15.30, MI 02.07.023 (starts Oct 21);
Tutorial: Wednesday, 10-12, MI 02.07.023 (starts Oct 23, roughly bi-weekly)
- Elective topic in Informatics Bachelor/Master: students in mathematics or in any science or engineering discipline are welcome!
- Oliver Meister
- written exam, Wednesday, Feb 5, 2014; 10-12 in room MI 02.07.023 (time and room of the tutorial)
- Semesterwochenstunden / ECTS Credits
- 3 SWS (2V + 1Ü) / 4 ECTS
- https://campus.tum.de/tumonline/lv.detail?clvnr=950111465 (lecture)
https://campus.tum.de/tumonline/wbStpModHB.detailPage?&pKnotenNr=705979 (module description)
- From Nov 18, the lecture on Monday will start at 14.00 (instead of 14.15)
The lecture will have a focus on parallel algorithms and implementation techniques in the field of numerical simulation and high performance computing, such as:
- linear algebra problems on dense and sparse matrices
- simulation on structured and unstructured meshes
- particle-based simulations (with long-range and short-range interactions)
- spectral methods (parallel FFT and related algorithms)
- Monte Carlo and statistical methods
(a.k.a. the seven dwarfs of HPC).
The accompanying tutorials will include practical assignments, and will concentrate on the programming of GPU and accelerator platforms.
Slides and exercise sheets/solutions will be made available during the lecture.
Lecture slides will be published here after the lessons: See also the lecture from winter term 2012/13.
- Oct 21: Intro
- Oct 21, Oct 28, Nov 4: Fundamentals - Parallel Architectures, Models, and Languages
- additional material: Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures (technical report by Williams et al.)
- MPI examples: Cannon's Algorithm mpi_cannon.c (unsafe send/receive), mpi_cannon_sr.c (using MPI_Sendrecv), mpi_cannon_nbl.c (non-blocking communication)
- Oct 28, Nov 4, Nov 11: Dwarf No. 1 - Dense Linear Algebra;
- Nov 18: Dwarf no. 2 - Sparse Linear Algebra: Application example (page rank) and data structures
- Nov 25, Dec 2: Parallel Sparse Matrix-Vector Multiplication
- Dec 9, Dec 16: Dwarf No. 5 - Structured Grids
- articles by M. Frigo and V. Strumpen:
Cache oblivious stencil operations (preprint);
The memory behavior of cache oblivious stencil operations (preprint can be found via Google)
- article by K. Datta et al. in SIAM Review (preprint)
- articles by M. Frigo and V. Strumpen:
- Dec 16, Dec 18: Structured Grids and Space-filling Curves (will not be part of the exam)
- Jan 13: Dwarf No. 6 - Unstructured Grids and Partitioning
- Jan 20, Jan 27: Dwarf no. 4: N-body methods and implementation
- Maple worksheet: twobody.mw (also as PDF)
- article on Fast Multipole methods by C.R. Anderson
- article by Barnes & Hut in Nature (both articles can be accessed via TUM ebib-access)
- Feb 3: "all questions answered" (exam preparation)
Roughly every second week a two hour tutorial will take place (details at page top; days and time will be announced in TUMonline and in the lectures). The assignments and their solutions will be gradually posted here.
|Oct 23rd||Organizational remarks||-||-||-|
|Nov 6th||Introduction to CUDA||Worksheet 1||Exercise 1||Solution 1|
|Nov 13th||Further details on Dense LA in CUDA||Worksheet 2||Exercise 2||Solution 2|
|Nov 27th||Sparse LA in CUDA||Worksheet 3||Exercise 3||Solution 3|
|Dec 11th||Solving the heat equation with CUDA||Worksheet 4||Exercise 4||Solution 4|
|Jan 8th||The Shallow Water Equations and CUDA||Worksheet 5||Exercise 5||Solution 5|
|Jan 22nd||Further topics on SWE and CUDA||Worksheet 6||-||-|
- Exam review: you may view your corrected and graded exam on Thursday, Feb 13, 15.30 (office of Prof. Bader, Leibniz Supercomputing Centre, room E.2.044)
- written exam on Feb 5, 2014, from 10.15 (room MI 02.07.023)
- please be in (front of) the lecture room in time (at 10.00); the exam will start on 10.15, at the latest, and there will be announcements before the start!
- no helping material of any kind will be allowed for the exam
- please make sure that you register for the exam in TUMonline
- the exam will extend over all topics discussed in the lectures and tutorials:
- approx. 30% of the questions will deal with questions related to the tutorials; basic knowledge about GPU programming with CUDA is thus necessary
- The following topics will be excluded as topics of the exam:
Literature and Online Material
- R.H. Bisseling: Parallel Scientific Computing - A structured approach using BSP and MPI, Oxford University Press, 2004.
- Course notes on Rob Bisseling's lecture on Parallel Algorithms (based on the text book)
- V. Eijkhout: Introduction to High-Performance Scientific Computing (textbook, available as PDF on the website)
- T.G. Mattson, B.A. Sanders, B.L. Massingill: Patterns for Parallel Programming, Addison-Wesley, 2005
- G. Hager, G. Wellein: Introduction to High Performance Computing for Scientists and Engineers, Chapman & Hall/CRC Computational Science, 2010
Books on CUDA
- D.B. Kirk, W.W. Hwu: Programming Massively Parallel Processors - A Hands-on Approach, Morgan-Kaufman, 2010
- J. Sanders, E. Kandrot: CUDA by Example, Addison-Wesley, 2011
Helpful, but not strictly required is knowledge in:
- basics of numerical methods (e.g.: lecture IN0019 Numerical Programming or similar)
- basics of parallel programming (lecture Parallel Programming, HPC - Programming Paradigms and Scalability, or similar)
Most important is a certain interest in problems from scientific computing and numerical simulation!