HPC - Algorithms and Applications - Winter 14
- Term
- Winter 14/15
- Lecturer
- Prof. Dr. Michael Bader
- Time and Place
- Lecture: Monday, 14.00-15.30, MI 02.07.023 (intro lecture on Oct 8; regular lectures start from Oct 13);
Tutorial: Wednesday, 12.00-14.00, MI 02.07.023 (starts Oct 8, roughly bi-weekly) - Audience
- Elective topic in Informatics Bachelor/Master: students in mathematics or in any science or engineering discipline are welcome!
- Tutorials
- Oliver Meister
- Exam
- written exam, repeat exam on Monday, Apr 13, 2015; 17.15-19.00 in room MI 02.07.023;
- Semesterwochenstunden / ECTS Credits
- 3 SWS (2V + 1Ü) / 4 ECTS
- TUMonline
- https://campus.tum.de/tumonline/lv.detail?clvnr=950160193 (lecture)
https://campus.tum.de/tumonline/wbStpModHB.detailPage?&pKnotenNr=705979 (module description)
Contents
Annonuncements
- The course will start on Wednesday, Oct 8 with an overview/introduction and some organizational issues (accounts for tutorials, e.g.)
Content
The lecture will have a focus on parallel algorithms and implementation techniques in the field of numerical simulation and high performance computing, such as:
- linear algebra problems on dense and sparse matrices
- simulation on structured and unstructured meshes
- particle-based simulations (with long-range and short-range interactions)
- spectral methods (parallel FFT and related algorithms)
- Monte Carlo and statistical methods
(a.k.a. the seven dwarfs of HPC).
The accompanying tutorials will include practical assignments, and will concentrate on the programming of GPU and accelerator platforms.
Lecture Material
Lecture slides will be published here after the lessons: See also the lecture from winter term 2013/14.
- Oct 8: Intro
- Oct 13,20,27: Fundamentals - Parallel Architectures, Models, and Languages
- read the related paper: Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures (technical report by Williams et al.; published in Communications of the ACM 52 (4), 2009, p.65-76)
- MPI examples: Cannon's Algorithm mpi_cannon.c (unsafe send/receive), mpi_cannon_sr.c (using MPI_Sendrecv), mpi_cannon_nbl.c (non-blocking communication)
- Oct 13,20,27: Dwarf No. 1 - Dense Linear Algebra
- read the related paper: article by Elmroth et al. in SIAM Review (access via TUM eAccess, if necessary)
- chapter In-core dense matrices of the ScaLAPACK User's Guide
- Nov 3: Dwarf no. 2 - Sparse Linear Algebra: Application example (page rank) and data structures
- Nov 10, 12, 24: Parallel Sparse Matrix-Vector Multiplication
- lecture material accompanying the book by R. Bisseling (see esp. the slides psc4_3.pdf, psc4_4.pdf, psc4_5.pdf, and psc4_6.pdf)
- Nov 24, Dec 1: Dwarf No. 5 - Structured Grids
- articles by M. Frigo and V. Strumpen:
Cache oblivious stencil operations (preprint);
The memory behavior of cache oblivious stencil operations (preprint can be found via Google) - article by K. Datta et al. in SIAM Review (preprint)
- articles by M. Frigo and V. Strumpen:
- Dec 8, 15: Dwarf No. 6 - Unstructured Grids and Partitioning
- additional material: article by Karypis and Kumar: A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
- additional material: article by Hendrickson and Leland: A Multilevel Algorithm for Partitioning Graphs (this article was awarded the "Test of Time Award" at the 2014 Supercomputing Conference, SC14)
- Dec 15, 22: Dwarf No. 5 revisited Structured Grids and Space-filling Curves
- IPython Notebook worksheets: Hilbert_Plotter.ipynb, sfc_hilbert_plotter_adp.ipynb
- Maple worksheets: hilbert_adap.mw (also as PDF);
- Jan 7, 19: Dwarf no. 4: N-body methods and Barnes-Hut and Fast Multipole algorithm
- article by Barnes & Hut in Nature (article can be accessed via TUM ebib-access)
- article by C.R. Anderson (Fast Multipole without multipoles (article can be accessed via TUM ebib-access)
Tutorials
Roughly every second week a two hour tutorial will take place (details at page top; days and time will be announced in TUMonline and in the lectures). The assignments and their solutions will be gradually posted here.
Date | Slides | Worksheet | Source | Source (solution) |
Oct 8th | Organizational remarks | - | - | - |
Oct 22nd | Introduction to CUDA | Worksheet 1 | Exercise 1 | included in Exercise 2 |
Nov 5th | Further details on Dense LA in CUDA | Worksheet 2 | Exercise 2 | Solution 2 |
Nov 19th | Sparse LA in CUDA | Worksheet 3 | Exercise 3 | included in Exercise 4 |
Dec 3rd | Solving the heat equation with CUDA | Worksheet 4 | Exercise 4 | Solution 4 |
Dec 17th | The Shallow Water Equations and CUDA | Worksheet 5 | Exercise 5 | included in Exercise 6 |
Jan 14th | Further topics on SWE and CUDA | Worksheet 6 | Exercise 6 | - |
Exam
- Repeat Exam:
- the same rules as for the written exam will apply (esp. no helping material, questions on the tutorials)
- if less than 10 participants register for the exam, it will be executed as a series of oral exams (announcement with details will follow by email)
- written exam on Mon, Apr 13, 2015, from 17.15 (room MI 02.07.023)
- please be in (front of) the lecture room in time (at 17.00); the exam will start on 17.15, at the latest, and there will be announcements before the start!
- no helping material of any kind will be allowed for the exam
- please make sure that you register for the exam in TUMonline
- the exam will extend over all topics discussed in the lectures and tutorials:
- approx. 30% of the questions will deal with questions related to the tutorials; basic knowledge about GPU programming with CUDA is thus necessary
Literature and Online Material
- R.H. Bisseling: Parallel Scientific Computing - A structured approach using BSP and MPI, Oxford University Press, 2004.
- Course notes on Rob Bisseling's lecture on Parallel Algorithms (based on the text book)
- V. Eijkhout: Introduction to High-Performance Scientific Computing (textbook, available as PDF on the website)
- T.G. Mattson, B.A. Sanders, B.L. Massingill: Patterns for Parallel Programming, Addison-Wesley, 2005
- G. Hager, G. Wellein: Introduction to High Performance Computing for Scientists and Engineers, Chapman & Hall/CRC Computational Science, 2010
Books on CUDA
- D.B. Kirk, W.W. Hwu: Programming Massively Parallel Processors - A Hands-on Approach, Morgan-Kaufman, 2010
- J. Sanders, E. Kandrot: CUDA by Example, Addison-Wesley, 2011
Prerequisites
Helpful, but not strictly required is knowledge in:
- basics of numerical methods (e.g.: lecture IN0019 Numerical Programming or similar)
- basics of parallel programming (lecture Parallel Programming, HPC - Programming Paradigms and Scalability, or similar)
Most important is a certain interest in problems from scientific computing and numerical simulation!