**Prerequisites**

Introductory course in linear algebra. Some knowledge of algorithms.

Knowledge of a modern programming language such as C, C++, C#, Python, or Java. Basic knowledge of C is helpful, as we will use this language in class, but we will organise the course in such a way that there will be time to adapt to it for those coming from another language. For those with little or no experience with C, this course is an opportunity to learn C, which is much faster than Python and hence commonly used in high performance computing. Some knowledge of the UNIX/LINUX operating system is helpful.

**Aim of the course**

The course aims to provide a thorough knowledge of designing and implementing parallel algorithms for problems in the area of scientific computing and big data.

**Learning goals**

After completion of the course, the student is able to:

- design a parallel algorithm for a problem from the area of scientific computing or big data;
- analyse the cost of the algorithm in terms of computing time, communication time and synchronisation time;
- write a parallel program based on an algorithm that solves a problem;
- write a report on the algorithm, its analysis, and numerical experiments performed, in a concise and readable form;
- present the results in an oral presentation, in a clear manner, highlighting the most important ideas.

Today, parallel computers are appearing on our desktops. The advent of dualcore, quadcore, and octacore computers and the expected increase in the number of cores in the coming years causes a major change in our approach to software, such as the mathematical software we use in scientific computations and in the emerging areas of big data computations and machine learning.

Parallel computers drastically increase our computational capabilities and thus enable us to model more realistically in many application areas. To make efficient use of parallel computers, it is necessary to reorganize the structure of our computational methods. In particular, attention must be paid to the division of work among the different processors solving a problem in parallel and to the communication between them. Suitable parallel algorithms and systems software are needed to realize the capabilities of parallel computers.

We will discuss extensively the most recent developments in the area of parallel computers, ranging from multicore desktop PCs to clusters of PCs connected by switching devices, to massively parallel computers with distributed memory such as our national supercomputer Snellius at SURFsara in Amsterdam.

The following subjects are treated:

- Types of existing parallel computers;
- Principles of parallel computation: distributing the work evenly and avoiding communication;
- The Bulk Synchronous Parallel (BSP) model as an idealized model of the parallel computer;
- Use of BSPlib software as the basis for architecture-independent programs;
- Parallel algorithms for the following problems: inner product computation, sorting, prime number sieving, LU decomposition, sparse matrix-vector multiplication, iterative solution of linear systems, graph matching;
- Analysis of the computation, communication and synchronization time requirements of these algorithms by the BSP model;
- Hands-on experience in the laboratory class, using the Snellius supercomputer.

**Lecturer**

R. Bisseling (UU)

- Docent: Constantijn Dekker