1) Prerequisites:
A solid command of linear algebra is essential, including column reduction, quotient vector spaces, dual spaces, and orthogonal projections onto subspaces. Familiarity with vector spaces over finite fields, in particular the field with two elements, is also helpful. The course includes coding assignments, so some prior experience with Python is expected.
An undergraduate course in topology is not strictly required, but it will make certain parts of the material easier to follow (for example, Chapters 1–3 of J. R. Munkres). The course is largely self-contained, although familiarity with basic homotopies will make the learning curve less steep.
2) Aim of the course:
Topological Data Analysis (TDA) leverages tools from (algebraic) topology to gain a qualitative understanding of data -- or, in other words, to infer properties of the shape of the data. The ‘’simplest’’ of the topological invariants is the number of connected components of a data set. Translated into the language of traditional data analysis, this would correspond to the task of clustering the data, i.e., the process of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). TDA can also detect many other types of non-linearity in data, e.g., tendrils, voids, and periodicity in time-dependent data.
The aim of this course is for the students to learn about the algorithms and mathematics underlying the core tools of TDA, as well as the applicability of TDA to the sciences through guest lectures and assignments.
Simplicial theory: fundamentals of simplicial complexes, simplicial (co-)homology (incl. algorithms), simplicial complexes from data (Cech, Rips), Nerve Lemma, sensor networks.
Persistent (co-)homology: algorithms, algebraic foundations, stability, homological inference, interleavings, circular coordinates, kernel methods, applications, and a proof of the isometry theorem.
Generalized persistence: zigzag and multiparameter persistence.
Clustering: ToMATo, Kleinberg’s theorem, clustering in multiple parameters.
Low-dimensional embeddings: Mapper, Reeb graphs, ISOMAP, UMAP.
Spectral theory: (persistent) Laplacians for graphs and simplicial complexes.
Algebraic topology and category theory elements, such as homotopy equivalence and functoriality, will be introduced when needed.
The course will follow my lecture notes (soon to be a book).
3) Rules about Homework/Exam:
Assignment 1 (15% of final grade): The students will explore one or several datasets using tools from topological data analysis related to persistent homology. This is individual work.
Assignment 2 (25% of final grade): The students will be presented with a range of papers covering different aspects of TDA not discussed in class. The assignment will be graded based on a short report (5 pages) and a short presentation (10 minutes). In the interest of time, the second assignment may be done in pairs. The presentations will be graded individually.
4) Lecture notes/Literature:
The final exam will count for 60% of the final grade and will last 3 hours. In order to pass the course, the grade of the exam has to be at least 5.0. Homework will count for 40% of the retake exam.
- Docent: Magnus Bakke Botnan