1) Prerequisites
-
Programming experience in one of Matlab, Python or Julia. Needed for project
-
Linear algebra: matrix and vector operations and norms, eigenvalues and eigenvectors, rank, systems of linear equations and least squares problem
-
Numerical analysis: conditioning of a mathematical problem and of backward stability of an algorithm to solve that problem
These prerequisites and assumed prior knowledge can for example be obtained from the combination of both:
-
G.W. Strang (2023). Introduction to Linear Algebra (sixth edition). Wellesley-Cambridge Press, USA.
-
L.N. Trefethen and D. Bau (1997). Numerical Linear Algebra, SIAM Society for Industrial and Applied Mathematics.
Prior study of a numerical linear algebra course will be useful but not necessary. This course is distinct from and complementary to the MasterMath course in Numerical Linear Algebra.
2) Aim of the course
The goal of this course is to equip students with a deep understanding of the mathematical foundations and practical algorithms required to tackle large-scale problems in data science, namely data assimilation (DA) and machine learning (ML). The course will emphasize techniques that are not only theoretically sound but also highly relevant to modern applications in these rapidly evolving fields. We will begin with a solid review of numerical linear algebra (NLA), focusing on both direct and iterative solvers for sparse linear systems, which are crucial in many large-scale computations.
The course will make clear connections between these advanced NLA techniques and their applications in DA and ML from the very beginning. In data assimilation, for instance, we will explore how large-scale state estimation problems and error covariance updates are solved efficiently using advanced matrix operations. Similarly, in machine learning, students will learn how NLA underpins critical methods like principal component analysis (PCA), efficient solutions of regularized linear models, random feature methods, kernel methods.
The course will consist of 4 parts:
-
Weeks 1 -- 3 (JMT): Numerical linear algebra fundamentals: solving linear systems, linear algebra on a computer and basic iterative methods
-
Weeks 4 -- 6 (VDM): Krylov methods: Conjugate gradient, GMRES and preconditioning
-
Week 7: Mid-term feedback: presentation of project part 1.
-
\item Week 8 -- 10: NLA for Machine Learning: PCA, random feature methods, kernel methods
-
Week 11 -- 13: NLA for data assimilation: variational methods, ensemble methods, ML in DA
-
Week 14: Final project presentations
-
Week 15: Revision session
3) Rules about Homework/Exam
There will be weekly problems classes based on formative (ungraded) theoretical and numerical problem sheets. Solutions to these will be provided weekly.
Assessment will consist of a project (40%) and final written exam (60%). Due to Mastermath rules, a minimum grade of 5.0 for the exam is required to pass.
The project will be conducted in small groups and consist of two parts:
-
Part 1: Apply iterative solver methods to a matrix arising from a relevant applications Students will need to select a suitable problem, describe its mathematical and computational structure and apply a justified methodology to solve a linear system and/or eigenvalue problem.
-
Part 2: Reproduce an example for ML or DA from the literature. The project should explain the theory, implement a numerical example or demonstration (with emphasis on reproducible code) and present their results to the group.
Some examples for both parts from ML and DA will be provided. Students will be able to select their own alternative examples, but this must be approved by one of the teaching team.
Mandatory mid-term presentations will provide a formative feedback moment. A detailed weekly breakdown will be provided for the first part, and a clear rubric provided for the more open-ended second part. Assessment will be made via the final presentations in Week 14, and a short (around 10 page) write up.
The final exam will be a written exam, with questions in a similar style to those of the exercise classes. Depending on numbers the resit exam will either be a written exam in the same style as the first exam, or an oral exam. The resit exam will count 60%.
4) Lecture notes/Literature:
Lecture notes and slides from lectures will be made available to students via the MasterMath virtual learning environment.
Core literature:
-
Hastie, Tibshirani, Friedman, The Elements of Statistical Learning
-
Bach, Baptista, Sanz-Alonso, and Stuart, Machine Learning for Inverse Problems and Data Assimilation
-
Evensen, Data Assimilation: The Ensemble Kalman Filter
-
Golub \& Van Loan, Matrix Computations (optional for depth)
Additional literature references will be provided throughout the course.
- Docent: Victorita Dolean-Maini
- Docent: Jemima Tabeart