Description

High-dimensional probability theory studies random objects in high-dimensional spaces, such as random vectors, matrices, and processes. High-dimensional probabilistic problems arise in various fields, such as statistical analysis of high-dimensional data, theory of (large) random graphs, machine learning, numerical linear algebra, randomized algorithms, convex geometry, and compressed sensing. Although these application areas differ considerably, there is an important circle of probabilistic principles and techniques, including concentration inequalities and chaining methods, that is commonly used in all of them. The aim of this course is to give an introduction to these ideas, with a view towards applications in data science.

Topics

We aim to discuss the following topics:

  • Elementary concentration inequalities (Hoeffding, Chernoff, Bernstein) 

  • Concentration of high-dimensional random vectors

  • Gaussian concentration for Lipschitz functions

  • Nets, covering and packing numbers

  • Matrix concentration inequalities

  • High-dimensional covariance estimation and principal component analysis

  • Decoupling and the Hanson-Wright inequality

  • Chaining and generic chaining

During the theoretical development of these topics, we will discuss several applications in mathematical data science, such as data dimension reduction, community detection in networks, matrix completion, clustering, semidefinite programming, and sparse linear regression.

 

Prerequisites

It is required to have a firm grasp of undergraduate 

  • analysis (norms, inner products, metrics, basics of convex sets and functions, Taylor approximations and series, gamma function, Stirling’s approximation, Riemann integration, differentiation in multiple variables); 

  • probability theory (random variables and vectors, expected value, covariance matrix, standard univariate probability distributions (Bernoulli, binomial, Poisson, normal, exponential, etc.), multivariate normal distribution, Markov, Chebyshev, and Jensen inequalities, law of large numbers, central limit theorem, Poisson limit theorem);

  • linear algebra (eigenvalues, eigenvectors, orthogonal projections, orthogonal matrices, spectral decomposition, positive (semi-)definiteness; prior exposure to singular values and the singular value decomposition is useful but not required).

Knowledge of measure-theoretic probability theory is not required.

No prior knowledge from other Mastermath courses is required.

Primary literature.

Vershynin, Roman. High-dimensional probability: An introduction with applications in data science. Vol. 47. Cambridge university press, 2018. We will use a pre-publication version of the second edition of this book (freely available online: https://www.math.uci.edu/~rvershyn/papers/HDP-book/HDP-2.pdf)

 

Van Handel, Ramon, Probability in high dimensions, lecture notes, Princeton university, 2016. (available online: http://web.math.princeton.edu/~rvan/APC550.pdf)

 

Sources for further reading.

 

Blum, Avrim, John Hopcroft, and Ravindran Kannan. Foundations of data science. Cambridge University Press, 2020. (available online: https://www.cs.cornell.edu/jeh/book.pdf)

 

Boucheron, Stéphane, Gábor Lugosi, and Pascal Massart. Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013.

 

Ledoux, Michel. The concentration of measure phenomenon. American Mathematical Society, 2001.

 

Talagrand, Michel. Upper and lower bounds for stochastic processes. Springer, 2014.


Examination. 

During the course there will be four homework assignments, which will be completed in pairs. At the end of the course there will be a final exam and a re-take exam. These exams will either be written or oral, depending on the number of participants. The final grade for the course is determined by the average homework grade (20%) and the exam/re-take exam grade (80%). A sufficient grade (5,5 or higher) for the exam/re-take exam is required to pass the course.