Course information | Multivariate Statistics - M1 - 8EC

Multivariate Statistics - M1 - 8EC

marks.pdf.pdf

Prerequisites:

o Basic knowledge of probability theory

expectation, variance, covariance
transformation formula for densities
conditional probability distribution, independence
convergence in probability, convergence in distribution
law of large numbers, central limit theorem
Markov inequality
characteristic function
normal distribution, chi-squared distribution, t-distribution

o Basic knowledge of statistics

ML estimation (also for p-dimensional parameters), moment estimators
confidence intervals
hypothesis tests

o Basic knowledge of linear algebra

matrix inverse
determinant, trace
eigendecomposition
square root of a matrix

o Basic knowledge of calculus

limits
differentiation (of functions of more than one variable)
integration (of functions of more than one variable)
Lagrange multiplier

o Basic knowledge in R (students can also use alternative programming languages like Python or Matlab, but solutions to problems will always be given in R)

reading data
creating plots
using loops
using logical operators ("&", "|","!")
defining functions
matrix operations

Note that we will shortly repeat more unknown concepts like Lagrange multiplier so that some gaps in knowledge are not problematic

Aim of the course:

In multivariate statistics we observe multiple measurements for each individual observation. This can be vital signs like heart rate, blood pressure and respiratory rate of a patient or household expenditures for housing, food, education and entertainment. A focus lies on finding and modelling dependencies between these individual variables so that we can gain insights into the underlying mechanics.

A particular challenge is posed by the case where the dimension of the observations is large. Nowadays collecting data is much cheaper than in the past so that working with huge data sets is not unusual anymore. We will tackle this problem among others by means of dimension reduction. Graphical tools will help us to understand and visualize the structure of big and complex data sets.

Often, we cannot assume that all observations are homogeneous and follow the same probability model. In this case we want to discover groups within the data set and classify observations into them.

At the end of the course the student will be able to analyze complex datasets. They can handle large data sets, investigate the underlying dependence structure, identify subpopulations and test for the equality of their means and covariances. Students will understand the mathematical foundation of multivariate statistical procedures and thus understand their limitations. They will be able to derive theoretical properties like consistency and asymptotic normality of new estimators. Students can adapt existing hypothesis-tests to their needs or construct new tests based on general principles.

Rules about homework / exam:

Doing homework is voluntary but recommended. The final grade is based on the written exam only. To pass the course, the grade for the (retake) exam should be at least 5.5.

Lecture notes / literature:

Lectures notes are published online. They are based on the books:

Wolfgang Härdle and Leopold Simar (2024): Applied Multivariate Statistical Analysis
Theodore Anderson (2003): An Introduction to Multivariate Statistical Analysis

Docent: Alexander Dürre