Abstract
This course provides a thorough treatment of stochastic optimization via Stochastic Approximation (SA). SA is the umbrella term for methods that find solutions to optimization problems in an iterative way, where the algorithms are driven by either simulation experiments or observed data. The stochastic gradient descent method, which belongs to the family of SA algorithms, has become a standard technique in machine learning and will be discussed in-depth in this course. The course summarizes the key results of SA. Proves of the main results will be given, and many illustrating examples are provided. Also, the statistical output analysis of SA algorithms is covered in this course. The main method for simulation-based optimization and learning we will work with in tis course is the Simultaneous Perturbation Stochastic Approximation, which is a numerically highly efficient pseudo-gradient method.

Full Description
Adaptive algorithms are now utilized across varied applications in simulation/data-driven learning and optimization. These algorithms aim to estimate recursively an unknown time-invariant (or slowly varying) parameter vector, traditionally denoted by θ, from measurements whose statistics depends on it. A well-known example is the gradient descent method which is one of the main techniques in AI and machine learning for supervised and unsupervised learning. Such algorithms are subsumed under the name  stochastic approximation (SA). 

The focus of this course is on SA algorithms and their application to stochastic simulation/data-based optimization and learning algorithms. The ODE approach to iterative learning and optimization algorithms will be introduced. In this course we will discuss the projection-method and algorithms with biased gradient estimators. Next to discussing various SA algorithms, this course provides the theoretical insights needed for carrying out a proper statistical output analysis of SA-type algorithms. Optimal computer budget allocation in SA will be taught as well (batching updates is often not advisable). Applications will stem from a wide range of stochastic models.

The course is of interest to students in the area of Simulation Analytics, Computer Science, Operations Research, and Machine Learning. It offers a self-contained course to simulation-based techniques in optimization and (machine) learning. Participating students have majors in disciplines of applied mathematics, computer science, data science, operations research, physics, electrical engineering, and economics.

Prerequisites
Mathematical maturity. Basic knowledge of non-linear optimization theory, probability theory, stochastic processes, and Monte Carlo simulation

Aim of the course
In this course the student will learn how to apply gradient estimators in stochastic simulation/data-based optimization and learning algorithms. After successful participation in this course the student will be able to conduct a simulation-based stochastic optimization solution to real life problems.

Lecturers
Bernd Heidergott (VU) and Ad Ridder (VU)