Graduate seminar course
Teaching ML research skills with the Chambers — a graduate course at the University of Potsdam

In this graduate seminar, students implement core ML methods from scratch and evaluate them against experimental data they collect from a physical system — highlighting common misconceptions, and providing firsthand experience of how theory can both fail and succeed under real-world conditions.
The course, originally called "From ML Theory to Practice", was run at the University of Potsdam in the fall semester of 2025. It was designed by Juan L. Gamella and Simon Bing, with Simon as the instructor for the Fall 2025 edition.
The course gives graduate (or advanced undergraduate) students in CS, Statistics, or Data Science a taste of what research in machine learning looks like in practice. Over the semester, students independently read research literature, implement a selection of core methods from scratch — classifier two-sample tests, VAEs, Gaussian processes, Bayesian optimization — and apply them to real experimental data they collect from the Causal Chambers, a set of physical devices designed for ML and causal-inference research.
See the course repository for the complete details, schedule, and materials.
How the course is run
The course follows a flipped classroom approach. Each week, students read the assigned literature at home, work through the current project notebook, and meet for a 2-hour session with the instructor. In each session:
Students present their solution to the previous project.
Open questions about the literature or the project are discussed with the instructor.
The next topic and project are introduced.
Between sessions, take-home work alternates between reading (in preparation for a new topic) and implementing (working through the corresponding project notebook). See the course repository for the complete details and materials.
The instructor solutions and additional support material are hosted in a separate, private repository. You can request access through this form.
Course outline
The course is split into 8 projects of varying length. Some are done in-session with the instructor, while others are intended for the students to work at home.
Where possible, projects use the existing open-source datasets and need no live access to a Causal Chamber® (see Needs API below). The remaining projects have students collect their own data through the Remote Lab.
Projects
Understanding linear models on synthetic data
The goal is to familiarize students with the linear model and expose them to common misconceptions about p-values, confidence and prediction intervals, and related concepts. The project serves as a warm-up for the course and the submission system.
Intermezzo: collecting data from a Causal Chamber®
Together with the instructor, the students set up their credentials and learn to collect data from the Chambers using the Remote Lab.
Linear models and real-world data
The students apply the linear model to experimental data collected from the Chambers and experience how the model breaks down under assumption violations. They witness the effect of multicollinearity, and collect data to observe the principle of causal invariance and the minimax formulation of causality.
Causality, RCTs, and two-sample testing
The students learn the basics of experiment design, randomized controlled trials, and statistical hypothesis testing. They apply what they've learned to test a real causal hypothesis in the physical system of the Chambers, and repeat the experiment under different conditions to witness the problems with p-value peeking.
Classifier two-sample tests
As a follow-up to the previous project, the students learn about classifier two-sample tests as a tool for hypothesis testing on high-dimensional data. They build a complete classifier and test from scratch, and apply it to an image dataset from the Chambers.
Generative models: VAEs
The students implement an autoencoder and a variational autoencoder (VAE) from scratch, and apply them to a representation learning problem using experimental data with a ground truth from the Chambers.
Gaussian Processes (GPs)
The students build kernels and the machinery for Gaussian process regression and sampling from scratch. They apply the machinery to synthetic data and learn to incorporate background knowledge by combining kernels. The goal is to familiarize students with GPs as preparation for the final project in Bayesian optimization.
Bayesian Optimization
The students apply what they've learned about GPs to build a complete Bayesian optimization pipeline and solve an optimization problem in the real, physical system of the Chambers.
For instructors
Interested in running this course at your institution? The course materials are publicly available in the course repository under a CC-BY-4.0 license. For instructor solutions, support material, or questions about adapting the course, reach out through this form.
Last updated