Effect estimation with proxies

How our subscribers at the University of Copenhagen used the Chambers for their research in causal effect estimation.

circle-info

You probably need some knowledge of Causal Inference to understand all the details in this post. If you're interested, The Book of Whyarrow-up-right (Judea Pearl) and Causal Inference: What Ifarrow-up-right (Hernán and Robins) are a good place to start.

Estimating the causal effect of a treatment on an outcome is somewhat straightforward when all relevant variables are observed. Of these, the most important are the confoundersarrow-up-right: third variables that affect both treatment and outcome and that, if unobserved, distort their true relationship.

In practice, it is very common that we cannot observe confounders directly, but only have access to them through a noisy measurement, called a proxy.

This is the problem that our colleagues at the University of Copenhagenarrow-up-right studied in their recent 2026 paper "Identifying Causal Effects Using a Single Proxy Variable"arrow-up-right by Silvan Vollmerarrow-up-right, Niklas Pfisterarrow-up-right, and Sebastian Weichwaldarrow-up-right. With a novel result, they extend the settings in which the causal effect between treatment and outcome is identifiable, and develop an algorithm to estimate it.

Silvan Vollmerarrow-up-right, the first author of the paper, presenting his work at EuroCIMarrow-up-right.

Validation on a real physical experiment

The authors encountered the other fundamental problem in causal inference: finding real-world datasets suitable to validate your algorithms 🤓. This is where the Chambers come in.

The authors used our Light Tunnel Mk2 to create a real, physical experiment that matched their problem formulation. By running the tunnel in its linked_ledsarrow-up-right configuration, a part of the causal graph of the chamber (C) resembled the single-proxy scenario:

Overview of the setup used in the paper and the resulting causal graph. You can find a detailed description of all variables in the documentationarrow-up-right of the hardware configuration.

Let's break this down. In this particular hardware configuration, the brightness of the UV LED atop the second light-intensity sensor (ir_2) is set by the chamberarrow-up-right as a linear function of the measurement of the first sensor (ir_1).

In this setup, ir_1 serves as the treatment and ir_2 as the outcome, with the green brightness of the main light source acting as the confounder between both sensor measurements. As proxy, we take current_ls_raw: a noisy measurement of the electrical current drawn by the light source, which depends on its brightness.

There are two additional variables: the sensor parameters sps_ and offset_current_ls, which control the oversampling ratearrow-up-right and reference voltage of the current sensor. By changing their values, the authors were able to test their method under different proxies. The values for all the variables are given in Appendix Karrow-up-right of the paper.

Additional resources

You can find the datasetsarrow-up-right collected by the authors, as well as the code to collect them using the Remote Lab, in our open-source dataset repositoryarrow-up-right.

References

Last updated