Utrecht University – Department of Information and Computing Sciences – Intelligent Systems group
I am currently a PhD candidate at the Department of Information and Computing Sciences of Utrecht University working on causal inference. My supervisors are Thijs van Ommen and Mehdi Dastani. My PhD project is part of a collaboration between ProRail and Utrecht University.
My current research focuses on the construction of useful abstractions from data making use of the interventionist framework of causal inference. This has lead me to the study of modeling of properties of causal relationships using information theoretical quantities. The goal is to develop algorithms which learn representations possessing the desired properties, by making use of appropriate information theoretical metrics.
In addition to exploring causal representation learning and information-based modeling of causal properties, I'm also investigating causal discovery when there's background knowledge and methods for selecting interventions in the presence of an SCM. These are topics that my Master's students have been working on.
On the applied side, an objective of my PhD is to apply these developed methods to analyze train delay data provided by ProRail. The goal is to understand how causal inference can be utilized to enhance train traffic control.
PhD in Artificial Intelligence, Utrecht University
2021-Now
Causal Discovery, Intervention Selection, Representation Learning, Information Theory
ML engineer & Data Scientist, Orbisk (Utrecht)
2020-2021
Computer Vision, API development
Research internship at UMC's Brain Center
2020
Dimensionality reduction methods in a Genome-Wide Association Study (GWAS) of ALS.
MSc in Theoretical Physics, Utrecht University.
2017-2019
Mathematical emphasis. Lie Algebras, Differential Geometry, Representation Theory, ...
A la carte Mathematics units, University of Porto.
2016-2017
Topology, Group Theory, Logic, Functional Analysis, Manifolds, ...
BSc in Physics, University of Porto.
2013-2016
The Minimal Search Space for Conditional Causal Bandits
Submitted to the 43nd International Conference on Machine Learning (ICML 2025).
We establish a graphical characterization of the minimal search space for causal bandits with conditional interventions and propose an efficient algorithm to identify it.
The Causal Information Bottleneck and Optimal Causal Variable Abstractions
Submitted to the 41st conference on Uncertainty in Artificial Intelligence (UAI 2025).
We propose the Causal Information Bottleneck (CIB), a method for learning variable abstractions suitable for causal tasks.
Fundamental Properties of Causal Entropy and Information Gain
In Proceedings of the 3rd conference of Causality Learning and Reasonig (CLeaR) 2024. Accepted for both poster and oral presentation.
This research contributes to the formal understanding of the notions of causal entropy and causal information gain by establishing and analyzing fundamental properties of these concepts, including bounds and chain rules. Furthermore, we elucidate the relationship between causal entropy and stochastic interventions. We also propose definitions for causal conditional entropy and causal conditional information gain.
Causal Entropy and Information Gain for Measuring Causal Control
In Proceedings of the European Conference on Artificial Intelligence (ECAI) 2023. Work presented at the third XI-ML workshop of ECAI 2023.
We introduce causal versions of entropy and mutual information, termed causal entropy and causal information gain, which are designed to assess how much control a feature provides over the outcome variable. These newly defined quantities capture changes in the entropy of a variable resulting from interventions on other variables.
Jaccard Kernel PCA in genotype and gene-burden data for ALS
Report for my internship at the Utrecht Brain Center, part of the Utrecht Medical Center (2020).
The project aimed to study whether one could enhance population stratification control in large-scale genetic studies of Amyotrophic lateral sclerosis (ALS) by utilizing Jaccard principal component analysis (jPCA) as an alternative to standard methods like PCA, which are ineffective for rare genetic variants. To that end, I developed a set of scripts (written in R) capable of running jPCA on very large datasets (through parallelization) and another one allowing for arbitrary positive integers gene-burden values.
The Monoidal Category of D-branes in a Kazama-Suzuki Model
My Master's thesis. Written for completion of the MSc program in Theoretical Physics (2019).
This thesis consists of an application of category theory to string theory. Concretely, we show that the D-branes of the most prolific Kazama-Suzuki model (which is an N = 2 superconformal field theory with central charge c = 9) form a category finite on objects. We furthermore prove that this category admits a notion of tensor product and thus a monoidal structure. I believe this thesis can be useful to someone trying to understand Kac-Moody algebras in the context of string theory, and representation theory of the Virasoro algebra in general.
Title: Going Beyond Causal Strength - An information theoretical metric of specificity in causal relationships
Title: Aspects of the causal structure of train delays - Abstracting train delay complexity & Learning delay relations from data
Intervention Selection with Railway Simulators - a causal approach to delay management
2024
This Master's project was incorporated into the student's internship at ProRail.
The student had to implement various MAB algorithms in order to select the best delay-reducing intervention in a railway system, modeled by a simulator which the learning algorithm could interact with.
TopICS: a topological intervention candidate selection algorithm
2023-2024
This Master's thesis was incorporated into the student's internship at ProRail.
The goal of this project was to implement a search space reduction algorithm which limits the number of interventions that need to be tested in order to find the optimal intervention for a given target.
Causal discovery from train network data with background knowledge
2022-2023
This Master's thesis was incorporated into the student's internship at ProRail.
The goal of this project was to learn an SCM describing train delay data provided by ProRail. The student applied a modified version of the FCI algorithm to train delay data. This algorithm took into account the background knowledge one has about causal relationships between train delays, making it feasible to learn the causal graph despite the high number of variables. Each structural equation was then estimated by fine-tuning a neural network previously trained on the entire dataset, and which also incorporated exogenous variables external to the delay data.
Advanced Machine Learning - MSc Computer Science
2025
Tutorial coordinator; Guest lecturer; Teaching Assistant
Machine Learning - BSc Computer Science
2024
Tutorial coordinator; Teaching Assistant
Advanced Machine Learning - MSc Computer Science
2024
Tutorial coordinator; Guest lecturer; Teaching Assistant
Machine Learning - BSc Computer Science
2023
Tutorial coordinator; Teaching Assistant
Advanced Machine Learning - MSc Computer Science
2023
Tutorial coordinator; Guest lecturer; Teaching Assistant
Machine Learning - BSc Computer Science
2022
Tutorial coordinator; Teaching Assistant
Calculus and Linear Algebra - University College Utrecht
2019
Teaching Assistant