Francisco N. F. Q. Simoes  

Utrecht University – Department of Information and Computing Sciences – Intelligent Systems group

I am currently a PhD candidate at the Department of Information and Computing Sciences of Utrecht University working on causal inference. My supervisors are Thijs van Ommen and Mehdi Dastani. My PhD project is part of a collaboration between ProRail and Utrecht University.

Your Name's Photo

My current research focuses on the construction of useful abstractions from data making use of the interventionist framework of causal inference. This has lead me to the study of modeling of properties of causal relationships using information theoretical quantities. The goal is to develop algorithms which learn representations possessing the desired properties, by making use of appropriate information theoretical metrics.

In addition to exploring causal representation learning and information-based modeling of causal properties, I'm also investigating causal discovery when there's background knowledge and methods for selecting interventions in the presence of an SCM. These are topics that my Master's students have been working on.

On the applied side, an objective of my PhD is to apply these developed methods to analyze train delay data provided by ProRail. The goal is to understand how causal inference can be utilized to enhance train traffic control.

  • PhD in Artificial Intelligence, Utrecht University

    2021-Now

    Causal Discovery, Intervention Selection, Representation Learning, Information Theory

  • ML engineer & Data Scientist, Orbisk (Utrecht)

    2020-2021

    Computer Vision, API development

  • Research internship at UMC's Brain Center

    2020

    Dimensionality reduction methods in a Genome-Wide Association Study (GWAS) of ALS.

  • MSc in Theoretical Physics, Utrecht University.

    2017-2019

    Mathematical emphasis. Lie Algebras, Differential Geometry, Representation Theory, ...

  • A la carte Mathematics units, University of Porto.

    2016-2017

    Topology, Group Theory, Logic, Functional Analysis, Manifolds, ...

  • BSc in Physics, University of Porto.

    2013-2016

Research & Publications

The Minimal Search Space for Conditional Causal Bandits

Francisco N. F. Q. Simoes, Mehdi Dastani, Thijs van Ommen

Submitted to the 43nd International Conference on Machine Learning (ICML 2025).

We establish a graphical characterization of the minimal search space for causal bandits with conditional interventions and propose an efficient algorithm to identify it.

The Causal Information Bottleneck and Optimal Causal Variable Abstractions

Francisco N. F. Q. Simoes, Mehdi Dastani, Thijs van Ommen

Submitted to the 41st conference on Uncertainty in Artificial Intelligence (UAI 2025).

We propose the Causal Information Bottleneck (CIB), a method for learning variable abstractions suitable for causal tasks.

Fundamental Properties of Causal Entropy and Information Gain

Francisco N. F. Q. Simoes, Mehdi Dastani, Thijs van Ommen

In Proceedings of the 3rd conference of Causality Learning and Reasonig (CLeaR) 2024. Accepted for both poster and oral presentation.

This research contributes to the formal understanding of the notions of causal entropy and causal information gain by establishing and analyzing fundamental properties of these concepts, including bounds and chain rules. Furthermore, we elucidate the relationship between causal entropy and stochastic interventions. We also propose definitions for causal conditional entropy and causal conditional information gain.

Causal Entropy and Information Gain for Measuring Causal Control

Francisco N. F. Q. Simoes, Mehdi Dastani, Thijs van Ommen

In Proceedings of the European Conference on Artificial Intelligence (ECAI) 2023. Work presented at the third XI-ML workshop of ECAI 2023.

We introduce causal versions of entropy and mutual information, termed causal entropy and causal information gain, which are designed to assess how much control a feature provides over the outcome variable. These newly defined quantities capture changes in the entropy of a variable resulting from interventions on other variables.

Jaccard Kernel PCA in genotype and gene-burden data for ALS

Francisco N. F. Q. Simoes (supervisor: Kevin Kenna)

Report for my internship at the Utrecht Brain Center, part of the Utrecht Medical Center (2020).

The project aimed to study whether one could enhance population stratification control in large-scale genetic studies of Amyotrophic lateral sclerosis (ALS) by utilizing Jaccard principal component analysis (jPCA) as an alternative to standard methods like PCA, which are ineffective for rare genetic variants. To that end, I developed a set of scripts (written in R) capable of running jPCA on very large datasets (through parallelization) and another one allowing for arbitrary positive integers gene-burden values.

The Monoidal Category of D-branes in a Kazama-Suzuki Model

Francisco N. F. Q. Simoes (supervisors: Stefan Vandoren, Ana Ros Camacho)

My Master's thesis. Written for completion of the MSc program in Theoretical Physics (2019).

This thesis consists of an application of category theory to string theory. Concretely, we show that the D-branes of the most prolific Kazama-Suzuki model (which is an N = 2 superconformal field theory with central charge c = 9) form a category finite on objects. We furthermore prove that this category admits a notion of tensor product and thus a monoidal structure. I believe this thesis can be useful to someone trying to understand Kac-Moody algebras in the context of string theory, and representation theory of the Virasoro algebra in general.

Posters and Talks

Supervision & Teaching

Intervention Selection with Railway Simulators - a causal approach to delay management

2024

Supervisee: Tomás Iken. Co-supervisor: Thijs van Ommen.

This Master's project was incorporated into the student's internship at ProRail.

The student had to implement various MAB algorithms in order to select the best delay-reducing intervention in a railway system, modeled by a simulator which the learning algorithm could interact with.

TopICS: a topological intervention candidate selection algorithm

2023-2024

Supervisee: Sem Yedema. Co-supervisor: Thijs van Ommen.

This Master's thesis was incorporated into the student's internship at ProRail.

The goal of this project was to implement a search space reduction algorithm which limits the number of interventions that need to be tested in order to find the optimal intervention for a given target.

Causal discovery from train network data with background knowledge

2022-2023

Supervisee: Vera Schoonderwoerd. Co-supervisor: Thijs van Ommen.

This Master's thesis was incorporated into the student's internship at ProRail.

The goal of this project was to learn an SCM describing train delay data provided by ProRail. The student applied a modified version of the FCI algorithm to train delay data. This algorithm took into account the background knowledge one has about causal relationships between train delays, making it feasible to learn the causal graph despite the high number of variables. Each structural equation was then estimated by fine-tuning a neural network previously trained on the entire dataset, and which also incorporated exogenous variables external to the delay data.

Teaching

Advanced Machine Learning - MSc Computer Science

2025

Tutorial coordinator; Guest lecturer; Teaching Assistant

Machine Learning - BSc Computer Science

2024

Tutorial coordinator; Teaching Assistant

Advanced Machine Learning - MSc Computer Science

2024

Tutorial coordinator; Guest lecturer; Teaching Assistant

Machine Learning - BSc Computer Science

2023

Tutorial coordinator; Teaching Assistant

Advanced Machine Learning - MSc Computer Science

2023

Tutorial coordinator; Guest lecturer; Teaching Assistant

Machine Learning - BSc Computer Science

2022

Tutorial coordinator; Teaching Assistant

Calculus and Linear Algebra - University College Utrecht

2019

Teaching Assistant