Copula-Guided Causal Simulators

Masterarbeit

Supervised causal learning (SCL) trains neural networks to recover causal graphs from data by generating large numbers of synthetic pairs (G, DG ), where G is a graph (typically a DAG) and DG is a dataset sampled from a

structural causal model (SCM) Markov to G.

Most existing SCL work uses hand-crafted SCM generators: graphs are sampled from simple random graph models, and mechanisms are linear or drawn from small families of oy nonlinear functions. his often yields unrealistic dependencies and poor out-of-distribution performance when applied to real data.

A promising alternative is to learn local dependence structure from real datasets, in a graph-agnostic way, nd hen plug this realistic “coupling library” nto rbitrary graphs. Copulas and vine copulas are a natural tool here: they separate marginals from dependence and allow us to model flexible bivariate and conditional

bivariate relationships. If we restrict the maximum node degree (e.g. to 3–

4 parents), we can hope to build fast, stable, and realistic generators based on pairwise and conditional pair copulas.

The core idea of this thesis is therefore: Estimate a collection of pairwise and conditional pair copulas from real data, and use them as local building blocks to generate synthetic datasets for arbitrary graphs (directed or undirected) with bounded degree.