Simulation data sets for testing causal discovery in the geosciences

Submitted: 2016-02-08


When using causal discovery in the geosciences, it is hard to evaluate the results, because there is generally no ground truth available. To fill this gap we simulate two important processes that are often dominant in geophysical processes, advection and diffusion, in a 2D numerical grid. This provides several data sets (pure advection, pure diffusion, and combination) for which ground truth is available. The accompanying paper shows results obtained using constraint-based structure learning to identify the underlying dynamic processes from spatio-temporal data. We hope that these data sets will serve as a benchmark for others to try out their causal discovery algorithms and to compare them to our results.

