Causality Causality Workbench                                                             Challenges in Machine Learning Causality

Causality Challenge #3: Cause-effect pairs

Register to our Google group causalitychallenge to keep informed!

The challenge is over, but we are running a follow up challenge: the CHALEARN Fast Causation Coefficient Challenge (until June 15, 2014).

View a video

A causes B


The problem of attributing causes to effects is pervasive in science, medicine, economy and almost every aspects of our everyday life involving human reasoning and decision making. What affects your health? the economy? climate changes? The gold standard to establish causal relationships is to perform randomized controlled experiments. However, experiments are costly while non-experimental "observational" data collected routinely around the world are readily available. Unraveling potential cause-effect relationships from such observational data could save a lot of time and effort.
Consider for instance a target variable B, like occurence of "lung cancer" in patients. The goal would be to find whether a factor A, like "smoking", might cause B. The objective of the challenge is to rank pairs of variables {A, B} to prioritize experimental verifications of the conjecture that A causes B.
As is known, "correlation does not mean causation". More generally, observing a statistical dependency between A and B does not imply that A causes B or that B causes A;  A and B could be consequences of a common cause. But, is it possible to determine from the joint observation of samples of two variables A and B that A should be a cause of B? There are new algorithms that have appeared in the literature in the past few years that tackle this problem. This challenge is an opportunity to evaluate them and propose new techniques to improve on them.
We provide hundreds of pairs of real variables with known causal relationships from domains as diverse as chemistry, climatology, ecology, economy, engineering, epidemiology, genomics, medicine, physics. and sociology. Those are intermixed with controls (pairs of independent variables and pairs of variables that are dependent but not causally related) and semi-artificial cause-effect pairs (real variables mixed in various ways to produce a given outcome).
This challenge is limited to pairs of variables deprived of their context. Thus constraint-based methods relying on conditional independence tests and/or graphical models are not applicable. The goal is to push the state-of-the art in complementary methods, which can eventually disambiguate Markov equivalence classes. If you are skeptical that this is possible, try this quiz: Examine the plot below of values of variable B plotted as a function of values of variable A. Can you guess which one is a cause of the other? Hint: Some non-linear functions are non-invertible.

A->B or A<-B ?


Challenge Rules

Schedule (updated June 28, 2013)

Sponsored by Pascal2 and part of the IJCNN 2013 contests