|
MARTIP studies the probe method on
MARTI
MARTIP uses the artificially generated dataset MARTI, to study the probe method. We assume that MARTI
data came from a real, but unknown, generative process. We add to the 1024
variables of MARTI 4096 "probes". Those are artificially generated variables
including randomly generated variables completely independent of the target,
and consequences of subsets of original variables (including the target)
and other probes. Importantly, no probe is a cause of the target.
Ideally, the probes should be generated from the (unknown) distribution of
non-causes of the target. We use instead a method for generating probes that
use permutations of values of some of the real variables, while enforcing
some causal dependencies.
Assume that we want to uncover causes of the target variable (lung cancer)
and we use a causal discovery algorithm for that purpose. The fraction of
probes selected as candidate causes is an indication of the fraction of false
positive. Because we know in that case the true data generative model, we
can analyze how useful the probe method is, despite the ad hoc way in which
the probes are generated.
The data include the same 500 training examples as MARTI (in the same order).
All original variables come first and the probes are appended as extra columns.
No test data are provided.
Download
the data in text format [7.8 Mb].
Download
the data in Matlab format [8 Mb].
|