This is one of the datasets of the first causality challenge: causation and prediction. The goal of the challenge was to make predictions under manipulations.
REGED (REsimulated Gene Expression Dataset) monitors the expression of genes, which could be responsible of lung cancer. The data are ?re-simulated?, i.e. generated by a model derived from real human lung-cancer microarray gene expression data. From the causal discovery point of view, the goal is to separate genes whose activity cause lung cancer from those whose activity is a consequence of the disease.
For the pot-luck challenge, the task proposed is to discover the causal network in the neighborhood of the target.