Causality Causality Workbench                                                             Challenges in Machine Learning Causality
Rating : (1 vote)

SIDO: A phamacology dataset

Contact: Isabelle Guyon - Submitted: 2008-09-12 02:53 - Views : 15356

This is one of the datasets of the first causality challenge: causation and prediction. The goal of the challenge was to make predictions under manipulations. SIDO (SImple Drug Operation mechanisms) contains descriptors of molecules, which have...  [more/question/discuss/rate/edit...]

PROMO: Simple causal effects in time series

Contact: Jean-Philippe Pellet - Submitted: 2011-01-26 17:59 - Views : 18703

The PROMO dataset proposes the task to identify which promotions affect sales. Artificial data about 1000 promotion variables and 100 product sales is provided. The goal is to predict a 1000x100 boolean influence matrix, indicating for each (i,j)...  [more/question/discuss/rate/edit...]

Rating : (1 vote)

CYTO: Causal Protein-Signaling Networks in human T cells

Contact: Karen Sachs - Submitted: 2018-04-23 20:54 - Views : 21427

This dataset consists of roughly 700 to 900 single cell recordings of the abundance of 11 phosphoproteins and phospholipids (PKC, PKA, P38, Jnk (pjnk), Raf (praf), Mek (pmek), Erk (p44/42), Akt (pakts473), PLC-gamma (plcg), PIP2, PIP3) under various...  [more/question/discuss/rate/edit...]

TIED: Target Information Equivalent Dataset

Contact: Alexander Statnikov - Submitted: 2008-09-12 20:24 - Views : 16725

TIED dataset 2008 Alexander Statnikov and Constantin Aliferis Introduction TIED stands for Target Information Equivalent Dataset. It is an artificial simulated dataset constructed to illustrate that there may be many minimal sets of...  [more/question/discuss/rate/edit...]

SIGNET: Abscisic Acid Signaling Network

Contact: Jerry Jenkins - Submitted: 2008-11-25 20:56 - Views : 21673

The objective is to determine the set of boolean rules that describe the interactions of the nodes within this plant signaling network. The dataset includes 300 separate boolean pseudodynamic simulations of the true rules, using an asynchronous...  [more/question/discuss/rate/edit...]

CINA: A marketing dataset

Contact: Isabelle Guyon - Submitted: 2008-09-12 02:37 - Views : 15634

CINA (Census Is Not Adult) is derived from census data (the UCI machine-learning repository Adult database). The data consists of census records for a number of individuals. The causal discovery task is to uncover the socio-economic factors...  [more/question/discuss/rate/edit...]

  • Authors: Causality workbench team
  • Key facts: Number of variables: 132 (demographic data) + one binary target variable . Number of examples: training 16033 + 3 test sets of 10000 examples corresponding...
  • Keywords: probe.method, marketing

REGED: A genomics dataset

Contact: Isabelle Guyon - Submitted: 2008-09-12 02:55 - Views : 5695

This is one of the datasets of the first causality challenge: causation and prediction. The goal of the challenge was to make predictions under manipulations. REGED (REsimulated Gene Expression Dataset) monitors the expression of genes, which...  [more/question/discuss/rate/edit...]

  • Authors: Causality workbench team
  • Key facts: Number of variables: 999 (gene expression coefficients) + one binary target variable (health status). Number of examples: training 500 + 3 test sets of 20000...
  • Keywords:, genomics

MARTI: Measurement Artifacts

Contact: Isabelle Guyon - Submitted: 2008-09-12 06:02 - Views : 14445

This is one of the datasets of the first causality challenge: causation and prediction. The goal of the challenge was to make predictions under manipulations. MARTI (Measurement ARTIfact) is obtained from the same data generative process as...  [more/question/discuss/rate/edit...]

Rating : (1 vote)

WebLogs: Causal discovery in web logs

Contact: Cristian Grozea - Submitted: 2008-12-07 02:36 - Views : 4657

From real data, the anonymized logs of a web server, determine the causal structure - which pages link/lead to visits of other pages. The ground truth is beyond doubt, from the referrer information, but this information will be kept for an...  [more/question/discuss/rate/edit...]

  • Authors: Cristian Grozea
  • Key facts: Number of variables: 20 (daily hits for web pages); Number of instances: 512 (training set). Ascii format for input; Matlab format allowed for output.
  • Keywords: web_logs, probabilistic

CauseEffectPairs: Distinguishing between cause and effect

Contact: Dominik Janzing - Submitted: 2010-05-04 13:53 - Views : 13833

The data set consists of 8 N x 2 matrices, each representing a cause-effect pair and the task is to identify which variable is the cause and which one the effect. The origin of the data is hidden for the participants but known to the organizers....  [more/question/discuss/rate/edit...]

STEMMATOLOGY: Computer-assisted stemmatology

Contact: Teemu Roos - Submitted: 2008-10-29 09:18 - Views : 13826

Stemmatology (a.k.a. stemmatics) studies relations among different variants of a document that have been gradually built from an original text by copying and modifying earlier versions. The aim of such study is to reconstruct the family tree (causal...  [more/question/discuss/rate/edit...]

MIDS: MIxed Dynamic Systems

Contact: Denver Dash - Submitted: 2008-11-23 06:12 - Views : 15257

Summary: This data represents a 9 variable (labeled X1...X9) dynamic system with several dynamic processes acting on qualitatively different time scales from one another. The goal is to learn a causal model of the system with the training data, and...  [more/question/discuss/rate/edit...]

NOISE: Causal Directions in Noisy Environment

Contact: Guido Nolte - Submitted: 2009-10-05 21:17 - Views : 5222

This challenge has two parts, a simulation and real data. Simulation: Data are simulated as superposition of bivariate unidirectional interaction plus additive mixed and non-white noise. The simulations were done with AR-models with...  [more/question/discuss/rate/edit...]

  • Authors: G. Nolte
  • Key facts: Simulated Data: 1000 examples of bivariate time series' for 6000 time points each. Real Data: EEG data of 10 subjects measured at rest with eyes closed....
  • Keywords: Time series, mixed noise, bivariate, EEG
Rating : (2 votes)

SECOM: SEmi COnductor Manufacturing process control data

Contact: Michael McCann - Submitted: 2008-11-19 18:55 - Views : 19480

Abstract: A complex modern semi-conductor manufacturing process is normally under consistent surveillance via the monitoring of signals/variables collected from sensors and or process measurement points. However, not all of these signals are equally...  [more/question/discuss/rate/edit...]

SETFI: Manufacturing data: Semiconductor Tool Fault Isolation

Contact: Eugene Tuv - Submitted: 2008-11-24 23:46 - Views : 27361

During the semiconductor fabrication process each wafer goes through a product specific sequence of operations (hundreds) in batches - lots. Every lot goes through each operation in the sequence. At each operation a lot could go through only one of...  [more/question/discuss/rate/edit...]

Rating : (1 vote)

WearableAccelerometersDataset: Wearable Computing: Classification of Body Postures and Movements (PUC-Rio) Data Set

Contact: Ugulino - Submitted: 2013-07-30 03:58 - Views : 4694

During the last 5 years, research on Human Activity Recognition (HAR) has reported on systems showing good overall recognition performance. As a consequence, HAR has been considered as a potential technology for e-health systems. Here, we propose a...  [more/question/discuss/rate/edit...]

GeoSimSets: Simulation data sets for testing causal discovery in the geosciences

Contact: Imme Ebert-Uphoff - Submitted: 2016-02-08 02:16 - Views : 1708

When using causal discovery in the geosciences, it is hard to evaluate the results, because there is generally no ground truth available. To fill this gap we simulate two important processes that are often dominant in geophysical processes,...  [more/question/discuss/rate/edit...]

ECOLI: Ecoli gene expression

Contact: Sisi Ma - Submitted: 2016-02-16 19:54 - Views : 902

Data simulated with Gene Network Weaver.  [more/question/discuss/rate/edit...]

Yeast: Yeast gene expression

Contact: Sisi Ma - Submitted: 2016-02-16 19:58 - Views : 1057

Data generated by gene network weaver  [more/question/discuss/rate/edit...]

reged01: REGED

Contact: Sisi Ma - Submitted: 2016-02-16 20:06 - Views : 1040

The REGED network was induced from 1,000 randomly selected genes in a lung cancer gene expression dataset  [more/question/discuss/rate/edit...]

Rating : (1 vote)

GearBoxData: Gear Box Fault Diagnosis Data

Contact: Yogesh Pandya - Submitted: 2018-03-18 19:18 - Views : 2343

Faulty and healthy gear box Data sets need to be analyzed in detail. Here, we created this dataset for those who do research in wind turbine gearbox fault diagnosis.  [more/question/discuss/rate/edit...]