SYLVA is an ecology dataset

The task of SYLVA is to classify forest cover types. The forest cover type for 30 x 30 meter cells is obtained from US Forest Service (USFS) Region 2 Resource Information System (RIS) data. We brought it back to a two-class classification problem (classifying Ponderosa pine vs. everything else). The data consists in 216 input variables. Each pattern is composed of 4 records: 2 true records matching the target and 2 records picked at random. Thus ½ of the features are distracters. The SYLVA dataset was used previously in the Performance Prediction challenge, the Model Selection game, and the Agnostic Learning vs. Prior Knowledge (ALvsPK) challenge.
CausalityThis dataset is used in the Active Learning Challenge by the Causality Workbench