Stemmatology (a.k.a. stemmatics) studies relations among different variants of a document that have been gradually built from an original text by copying and modifying earlier versions. The aim of such study is to reconstruct the family tree (causal graph) of the variants.
We provide a dataset to evaluate methods for computer-assisted stemmatology. The ground truth is provided, as are evaluation criteria to allow the ranking of the results of different methods. We hope this will facilitate the development of novel approaches, including but not restricted to hierarchical clustering, graphical modeling, link analysis, phylogenetics, string-matching, etc.