multy kernel learning dataset on protein fold prediction

unknown (from UCI repository)
multi-class multi-kernel protein-fold-prediction
unknown (793.6 KB)

    (Zipped) TAR archive ./._DingShenDam, DingShenDam, DingShenDam/._Composition_Test.csv, DingShenDam/Composition_Test.csv, DingShenDam/._Composition_Train.csv, DingShenDam/Composition_Train.csv, DingShenDam/._Hydrophobicity_Test.csv, DingShenDam/Hydrophobicity_Test.csv, DingShenDam/._Hydrophobicity_Train.csv, DingShenDam/Hydrophobicity_Train.csv, DingShenDam/._L14_Test.csv, DingShenDam/L14_Test.csv, DingShenDam/._L14_Train.csv, DingShenDam/L14_Train.csv, DingShenDam/._L1_Test.csv, DingShenDam/L1_Test.csv, DingShenDam/._L1_Train.csv, DingShenDam/L1_Train.csv, DingShenDam/._L30_Test.csv, DingShenDam/L30_Test.csv, DingShenDam/._L30_Train.csv, DingShenDam/L30_Train.csv, DingShenDam/._L4_Test.csv, DingShenDam/L4_Test.csv, DingShenDam/._L4_Train.csv, DingShenDam/L4_Train.csv, DingShenDam/._Polarity_Test.csv, DingShenDam/Polarity_Test.csv, DingShenDam/._Polarity_Train.csv, DingShenDam/Polarity_Train.csv, DingShenDam/._Polarizability_Test.csv, DingShenDam/Polarizability_Test.csv, DingShenDam/._Polarizability_Train.csv, DingShenDam/Polarizability_Train.csv, DingShenDam/._Secondary_Test.csv, DingShenDam/Secondary_Test.csv, DingShenDam/._Secondary_Train.csv, DingShenDam/Secondary_Train.csv, DingShenDam/._SWblosum62_Test.csv, DingShenDam/SWblosum62_Test.csv, DingShenDam/._SWblosum62_Train.csv, DingShenDam/SWblosum62_Train.csv, DingShenDam/._SWpam50_Test.csv, DingShenDam/SWpam50_Test.csv, DingShenDam/._SWpam50_Train.csv, DingShenDam/SWpam50_Train.csv, DingShenDam/._t_Test.csv, DingShenDam/t_Test.csv, DingShenDam/._t_Train.csv, DingShenDam/t_Train.csv, DingShenDam/._Volume_Test.csv, DingShenDam/Volume_Test.csv, DingShenDam/._Volume_Train.csv, DingShenDam/Volume_Train.csv

This dataset is on protein fold prediction (multiclass classification with 27 classes) based on a subset of the PDB-40D SCOP collection. It is an extension of the original dataset by Ding that also includes the pseudo-amino acid compositions proposed by Shen and Chou and the Smith-Waterman String kernels employed in Damoulas and Girolami.

    The file contains *_Train.csv and *_Test.csv files describing the 12 different feature spaces that should be used to construct individual base kernels for MKL. The data is split to independent train and test sets with 311 samples for training and 383 samples for testing. It also includes the labels in t_Test.csv and t_Train.csv files.
