View uci-20070111 mfeat-karhunen (public)

2010-11-06 09:58 by mldata | Version 1 | Rating Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
Rating
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Overall (based on 0 votes)
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Interesting
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Documentation
Summary

(No information yet)

License
unknown (from Weka repository)
Dependencies
Tags
arff slurped Weka
Attribute Types
Integer,Floating Point
Download
# Instances: 2000 / # Attributes: 65
HDF5 (1023.8 KB) XML CSV ARFF LibSVM Matlab Octave

Files are converted on demand and the process can take up to a minute. Please wait until download begins.

Completeness of this item currently: 55%.
You can edit this item to add more meta information and make use of the site's premium features.
Original Data Format
arff
Name
mfeat
Version mldata
0
Comment

The multi-feature digit dataset

Oowned and donated by:

Robert P.W. Duin Department of Applied Physics Delft University of Technology P.O. Box 5046, 2600 GA Delft The Netherlands

email: duin@ph.tn.tudelft.nl http : //www.ph.tn.tudelft.nl/~duin tel +31 15 2786143

Usage

A slightly different version of the database is used in

M. van Breukelen, R.P.W. Duin, D.M.J. Tax, and J.E. den Hartog, Handwritten digit recognition by combined classifiers, Kybernetika, vol. 34, no. 4, 1998, 381-386.

M. van Breukelen and R.P.W. Duin, Neural Network Initialization by Combined Classifiers, in: A.K. Jain, S. Venkatesh, B.C. Lovell (eds.), ICPR'98, Proc. 14th Int. Conference on Pattern Recognition (Brisbane, Aug. 16-20),

The database as it is is used in:

A.K. Jain, R.P.W. Duin, J. Mao, Statisitcal Pattern Recognition: A Review, in preparation

Description

This dataset consists of features of handwritten numerals (0'--9') extracted from a collection of Dutch utility maps. 200 patterns per class (for a total of 2,000 patterns) have been digitized in binary images. These digits are represented in terms of the following six feature sets (files):

  1. mfeat-fou: 76 Fourier coefficients of the character shapes;
  2. mfeat-fac: 216 profile correlations;
  3. mfeat-kar: 64 Karhunen-Love coefficients;
  4. mfeat-pix: 240 pixel averages in 2 x 3 windows;
  5. mfeat-zer: 47 Zernike moments;
  6. mfeat-mor: 6 morphological features.

In each file the 2000 patterns are stored in ASCI on 2000 lines. The first 200 patterns are of class `0', followed by sets of 200 patterns for each of the classes 1' -9'. Corresponding patterns in different feature sets (files) correspond to the same original character.

The source image dataset is lost. Using the pixel-dataset (mfeat-pix) sampled versions of the original images may be obtained (15 x 16 pixels).

Total number of instances:

2000 (200 instances per class)

Total number of attributes:

649 (distributed over 6 datasets,see above)

no missing attributes

Total number of classes:

10

Format:

6 files, see above. Each file contains 2000 lines, one for each instance. Attributes are SPACE separated and can be loaded by Matlab as > load filename No missing attributes. Some are integer, others are real.

Information about the dataset CLASSTYPE: nominal CLASSINDEX: last

Names
att1,att2,att3,att4,att5,att6,att7,att8,att9,att10,
Types
  1. numeric
  2. numeric
  3. numeric
  4. numeric
  5. numeric
  6. numeric
  7. numeric
  8. numeric
  9. numeric
  10. numeric
Data (first 10 data points)
    att1 att2 att3 att4 att5 att6 att7 att8 att9 att10 ...
    -10.2... -11.6... 11.5... -2.08... 4.04... 4.08... -2.55... -8.47... 2.13... 3.50... ...
    -5.03... -12.8... 0.16... 0.59... 3.12... 4.22... -6.41... -6.33... -0.24... 1.34... ...
    -9.63... -6.65... 0.38... -1.71... 0.30... 3.40... -7.24... -1.65... -0.87... 4.15... ...
    -6.65... -7.04... 4.10... -2.34... 3.49... 3.92... -9.87... -6.55... -1.36... 1.15... ...
    -10.6... -10.9... 0.19... 0.45... 2.19... -3.30... -8.37... -4.24... 2.96... -0.94... ...
    3.43... -3.91... -1.12... 4.03... -2.51... 1.73... -8.81... -4.18... 8.66... -2.78... ...
    -13.9... -9.61... 9.99... -3.32... 3.60... -1.11... -5.24... -3.25... 2.39... -1.80... ...
    -10.9... -11.3... 4.45... -2.69... 1.14... -0.06... -9.37... -6.92... 0.11... 1.78... ...
    -13.1... -13.0... 12.3... -3.00... 2.88... -0.15... 0.35... -1.10... 3.59... 0.55... ...
    -10.3... -5.19... 8.16... -0.39... 2.70... 3.76... -4.16... -0.97... 0.54... 0.95... ...
    ... ... ... ... ... ... ... ... ... ... ...
Description

A gzip'ed tar containing UCI and UCI KDD datasets (uci-20070111.tar.gz, 17,952,832 Bytes)

URLs
(No information yet)
Publications
    Data Source
    http://www.ics.uci.edu/~mlearn/MLRepository.html http://kdd.ics.uci.edu/
    Measurement Details
    Usage Scenario
    revision 1
    by mldata on 2010-11-06 09:58

    No one has posted any comments yet. Perhaps you would like to be the first?

    Leave a comment

    To post a comment, please sign in.

    This item was downloaded 3871 times and viewed 1913 times.

    No Tasks yet on dataset uci-20070111 mfeat-karhunen

    Submit a new Task for this Data item

    Data

    Sort by

    Disclaimer

    We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.

    Data | Task | Method | Challenge

    Acknowledgements

    This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
    PASCAL Logo
    http://www.pascal-network.org/.