View statlib-20050214 chscase_geyser2 (public)

2011-09-14 16:11 by mldata | Version 1 | Rating Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Overall (based on 0 votes)
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Interesting
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Documentation

(No information yet)

unknown (from Weka repository)
arff slurped Weka
Attribute Types
# Instances: 279 / # Attributes: 2
HDF5 (20.9 KB) XML CSV ARFF LibSVM Matlab Octave
Completeness of this item currently: 55%.
You can edit this item to add more meta information and make use of the site's premium features.
Original Data Format
Version mldata
                 File README

chscase A collection of the data sets used in the book "A Casebook for a First Course in Statistics and Data Analysis," by Samprit Chatterjee, Mark S. Handcock and Jeffrey S. Simonoff, John Wiley and Sons, New York, 1995. Submitted by Samprit Chatterjee (, Mark Handcock ( and Jeff Simonoff (

This submission consists of 38 files, plus this README file. Each file represents a data set analyzed in the book. The names of the files correspond to the names used in the book. The data files are written in plain ASCII (character) text. Missing values are represented by "M" in all data files.

More information about the data sets and the book can be obtained via gopher at the address

The information is filed under ---> Academic Departments & Research Centers ---> Statistics and Operations Research ---> Publications ---> A Casebook for a First Course in Statistics and Data Analysis ---> Welcome!

It can also be accessed from the World Wide Web (WWW) using a WWW browser (e.g., netscape) starting from the URL address

NOTICE: These datasets may be used freely for scientific, educational and/or non-commercial purposes, provided suitable acknowledgment is given (by citing the Chatterjee, Handcock and Simonoff reference above).

File: geyser2.dat

Note: attribute names were generated automatically since there was no information in the data itself.

Information about the dataset CLASSTYPE: numeric CLASSINDEX: none specific

  1. nominal:0.8,1.6,1.7,1.8,1.9,2.0,2.1,2.2,2.3,2.4,2.5,2.6,2.8,2.9,3.3,3.4,3.5,3.7,3.8,3.9,4.0,4.1,4.2,4.3,4.4,4.5,4.6,4.7,4.8,4.9,5.0,5.1,5.3,5.5,Long,Medium,Short
  2. numeric
Data (first 10 data points)
    col_1 col_2
    4.0 71
    2.2 57
    Long 80
    Long 75
    Long 55
    4.4 86
    4.3 77
    2.0 56
    4.8 81
    1.8 50
    ... ...

A gzip'ed tar containing StatLib datasets (statlib-20050214.tar.gz, 12,785,582 Bytes)

(No information yet)
    Data Source
    Measurement Details
    Usage Scenario
    revision 1
    by mldata on 2011-09-14 16:11

    No one has posted any comments yet. Perhaps you would like to be the first?

    Leave a comment

    To post a comment, please sign in.

    This item was downloaded 3844 times and viewed 1918 times.

    No Tasks yet on dataset statlib-20050214 chscase_geyser2

    Submit a new Task for this Data item


    Sort by


    We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.

    Data | Task | Method | Challenge


    This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
    PASCAL Logo