View uci-20070111 lung-cancer (public)
























- Summary
(No information yet)
- License
- unknown (from Weka repository)
- Dependencies
- Tags
- arff slurped Weka
- Attribute Types
- Integer,Floating Point
- Download
-
# Instances: 32 / # Attributes: 57
HDF5 (21.1 KB) XML CSV ARFF LibSVM Matlab Octave
You can edit this item to add more meta information and make use of the site's premium features.
- Original Data Format
- arff
- Name
- lung-cancer
- Version mldata
- 0
- Comment
Title: Lung Cancer Data
Source Information:
- Data was published in : Hong, Z.Q. and Yang, J.Y. "Optimal Discriminant Plane for a Small Number of Samples and Design Method of Classifier on the Plane", Pattern Recognition, Vol. 24, No. 4, pp. 317-324, 1991.
- Donor: Stefan Aeberhard, stefan@coral.cs.jcu.edu.au
- Date : May, 1992
Past Usage:
- Hong, Z.Q. and Yang, J.Y. "Optimal Discriminant Plane for a Small Number of Samples and Design Method of Classifier on the Plane", Pattern Recognition, Vol. 24, No. 4, pp. 317-324, 1991.
- Aeberhard, S., Coomans, D, De Vel, O. "Comparisons of Classification Methods in High Dimensional Settings", submitted to Technometrics.
- Aeberhard, S., Coomans, D, De Vel, O. "The Dangers of Bias in High Dimensional Settings", submitted to pattern Recognition.
Relevant Information:
-
This data was used by Hong and Young to illustrate the
power of the optimal discriminant plane even in ill-posed
settings. Applying the KNN method in the resulting plane
gave 77% accuracy. However, these results are strongly biased (See Aeberhard's second ref. above, or email to stefan@coral.cs.jcu.edu.au). Results obtained by Aeberhard et al. are : RDA : 62.5%, KNN 53.1%, Opt. Disc. Plane 59.4%
The data described 3 types of pathological lung cancers. The Authors give no information on the individual variables nor on where the data was originally used.
- In the original data 4 values for the fifth attribute were -1. These values have been changed to ? (unknown). (*)
- In the original data 1 value for the 39 attribute was 4. This value has been changed to ? (unknown). (*)
-
This data was used by Hong and Young to illustrate the
power of the optimal discriminant plane even in ill-posed
settings. Applying the KNN method in the resulting plane
Number of Instances: 32
Number of Attributes: 57 (1 class attribute, 56 predictive)
Attribute Information:
attribute 1 is the class label.
- All predictive attributes are nominal, taking on integer values 0-3
Missing Attribute Values: Attributes 5 and 39 (*)
Class Distribution:
- 3 classes, 1.) 9 observations 2.) 13 " 3.) 10 "
Information about the dataset CLASSTYPE: nominal CLASSINDEX: first
- Names
- class,attribute2,attribute3,attribute4,attribute5,attribute6,attribute7,attribute8,attribute9,attribute10,
- Types
- nominal:1,2,3
- nominal:0,1
- nominal:1,2,3
- nominal:0,1,2,3
- nominal:0,1,2
- nominal:0,1
- nominal:1,2,3
- nominal:1,2,3
- nominal:1,2,3
- nominal:1,2,3
- Data (first 10 data points)
class attr... attr... attr... attr... attr... attr... attr... attr... attr... ... 1 0 3 0 nan 0 2 2 2 1 ... 1 0 3 3 1.0 0 3 1 3 1 ... 1 0 3 3 2.0 0 3 3 3 1 ... 1 0 2 3 2.0 1 3 3 3 1 ... 1 0 3 2 1.0 1 3 3 3 2 ... 1 0 3 3 2.0 0 3 3 3 1 ... 1 0 3 2 1.0 0 3 3 3 1 ... 1 0 2 2 1.0 0 3 1 3 3 ... 1 0 3 1 1.0 0 3 1 3 1 ... 2 0 2 3 2.0 0 2 2 2 1 ... ... ... ... ... ... ... ... ... ... ... ...
- Description
A gzip'ed tar containing UCI and UCI KDD datasets (uci-20070111.tar.gz, 17,952,832 Bytes)
- URLs
- (No information yet)
- Publications
- Data Source
- http://www.ics.uci.edu/~mlearn/MLRepository.html http://kdd.ics.uci.edu/
- Measurement Details
- Usage Scenario
- revision 1
- by mldata on 2010-11-06 09:58
No one has posted any comments yet. Perhaps you would like to be the first?
Leave a comment
To post a comment, please sign in.This item was downloaded 3776 times and viewed 2374 times.
No Tasks yet on dataset uci-20070111 lung-cancer
Submit a new Task for this Data itemDisclaimer
We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.
Acknowledgements
This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
http://www.pascal-network.org/.