View uci-20070111 soybean (public)
























- Summary
(No information yet)
- License
- unknown (from Weka repository)
- Dependencies
- Tags
- arff slurped Weka
- Attribute Types
- String
- Download
-
# Instances: 683 / # Attributes: 36
HDF5 (1.0 MB) XML CSV ARFF LibSVM Matlab OctaveFiles are converted on demand and the process can take up to a minute. Please wait until download begins.
You can edit this item to add more meta information and make use of the site's premium features.
- Original Data Format
- arff
- Name
- soybean
- Version mldata
- 0
- Comment
Notes: The large soybean database (soybean-large-data.arff) and it's corresponding test database (soybean-large-test.arff) combined into a single file (soybean-large.arff).
Title: Large Soybean Database
Sources: (a) R.S. Michalski and R.L. Chilausky "Learning by Being Told and Learning from Examples: An Experimental Comparison of the Two Methods of Knowledge Acquisition in the Context of Developing an Expert System for Soybean Disease Diagnosis", International Journal of Policy Analysis and Information Systems, Vol. 4, No. 2, 1980. (b) Donor: Ming Tan & Jeff Schlimmer (Jeff.Schlimmer%cs.cmu.edu) (c) Date: 11 July 1988
Past Usage:
- See above.
- Tan, M., & Eshelman, L. (1988). Using weighted networks to represent classification knowledge in noisy domains. Proceedings of the Fifth International Conference on Machine Learning (pp. 121-134). Ann Arbor, Michigan: Morgan Kaufmann. -- IWN recorded a 97.1% classification accuracy -- 290 training and 340 test instances
- Fisher,D.H. & Schlimmer,J.C. (1988). Concept Simplification and Predictive Accuracy. Proceedings of the Fifth International Conference on Machine Learning (pp. 22-28). Ann Arbor, Michigan: Morgan Kaufmann. -- Notes why this database is highly predictable
Relevant Information Paragraph: There are 19 classes, only the first 15 of which have been used in prior work. The folklore seems to be that the last four classes are unjustified by the data since they have so few examples. There are 35 categorical attributes, some nominal and some ordered. The value
dna'' means does not apply. The values for attributes are encoded numerically, with the first value encoded as
0,'' the second as1,'' and so forth. An unknown values is encoded as
?''.Number of Instances: 683
Number of Attributes: 35 (all have been nominalized)
Attribute Information: -- 19 Classes diaporthe-stem-canker, charcoal-rot, rhizoctonia-root-rot, phytophthora-rot, brown-stem-rot, powdery-mildew, downy-mildew, brown-spot, bacterial-blight, bacterial-pustule, purple-seed-stain, anthracnose, phyllosticta-leaf-spot, alternarialeaf-spot, frog-eye-leaf-spot, diaporthe-pod-&-stem-blight, cyst-nematode, 2-4-d-injury, herbicide-injury.
date: april,may,june,july,august,september,october,?.
plant-stand: normal,lt-normal,?.
precip: lt-norm,norm,gt-norm,?.
temp: lt-norm,norm,gt-norm,?.
hail: yes,no,?.
crop-hist: diff-lst-year,same-lst-yr,same-lst-two-yrs, same-lst-sev-yrs,?.
area-damaged: scattered,low-areas,upper-areas,whole-field,?.
severity: minor,pot-severe,severe,?.
seed-tmt: none,fungicide,other,?.
germination: '90-100%','80-89%','lt-80%',?.
plant-growth: norm,abnorm,?.
leaves: norm,abnorm.
leafspots-halo: absent,yellow-halos,no-yellow-halos,?.
leafspots-marg: w-s-marg,no-w-s-marg,dna,?.
leafspot-size: lt-1/8,gt-1/8,dna,?.
leaf-shread: absent,present,?.
leaf-malf: absent,present,?.
leaf-mild: absent,upper-surf,lower-surf,?.
stem: norm,abnorm,?.
lodging: yes,no,?.
stem-cankers: absent,below-soil,above-soil,above-sec-nde,?.
canker-lesion: dna,brown,dk-brown-blk,tan,?.
fruiting-bodies: absent,present,?.
external decay: absent,firm-and-dry,watery,?.
mycelium: absent,present,?.
int-discolor: none,brown,black,?.
sclerotia: absent,present,?.
fruit-pods: norm,diseased,few-present,dna,?.
fruit spots: absent,colored,brown-w/blk-specks,distort,dna,?.
seed: norm,abnorm,?.
mold-growth: absent,present,?.
seed-discolor: absent,present,?.
seed-size: norm,lt-norm,?.
shriveling: absent,present,?.
roots: norm,rotted,galls-cysts,?.
- Names
- date,plant-stand,precip,temp,hail,crop-hist,area-damaged,severity,seed-tmt,germination,
- Types
- nominal:april,may,june,july,august,september,october
- nominal:normal,lt-normal
- nominal:lt-norm,norm,gt-norm
- nominal:lt-norm,norm,gt-norm
- nominal:yes,no
- nominal:diff-lst-year,same-lst-yr,same-lst-two-yrs,same-lst-sev-yrs
- nominal:scattered,low-areas,upper-areas,whole-field
- nominal:minor,pot-severe,severe
- nominal:none,fungicide,other
- nominal:90-100,80-89,lt-80
- Data (first 10 data points)
date plan... precip temp hail crop... area... seve... seed... germ... ... octo... normal gt-n... norm yes same... low-... pot-... none 90-100 ... august normal gt-n... norm yes same... scat... severe fung... 80-89 ... july normal gt-n... norm yes same... scat... severe fung... lt-80 ... july normal gt-n... norm yes same... scat... severe none 80-89 ... octo... normal gt-n... norm yes same... scat... pot-... none lt-80 ... sept... normal gt-n... norm yes same... scat... pot-... none 80-89 ... sept... normal gt-n... norm yes same... scat... pot-... fung... 90-100 ... august normal gt-n... norm no same... scat... pot-... none lt-80 ... octo... normal gt-n... norm yes same... scat... pot-... fung... 80-89 ... august normal gt-n... norm yes same... scat... severe none lt-80 ... ... ... ... ... ... ... ... ... ... ... ...
- Description
A gzip'ed tar containing UCI and UCI KDD datasets (uci-20070111.tar.gz, 17,952,832 Bytes)
- URLs
- (No information yet)
- Publications
- Data Source
- http://www.ics.uci.edu/~mlearn/MLRepository.html http://kdd.ics.uci.edu/
- Measurement Details
- Usage Scenario
- revision 1
- by mldata on 2011-09-14 15:57
No one has posted any comments yet. Perhaps you would like to be the first?
Leave a comment
To post a comment, please sign in.This item was downloaded 4534 times and viewed 3245 times.
No Tasks yet on dataset uci-20070111 soybean
Submit a new Task for this Data itemDisclaimer
We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.
Acknowledgements
This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
http://www.pascal-network.org/.