View sector_scale (public)
























- Summary
(No information yet)
- License
- unknown (from LibSVMTools repository)
- Dependencies
- Tags
- libsvm LibSVMTools slurped
- Attribute Types
- Download
-
# Instances: 19238 / # Attributes: 55198
HDF5 (18.1 MB) XML CSV ARFF LibSVM Matlab OctaveFiles are converted on demand and the process can take up to a minute. Please wait until download begins.
You can edit this item to add more meta information and make use of the site's premium features.
- Original Data Format
- libsvm
- Name
- sector_scale
- Version mldata
- 0
- Comment
LibSVM
- Names
- Data (first 10 data points)
2 0.00... 0.00... 3e-05 0.01... 0.00... 0.00... 0.00... 0.00... 0.00... ... 3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 5 0.00... 0.00... 4e-05 0.0 0.00... 0.004 0.00... 0.01... 0.01... ... 6 0.00... 0.00... 4e-05 0.00... 0.00... 0.00... 0.00... 0.0025 0.00... ... 8 0.00... 0.00... 3e-05 0.0 0.00... 0.00... 0.00... 0.00... 0.00... ... 9 0.00... 0.00... 0.00... 0.03... 0.00... 0.01... 0.01... 0.01... 0.00... ... 11 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 12 0.0005 0.00... 3e-05 0.01... 0.00... 0.00... 0.00... 0.00... 0.0 ... 14 0.00... 0.0009 8e-05 0.0 0.00... 0.00... 0.00... 0.0 0.0 ... 15 0.00... 0.00... 5e-05 0.01... 0.00... 0.00... 0.0 0.00... 0.00... ... ... ... ... ... ... ... ... ... ... ... ...
- Description
Preprocessing:
The scaled data was used in our KDD 08 paper. For unknown reason we could now only generate something close to it. The sources are from this page. We select train-0.tc and test-0.tc from ecoc-svm-data.tar.gz. A 2/1 training/testing split gives training and testing sets below. They are in the original format instead of the libsvm format: in each row the 2nd value gives the class label and subsequent numbers give pairs of feature IDs and values. We then do a kind of tf-idf transformation: ln(1+tf)*log_2(#docs/#coll_freq_of_term) and normalize each instance unit length.
[JR01b,SSK08a] # of classes: 105# of data: 6,412 / 3,207 (testing) # of features: 55,197 / 55,197 (testing)
- URLs
- (No information yet)
- Publications
- Data Source
- ref.html#AM98a AM98a]
- Measurement Details
- Usage Scenario
- revision 1
- by mldata on 2010-11-01 11:46
No one has posted any comments yet. Perhaps you would like to be the first?
Leave a comment
To post a comment, please sign in.This item was downloaded 9297 times and viewed 166 times.
No Tasks yet on dataset sector_scale
Submit a new Task for this Data itemDisclaimer
We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.
Acknowledgements
This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
http://www.pascal-network.org/.