Data
-
DMOZ Web Directory Topics - submitted by jeanbaptiste 6381 views, 11385 downloads, 0 comments
last edited by jeanbaptiste - Mar 29, 2012, 16:47 CET Rating
- Summary: Contains parsed webpages along with their topics extracted from DMOZ web directory
- Data Shape: 10630 attributes, 2658 instances ()
- License: unknown
- Tags: bag-of-words Classification DMOZ libsvm multi-class text web-pages
- Tasks / Methods / Challenges: 0 tasks, 0 methods, 0 challenges
- Download: HDF5 (4.1 MB) XML CSV ARFF LibSVM Matlab Octave
- Files are converted on demand and the process can take up to a minute. Please wait until download begins.
-
Yahoo! Web Directory Topics - submitted by jeanbaptiste 2299 views, 13160 downloads, 0 comments
last edited by jeanbaptiste - Mar 13, 2012, 15:16 CET Rating
- Summary: Contains parsed webpages along with their topics extracted from Yahoo! web directory
- Data Shape: 10630 attributes, 2212 instances ()
- License: unknown
- Tags: bag-of-words Classification multi-class text web-pages Yahoo!
- Tasks / Methods / Challenges: 1 tasks, 0 methods, 1 challenges
- Download: HDF5 (3.6 MB) XML CSV ARFF LibSVM Matlab Octave
- Files are converted on demand and the process can take up to a minute. Please wait until download begins.
Disclaimer
We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.
Acknowledgements
This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
http://www.pascal-network.org/.