View statlib-20050214 pbc (public)

2010-11-06 10:00 by mldata | Version 1 | Rating Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
Rating
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Overall (based on 0 votes)
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Interesting
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Documentation
Summary

(No information yet)

License
unknown (from Weka repository)
Dependencies
Tags
arff slurped Weka
Attribute Types
Integer,Floating Point,String
Download
# Instances: 418 / # Attributes: 20
HDF5 (170.7 KB) XML CSV ARFF LibSVM Matlab Octave

Files are converted on demand and the process can take up to a minute. Please wait until download begins.

Completeness of this item currently: 55%.
You can edit this item to add more meta information and make use of the site's premium features.
Original Data Format
arff
Name
pbc
Version mldata
0
Comment


Primary Biliary Cirrhosis

The data set found in appendix D of Fleming and Harrington, Counting

Processes and Survival Analysis, Wiley, 1991. The only differences are: age is in days status is coded as 0=censored, 1=censored due to liver tx, 2=death the sex and stage variables are not missing for obs 313-418

Quoting from F&H. "The following pages contain the data from the Mayo Clinic trial in primary biliary cirrhosis (PBC) of the liver conducted between 1974 and 1984. A description of the clinical background for the trial and the covariates recorded here is in Chapter 0, especially Section 0.2. A more extended discussion can be found in Dickson, et al., Hepatology 10:1-7 (1989) and in Markus, et al., N Eng J of Med 320:1709-13 (1989). "A total of 424 PBC patients, referred to Mayo Clinic during that ten-year interval, met eligibility criteria for the randomized placebo controlled trial of the drug D-penicillamine. The first 312 cases in the data set participated in the randomized trial and contain largely complete data. The additional 112 cases did not participate in the clinical trial, but consented to have basic measurements recorded and to be followed for survival. Six of those cases were lost to follow-up shortly after diagnosis, so the data here are on an additional 106 cases as well as the 312 randomized participants. Missing data items are denoted by `.'. "

Variables: case number number of days between registration and the earlier of death, transplantion, or study analysis time in July, 1986 status drug: 1= D-penicillamine, 2=placebo age in days sex: 0=male, 1=female presence of asictes: 0=no 1=yes presence of hepatomegaly 0=no 1=yes presence of spiders 0=no 1=yes presence of edema 0=no edema and no diuretic therapy for edema; .5 = edema present without diuretics, or edema resolved by diuretics; 1 = edema despite diuretic therapy serum bilirubin in mg/dl serum cholesterol in mg/dl albumin in gm/dl urine copper in ug/day alkaline phosphatase in U/liter SGOT in U/ml triglicerides in mg/dl platelets per cubic ml / 1000 prothrombin time in seconds histologic stage of disease

Information about the dataset CLASSTYPE: numeric CLASSINDEX: 3

Names
case_number,number_of_days,status,drug,age,sex,presence_of_asictes,presence_of_hepatomegaly,presence_of_spiders,presence_of_edema,
Types
  1. numeric
  2. numeric
  3. numeric
  4. nominal:D-penicillamine,placebo
  5. numeric
  6. nominal:female,male
  7. nominal:no,yes
  8. nominal:no,yes
  9. nominal:no,yes
  10. nominal:edema_despite_diuretic_therapy,edema_present_without_diuretics_or_edema_resolved_by_diuretics,no_edema_and_no_diuretic_therapy_for_edema
Data (first 10 data points)
    case... numb... status drug age sex pres... pres... pres... pres... ...
    1 400 2 D-pe... 21464 female yes yes yes edem... ...
    2 4500 0 D-pe... 20617 female no yes yes no_e... ...
    3 1012 2 D-pe... 25594 male no no no edem... ...
    4 1925 2 D-pe... 19994 female no yes yes edem... ...
    5 1504 1 plac... 13918 female no yes yes no_e... ...
    6 2503 2 plac... 24201 female no yes no no_e... ...
    7 1832 0 plac... 20284 female no yes no no_e... ...
    8 2466 2 plac... 19379 female no no no no_e... ...
    9 2400 2 D-pe... 15526 female no no yes no_e... ...
    10 51 2 plac... 25772 female yes no yes edem... ...
    ... ... ... ... ... ... ... ... ... ... ...
Description

A gzip'ed tar containing StatLib datasets (statlib-20050214.tar.gz, 12,785,582 Bytes)

URLs
(No information yet)
Publications
    Data Source
    http://lib.stat.cmu.edu/datasets/
    Measurement Details
    Usage Scenario
    revision 1
    by mldata on 2010-11-06 10:00

    No one has posted any comments yet. Perhaps you would like to be the first?

    Leave a comment

    To post a comment, please sign in.

    This item was downloaded 2778 times and viewed 1765 times.

    No Tasks yet on dataset statlib-20050214 pbc

    Submit a new Task for this Data item

    Data

    Sort by

    Disclaimer

    We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.

    Data | Task | Method | Challenge

    Acknowledgements

    This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
    PASCAL Logo
    http://www.pascal-network.org/.