View datasets-numeric lowbwt (public)

2010-11-06 09:57 by mldata | Version 1 | Rating Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
Rating
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Overall (based on 0 votes)
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Interesting
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Documentation
Summary

(No information yet)

License
unknown (from Weka repository)
Dependencies
Tags
arff slurped Weka
Attribute Types
Integer
Download
# Instances: 189 / # Attributes: 10
HDF5 (21.5 KB) XML CSV ARFF LibSVM Matlab Octave
Completeness of this item currently: 44%.
You can edit this item to add more meta information and make use of the site's premium features.
Original Data Format
arff
Name
'lowbwt'
Version mldata
0
Comment

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Identification code deleted.

As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems. Singapore: Springer-Verlag.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

NAME: LOW BIRTH WEIGHT DATA KEYWORDS: Logistic Regression SIZE: 189 observations, 11 variables

NOTE: These data come from Appendix 1 of Hosmer and Lemeshow (1989). These data are copyrighted and must be acknowledged and used accordingly.

DESCRIPTIVE ABSTRACT: The goal of this study was to identify risk factors associated with giving birth to a low birth weight baby (weighing less than 2500 grams). Data were collected on 189 women, 59 of which had low birth weight babies and 130 of which had normal birth weight babies. Four variables which were thought to be of importance were age, weight of the subject at her last menstrual period, race, and the number of physician visits during the first trimester of pregnancy.

SOURCE: Data were collected at Baystate Medical Center, Springfield, Massachusetts, during 1986.

NOTE: This data set consists of the complete data. A paired data set created from this low birth weight data may be found in plowbwt.dat and a 3 to 1 matched data set created from the low birth weight data may be found in mlowbwt.dat.

Table: Code Sheet for the Variables in the Low Birth Weight Data Set.

Columns Variable Abbreviation

2-4 Identification Code ID

10 Low Birth Weight (0 = Birth Weight ge 2500g, LOW l = Birth Weight < 2500g)

17-18 Age of the Mother in Years AGE

23-25 Weight in Pounds at the Last Menstrual Period LWT

32 Race (1 = White, 2 = Black, 3 = Other) RACE

40 Smoking Status During Pregnancy (1 = Yes, 0 = No) SMOKE

48 History of Premature Labor (0 = None, 1 = One, etc.) PTL

55 History of Hypertension (1 = Yes, 0 = No) HT

61 Presence of Uterine Irritability (1 = Yes, 0 = No) UI

67 Number of Physician Visits During the First Trimester FTV (0 = None, 1 = One, 2 = Two, etc.)

73-76 Birth Weight in Grams BWT

PEDAGOGICAL NOTES: These data have been used as an example of fitting a multiple logistic regression model.

STORY BEHIND THE DATA: Low birth weight is an outcome that has been of concern to physicians for years. This is due to the fact that infant mortality rates and birth defect rates are very high for low birth weight babies. A woman's behavior during pregnancy (including diet, smoking habits, and receiving prenatal care) can greatly alter the chances of carrying the baby to term and, consequently, of delivering a baby of normal birth weight. The variables identified in the code sheet given in the table have been shown to be associated with low birth weight in the obstetrical literature. The goal of the current study was to ascertain if these variables were important in the population being served by the medical center where the data were collected.

References:

  1. Hosmer and Lemeshow, Applied Logistic Regression, Wiley, (1989).
Names
LOW,AGE,LWT,RACE,SMOKE,PTL,HT,UI,FTV,class,
Types
  1. nominal:0,1
  2. numeric
  3. numeric
  4. nominal:2,3,1
  5. nominal:0,1
  6. nominal:0,1,2,3
  7. nominal:0,1
  8. nominal:1,0
  9. nominal:0,3,1,2,4,6
  10. numeric
Data (first 10 data points)
    LOW AGE LWT RACE SMOKE PTL HT UI FTV class
    0 19 182 2 0 0 0 1 0 2523
    0 33 155 3 0 0 0 0 3 2551
    0 20 105 1 1 0 0 0 1 2557
    0 21 108 1 1 0 0 1 2 2594
    0 18 107 1 1 0 0 1 0 2600
    0 21 124 3 0 0 0 0 0 2622
    0 22 118 1 0 0 0 0 1 2637
    0 17 103 3 0 0 0 0 1 2637
    0 29 123 1 1 0 0 0 1 2663
    0 26 113 1 1 0 0 0 0 2665
    ... ... ... ... ... ... ... ... ... ...
Description

A jarfile containing 37 regression problems, obtained from various sources (datasets-numeric.jar, 169,344 Bytes).

URLs
(No information yet)
Publications
    Data Source
    Measurement Details
    Usage Scenario
    revision 1
    by mldata on 2010-11-06 09:57

    No one has posted any comments yet. Perhaps you would like to be the first?

    Leave a comment

    To post a comment, please sign in.

    This item was downloaded 4757 times and viewed 2934 times.

    No Tasks yet on dataset datasets-numeric lowbwt

    Submit a new Task for this Data item

    Data

    Sort by

    Disclaimer

    We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.

    Data | Task | Method | Challenge

    Acknowledgements

    This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
    PASCAL Logo
    http://www.pascal-network.org/.