View statlib-20050214 baseball-team (public)

2010-11-06 10:00 by mldata | Version 1 | Rating Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star
Rating
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Overall (based on 0 votes)
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Interesting
Empty StarEmpty StarEmpty StarEmpty StarEmpty StarEmpty Star Documentation
Summary

(No information yet)

License
unknown (from Weka repository)
Dependencies
Tags
arff slurped Weka
Attribute Types
Integer,String
Download
# Instances: 26 / # Attributes: 9
HDF5 (21.1 KB) XML CSV ARFF LibSVM Matlab Octave
Completeness of this item currently: 55%.
You can edit this item to add more meta information and make use of the site's premium features.
Original Data Format
arff
Name
baseball-team
Version mldata
0
Comment
        1988 Graphics Section Poster Session
      Displaying Analysis of Baseball Salaries

 The Statistical Graphics Section of the  American  Sta-

tistical Association is sponsoring a special poster session titled "Why They Make What They Make - An Analysis of Major League Baseball Salaries." The session will be held in August 1988 at the meetings in New Orleans and is intended to allow members to compare techniques for analyzing and displaying data. The session provides a forum for both old and new graphical techniques to describe and summarize the data.

 The results of each analysis will be  displayed  during

the poster session. Each participant will be given a space to display his/her results and other conference attendees will be encouraged to discuss them with you. There will not be any formal discussion or comparison of results. Rather, the opportunity is provided for informal comparison among ourselves.

 Since  this  session  is  treated   as   an   organized

contributed-paper session, you must submit a contributed- paper abstract to the ASA office. The abstract must be postmarked not later than February 15, 1988 AND identified as a contribution to the special Statistical Graphics Expo- sition. The forms can be found in future issues of AMSTAT News. The abstract can be very general. A sample which will suffice is provided below. Variations might include the software, hardware and techniques you plan to use. You have until August to work on the analysis. After the meet- ings you will be given an opportunity to include a short paper describing your analysis in the 1988 Proceedings of the Statistical Graphics Section.

                  SAMPLE ABSTRACT

         Analysis of Baseball Salary Data *

 This analysis describes and  summarizes  the  relation-
 ships  between  1987  salaries of major league baseball
 players and the player's performance.  We use graphical
 methods to show relationships between a player's salary
 to his 1986 and career performance statistics  and  the
 player's team.

 *Contribution to special Statistical  Graphics  Exposi-
 tion







                 September 2, 1987





                       - 2 -

 Description of the 1988 Graphics Section Data Set

 The data consist of three files consisting of  data  on

the regular and leading substitute hitters in 1986, the reg- ular pitchers in 1986 and the team statistics. The salary data were taken from Sports Illustrated, April 20, 1987. The salary of any player not included in that article is listed as an NA. The 1986 and career statistics were taken from The 1987 Baseball Encyclopedia Update published by Col- lier Books, Macmillan Publishing Company, New York. The team attendance figures were obtained from the Elias Sports Bureau, personal conversation.

 The goal is to use  graphical  methods  to  attempt  to

explain differences in the salaries of major league baseball players and to answer the question "Are players paid accord- ing to their performance?".

                 September 2, 1987





                       - 3 -

                  The Hitter File

There is one line per hitter. Each data item is separated by a tab character. The variables are hitter's name, number of times at bat in 1986, number of hits in 1986, number of home runs in 1986, number of runs in 1986, number of runs batted in in 1986, number of walks in 1986, number of years in the major leagues, number of times at bat during his career, number of hits during his career, number of home runs during his career, number of runs during his career, number of runs batted in during his career, number of walks during his career, player's league at the end of 1986, player's division at the end of 1986, player's team at the end of 1986, player's position(s) in 1986, number of put outs in 1986, number of assists in 1986, number of errors in 1986, 1987 annual salary on opening day in thousands of dol- lars, player's league at the beginning of 1987, player's team at the beginning of 1987.

                  The Pitcher File

There is one line per pitcher. Each data item is separated by a tab character. The variables are pitcher's name, player's team at the end of in 1986, player's league at the end of 1986, number of wins in 1986, number of losses in 1986, earned run average in 1986, number of games in 1986, number of innings pitched in 1986, number of saves in 1986, number of years in the major leagues, number of wins during his career, number of losses during his career, earned run average during his career, number of games during his career, number of innings pitched during his career, number of saves during his career, 1987 annual salary on opening day in thousands of dol- lars, player's league at the beginning of 1987, player's team at the beginning of 1987.

                 September 2, 1987





                       - 4 -

                   The Team File

There is one line per team. Each data item is separated by a tab character. The variables are league, division, position in final league standings in 1986, team, number of wins in 1986, number of losses in 1986, attendance for home games in 1986, attendance for away games in 1986, 1987 average salary.

          Coding for some of the Variables

Team Names: N.Y. New York Phi. Philadelphia St.L. St. Louis Mon. Montreal Chi. Chicago Pit. Pittsburgh Hou. Houston Cin. Cincinnati S.F. San Francisco S.D. San Diego L.A. Los Angeles Atl. Atlanta Bos. Boston Det. Detroit Tor. Toronto Cle. Cleveland Mil. Milwaukee Bal. Baltimore Cal. California Tex. Texas K.C. Kansas City Oak. Oakland Min. Minnesota Sea. Seattle

Leagues: N National A American

Division: W West E East

                 September 2, 1987





                       - 5 -

Player's position(s):

 If a substitute played 70% of his games  at  one  posi-

tion, that is the only position listed for him. If he did not play 70% of his games at one position, but played 90% of his games at two position, he is listed with a combination position, such as "S2" for shortstop and second base, or "CO" for catcher and outfield. If a player failed to meet either the 70% or 90% requirement listed above, he is listed as a utillity player ("UT"). 1B First Base 2B Second Base SS Short Stop 3B Third Base RF Right Field CF Center Field LF Left Field C Catcher DH Designated Hitter OF Outfield UT Utility OS Outfield and Short Stop 3S Third Base and Short Stop 13 First and Third Base 3O Third Base and Outfield O1 Outfield and First Base S3 Short Stop and Third Base 32 Third and Second Base DO Designated Hitter and Outfield OD Outfield and Designated Hitter CD Catcher and Designated Hitter CS Catcher and Short Stop 23 Second and Third Base 1O First Base and Outfield 2S Second Base and Short Stop

                 September 2, 1987

Note: the uncorrected data is being used!

Information about the dataset CLASSTYPE: numeric CLASSINDEX: none specific

Names
league,division,position_in_final_league_standings_in_1986,team,number_of_wins_in_1986,number_of_losses_in_1986,attendance_for_home_games_in_1986,attendance_for_away_games_in_1986,_average_salary,
Types
  1. nominal:A,N
  2. nominal:E,W
  3. nominal:1,2,3,4,5,6,7
  4. nominal:Atl.,Bal.,Bos.,Cal.,Chi.,Cin.,Cle.,Det.,Hou.,K.C.,L.A.,Mil.,Min.,Mon.,N.Y.,Oak.,Phi.,Pit.,S.D.,S.F.,Sea.,St.L.,Tex.,Tor.
  5. numeric
  6. numeric
  7. numeric
  8. numeric
  9. numeric
Data (first 10 data points)
    league divi... posi... team numb... numb... atte... atte... _ave...
    N E 1 N.Y. 108 54 2767... 2176... 526899
    N E 2 Phi. 86 75 1933... 1721... 540930
    N E 3 St.L. 79 82 2471... 1695... 445049
    N E 4 Mon. 78 83 1128... 1926... 312959
    N E 5 Chi. 70 90 1859... 1879... 556256
    N E 6 Pit. 64 98 1000... 1651... 223730
    N W 1 Hou. 96 66 1734... 1816... 430363
    N W 2 Cin. 86 76 1692... 1849... 313194
    N W 3 S.F. 83 79 1528... 1924... 326583
    N W 4 S.D. 74 88 1805... 1779... 405340
    ... ... ... ... ... ... ... ... ...
Description

A gzip'ed tar containing StatLib datasets (statlib-20050214.tar.gz, 12,785,582 Bytes)

URLs
(No information yet)
Publications
    Data Source
    http://lib.stat.cmu.edu/datasets/
    Measurement Details
    Usage Scenario
    revision 1
    by mldata on 2010-11-06 10:00

    No one has posted any comments yet. Perhaps you would like to be the first?

    Leave a comment

    To post a comment, please sign in.

    This item was downloaded 2610 times and viewed 2055 times.

    No Tasks yet on dataset statlib-20050214 baseball-team

    Submit a new Task for this Data item

    Data

    Sort by

    Disclaimer

    We are acting in good faith to make datasets submitted for the use of the scientific community available to everybody, but if you are a copyright holder and would like us to remove a dataset please inform us and we will do it as soon as possible.

    Data | Task | Method | Challenge

    Acknowledgements

    This project is supported by PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning)
    PASCAL Logo
    http://www.pascal-network.org/.