Saturday, June 1, 2013

Evaluating the Moneyball draft eleven years later:
Part 2: Value of the major leaguers, college vs. high school draftees


If you read the book or saw the movie Moneyball, a strong impression you could come away with is that Billy Beane had helped make the Oakland A's a better team than their payroll predicted they should be by using modern approaches for scouting talent. The book pays a lot of attention to the 2002 baseball draft, a draft where Beane chose only college players and left out the high school stars. The emotional reason behind this decision is that Beane himself had been drafted out of high school and did not turn out to be productive at the major league level.

Obviously, a sample size of n = 1 is not a basis for science. Let's look at a statistic from the high school and college players among the first 50 drafted position players and the first 50 drafted pitched.

I admit that a single statistic measures only a single dimension and baseball is not a one dimensional game. For position players, I'm using total bases, which means adding up the hits and walks and stolen bases, giving one extra base for each double, two extra bases for each triple, three extra for a home run and subtracting the number of times the player was caught stealing.

For the pitchers, I use innings pitched as the measure of their value over their career. It's not perfect, as it tends to favor starting pitchers over relievers, but in general it does show over a career how much value a pitcher had for his club.

Total base numbers for the players from the 2002 draft that made the majors
High school draftees 
===============
2902, 2114, 2111, 2062, 1629, 1243, 907, 527, 476, 179, 98, 54, 19
Average: 1101.6
Standard deviation: 846.5
n = 13

College draftees
===========
2670, 2649, 1928, 1292, 1270, 1100, 829, 541, 424, 124, 6, 3, 2
Average: 987.5
Standard deviation: 951.0
n = 13

For these numbers, the high schoolers are better prospects on average than the college players, due in large part to the most productive hitter on the list, Prince Fielder and his 2,902 career total bases so far. The top two college recruits were Nick Swisher and Curtis Granderson, who are also both still active.

The reason I included the standard deviation and the sample size is to find out if the difference we see is statistically significant and the answer is no. The simplest formula for statistical significance is the z-score method, and the big standard deviations are in the denominator of the formula, which overwhelm the numerator by a lot.

Innings pitched numbers for the players from the 2002 draft that made the majors
High school draftees 
===============
1537, 1492, 1377, 1163, 1022, 473, 450, 233,180, 164, 120, 15, 10
Average = 633.5
Standard deviation = 529.3
n = 13

College draftees
===========
1438, 1202, 1179, 1161, 1141, 495, 166, 157, 114, 82, 68, 20
Average = 602
Standard deviation = 491.2
n = 12

Once again, the big standard deviations mean the difference isn't significant, but the high school draftees did slightly outshine the college.

Generally, Beane's decision to ignore high school talent appears to be counter-productive. On average, they are no more likely to wash out than college players and the most talented can be in the major leagues at a younger age with a chance to have a longer career if they can stay healthy and productive.

One of the reasons the book focused on the draft was that the Athletics had negotiated a large number of first round picks in 2002, the highest being the 16th overall and the lowest being the 39th. Tomorrow, we will look at those picks and how the A's did against their competitors.

No comments:

Post a Comment