Baseball Stats – dollars-per-win

A recent post on the Beyond the Boxscore blog pointed me to another blog entry on Statspeak called Building a sabermatrician’s workbench.  And as the web will do, I was in turn pointed to Baseball-Databank.org.

I’m not new to MySQL databases or SQL queries, but the existence of a free baseball database was news to me, so I downloaded and imported the beast.  It’s quite an interesting blob of data and you can use it to analyize and discover tidbits about your favorite teams and players. 

I wanted to find out how payroll and team wins are related, specifically from this past year.  I was messing around with the database, and came up with a query for data that is probably easy to find out and already analyzed on lots of other websites. But where is the fun in that?  So I created a query that would take the sum of all players’ salaries on each team, and then divide it by the number of wins that team had.  A simple calculation, but a little more wordy when written out as a query:

SELECT t.teamID, t.W, t.L, (t.W/t.G) as wpct,
 sum(s.salary) as payroll,
 sum(s.salary)/t.W as dpw
FROM Teams t, Salaries s
WHERE t.teamID = s.teamID
 AND t.yearID = s.yearID
 AND s.yearID = 2008
GROUP BY s.teamID
ORDER BY dpw DESC 

I won’t go into detail on what each line of the query does, because you can find that out elsewhere if you are interested.  It query outputs the wins, losses, win percent, payroll and dollars per win that the team spent for 2008.  Here are the results, with dollars in millions:

teamID W L wpct payroll dpw
NYA 89 73 0.5494 207.9 2.336
SEA 61 101 0.3765 117.7 1.929
DET 74 88 0.4568 137.7 1.861
NYN 89 73 0.5494 137.8 1.548
ATL 72 90 0.4444 102.4 1.422
LAN 84 78 0.5185 118.6 1.412
BOS 95 67 0.5864 133.4 1.404
CHA 89 74 0.5460 121.2 1.362
CHN 97 64 0.6025 118.3 1.220
LAA 100 62 0.6173 119.2 1.192
SDN 63 99 0.3889 73.7 1.169
SLN 86 76 0.5309 99.6 1.158
TOR 86 76 0.5309 97.8 1.137
SFN 72 90 0.4444 76.6 1.064
PHI 92 70 0.5679 97.9 1.064
HOU 86 75 0.5342 88.9 1.034
CIN 74 88 0.4568 74.1 1.002
BAL 68 93 0.4224 67.2 0.988
CLE 81 81 0.5000 79.0 0.975
WAS 59 102 0.3665 55.0 0.932
COL 74 88 0.4568 68.7 0.928
MIL 90 72 0.5556 80.9 0.899
TEX 79 83 0.4877 67.7 0.857
ARI 82 80 0.5062 66.2 0.807
KCA 75 87 0.4630 58.2 0.777
PIT 67 95 0.4136 48.7 0.727
MIN 88 75 0.5399 56.9 0.647
OAK 75 86 0.4658 48.0 0.640
TBA 97 65 0.5988 43.8 0.452
FLO 84 77 0.5217 21.8 0.260

What does this tell us?  I’ve highlighted the teams that made it into the postseason. Most of them were in the top half and around 1.2-1.4 million per win. 

There are two reasons you could have a high dpw:

1) High Payroll.  The more you pay, the more each win will cost you.
2) Low Wins. The less you win, the more those you do win will have cost you.

Let’s look at a few teams.

The Yankees with the highest per-win value (2.336m) had a tough year but came out six wins short of tying Boston for the wild card (assuming those wins didn’t come from Boston).  That number of wins would have lowered their dpw to about 2.2m.  Either way, their strategy, for the most part, is to pay for veterans who have proved themselves to be among the best in the game.  This year, it didn’t work out for them.

The Tampa Bay Rays, on the other hand, made the most out of a low payroll by taking their 452k per win to a World Series berth.  I imagine that their dollars-per-win will go up over the next few years as their young stars start to get into the mature part of their contracts, though I haven’t researched that.  Also, perhaps the higher revenue from an excited fanbase translates into more money for free agents.

Seattle has the second highest dpw value with their 118m payrool and only 61 wins.   Even though the Yankees have the highest dpw, I think the Mariners are the most offensive (as in offending, not in run production) team this year.  They are in a smaller market and have less revenue coming in than the Yankees do and the losing record and current roster makes me think that they are a lot of big changes away from being competetive.

The Oakland Athletics are pretty low on the dollars-per-win chart with a sub-.500 year that threatenend for awhile to be a winning year.  They are adding payroll this year and should pick up more wins, though I still expect them to be near the lower quarter of this graph next year. 

So what does dollars-per-win tell us?  Is payroll releated to team wins?  Yes, but not always.  A team like the Yankees can be successful year after year by throwing money at the best players, but that will not guarantee them a 100 win season each time.  They can do that though because they are in a large market and can count on the revenue coming in to make up for the higher cost.  Smaller market teams need to be smarter and more efficient with their payroll in order to compete for a playoff spot on a yearly basis since they don’t have the same kind of revenues.

I don’t think there is anything wrong with having a high dollars-per-win ratio. I think it depends on how you’re getting it.  If the team isn’t making it to the postseason (or getting close to it) then the players they are paying for are probably not worth what they are earning.  I think dollars-per-win could be considered an effiency thing, i.e, what is the team doing with the money they are paying out.  In the end (competetivly), nothing matters more than a World Series but you’ll never get there without the wins, so wins become important.   And in the end (business-wise) you just want to make money, and wins will keep people coming to the park.

Where to go from here?  First, I should take a look at previous years and see what the patterns are.  Do most teams come from the top, or is there always a team or two that comes from the depths?

It would be interesting to look at the similar calculation of net revenue-per-win, but I don’t know where to get those numbers.  The results would take into account their market and attendance and put it up against wins.  Teams that have a higher attendance dispite a poor team would probably rate well.  Maybe net revenue-per-win would mean more for the bean counters than the stat heads.

Am I assuming too much with dollars-per-win?  Have I made any mistakes?  Any other thoughts?