How accurate is WAR, really?

So, this whole metrics vs old-stand-by debate is about to boil over with the AL MVP voting and the Mike Trout vs Miguel Cabrera senseless bickering. One side talks about playoffs, the “eye test,” and the Triple Crown; the other throws out WAR and WPA.

The consensus among the new-age metrics crowd is that stats like RBI are inherently flawed and batting average isn’t nearly as important as on-base percentage. This makes a good deal of sense, since why should it matter how a person gets on base? However, the metrics folks are sometimes given a free pass for the complexity of the math they invoke. They throw around WAR (wins above replacement) like its a truism, and if we are judging by WAR alone there is no comparison. According to fangraphs.com, Cabrera has a WAR of 7.2 and Trout a WAR of 10.4.

But is WAR actually a true representation of what we see on the field?

On some level that question is unanswerable. You can’t synthesize everything in a sporting event to numerical values; it just doesn’t work; because players–no matter their tendencies–are not fully rational creatures. They will sometimes do things so outside what is statistically likely (both for good and ill), and most importantly, they remain human. In a strange way, the metrics debate is not that dissimilar from the artificial intelligence debates… but that’s for another time. The real question is whether we can look at WAR alone to determine a player’s worth.

I did some calculating. I took Fangraphs’ WAR totals for every team in MLB, added the pitching and the hitting, then added in what they consider replacement level (43 wins). If WAR is universal and correct, there should be a very high correlation between projected wins with WAR and actual wins for these teams. But here’s the thing: the correlation wasn’t that great. In fact, not a single team in MLB has the same number of wins in WAR (rounded to the nearest whole number) as they do in reality.

Since we’re talking about the AL MVP award, let’s talk primarily about the American League. As I said before, if WAR is such a great stat it should be able to predict actual records and so it should stand a reasonable chance of predicting all the teams that make the playoffs in a given year. So, let’s imagine a made-up metrics world where WAR determined everything, including which teams get to the playoffs. According to Fangraphs’ war calculator the AL division winners for 2012 would be the AL East champion Yankees (94.3 wins), AL West champion Rangers (93.9 wins) and AL Central champion Tigers (89 wins). That’s pretty good. In fact, it missed these teams’ actual win totals (95, 93, and 88 respectively) by only one each.

But you’ll notice one problem. The Rangers didn’t win the West, and this is where things get awfully sketchy. Who would be the two Wild Card teams in the WAR world? The Angels (90.9 wins) and the Rays (88.6 wins). Uh oh. That’s right, the Athletics, who in the real world won their division with 94 wins would miss the WAR playoffs; in fact, they would finish 4.1 games back in the Wild Card to the Rays, and that’s not the worst of it.

The Baltimore Orioles have been the surprise team of 2012 and they made the playoffs as the final Wild Card in that 5th seeded spot, but in the WAR world they are nowhere to be found and that’s because their WAR win total is a measly 74.9. That’s right, WAR predicts the Orioles to be over 6 games UNDER .500 or 18.1 games under their actual win total of 93. In fact, in the WAR world of the AL East, the Orioles would have finished behind not just the Rays but also the Boston Red Sox (77 WAR wins), who were 24 (!) games behind the Orioles in our real world.

Ladies and gentlemen, this is a problem. How can we trust a stat designed to tell us the “truth” in players’ values when some teams are so badly misjudged by their WAR totals that a make-believe season looks nothing like reality? You could say that there is bound to be some error projecting from the minute to the macro, which is essentially what WAR does, but good math should become more accurate, on a percentage basis, the more information you generate. With WAR the macro of an entire season exposes its problems. The math doesn’t add up. It seems like a good indicator, but the metrics cannot explain the reality we see on the baseball field.

By the way, in case you are wondering, the NL WAR world would have ended the 2012 season with the NL East champion Nationals (94.4 WAR wins), NL Central champion Cardinals (96.9), NL West champion Diamondbacks (91.8), and wild card Brewers (94.7) and Braves (91.6). So, yes, in fact the division champion Reds (97 real wins) and Giants (94 real wins) would not have made the playoffs in the WAR world.

Is this to say that Mike Trout shouldn’t win the AL MVP? No. We do have to look beyond the Triple Crown to fielding and base running, but we also have to acknowledge that WAR is not perfect. If I had an AL MVP vote I would give it to Trout, but not without some hard thought. Baseball is not a numbers game alone, and for that I am very thankful.

For your enjoyment, the WAR totals compared with reality are below.

Image

 

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

One Response to How accurate is WAR, really?

  1. filihok says:

    Your analysis here is lacking:
    You said: “But here’s the thing: the correlation wasn’t that great. In fact, not a single team in MLB has the same number of wins in WAR (rounded to the nearest whole number) as they do in reality.”

    http://www.hardballtimes.com/main/article/what-is-war-good-for/
    There is a very high correlation between WAR-wins and actual wins.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s