Tuesday, January 22, 2013

2012 Baseball Review: Part 1, American League

So, the elephant in the room. It's January 19th. 2013. Almost three weeks into the new year and I begin to write about baseball? I believe in digesting information before looking at narratives and broader trends, but this seems a little extreme. While we are in the thick of the NFL playoffs and the college basketball season, I thought I should at least get some of my baseball findings on paper.

I am one of those baseball fans that only really pays attention to MLB after the All-Star break. It's a long season (too long), and it's too hard to follow so many games. But after the All-Star break, usually the chaff has been purged leaving a few teams in each division with a shot at the post-season. To me it is infinitely more fascinating to analyze these teams than focus on meaningless games in early July involving clubs like the Blue Jays, Twins, Mariners, etc. Only problem: watching after the Olympics made me wonder which teams I should be paying attention to, so I put together some research to inform my viewing/thinking. Some of this research proved to be quite fascinating:


The first thing I did was sort between playoff teams and pretenders. To do this, I ranked teams by  run differential and odds to make the playoffs (P%) in both leagues, starting with the American League. My thought was that run differential (and its derivation, Pythagorean win probability) is one of the most telling stats to show how good a team really is. When looking at the American League, I didn't even include the Baltimore Orioles, who at the time, were competing with the Yankees for first place in the AL-East, but had a negative run differential. This is where watching the team over a season would have made a difference. The Orioles were gangbusters in close games, winning more 1-run games than any other team in the league. No doubt, their strong bullpen and ability to not blow leads helped. At one point, they had won more than 70 consecutive games with a lead in the 7th. Instead, what I saw was a team that was lucky to have as many wins as they did. I thought their run differential (which ended at +7; the next worse playoff team was Detroit at +56) was more indicative of a 80-win team.

What's the lesson? I think that baseball, more than other sports, can be a game where the final score can be misleading and runs are lumpy. For example, in football, based on average margin of victory (found here), 49.03% of games end with a margin less than 8 points, or a one-possession game. In baseball (I know, there are no possessions, so the comparison is flawed, but data are found here), the percentage of one-run games was historically 30.24%. Almost half of NFL games end in one-score margins, far more than baseball games that are close at the end. You can take it further: 71.59% of NFL games ended in a 16 point or less margin (2 possessions), whereas only 48.47% of baseball games were 2 runs or less. What does this tell us? There are more big wins/blowouts in baseball, which skews the value of run differential. In other words, a few blowout losses can lead to a lower run differential for a given baseball team, but in football, fewer blowouts means that point differential is a better measure of talent.

Comparing baseball with basketball illustrates the lumpiness of baseball scoring. The median scoring MLB team 2012 happened to be the Baltimore Orioles, with 712 runs scored over the season, or roughly 4.40 runs scored/game. Compare that with the median NBA team in 2012/2013, the Brooklyn Nets, who score 96.4 points/game. The NBA team scores more frequently, which minimizes tail events and random scoring nights, both high and low. The lumpy nature of baseball scoring means that over, say, a five game series, a team could get blown out in one game but win three close ones, and come up with a negative run differential. But in a basketball series where scoring happens more often, I suspect (no data for this one), that the better team generally outscores the other team over the series, in addition to winning more games.

I don't know for sure how Buck Showalter managed his team so well, but I hypothesize that he understood the lumpy nature of baseball scoring and that some nights, tail events would drive things. Big wins in baseball seem to happen when starting pitchers are chased early, causing a ripple effect on the rest of the bullpen that you don't see in the other sports. Smart teams can tell when their starters don't have their best stuff on a given day, and are able to pull those guys early, electing instead to fight another day. This may be chasing a false narrative, but I think Showalter did a masterful job managing his team in terms of chasing wins where the odds were favorable, but then getting guys out whenever a game got away. This can lead to some big losses, negatively affecting run differential, but also keep guys fresh for "must-win" close games.

Anyway, after sorting the teams out and mistakenly precluding the Orioles, I sought to look at specific aspects of the teams that would make them more likely playoff contenders. I started with starting pitching.

I charted stats from baseball-reference for a team's top three starters, with the hypothesis that in the playoffs, due to limited number of games and each game's increased importance, teams would choose to start their best pitchers on shorter rest, and that the top three pitchers may have an outsized impact on the outcome of the series. The raw data shows that Detroit's pitchers threw a ton of innings and struck out far more batters than walks/home runs given up. It also looks like the White Sox, Yankees, and Rays had above average pitching staffs, with the Sox having a very interesting staff with few weaknesses (no categories in red).

But raw data can be misleading. The next section, Pitching Rate, attempts to measure much of the same information by % of pitches thrown, which adjusts for pitchers with more workload (pitchers that throw more will have more stats). I also included some "advanced" stats: Fielding Independent Pitching (FIP, lower is better) and Wins Above Replacement (WAR). Fielding Independent Pitching is derived via a formula that weights the aforementioned common stats by importance, giving you a single number that ignores the effect of defense (which I elected not to measure). WAR is a stat I got from Baseball-Reference that measures a pitchers performance, like you guessed above replacement. It's basically a tell-all stat on how many wins a specific player generated.

The advanced pitching section shows that while Detroit's pitchers were indeed very good, a lot of that was likely the work of one Justin Verlander, and overall, the Chicago White Sox staff was the most consistent across all categories and produced the highest WAR. In the playoffs, I like defense due to the lumpy/lucky nature of offense. But what about the offense?

I charted stats for a team's top nine batters by number of plate appearances (including the DH). As no is surprised to see, the Yankees had the best regular season OPS+ (advanced on-base + slugging %), but strangely, the LA Angel's produced the highest WAR by virtue of a huge year from Mike Trout. In fact, the huge WAR produced by the Angels actually manipulated the average for the eight teams (at the bottom), increasing it to 1.30, a figure only the Angels, Yankees, and Red Sox were better than. Put it another way: if the Angels didn't make the playoffs, the difference between the best bats remaining in the league playoffs and the worst would probably not be that dramatic, placing an even greater emphasis on pitching being the differentiating factor. The last column measure power and speed, another Baseball-Reference stat that looks at home runs and steals, but I elected not to look to closely at this athleticism-related stat.

In the Total WAR column, we see that that the White Sox's strong pitching helped them to the League's highest WAR. With pitching setting them apart, I believed that WAR figure to portend playoff success for that Chicago team. I was also very high on Tampa Bay; after a slow start and an injury to possibly their best player in Evan Longoria, the team was coming on strong. Their pitching was decent, though had a tendency to give up walks, and I figured having Longoria back full-time would help their batting stats. Teams I wanted no part of were (including the Orioles, I know) the Rangers, who seemed very mediocre and had a dark cloud hanging over the clubhouse in the form of Josh Hamilton, and the Tigers, which I though were clearly second-best to the White Sox in the Central and unlikely to get a wild care spot. Boy was I wrong about that...

Check back for the National League analysis, a league comparison, and a fun team picking exercise that would have ended better if all the above predictions panned out.

No comments:

Post a Comment