9.03.2014

Luyman's Terms #15: Learning WPA and WE with Goofy

In my post on Monday, I mentioned that I would expand upon the How to Play Baseball video with Goofy. I thought it might be fun to take the video and learn a couple new stats and how they work together. Those two stats are Win Probability Added (WPA) and Win Expectancy (WE). These are hardly stats that are predictive, but they are fun to play around with. They do a good job telling the story of a particular game, and outline "clutch" plays quite well.

To understand how these stats work, we must first understand the concept of "context neutral" statistics. Context neutral stats are stats that do not care about about the situation in which they are trying to quantify. Like my pieces about the value of hitters in the batters box. Tat stat didn't care about if the batter came to the plate with runners on 1st and 3rd with nobody out or if there was only a runner on 2nd with 1 out or nobody on and 2 outs. All it cared about was the outcome of the at bat.

Most stats are context neutral. Batting average only cares if a batter got a hit, OBP only cares if a batter got on base, and fielding percentage only cares if a defender didn't make an error on a play. Context neutral stats are great for large sample sizes. Unfortunately, context does matter during an individual game of baseball. Not every single is created equal. Some singles score runs if runners are on base, others don't.

Enter WPA. this is a statistic that is entirely depends upon context, and quantifies the difference between singles that score runs and singles that don't (as well as any other batting event). WPA uses what are called Win Expectancy Matrices to determine the change in a teams probability of winning. Now that we have a rough idea of what WPA is, lets learn how to calculate it piece by piece. I used this tool to determine win expectancy


We join our game already in progress with the Blue Sox up by 3 in the bottom of the ninth. At this point, the Blue Sox are a 99.443% favorite to win (and I imagine a little extra since the batter has two strikes). This means that the home team only has a 0.557% chance of winning. Our pitcher is pretty pleased with himself, and gives the crowd a bow.


 He unleashes his next pitch...and the batter ends up Bill Bucknering the third baseman and ends up on first with a single.


As a result of this play, our team now has a 98.854% chance of winning, and the home team now has a whopping 1.146% chance! Since RedGoofy1 was the batter who hit the single, he is credited with .00589 WPA for that plate appearance. Immediately after RedGoofy1 gets into a run down and ends up stealing second!


As a result of this play, our team now has a 98.924% chance to win, and the home team has a 1.076% chance. After the steal of second, RedGoofy1 is credited with - .00007 WPA. Bringing his total WPA for the inning to .00589 - .00007 = .00582 WPA. So why did RedGooy1's WPA drop after a steal of second? He's in a much better position to score! Shouldn't it go up?

This is one of the problems of dealing with small samples. If we take a closer look into amount of times this particular situation has come up, we see that a total of 1208 games had this exact game state. That may sound like a  lot, but that is 1208 games out of a 112620 game sample. That translates to 1.0726% of total games played. That isn't very many.

Let's go back to our game. After RedGoofy1 steals 2nd base, our hero pitcher gets so salty that he cracks the next batter right in the face! 


That leaves runners on first and second and with our team at a 96.234% chance of winning! This leaves the home team at a 3.766% chance to win and RedGoofy2 is credited with .026934 WPA. With the tying run coming up to bat, I'm still feeling pretty good about our teams chances of a victory! The feeling doesn't last very long as RedGoofy3 drops a bunt, wacky cartoon hijinks ensue, and the bases are suddenly loaded.


This brings our team with a 91.12% chance of winning and the home team with a 8.86% chance of winning. RedGoofy3 is credited with .05094 WPA. It should be noted that while the batters are gaining WPA, our pitcher is losing it as a result. WPA is a zero sum stat, meaing that for every positive value incurred, there is an equal negative value somewhere else. Our pitcher has a WPA for the third of an inning of -.00589 (single) + .00007 (stolen base) - .026934 (HPB) - .05094 (bunt) = -.083694 total WPA, meaning he is charged with costing his team of losing 8.3694% of a win since recording the 2nd out of the inning.

So even with the bases loaded, and the winning run at the plate, I'm still feeling pretty good about our team's chances of winning. The next batter hits a long fly ball to center, more wacky cartoon hijinks ensue and our teams finally records the third out! We win! A fight breaks out, because baseball, and we are left to celebrate out victory.


After the dust settles, RedGoofy4 is credited with -.0886 WPA since he dropped his team's chances of winning down to 0%. Our pitcher is credited with .0886 WPA since he raised his team's chances of winning up to 100%. That brings his total WPA since the second out was recorded to -.00589 (single) + .00007 (stolen base) - .026934 (HPB) - .05094 (bunt) + .0886 (third out) = .004906 WPA.

It should be noted, again, that this is hardly a predictive statistic. WPA has a weak year-to-year correlation of .414 which means that only 17% of year X WPA informs a players year X+1 WPA. Which means that 83% of WPA comes from somewhere else. Most likely it is the amount of chances a player has to come to bat in high leverage situation and how often they succeed in those situations. If a player has a lot of chances, they are more likely to have a high WPA. If a player does not get those opportunities, their WPA will suffer.

No comments:

Post a Comment