Thursday, May 5, 2016

The Problem with Understanding Hockey Analytics

During last night's telecast of Capitals @ Penguins game 4, everyone's favorite hockey analyst Pierre McGuire couldn't help but make a snarky comment about analytics following Matt Cullen's goal in the 2nd period. For posterity, or until the NHL makes another terrible website design change, the video is embedded below:

"It's the little things that analytics won't tell you, Doc. Winning races, winning battles, getting to loose pucks, all of that stuff happened during that last sequence. What a job by Matt Cullen."
Now, I'm going to tell you something that you didn't expect: Pierre McGuire is right. No metric we have, or most likely are developing (at least until player tracking data becomes available) can measure these things. However, both Pierre, and I suspect, the average hockey fan tend to miss the forest for the trees when it comes to understating what the statistics we do have measure, and how their purpose is applied.

Time and again, we've been over how the best indicator of future success in the standings is some form of Corsi percentage. First it was score-tied, and thanks to a further application of something we knew five years ago, now it's score-adjusted. While Corsi, or anything else we have does not measure Matt Cullen's will to win a race, Tom Wilson's ability to fire up his teammates, or the fact that Ryan Callahan's grandmother is in the stands, it does measure, and until somebody comes up with something better, is the best measure of what Pierre's intangibles hope to achieve.

Put another way, what Pierre and countless others are hoping to do is measure a player's ability to help his team win games. That's the same thing that Corsi does, only Corsi has the advantage of actual, testable events leading to results. At the end of the day, accruing wins and standings points should be the goal of any successful coach, general manager, owner, and franchise. The way that anyone in these positions goes about achieving them often makes or breaks a team's success.

This isn't to say that what Pierre lists as intangibles aren't important, or that a good character guy in the room isn't necessary. The trick is to identify players that possess these traits, and at the same time, drive shot attempts towards the opposing team's net. You can harp on intangibles all you want, but if the result is utilizing a player like Tanner Glass or Zac Rinaldo for an extended period of time, your process is broken, plain and simple.

This brings me to my next point. Many analysts, like Pierre, are often asked to break down matchups on a play-by-play, game-by-game, or night-by-night basis. In a similar respect, the average fan often watches or attends one game at a time, and as a result, both don't stop to see the bigger picture. To me, the fact that we're still dealing with comments like Pierre's on national broadcasts isn't all that surprising. In fact, it's kind of expected, given how he, many of the analysts in his industry, and many fans across the world are viewing hockey under a microscope.

Adding to the problem, hockey may be the hardest sport to predict on a game-by-game basis. Whether you look at basic models or betting markets, one trend is blatantly obvious: in hockey, even controlling for everything we know, a vast majority of the matchups have odds between 55/45 and 65/35 either way. 3:1 favorites in a single game don't happen often, which is why in small samples, we see extreme results.

When you start to ask why, it becomes less and less surprising; hockey is a game of razor thin margins anywhere you look. For example, an above average team is going to control merely 2-4% more shot attempts than a below average one over the course of a season. An above-average goaltender stops one (1) percent more shots than an average one, or one (1) more shot over a sample of 100. If the margins are this thin over 82 games, it's no surprise that they're basically non-existent over one game or one playoff series. Coincidentally, one game or one playoff series are what analysts like Pierre are paid to comment on.

Still, despite the studies identifying score-adjusted metrics as the best predictor of future team success, fans, analysts, and even teams (see: Ducks, Anaheim) still can't rationalize the inherent randomness of losing as a 60% favorite in one game or one series.

To put this in different terms, if you've ever played poker, specifically No Limit Texas Hold 'em, you'll understand this concept. In poker, you might go one day, or even one week having your opponent's King-Queen offsuit beat your Ace-Jack suited whilst all-in during a high leverage tournament hand. However, we know that in the long run you're making the right play, given that you're going to win about 62% of the time. In hockey terms, the 2015-16 Anaheim Ducks were the team holding Ace-Jack. Their season came down to one shuffle of a weighted deck, and they lost. Instead of accepting their fate as a simple mark of bad luck, striving to continue "getting their money in with an edge," they overreacted and fired Bruce Boudreau. As a result, they're almost certain to end up with a less capable bench boss next season.

At the end of the day, the fact that mainstream hockey circles can't comprehend that 35-45% underdogs can win games and series, but don't over the long run, doesn't prove the metrics we have are broken. Instead, it proves that their understanding, and by association the overall understanding of the game, is broken.