Friday, July 15, 2011

Yeah, But: QualComp

"Going by these stats, X is better than Y." "Yeah, but…"

In the Yeah, But series, I will be taking a closer look at the stats lurking in the background. We don't care about them per se, but they provide context and help us compare and rate players. My initial plan was to start the series with a discussion of shooting percentage, but after reading this article at arcticicehockey by Dirk Hoag this seems like a good time to take a deeper look at QualComp.

If you've ever argued with someone about whether one player is better than another, you've probably had a conversation that went something like:

Pens fan: Letang had a better year than Lidström. Look at how much better his +/- and Corsi stats are!
Wings fan: Yeah, but Lidström faced much tougher opponents!

Here are their Corsi/60 and Corsi QualComp stats, courtesy of Behindthenet:

PlayerCorsiCorsi QualComp
Lidström2.651.205
Letang12.010.281

Roughly speaking, Lidström put up decent numbers against the toughest competition in the league, while Letang did very well against opponents that were a little above average. How much should we boost Lidström's numbers to compensate?

The obvious way to figure this out would be to list the Corsi QoC and Corsi rate for each player (maybe above some time-played threshold) and run a regression. Unfortunately, that won't work very well and will drastically underestimate the importance of QualComp.

Say you are watching your favorite team play on home ice. It's tied in the second period and, with both teams at full strength, you see the opposition put their best players on the ice. Let's stop there. What should you expect in the next minute or so?

The first thing is that the opponents will be good, so you shouldn't expect much. That's what I'll call the competition effect - the better the opposition, the worse your expected results will be. This is what we would like to measure. On the other hand, your coach will see who is on the ice and will probably put out one of the top lines and a good defense pairing. This means you should expect good results. I'll call this the matchup effect - the better the opposition, the better your players on the ice, which will raise the expected results.

Which one will win out depends on how important the competition effect is and, importantly, how much the coaches focus on the matchups. In the regular season, they put some importance on them, but they have to take the long haul into account. In the playoffs, matchups get a lot of focus by everyone from casual fans to bloggers to on-air analysts to the coaches themselves. Barring serious injuries like Crosby's concussion, coaches do close to everything they can to win the game they're in. A top line facing a bottom line is rare, especially on face offs that don't follow icing. In the playoffs we would expect the matchup effect to dominate the competition effect. More on that in a bit.

To see how this all works, let's oversimplify things and say that Corsi rates are simply additive (subtractive?) - if your line is +5 and you face my line, which is +2, then your Corsi rate will be 5 - 2 = 3. Here's a graphical representation a possible matchup between two teams:


You can see the competition effect within each line - the slope of the line connecting the points is -1, which comes from the (over)simplifying assumption that Corsi rates are additive. You can see the matchup effect by noting that as you move from left to right from the opponent's fourth line to their first, you will tend to also move up toward where your first line is because of matchups.

Let's look at a real-world example - the first-round series between the Vancouver Canucks and the Chicago Blackhawks. There are several things about this series that make it a good one to demonstrate the matchup effect. It's in the playoffs, when coaches go out of their way to avoid randomizing matchups. The series went seven games, with a couple overtimes for good measure, which provides more data. The sample size is in a sweet spot where it's small enough for me to be able to break it down but big enough for there to be evidence of the effects I'm trying to demonstrate. The teams also fit the bill nicely - both feature two very strong lines with a significant dropoff. As usual, I will restrict attention to 5-on-5 play where both goalies were in net.

Overall, Vancouver dominated this series 5-on-5. Chicago had no answer, apart from Crawford in goal, for either of the Canucks top lines. Meanwhile, the Blackhawks' best struggled. Looking at the matchups, the only place where Chicago had an edge was when the bottom lines faced each other. The Blackhawks show the competition effect quite well - those players, including their best, that faced the best on the Canucks had awful stats while those playing mainly against the bottom lines did better. On the other hand, the Canucks were an example of the matchup effect swamping the competition effect. The players that faced the toughest opposition were their top players, and they did extremely well. The less skilled players struggled, even though they played far worse opposition.

Here are three plots giving the QualComp and Corsi rates for all players with at least 15 minutes of ice time against both the top and bottom two lines (combined) of the other team. For QualComp, I took the weighted average of the regular-season Corsi from BTN.


It doesn't look like much of a relationship and this is confirmed in the regression, with similar results to what Dirk Hoag discussed:

Coeff. t R2
Corsi QualComp-0.54 -0.230.0016

Instead of averaging everything out, let's look at within-player results. That is, after all, the information we want to know - e.g. if Lidstrom played against competition similar to Letang's, what would we expect his Corsi to be? To do this, we need to create some kind of split in the data. Vic Ferrari compared the first half of the season and the second. I will instead look at performance when facing the opponent's top two lines compared to the bottom two. Keep in mind that I'm only trying to demonstrate the matchup effect. Trying to determine the importance for the league based on a 7-game sample between two teams would obviously not work all that well.

Here is a similar scatterplot to the above, but this time with the data split:


Now that we're looking at differences for each player, we can see a much stronger relationship. It is stronger for the Chicago players but the most impressive change is on the Vancouver side. Aggregating all the minutes, there was a positive relationship between opponent strength and Corsi rate due to the matchup effect. Splitting it up we see that the Canucks players tended to be less productive against better opposition, although the effect was quite weak - this almost surely would be bigger over a more reasonable sample.

Running the same regression on the pooled data gives us:

Coeff.t R2
Corsi QualComp-1.92-2.870.1054

Splitting the data up so we capture the change in Corsi for each player gives us a regression indicating that the effect is over three times as large, is statistically significant (before it wasn't even close) and an R^2 over 65 times larger.

When looking at correlations, you have to be careful to think about how coaching decisions affect everything. Good players tend to play with other good players and against other good players. This influences basically all the "yeah, but…" stats including quality of competition and teammates, zone starts and special teams. I'll explore those more in future articles. To measure effects like Qual Comp it is important to use a method that captures changes for a player or group like what I've done here, WOWY or the method Vic Ferrari used in his article on Qual Comp.

In a future article I will try using a similar methodology on a larger scale to figure out how large the effect actually is.

No comments:

Post a Comment