Friday, January 6, 2012

Anomalies: Is Shooting Percentage Predictive?

There is a mountain of evidence that shooting percentage is overwhelmingly driven by luck. We've written a few articles on it and basically every top blogger has either written directly on the great role of randomness in shooting percentage or makes frequent use of that fact in analyzing hockey stats.

To summarize all of that work very briefly, over a season or less worth of data team shooting percentage is mostly driven by luck. It is not very sustainable, in other words there is very little correlation between shooting percentages in one period of time and another, whether that's even/odd numbered games, those in the first half of the season and second or from one season to the next. Stats such as shooting percentage, save percentage and the sum of these, referred to as PDO, show very high regression to the mean. So a team shooting for a low percentage in the first half of the season is essentially as likely as one shooting well to make a high percentage of shots in the second half of the year. I am fully on board with shooting results being mostly luck and have done a bunch of work on this myself.

A related but separate question is whether shooting percentage is predictive of future scoring. For example, is a team that shot at a high percentage in the regular season going to score more in the playoffs on average than a team that shot at a low percentage? All signs would seem to point to no. In addition to the work summarized above, if you just look at the correlation between shooting percentage in the regular season and scoring rate in the playoffs, it is very low. As you may have guessed by the existence of this column, and it having "anomalies" in the title, shooting percentage does turn out to have both statistical and, I would argue, actual significance in predicting future scoring.

Results

To study the predictability of regular-season shooting percentage on playoff scoring rate, I took 5-on-5 data from BTN for both the regular season and playoffs for the four seasons from 2007-2008 through 2010-2011. This gives us a sample of 64 team seasons. The variable we are trying to predict is goal-scoring rate (5-on-5 GF/60) in the playoffs. Here is the regression equation for playoff scoring rate on regular-season 5-on-5 shooting percentage (expressed out of 100) and, importantly, regular-season shooting rate (5-on-5 SF/60):

Playoff scoring rate = 0.213 * RS Sh% + 0.132 * SF/60 - 3.482

The coefficient on shot rate is nearly 4 standard errors greater than zero and very strongly significant. That should surprise nobody; shot rate is a solid predictor of future scoring. More surprising is that the p-value for regular-season shooting percentage is 0.014 which is easily significant at the standard 5% and almost significant at the stricter 1% level. Despite how much variance there is in playoff scoring rates, you have to factor in matchups and some teams only play 4 games, if you include shooting rate then shooting percentage is a statistically significant predictor of playoff scoring!

That's great for the stats nerds, but is it significant enough for anyone else to care? Let's look at this with a hypothetical example. Let's take two teams that had the playoff-team average shooting rate 5-on-5, give one the playoff-team average shooting percentage in the regular season and the other a shooting percentage one standard deviation higher. Here is a table with their regular-season shooting rates, RS shooting percentages and expected scoring rate in the playoffs:

RS Sh%RS SF/60Exp. Playoff GF/60
9.264%29.9272.438
8.44%29.9272.262

Going by average 5-on-5 ice time and series length, you could say that if two teams have the same shooting rate, the team with the good shooting percentage will score a little under a goal per series (0.81) better than the one that is average. If you compared one team with a high percentage and another below average it would be higher. If you'll forgive me for making simplifying assumptions such as independence, over 20% of first-round matchups should feature teams with shooting percentages different enough for their predicted expected goals for in the series to be off by more than a goal. I think it's large enough that this is significant in practice, not just statistically so.

Better to be good than lucky.

I want to emphasize that while shooting percentage appears to be a significant predictor, shooting rate is stronger. A good way to look at this is to consider predicted scoring rates for teams one standard deviation above the mean in one or both of these.

+1 SDShot RateShooting%Predicted GF/60
Both32.0149.26%2.713
Shot Rate32.0148.44%2.538
Shooting%29.9279.26%2.438
Both Avg29.9278.44%2.262

You can see from the bolded that a team with a higher rate but average percentage is predicted to score more goals than one with an average rate and higher percentage. Using a similar method as I used above, the team with the better rate will score just under half a goal more per series.

Spurious Explanations

So we have established that if you take two teams with equal shooting rates but different shooting percentages in the regular season, the one that made a higher percentage of its shots will have a significantly higher average scoring rate in the playoffs. I have come up with two theories on why this would be even with the extreme idea that shooting is all luck.

The first is that the higher shooting percentage means the team probably scored more goals, which means they probably had a better record and perhaps this means they faced weaker competition in the playoffs. So luck-based results in the regular season led to weaker playoff competition. This seems unlikely because changes would be so marginal, but in theory it could be the case.

A more complicated idea is based on score effects. It's pretty well established that teams tend to shoot at a higher rate when they are behind and a lower rate when they are ahead. Extending this, the team that got the bounces to go their way was in the lead more often. Since they were in the lead more often but put up the same shooting rate then they are better when it comes to shot rate and that will shine through in the playoffs.

Neither of these theories can be disproven 100%, but I think we can safely conclude that such effects would be quite small. I got at this by including a different variable: regular-season goal-scoring rate for all time that isn't 5-on-5. So this includes all special teams, 4-on-4 and 3-on-3. Since this is goal-scoring rate directly, and about 1 in 3 goals are scored in such situations, it seems like this would have a stronger connection to playoff seeding or being ahead than luck-based 5-on-5 shooting percentage. When I ran a similar regression of playoff scoring on 5-on-5 shooting rate and non-5-on-5 scoring rate the latter wasn't close to significant, with the standard error greater than the size of the coefficient and the R^2 barely budged from using shot rate alone. There are also other reasons I'll get to in the near future that make me think score effects are out as an explanation.

I think we can reasonably conclude that while there may be some negligible score or even playoff-matchup effects, that's not what's driving most of the predictive power of regular-season shooting percentage.

Shot Quality? Aim?

Having racked my brain for the week or so since discovering this, the only explanation I can find is our good friend shot quality. Getting high-quality shots isn't easy; the other team is doing their best to keep you out of the dangerous areas and stop rebounds from going there. If you have two teams that shoot at the same rate but one gets more high-quality shots then they are probably better possessing the puck and generating offense. Perhaps an entire season of 5-on-5 shots is enough for this to shine through.

There is some evidence to back this theory up. Running a regression we find that if you take into account regular-season SF/60, regular-season shooting percentage is a significant predictor of playoff shot rates! In other words, if you have two teams that shot at the same rate during the regular season then the one that shot for a higher percentage will, on average, have a higher playoff SF/60. The coefficient is 0.918, so increasing regular-season shooting% by 1.1 percentage points would increase the expected shot rate in the playoffs by 1.

To me shooting percentage indicating something about shooting rates in the future is pretty strong evidence that if either it's about shot quality and not sniping ability. If you had a team very good at hitting the corners during the regular season, it's not clear why they would be more likely to take more shots in the playoffs. In contrast, if you have a team very good at creating shots from just in front of the crease then it seems reasonable that they'd shoot more often in the future because that takes skill that more readily translates.

What do you think?

We'd love to hear your thoughts on this. In particular, I'd like any alternative theories beyond shot quality. Are you surprised at all by this? Does it make sense? Let us know by leaving a comment, or a tweet @drivingplay.

4 comments:

  1. I have found that on an individual level shooting percentage is predictive of future goal scoring rates so I am not surprised to see the same at the team level. I have also found that playing style has a significant influence on shooting percentages. Players that are asked to play an offensive game generally are associated with above average shooting percentages while they are on the ice. Conversely players that are more defensive in nature tend to have significantly lower shooting percentages. The same probably happens, though probably to a lesser extent, at the team level.

    So, a theory I could propose is kind of like score effects on steroids. The team that has the higher regular season shooting percentage is the better offensive team or at least likely play a more offensive oriented game. This may make the team that has the lower regular season shooting percentage naturally feel they can't match offense vs offense so instead play more of a defensive game (they may be a more defensive oriented team to start with, hence the lower shooting percentage) even when the game is tied. Playing in a defensive shell typically results in more shots against (as we know from score effects) thus resulting in the team with the higher regular season shooting percentage having more shots.

    ReplyDelete
  2. I don't think these results are surprising at all. At times I think the rhetoric gets a little heated, and over-minimizes the significance of shot quality.

    It certainly exists, both at an individual and team level, but the spread of true talent there is much less significant than with shot quantity. The two concepts can peacefully coexist, but just need to be set in proper proportion.

    ReplyDelete
  3. I have to agree with Forecheker, although I am a bit skeptical. Did you run a regression with only Sh%? what % variance is explained by that model? In all the models I've seen adding Sh% usually reduces R^2. Instead of jumping to shot quality, I think perhaps there are a few differences in the playoffs that may be significant. 1) The playoffs are seeded. We have more data on games in which 1 team is clearly (through their RS P%) better, playing an inferior team (ie. 1v8, 2v7). In all likelihood it isn't until the conf finals that teams are evenly matched. This is 12 series' of mis-matches vs. 3 series' of equal teams. 2) Teams are select to appear in the playoffs. These teams are much more likely to have more skilled players on their teams.

    ReplyDelete
  4. Patrick,

    When I ran it with just shooting percentage it came up completely insignificant, with an R^2 of something like 0.01 or something very small. I think the key to that is that you have to include shot rate because shot rate and shooting% are negatively correlated and shot rate is far more predictive. So if you have two teams with different shooting percentages, the higher one probably shot less and this effect swamps any shooting-percentage effect. Only when you take the shot rate into account can you see the effect.

    ReplyDelete