Driving Play: shooting percentage

Showing posts with label shooting percentage. Show all posts

Friday, January 6, 2012

Anomalies: Is Shooting Percentage Predictive?

There is a mountain of evidence that shooting percentage is overwhelmingly driven by luck. We've written a few articles on it and basically every top blogger has either written directly on the great role of randomness in shooting percentage or makes frequent use of that fact in analyzing hockey stats.

To summarize all of that work very briefly, over a season or less worth of data team shooting percentage is mostly driven by luck. It is not very sustainable, in other words there is very little correlation between shooting percentages in one period of time and another, whether that's even/odd numbered games, those in the first half of the season and second or from one season to the next. Stats such as shooting percentage, save percentage and the sum of these, referred to as PDO, show very high regression to the mean. So a team shooting for a low percentage in the first half of the season is essentially as likely as one shooting well to make a high percentage of shots in the second half of the year. I am fully on board with shooting results being mostly luck and have done a bunch of work on this myself.

A related but separate question is whether shooting percentage is predictive of future scoring. For example, is a team that shot at a high percentage in the regular season going to score more in the playoffs on average than a team that shot at a low percentage? All signs would seem to point to no. In addition to the work summarized above, if you just look at the correlation between shooting percentage in the regular season and scoring rate in the playoffs, it is very low. As you may have guessed by the existence of this column, and it having "anomalies" in the title, shooting percentage does turn out to have both statistical and, I would argue, actual significance in predicting future scoring.

Results

To study the predictability of regular-season shooting percentage on playoff scoring rate, I took 5-on-5 data from BTN for both the regular season and playoffs for the four seasons from 2007-2008 through 2010-2011. This gives us a sample of 64 team seasons. The variable we are trying to predict is goal-scoring rate (5-on-5 GF/60) in the playoffs. Here is the regression equation for playoff scoring rate on regular-season 5-on-5 shooting percentage (expressed out of 100) and, importantly, regular-season shooting rate (5-on-5 SF/60):

Playoff scoring rate = 0.213 * RS Sh% + 0.132 * SF/60 - 3.482

The coefficient on shot rate is nearly 4 standard errors greater than zero and very strongly significant. That should surprise nobody; shot rate is a solid predictor of future scoring. More surprising is that the p-value for regular-season shooting percentage is 0.014 which is easily significant at the standard 5% and almost significant at the stricter 1% level. Despite how much variance there is in playoff scoring rates, you have to factor in matchups and some teams only play 4 games, if you include shooting rate then shooting percentage is a statistically significant predictor of playoff scoring!

That's great for the stats nerds, but is it significant enough for anyone else to care? Let's look at this with a hypothetical example. Let's take two teams that had the playoff-team average shooting rate 5-on-5, give one the playoff-team average shooting percentage in the regular season and the other a shooting percentage one standard deviation higher. Here is a table with their regular-season shooting rates, RS shooting percentages and expected scoring rate in the playoffs:

RS Sh%	RS SF/60	Exp. Playoff GF/60
9.264%	29.927	2.438
8.44%	29.927	2.262

Going by average 5-on-5 ice time and series length, you could say that if two teams have the same shooting rate, the team with the good shooting percentage will score a little under a goal per series (0.81) better than the one that is average. If you compared one team with a high percentage and another below average it would be higher. If you'll forgive me for making simplifying assumptions such as independence, over 20% of first-round matchups should feature teams with shooting percentages different enough for their predicted expected goals for in the series to be off by more than a goal. I think it's large enough that this is significant in practice, not just statistically so.

Better to be good than lucky.

I want to emphasize that while shooting percentage appears to be a significant predictor, shooting rate is stronger. A good way to look at this is to consider predicted scoring rates for teams one standard deviation above the mean in one or both of these.

+1 SD	Shot Rate	Shooting%	Predicted GF/60
Both	32.014	9.26%	2.713
Shot Rate	32.014	8.44%	2.538
Shooting%	29.927	9.26%	2.438
Both Avg	29.927	8.44%	2.262

You can see from the bolded that a team with a higher rate but average percentage is predicted to score more goals than one with an average rate and higher percentage. Using a similar method as I used above, the team with the better rate will score just under half a goal more per series.

Spurious Explanations

So we have established that if you take two teams with equal shooting rates but different shooting percentages in the regular season, the one that made a higher percentage of its shots will have a significantly higher average scoring rate in the playoffs. I have come up with two theories on why this would be even with the extreme idea that shooting is all luck.

The first is that the higher shooting percentage means the team probably scored more goals, which means they probably had a better record and perhaps this means they faced weaker competition in the playoffs. So luck-based results in the regular season led to weaker playoff competition. This seems unlikely because changes would be so marginal, but in theory it could be the case.

A more complicated idea is based on score effects. It's pretty well established that teams tend to shoot at a higher rate when they are behind and a lower rate when they are ahead. Extending this, the team that got the bounces to go their way was in the lead more often. Since they were in the lead more often but put up the same shooting rate then they are better when it comes to shot rate and that will shine through in the playoffs.

Neither of these theories can be disproven 100%, but I think we can safely conclude that such effects would be quite small. I got at this by including a different variable: regular-season goal-scoring rate for all time that isn't 5-on-5. So this includes all special teams, 4-on-4 and 3-on-3. Since this is goal-scoring rate directly, and about 1 in 3 goals are scored in such situations, it seems like this would have a stronger connection to playoff seeding or being ahead than luck-based 5-on-5 shooting percentage. When I ran a similar regression of playoff scoring on 5-on-5 shooting rate and non-5-on-5 scoring rate the latter wasn't close to significant, with the standard error greater than the size of the coefficient and the R^2 barely budged from using shot rate alone. There are also other reasons I'll get to in the near future that make me think score effects are out as an explanation.

I think we can reasonably conclude that while there may be some negligible score or even playoff-matchup effects, that's not what's driving most of the predictive power of regular-season shooting percentage.

Shot Quality? Aim?

Having racked my brain for the week or so since discovering this, the only explanation I can find is our good friend shot quality. Getting high-quality shots isn't easy; the other team is doing their best to keep you out of the dangerous areas and stop rebounds from going there. If you have two teams that shoot at the same rate but one gets more high-quality shots then they are probably better possessing the puck and generating offense. Perhaps an entire season of 5-on-5 shots is enough for this to shine through.

There is some evidence to back this theory up. Running a regression we find that if you take into account regular-season SF/60, regular-season shooting percentage is a significant predictor of playoff shot rates! In other words, if you have two teams that shot at the same rate during the regular season then the one that shot for a higher percentage will, on average, have a higher playoff SF/60. The coefficient is 0.918, so increasing regular-season shooting% by 1.1 percentage points would increase the expected shot rate in the playoffs by 1.

To me shooting percentage indicating something about shooting rates in the future is pretty strong evidence that if either it's about shot quality and not sniping ability. If you had a team very good at hitting the corners during the regular season, it's not clear why they would be more likely to take more shots in the playoffs. In contrast, if you have a team very good at creating shots from just in front of the crease then it seems reasonable that they'd shoot more often in the future because that takes skill that more readily translates.

What do you think?

We'd love to hear your thoughts on this. In particular, I'd like any alternative theories beyond shot quality. Are you surprised at all by this? Does it make sense? Let us know by leaving a comment, or a tweet @drivingplay.

Saturday, October 29, 2011

How Much of Shooting Percentage Is Skill?

The correct answer, as it is for most questions, is "it depends." In this case, on sample size.

In a recent post at Arctic Ice Hockey, the indispensable Gabe Desjardins argued that we should move away from working on metrics for shot quality because there isn't much payoff. This has motivated me to write a follow up on an article I wrote a couple months ago on how luck vs skill influence shooting percentage based on sample size. In that article, I took two teams, one above average and one below average at shooting, and examined how likely the good team is to shoot at a higher percentage for a given numbers of shots. Here I will look at how much variation in shooting percentage is explained by skill and luck for different numbers of shots.

Methodology

My methodology is a sort of mirror image of what JLikens did in his article on the same subject and Vic Ferrari's imaginary dice rolling. JLikens assumed each team had the same real shooting percentage and ran simulations to see how much variation in results there would be after a season worth of even-strength shots. In my simulations I will create a distribution of shooting talent and see how much that skill explains variation in results for a given number of shots.

Here are the steps:
- Going by this more recent JLikens article in each of 10,000 simulations I created 30 teams by drawing a shooting percentage from Beta(263,2977), a distribution pretty close to that of actual even-strength team shooting skill in the NHL.
- All 30 teams take the same number of shots, which score a goal or not based on the probability given by the team's shooting skill.
- For each simulation, I calculate the R^2 between shooting percentage on those shots and the shooting talent of the teams. The average of these tells us how much variation in shooting percentage results is explained by shooting ability.
- The rest is luck.

Results

Here is a table with the results. The first column is the time period in question. The second is the number of shots each team took in the simulations. The third column is the percentage of variation in even-strength shooting-percentage results that is explained by the skill component - the average R^2 of all simulations. The last column is simply 100% minus that and represents the percentage of variation that is due to random chance, luck if you will.

Time Period	Shots	Skill	Luck
Season to today	250	9.9%	90.1%
1/4 season	500	15.7%	84.3%
half season	1,000	25.1%	74.9%
1 season	2,000	38.9%	61.1%
2 seasons	4,000	55.4%	44.6%
3 seasons	6,000	64.9%	35.1%
4 seasons	8,000	70.9%	29.1%
5 seasons	10,000	75.3%	24.7%
6 seasons	12,000	78.4%	21.6%
7 seasons	14,000	80.9%	19.1%
8 seasons	16,000	82.9%	17.1%
9 seasons	18,000	84.5%	15.5%
10 seasons	20,000	86%	14%

Here's a graph:

Put in words, at this point in the season shooting results are 90% luck and 10% skill. This is likely an underestimate, as I'll discuss below. Over a whole season it goes to a little over 60% random chance. It takes about 140 games worth of shots for results to be 50/50.

Some thoughts

I'm making several assumptions that are not valid. The biggest and most obviously dubious is that shooting-percentage skill will be the same for every shot. In reality, if a team shoots at an 8.5% clip their top line will shoot higher, their fourth line lower, they'll do better against weaker opponents and worse against good goaltenders and so on. Injuries, trades, free agency and coaching changes are obviously a big issue as well. On a related note, I assume that each team takes the same number of shots. In practice, teams obviously take more or fewer shots than average over a given stretch. To make things more problematic, it is probably the case that teams that shoot more often tend to be better at making their shots. One thing all of these factors have in common is that they will increase the randomness factor. Think of the above estimates as lower bounds on how much luck explains variation in shooting percentage or an upper bound for how much skill matters.

Wednesday, September 7, 2011

Luck, Skill and Sample Size in Playoff Shooting Percentage

In a discussion on a message board about my previous article on luck and skill in shooting percentage, someone asked about the numbers for the playoffs. The average series runs a little under 6 games. That makes the entire playoffs pretty close to a quarter of a season for the teams that make the Cup finals. Here are the numbers for how often the team that shoots at a higher percentage (about 7th best in the league) outshoots the team that is bad at shooting (~7th worst in the league):

Time period	Number of Shots	A scores more	B scores more	Goals scored equal	A > B Significant	B > A significant
Four Games	96	51%	38.1%	10.9%	7.4%	3.9%
Five Games	120	51.9%	38.7%	9.4%	7.6%	3.8%
Six Games	144	54.1%	37.5%	8.4%	7.8%	3.4%
Seven Games	168	55.4%	36.6%	8%	7.9%	3.1%
Second Round	274	57.4%	36.6%	6%	9%	3.1%
Conference Finals	411	60.5%	34.6%	4.9%	9.4%	2.7%
Stanley Cup Finals	548	62.9%	33.1%	4%	10.7%	1.9%

Something worth noting is that on average this gap in shooting from good to bad is worth a little less than a goal (0.88 goals) per series. Despite that, the team with less shooting talent still has between a 35% and 40% chance to get more goals at even strength in a series if they get the same number of shots, and another 8-10% shot at breaking even. The bounces don't always even out. That leaves a lot of room for luck, creating more shots and special teams.

Friday, August 19, 2011

On Luck, Skill and Sample Size in Shooting Percentage

The roles of skill and luck in shooting are an important and often misunderstood part of hockey analysis. This is particularly true when analytical and traditional fans get together. A typical discussion might go something like:

A: Steven Stamkos got 91 points with 45 goals at the age of 21. He's definitely getting 100 points next year, and could easily score 50 goals a season as he improves.
B: Yeah, but he was lucky to make 16.5% of his shots. That's not sustainable and his numbers will probably go down next year, even if he does actually improve.
A: So shooting is all luck? You clearly know nothing about hockey and should actually watch some games instead of just sitting at your computer all day coming up with fancy stats that don't mean anything.

Turns out both guys have a point. Stamkos should probably expect his stats to drop next season because, in addition to goal numbers trending downward for the league as a whole, he's probably not burying 16.5% of his shots. On the other hand, I person B probably should watch more hockey, it is a great game.

A lot of the confusion comes from incorrect either/or thinking. Scoring on a high percentage of your shots is a result of both luck and skill. I'm not just putting down the traditional fans here. Those of us in the analytical community, myself included, are prone to bad thinking as well. While most people ignore or underrate the importance of luck in hockey in general and shooting in particular, we tend to go too far the other way, chalking everything up to luck and ignoring the skill aspect.

Shooting for a high percentage is skill based. Two of our favorite writers, JLikens of objectivenhl fame and Gabe Desjardins from BTN and arcticicehockey have written several articles on the subject.

In case their articles are not convincing enough, let's consider two of the best players in the game: Henrik Sedin and Sidney Crosby. While not necessarily known as snipers, these players have all the skills that one might think lead to their team putting a high percentage of shots in the net. Both have elite vision, passing ability, hands, positioning and, in one case, telepathy. We would all expect their teams to have better shooting percentages when they are on the ice than when they are sitting on the bench or worse. The numbers bear this out. Here is a chart with their teams' performances at even strength with both goalies in net from the last four seasons combined. The stats are courtesy of noted Driving Play reader Vic Ferrari's timeonice scripts, which you can find information on how to use here.

Team	Goals	Shots On Goal	Shooting %
Penguins, Crosby on Ice	236	2160	10.9%
Penguins, Crosby off Ice	414	5347	7.7%
Canucks, Henrik on Ice	273	2624	10.4%
Canucks, Henrik off Ice	359	4709	7.6%

As you can see, the Pens with Crosby shot 3.2 percentage points higher than they did without him. While some of it may be variance, with the number of shots they took with him on, that's a difference of 69 goals or more than 17 goals per season due to better shooting. The Canucks shot 2.8 points higher with Henrik on the ice, a difference of over 73 goals, more than 18 per season, when you consider how many shots they took with him on. For the statistically minded, these shooting-percentage differences are very very very significant. To give you an idea, it varies field to field but the most common benchmark is for there to be less than a 5% chance of results this extreme, or more so, due to variance alone. That's a 1-in-20 chance. For Crosby, there is a 0.00045% chance, or less than 1 in 222,000. For Hank there is a 0.00235% chance of results that extreme due to randomness alone - less likely than 1 in 42,000. Again, 1 in 20 is the usual mark. The data confirm what anyone would guess from watching a few games - Henrik Sedin and Sidney Crosby help their teams shoot better. (Note: if you are a hater and/or think that it's the likes of Alex Burrows and Pascal Dupuis who are driving these results, feel free to be wrong. The point of this is to provide evidence of shooting skill and clearly someone has it when these two are on the ice.)

Let's now look at the role of luck on shooting percentage. To do this, I will run simulations comparing the results of a typical team that shoots well and one that does poorly. In this article on objectivenhl, which is worthy of being linked again, JLikens finds that the average team shoots at an 8.1% clip 5-on-5, with a standard deviation of 0.48%. Going by this, a team that is good at shooting, let's say 7th or 8th best in the league, would have a true 5-on-5 shooting percentage of something like 8.42%. On the other hand, a team that is bad at shooting, say 7th or 8th worst in the league, would be expected to score on about 7.78% of their shots.

Let's see how things shake out. Below is a chart giving the results of 10,000 simulations for various numbers of shots where team A has a true shooting percentage of 8.42% and team B shoots at 7.78%. The first two columns tell you the given time period and number of shots for each team. The next three columns tell you how often the team good at shooting outshot the bad (column 3), the bad team outshot the good (4) and how often they had an equal shooting percentage (5). The last two columns give what percent of the time someone looking at the data, and not knowing the underlying percentages, would get statistical signficance at the 5% level. Notice that in the last column, the statistical test would reveal that B is significantly better at shooting than A despite their shooting skill actually being over half a percentage point worse.

Time period	Number of Shots	A scores more	B scores more	Goals scored equal	A > B Significant	B > A significant
One Period	8	31.9%	28.7%	39.4%	1.3%	1.1%
One Game	24	42.6%	36%	21.4%	6%	4.5%
1/4 Season	500	62.3%	33.2%	4.5%	10.5%	2.2%
1/2 Season	1,000	68.6%	28.5%	2.8%	13.4%	1.4%
1 Season	2,000	76.4%	21.7%	1.9%	17.9%	0.8%
2 Seasons	4,000	85%	14%	1%	27.8%	0.2%
3 Seasons	6,000	90.1%	9.3%	0.6%	36.1%	0.1%
4 Seasons	8,000	93.6%	6%	0.4%	43.7%	0.1%
5 Seasons	10,000	95.6%	4.1%	0.3%	50.3%	0%

You probably didn't find the results surprising for that first row, representing a period of play. The most common outcome, happening about 40% of the time, is that the two teams remain tied, most often at 0. The team that shoots better due to getting higher-quality shots, hitting the corners better and so on is only slightly more likely to be the one that is ahead if you know that one of them is. Less than 32% of the time will the better team find themselves ahead after a period in which both get the league average 8 shots, whereas they'll be behind almost 29% of the time.

Lower on the chart it gets more troubling, especially for us bloggers. The most common sample point for analysis is half a season. Generally the best way to study the persistence of something is to split the season in half, typically first half vs second half or even-and-odd numbered games, and compare the two samples. This works well because teams should be the same or very similar. If you study something over multiple seasons you aren't getting the same teams every year due to player and coaching changes. In half a season, the team near the top in shooting skill has only about a 2 in 3 chance of outscoring the team near the bottom with the same number of shots. There is also little chance, roughly 13%, of finding that the better team is significantly better at shooting if you were looking at the data. Even over a whole season of shooting data, there is a 1 in 4 chance that the worse team will get better results. It isn't until we get several years worth of shooting results that it tilts heavily in favor of the better shooting team and that's not realistic because teams change so much each offseason and the simulations assumed the same percentage each season.

As you can see, luck plays a huge role for all reasonable sample sizes. This is the fundamental reason why shooting stats are better than goals. Luck is less of a factor for number of shots taken than number of shots made, so they are more reliable indicators of skill over samples of a season or less. If over a season there is a 1 in 4 chance that a good-shooting team is outshot by a bad-shooting team then it's tough to say that a team's results are due to skill and not just random luck.

In a future installment I will look at how persistence is affected by sample size.