Showing posts with label Luck. Show all posts
Showing posts with label Luck. Show all posts

Friday, January 6, 2012

Anomalies: Is Shooting Percentage Predictive?

There is a mountain of evidence that shooting percentage is overwhelmingly driven by luck. We've written a few articles on it and basically every top blogger has either written directly on the great role of randomness in shooting percentage or makes frequent use of that fact in analyzing hockey stats.

To summarize all of that work very briefly, over a season or less worth of data team shooting percentage is mostly driven by luck. It is not very sustainable, in other words there is very little correlation between shooting percentages in one period of time and another, whether that's even/odd numbered games, those in the first half of the season and second or from one season to the next. Stats such as shooting percentage, save percentage and the sum of these, referred to as PDO, show very high regression to the mean. So a team shooting for a low percentage in the first half of the season is essentially as likely as one shooting well to make a high percentage of shots in the second half of the year. I am fully on board with shooting results being mostly luck and have done a bunch of work on this myself.

A related but separate question is whether shooting percentage is predictive of future scoring. For example, is a team that shot at a high percentage in the regular season going to score more in the playoffs on average than a team that shot at a low percentage? All signs would seem to point to no. In addition to the work summarized above, if you just look at the correlation between shooting percentage in the regular season and scoring rate in the playoffs, it is very low. As you may have guessed by the existence of this column, and it having "anomalies" in the title, shooting percentage does turn out to have both statistical and, I would argue, actual significance in predicting future scoring.

Results

To study the predictability of regular-season shooting percentage on playoff scoring rate, I took 5-on-5 data from BTN for both the regular season and playoffs for the four seasons from 2007-2008 through 2010-2011. This gives us a sample of 64 team seasons. The variable we are trying to predict is goal-scoring rate (5-on-5 GF/60) in the playoffs. Here is the regression equation for playoff scoring rate on regular-season 5-on-5 shooting percentage (expressed out of 100) and, importantly, regular-season shooting rate (5-on-5 SF/60):

Playoff scoring rate = 0.213 * RS Sh% + 0.132 * SF/60 - 3.482

The coefficient on shot rate is nearly 4 standard errors greater than zero and very strongly significant. That should surprise nobody; shot rate is a solid predictor of future scoring. More surprising is that the p-value for regular-season shooting percentage is 0.014 which is easily significant at the standard 5% and almost significant at the stricter 1% level. Despite how much variance there is in playoff scoring rates, you have to factor in matchups and some teams only play 4 games, if you include shooting rate then shooting percentage is a statistically significant predictor of playoff scoring!

That's great for the stats nerds, but is it significant enough for anyone else to care? Let's look at this with a hypothetical example. Let's take two teams that had the playoff-team average shooting rate 5-on-5, give one the playoff-team average shooting percentage in the regular season and the other a shooting percentage one standard deviation higher. Here is a table with their regular-season shooting rates, RS shooting percentages and expected scoring rate in the playoffs:

RS Sh%RS SF/60Exp. Playoff GF/60
9.264%29.9272.438
8.44%29.9272.262

Going by average 5-on-5 ice time and series length, you could say that if two teams have the same shooting rate, the team with the good shooting percentage will score a little under a goal per series (0.81) better than the one that is average. If you compared one team with a high percentage and another below average it would be higher. If you'll forgive me for making simplifying assumptions such as independence, over 20% of first-round matchups should feature teams with shooting percentages different enough for their predicted expected goals for in the series to be off by more than a goal. I think it's large enough that this is significant in practice, not just statistically so.

Better to be good than lucky.

I want to emphasize that while shooting percentage appears to be a significant predictor, shooting rate is stronger. A good way to look at this is to consider predicted scoring rates for teams one standard deviation above the mean in one or both of these.

+1 SDShot RateShooting%Predicted GF/60
Both32.0149.26%2.713
Shot Rate32.0148.44%2.538
Shooting%29.9279.26%2.438
Both Avg29.9278.44%2.262

You can see from the bolded that a team with a higher rate but average percentage is predicted to score more goals than one with an average rate and higher percentage. Using a similar method as I used above, the team with the better rate will score just under half a goal more per series.

Spurious Explanations

So we have established that if you take two teams with equal shooting rates but different shooting percentages in the regular season, the one that made a higher percentage of its shots will have a significantly higher average scoring rate in the playoffs. I have come up with two theories on why this would be even with the extreme idea that shooting is all luck.

The first is that the higher shooting percentage means the team probably scored more goals, which means they probably had a better record and perhaps this means they faced weaker competition in the playoffs. So luck-based results in the regular season led to weaker playoff competition. This seems unlikely because changes would be so marginal, but in theory it could be the case.

A more complicated idea is based on score effects. It's pretty well established that teams tend to shoot at a higher rate when they are behind and a lower rate when they are ahead. Extending this, the team that got the bounces to go their way was in the lead more often. Since they were in the lead more often but put up the same shooting rate then they are better when it comes to shot rate and that will shine through in the playoffs.

Neither of these theories can be disproven 100%, but I think we can safely conclude that such effects would be quite small. I got at this by including a different variable: regular-season goal-scoring rate for all time that isn't 5-on-5. So this includes all special teams, 4-on-4 and 3-on-3. Since this is goal-scoring rate directly, and about 1 in 3 goals are scored in such situations, it seems like this would have a stronger connection to playoff seeding or being ahead than luck-based 5-on-5 shooting percentage. When I ran a similar regression of playoff scoring on 5-on-5 shooting rate and non-5-on-5 scoring rate the latter wasn't close to significant, with the standard error greater than the size of the coefficient and the R^2 barely budged from using shot rate alone. There are also other reasons I'll get to in the near future that make me think score effects are out as an explanation.

I think we can reasonably conclude that while there may be some negligible score or even playoff-matchup effects, that's not what's driving most of the predictive power of regular-season shooting percentage.

Shot Quality? Aim?

Having racked my brain for the week or so since discovering this, the only explanation I can find is our good friend shot quality. Getting high-quality shots isn't easy; the other team is doing their best to keep you out of the dangerous areas and stop rebounds from going there. If you have two teams that shoot at the same rate but one gets more high-quality shots then they are probably better possessing the puck and generating offense. Perhaps an entire season of 5-on-5 shots is enough for this to shine through.

There is some evidence to back this theory up. Running a regression we find that if you take into account regular-season SF/60, regular-season shooting percentage is a significant predictor of playoff shot rates! In other words, if you have two teams that shot at the same rate during the regular season then the one that shot for a higher percentage will, on average, have a higher playoff SF/60. The coefficient is 0.918, so increasing regular-season shooting% by 1.1 percentage points would increase the expected shot rate in the playoffs by 1.

To me shooting percentage indicating something about shooting rates in the future is pretty strong evidence that if either it's about shot quality and not sniping ability. If you had a team very good at hitting the corners during the regular season, it's not clear why they would be more likely to take more shots in the playoffs. In contrast, if you have a team very good at creating shots from just in front of the crease then it seems reasonable that they'd shoot more often in the future because that takes skill that more readily translates.

What do you think?

We'd love to hear your thoughts on this. In particular, I'd like any alternative theories beyond shot quality. Are you surprised at all by this? Does it make sense? Let us know by leaving a comment, or a tweet @drivingplay.

Saturday, October 29, 2011

How Much of Shooting Percentage Is Skill?

The correct answer, as it is for most questions, is "it depends." In this case, on sample size.

In a recent post at Arctic Ice Hockey, the indispensable Gabe Desjardins argued that we should move away from working on metrics for shot quality because there isn't much payoff. This has motivated me to write a follow up on an article I wrote a couple months ago on how luck vs skill influence shooting percentage based on sample size. In that article, I took two teams, one above average and one below average at shooting, and examined how likely the good team is to shoot at a higher percentage for a given numbers of shots. Here I will look at how much variation in shooting percentage is explained by skill and luck for different numbers of shots.

Methodology

My methodology is a sort of mirror image of what JLikens did in his article on the same subject and Vic Ferrari's imaginary dice rolling. JLikens assumed each team had the same real shooting percentage and ran simulations to see how much variation in results there would be after a season worth of even-strength shots. In my simulations I will create a distribution of shooting talent and see how much that skill explains variation in results for a given number of shots.

Here are the steps:
- Going by this more recent JLikens article in each of 10,000 simulations I created 30 teams by drawing a shooting percentage from Beta(263,2977), a distribution pretty close to that of actual even-strength team shooting skill in the NHL.
- All 30 teams take the same number of shots, which score a goal or not based on the probability given by the team's shooting skill.
- For each simulation, I calculate the R^2 between shooting percentage on those shots and the shooting talent of the teams. The average of these tells us how much variation in shooting percentage results is explained by shooting ability.
- The rest is luck.

Results

Here is a table with the results. The first column is the time period in question. The second is the number of shots each team took in the simulations. The third column is the percentage of variation in even-strength shooting-percentage results that is explained by the skill component - the average R^2 of all simulations. The last column is simply 100% minus that and represents the percentage of variation that is due to random chance, luck if you will.

Time PeriodShotsSkillLuck
Season to today2509.9%90.1%
1/4 season50015.7%84.3%
half season1,00025.1%74.9%
1 season2,00038.9%61.1%
2 seasons4,00055.4%44.6%
3 seasons6,00064.9%35.1%
4 seasons8,00070.9%29.1%
5 seasons10,00075.3%24.7%
6 seasons12,00078.4%21.6%
7 seasons14,00080.9%19.1%
8 seasons16,00082.9%17.1%
9 seasons18,00084.5%15.5%
10 seasons20,00086%14%

Here's a graph:


Put in words, at this point in the season shooting results are 90% luck and 10% skill. This is likely an underestimate, as I'll discuss below. Over a whole season it goes to a little over 60% random chance. It takes about 140 games worth of shots for results to be 50/50.

Some thoughts

I'm making several assumptions that are not valid. The biggest and most obviously dubious is that shooting-percentage skill will be the same for every shot. In reality, if a team shoots at an 8.5% clip their top line will shoot higher, their fourth line lower, they'll do better against weaker opponents and worse against good goaltenders and so on. Injuries, trades, free agency and coaching changes are obviously a big issue as well. On a related note, I assume that each team takes the same number of shots. In practice, teams obviously take more or fewer shots than average over a given stretch. To make things more problematic, it is probably the case that teams that shoot more often tend to be better at making their shots. One thing all of these factors have in common is that they will increase the randomness factor. Think of the above estimates as lower bounds on how much luck explains variation in shooting percentage or an upper bound for how much skill matters.

Wednesday, September 7, 2011

Luck, Skill and Sample Size in Playoff Shooting Percentage

In a discussion on a message board about my previous article on luck and skill in shooting percentage, someone asked about the numbers for the playoffs. The average series runs a little under 6 games. That makes the entire playoffs pretty close to a quarter of a season for the teams that make the Cup finals. Here are the numbers for how often the team that shoots at a higher percentage (about 7th best in the league) outshoots the team that is bad at shooting (~7th worst in the league):

Time periodNumber of ShotsA scores moreB scores moreGoals scored equalA > B SignificantB > A significant
Four Games9651%38.1%10.9%7.4%3.9%
Five Games12051.9%38.7%9.4%7.6%3.8%
Six Games14454.1%37.5%8.4%7.8%3.4%
Seven Games16855.4%36.6%8%7.9%3.1%
Second Round27457.4%36.6%6%9%3.1%
Conference Finals41160.5%34.6%4.9%9.4%2.7%
Stanley Cup Finals54862.9%33.1%4%10.7%1.9%

Something worth noting is that on average this gap in shooting from good to bad is worth a little less than a goal (0.88 goals) per series. Despite that, the team with less shooting talent still has between a 35% and 40% chance to get more goals at even strength in a series if they get the same number of shots, and another 8-10% shot at breaking even. The bounces don't always even out. That leaves a lot of room for luck, creating more shots and special teams.

Friday, August 19, 2011

On Luck, Skill and Sample Size in Shooting Percentage

The roles of skill and luck in shooting are an important and often misunderstood part of hockey analysis. This is particularly true when analytical and traditional fans get together. A typical discussion might go something like:

A: Steven Stamkos got 91 points with 45 goals at the age of 21. He's definitely getting 100 points next year, and could easily score 50 goals a season as he improves.
B: Yeah, but he was lucky to make 16.5% of his shots. That's not sustainable and his numbers will probably go down next year, even if he does actually improve.
A: So shooting is all luck? You clearly know nothing about hockey and should actually watch some games instead of just sitting at your computer all day coming up with fancy stats that don't mean anything.

Turns out both guys have a point. Stamkos should probably expect his stats to drop next season because, in addition to goal numbers trending downward for the league as a whole, he's probably not burying 16.5% of his shots. On the other hand, I person B probably should watch more hockey, it is a great game.

A lot of the confusion comes from incorrect either/or thinking. Scoring on a high percentage of your shots is a result of both luck and skill. I'm not just putting down the traditional fans here. Those of us in the analytical community, myself included, are prone to bad thinking as well. While most people ignore or underrate the importance of luck in hockey in general and shooting in particular, we tend to go too far the other way, chalking everything up to luck and ignoring the skill aspect.

Shooting for a high percentage is skill based. Two of our favorite writers, JLikens of objectivenhl fame and Gabe Desjardins from BTN and arcticicehockey have written several articles on the subject.

In case their articles are not convincing enough, let's consider two of the best players in the game: Henrik Sedin and Sidney Crosby. While not necessarily known as snipers, these players have all the skills that one might think lead to their team putting a high percentage of shots in the net. Both have elite vision, passing ability, hands, positioning and, in one case, telepathy. We would all expect their teams to have better shooting percentages when they are on the ice than when they are sitting on the bench or worse. The numbers bear this out. Here is a chart with their teams' performances at even strength with both goalies in net from the last four seasons combined. The stats are courtesy of noted Driving Play reader Vic Ferrari's timeonice scripts, which you can find information on how to use here.

TeamGoalsShots On GoalShooting %
Penguins, Crosby on Ice236216010.9%
Penguins, Crosby off Ice41453477.7%
Canucks, Henrik on Ice273262410.4%
Canucks, Henrik off Ice35947097.6%

As you can see, the Pens with Crosby shot 3.2 percentage points higher than they did without him. While some of it may be variance, with the number of shots they took with him on, that's a difference of 69 goals or more than 17 goals per season due to better shooting. The Canucks shot 2.8 points higher with Henrik on the ice, a difference of over 73 goals, more than 18 per season, when you consider how many shots they took with him on. For the statistically minded, these shooting-percentage differences are very very very significant. To give you an idea, it varies field to field but the most common benchmark is for there to be less than a 5% chance of results this extreme, or more so, due to variance alone. That's a 1-in-20 chance. For Crosby, there is a 0.00045% chance, or less than 1 in 222,000. For Hank there is a 0.00235% chance of results that extreme due to randomness alone - less likely than 1 in 42,000. Again, 1 in 20 is the usual mark. The data confirm what anyone would guess from watching a few games - Henrik Sedin and Sidney Crosby help their teams shoot better. (Note: if you are a hater and/or think that it's the likes of Alex Burrows and Pascal Dupuis who are driving these results, feel free to be wrong. The point of this is to provide evidence of shooting skill and clearly someone has it when these two are on the ice.)

Let's now look at the role of luck on shooting percentage. To do this, I will run simulations comparing the results of a typical team that shoots well and one that does poorly. In this article on objectivenhl, which is worthy of being linked again, JLikens finds that the average team shoots at an 8.1% clip 5-on-5, with a standard deviation of 0.48%. Going by this, a team that is good at shooting, let's say 7th or 8th best in the league, would have a true 5-on-5 shooting percentage of something like 8.42%. On the other hand, a team that is bad at shooting, say 7th or 8th worst in the league, would be expected to score on about 7.78% of their shots.

Let's see how things shake out. Below is a chart giving the results of 10,000 simulations for various numbers of shots where team A has a true shooting percentage of 8.42% and team B shoots at 7.78%. The first two columns tell you the given time period and number of shots for each team. The next three columns tell you how often the team good at shooting outshot the bad (column 3), the bad team outshot the good (4) and how often they had an equal shooting percentage (5). The last two columns give what percent of the time someone looking at the data, and not knowing the underlying percentages, would get statistical signficance at the 5% level. Notice that in the last column, the statistical test would reveal that B is significantly better at shooting than A despite their shooting skill actually being over half a percentage point worse.

Time periodNumber of ShotsA scores moreB scores moreGoals scored equalA > B SignificantB > A significant
One Period831.9%28.7%39.4%1.3%1.1%
One Game2442.6%36%21.4%6%4.5%
1/4 Season50062.3%33.2%4.5%10.5%2.2%
1/2 Season1,00068.6%28.5%2.8%13.4%1.4%
1 Season2,00076.4%21.7%1.9%17.9%0.8%
2 Seasons4,00085%14%1%27.8%0.2%
3 Seasons6,00090.1%9.3%0.6%36.1%0.1%
4 Seasons8,00093.6%6%0.4%43.7%0.1%
5 Seasons10,00095.6%4.1%0.3%50.3%0%

You probably didn't find the results surprising for that first row, representing a period of play. The most common outcome, happening about 40% of the time, is that the two teams remain tied, most often at 0. The team that shoots better due to getting higher-quality shots, hitting the corners better and so on is only slightly more likely to be the one that is ahead if you know that one of them is. Less than 32% of the time will the better team find themselves ahead after a period in which both get the league average 8 shots, whereas they'll be behind almost 29% of the time.

Lower on the chart it gets more troubling, especially for us bloggers. The most common sample point for analysis is half a season. Generally the best way to study the persistence of something is to split the season in half, typically first half vs second half or even-and-odd numbered games, and compare the two samples. This works well because teams should be the same or very similar. If you study something over multiple seasons you aren't getting the same teams every year due to player and coaching changes. In half a season, the team near the top in shooting skill has only about a 2 in 3 chance of outscoring the team near the bottom with the same number of shots. There is also little chance, roughly 13%, of finding that the better team is significantly better at shooting if you were looking at the data. Even over a whole season of shooting data, there is a 1 in 4 chance that the worse team will get better results. It isn't until we get several years worth of shooting results that it tilts heavily in favor of the better shooting team and that's not realistic because teams change so much each offseason and the simulations assumed the same percentage each season.

As you can see, luck plays a huge role for all reasonable sample sizes. This is the fundamental reason why shooting stats are better than goals. Luck is less of a factor for number of shots taken than number of shots made, so they are more reliable indicators of skill over samples of a season or less. If over a season there is a 1 in 4 chance that a good-shooting team is outshot by a bad-shooting team then it's tough to say that a team's results are due to skill and not just random luck.

In a future installment I will look at how persistence is affected by sample size.

On The Strange Results Of The Winnipeg Thrashers

Some people may have forgotten this, but in December of 2010, the Thrashers were primed for a playoff berth. Mainstream journalists sat up and took notice. After their game against the Maple Leafs on December 20th, the Thrashers were 19-11-5, with a point percentage of .614. This was finally the year for them - they'd gotten rid of Ilya Kovalchuk, acquired Dustin Byfuglien, and the team was better off. We know what happened next; they went 13-25-7 over their remaining 45 games and finished with the 6th worst overall record in hockey. Then they moved to Winnipeg.

But something strange happened along the way - by 'advanced metrics', the team got better, even as it did worse. Here's a look at Atlanta/Winnipeg's first half and second half even-strength Fenwick with the score tied by player, with a minimum of 10 total games. Fenwick % is shots on goal + missed shots on goal by Team X (here, Atlanta) divided by the total number of shots and missed shots taken. A player's Fenwick % is shots on goal + missed shots FOR while he is on the ice divided by total shots on goal + missed shots by both teams. We're only looking at the results while the score is tied because teams change their strategies when ahead or behind, which fouls up the numbers. (All numbers here courtesy of timeonice.com)

PlayerGP1st Half FenwickGP2nd Half FenwickDifference
Andrew Ladd390.513400.5150.002
Dustin Byfuglien400.5390.5580.058
Johnny Oduya370.425400.5320.107
Chris Thorburn360.452400.5320.08
Anthony Stewart390.479360.456-0.023
Ron Hainsey350.461400.5210.06
Bryan Little330.496410.5310.035
Tobias Enstrom400.477310.5260.049
Nik Antropov310.46400.5140.054
Evander Kane340.467340.5450.078
Zach Bogosian300.456380.5140.058
Alex Burmistrov360.398320.5920.194
Eric Boulton280.496310.5030.007
Rich Peverley390.474180.4880.014
Brent Sopel330.48190.444-0.036
Niclas Bergfors290.488220.5130.025
Tim Stapleton80.469340.5430.074
Fredrik Modin230.379100.5350.156
Jim Slater320.465
Ben Eager310.37510.333-0.042
Blake Wheeler230.598
Mark Stuart220.561
Patrice Cormier20.238180.4650.227
Rob Schremp150.592
Freddy Meyer70.34470.50.156
Radek Dvorak120.608
Ben Maxwell120.516


We see that in the first half, Atlanta was well into the negative - only two players managed to hit 50%. Their goal differential, however, was +3 despite a 46.3% Fenwick percentage. In the second half, the story was reversed - few players were in the red. Yet their goal differential with the score tied was -2 in spite of a .529 Fenwick %. We know that Fenwick % with the score tied is a better predictor of future results than Goal %, so by these measures, Atlanta/Winnipeg could be looking at a resurgence next year.

A nice chart contributed by JaredL shows the relationship between Fenwick % and Goal % as the season progressed:




We see the Fenwick % rising as the Goal % drops. What could cause the Fenwick to jump? I can think of three things that would cause the improvement:

A: Personnel Changes - The Thrashers made a few moves towards the end of the year, they brought in Radek Dvorak, Mark Stuart, and Blake Wheeler while they shipped out Brent Sopel , Niclas Bergfors, and Rich Peverley. Wheeler and Dvorak's 2nd half Fenwick while tied definitely beats Bergfors's and Peverley's.

B: Coaching Adjustments - It was Craig Ramsay's first year coaching the Thrashers, and perhaps the players had not figured out his system until the second half.

C: Player Improvement - Dustin Byfuglien played some defense for the Blackhawks last year, but this was his first year playing defense full-time. Promising youngsters Zach Bogosian, Evander Kane, and Alex Burmistrov had not played very much in the NHL. Burmistrov's jump was especially impressive.

But what of the drop in goals? I can think of two reasons for that:

A: Blind Luck - The Thrashers simply didn't get the bounces. Over such a small sample, chance will always be a factor. No one said that hockey was fair.

B: Changing Strategy - What if the Thrashers were responding to their difficulty in scoring goals by simply firing more pucks at the net? It's possible, but I doubt very much that it would result in such a wild change in Fenwick.

Still, this change in goal differential involving score tied Fenwick is one thing, but you don't get to a 14-19-6 second half record without other things going wrong, and it seems like just about everything else did. Here's a look at their Special Teams split into first and second halves:

Special TeamsPower PlayPenalty Kill
First Half20.9%80.9%
Second Half14.0%74.3%


And here's a graph showing Fenwick shooting percentage, both for and against, for the season:


We can see, again, that the opponent's shooting percentage improves while Atlanta's gets worse.

So who are the Winnipeg Jets going to be next season? It's difficult to say. They moved to a different city and switched coaches, but the personnel are going to remain pretty much intact. The team is still in the Eastern Conference despite moving to Winnipeg, which will lead to increased travel. They've yet to sign Zach Bogosian. Frankly, I don't know. For our upcoming series on Driving Play predicting the 2011-12 season, I inexplicably ranked them as #15 in the Conference - last overall. I doubt they'll make it there, but in spite of their second half Fenwick, I still think it will be a long winter in Winterpeg.

Wednesday, June 29, 2011

Why Shooting Stats Are Better Than Goals

Let's say you are asked to rank the NHL teams halfway through the season. Which stats should you use to do this?

Before getting to that, we need to think carefully about what a ranking means. The best team in the NHL is the one that is the best at winning games, and so on down the line. This comes down to two things - scoring goals and preventing the opposition from doing the same. If someone says X "is the best team in the league", what they mean is that X is the best at outscoring their opponents. Similarly, if Y is the best player in the league that means that he is the best at the combination of generating goals for his team and preventing them for the other.

Success at scoring and preventing goals in hockey, like every activity, is a combination of skill and luck. For some things, e.g. roulette, luck is the dominant factor. In others, like sprinting 100m, skill overwhelmingly wins the day. Hockey falls somewhere in the middle, perhaps closer to roulette than anyone would care to admit. Getting back to ranking the teams, that means figuring out which are the strongest at the skill part. Note that I'm using skill loosely here to refer to any skills that help a team score goals and prevent them, including those like grit and mental toughness that pundits love to talk about.

There are a few ways to tease out this skill component, all of which I will use in various articles in the future. Here I will compare stats from each team in two different groups of games - each half of the season, numbered even vs. odd, etc. The idea behind this is that luck in the first half of the season and luck in the second half of the season should be completely unrelated. Sometimes your team will get lucky in the first half and unlucky in the second half of the season, but the opposite is just as likely. Think of it like two coin tosses. If you win the first coin toss then you are no more likely to win the second than if you'd lost it. In contrast with the luck factor, your team should usually be about as skilled in the second half of the season as the first. If there is no relationship, known as correlation, between luck in the first half of the season and the second any link will be due to skill.

Mostly due to the availability of data, I restrict attention to 5-on-5 situations where both goalies are on the ice. For each of the past four years, I split the season in half and look at how goals and shooting stats in the first half relate to goals in the second half. Because we care about both scoring and allowing goals, I expressed this as a percentage: goals for divided by the sum of goals for and against (GF/(GF+GA)). The same goes for shooting stats.

Here is a graph of the relationship between goal percentage in the first half of the season and the second. All data are from timeonice. See links on the right.


It looks rather weak. The numbers back that up - the correlation is just 0.13. This is not statistically significant. Even ignoring that, it's pretty clear that putting up good scoring numbers 5-on-5 with the goalies in net in the first half of the season doesn't mean much in the way of predicting performance in the second half.

The relationship between Corsi percentage in the first half of the season and goal percentage in the second half is far stronger. Corsi percentage is like goal percentage, but for all types of shots, including missed shots and blocked shots. Here is the scatterplot:


You can see a distinguishable up-and-right pattern, which indicates a stronger relationship between the two. The correlation is 0.36, which is statistically significant. Keep in mind that we're looking at how shooting ratios in the first half relate to goals in the second half.

Let's look at the best and worst teams in the first half of this last season. The New Jersey Devils were an impressively bad 10-29-2 on January 8th, with an overall goal differential of -58 (72 - 130). 5-on-5 with goalies in their goal differential was -48 (45 - 93) and goal percentage 32.6%. That is the worst goal percentage in either half for any team in any of the four seasons of data that is available at timeonice. In contrast, the Flyers looked like world beaters halfway through. Their record was 26-10-5, goal differential +30 (137-107) and goal% 5-on-5 a cool 60%. What happened in the second half? The Devils put up one of the best turnarounds in NHL history, nearly making the playoffs, and the Flyers record was mediocre. The Devils went 28-10-3, the Flyers 21-13-7. The Devils had an overall goal differential of +23 (102-79), the Flyers +6 (122-116). 5-on-5 with goalies, New Jersey had a goal differential of +23 (76-53), 58.9%, and Philly 0 (81-81), 50%.

How could the worst team in the league in the first half have a better second half than the best team by such a large margin? The answer comes down to the luck factor I discussed above. In the first half, New Jersey took 52.6% of all the 5-on-5 Corsi shots in their games. Philadelphia was actually worse, just better than even at 50.6%. Despite that, the Devils got hugely outscored and the Flyers got far more goals than their opponents. While skill may be a factor in shots going in and being saved by your own goalie, the topic of my next article, luck plays a massive role in scoring over just a half season. The Devils were clearly not getting the bounces and the Flyers were. In the second half of the season, Philadelphia's luck was about average and New Jersey actually caught the breaks.

You can see how much better Corsi stats handle luck by looking at the two teams in the graphs above. New Jersey is the red point and Philadelphia orange. You can see that the Devils are a huge outlier when you look at goals in the first and second, but not so looking at Corsi in the first half and goals in the second, though you can see that they were fortunate. The goals graph is so scattered that the Flyers don't stand out much, but you can see that they dropped off a lot by how far they are from the top of graph. On the Corsi graph they are right in the middle, so from that perspective their second-half performance should have been expected instead of surprising.

Other articles might stop there, but things get more interesting if you run a regression. Regression analysis is a tool I will use pretty frequently. It allows you to separate out different effects. In our case, we want to know how important goals in the first half are once you take Corsi into account, and vice-versa. The regression makes it very clear that Corsi% is a far, far better predictor of goal% in the second half than first-half goal%. Not only that, it appears that virtually all of the tiny amount of explanatory power you get from goal% comes from the fact that goals are a type of shot.

When the regression spits out a formula, the size of the coefficient tells you how big its effect is. When both first-half goal% and Corsi% are included, the goal% coefficient is a minuscule 0.007. For the stats nerds, the standard error is 0.087 so the p-value is an astonishing 0.936. This is about as statistically insignificant as it gets. For comparison, the coefficient for Corsi% is 0.550 (SE of 0.142, p < 0.001) which is very strongly significant. If you have a team that breaks even on goals in the first half of the season but Corsi outshoots its opponents 60-40 then they will average about 83.3 goals scored and 66.7 allowed in the second half of the season (assuming 150 total 5-on-5 goals, which is close to the league average). If instead you have a team that was even on shots but won the goal battle by that much then they will average 75.2 goals in the second half and concede 74.8.

Once Corsi is taken into account, goals do not at all predict future success.


Topics left for future articles:
- What about score effects?
- What about Fenwick?
- What about special teams?
- Is shooting all luck, then?