## Wednesday, 25 February 2015

### New powerplay metrics applied to 2014-15 NHL teams

In this previous post I made a case for a new powerplay metric called GDA-PK (Goal-Difference Average -  Penalty Kill) that used the amount of time in a powerplay state rather than the number of powerplay instances as its basis. My argument was essentially that not every powerplay is the same: some are majors, some are interrupted by goals or additional penalties, and others are confounded by pulling the goalie and by shorthanded goals.

My aim was to present a new metric that would measure the effectiveness of special lineups in way that removed a lot of the noise and was easier to interpret in the context of the rest of the game.

GDA-PK represents the average goals gained or lost per hour when killing a penalty. In Table 1 below, for example, the Vancouver Canucks, have a GDA-PK this season of -3.87, so they fall behind by 3.87 goals for every 60 minutes in a 4-on-5 situation. This can also be written that the 'nucks lose a goal for every 60/3.87 = 15.5 minutes of penalty killing.

Likewise, GDA-PP represents the average goals gained or lost per hour of powerplay. Table 1 shows that the Detroit Red Wings manage to gain 8.00 goals for every 60 minutes of powerplay, or get ahead by 1 goal for every 7.5 minutes spent on the powerplay. This isn't the same as gaining 1 goal for every 7.5 minutes of penalty in favour of the Red Wings, because a lot of those penalties will be cut short by conversions.

Also note that it's 'Goal Difference', not goals scored/allowed. This way, both measures account for shorthanded goals by treating them as negative powerplay goals.

I wanted to see if there would be any major divergences between GDA measures and the traditional PP% and PK% (PowerPlay percent and Penalty Kill %) measures in terms of ranking the teams. Ideally, both measures would agree in rankings because they are intended to measure the same thing.

### Results

Table 1: Powerplay and Penalty Kill Goal Differential per Hour (Goals against minus goals for) and related statistics for teams for the 901 of 1230 regular season games up to 2015-02-24, inclusive.

Team GDA-PK Pkill Pct GDA-PP Pplay Pct
ANA -3.44 0.814 4.35 0.177
ARI -7.77 0.779 5.97 0.212
BOS -5.62 0.824 5.00 0.174
BUF -6.88 0.754 1.59 0.118
CAR -3.26 0.880 5.66 0.185
CBJ -5.44 0.808 6.32 0.207
CGY -4.64 0.803 5.96 0.175
CHI -3.96 0.862 6.14 0.187
COL -4.49 0.832 3.56 0.133
DAL -6.18 0.793 5.43 0.180
DET -5.33 0.832 8.00 0.252
EDM -5.67 0.780 4.41 0.152
FLA -6.66 0.790 4.04 0.148
L.A -5.35 0.798 5.17 0.186
MIN -3.63 0.860 3.44 0.159
MTL -3.88 0.856 4.90 0.170
N.J -6.35 0.798 5.20 0.195
NSH -5.24 0.820 5.80 0.173
NYI -6.69 0.746 5.88 0.185
NYR -3.33 0.830 4.89 0.182
OTT -4.31 0.825 4.65 0.172
PHI -8.49 0.760 6.76 0.233
PIT -3.56 0.856 4.81 0.205
S.J -5.26 0.804 5.90 0.205
STL -6.67 0.804 6.67 0.233
T.B -4.83 0.833 4.11 0.171
TOR -4.52 0.821 4.52 0.186
VAN -3.76 0.859 4.83 0.184
WPG -4.75 0.807 5.05 0.188
WSH -6.17 0.808 7.76 0.237

League-wide GDA-PK: -5.163
League-wide GDA-PP: +5.163

Correlations between proposed and classic measures:
PP Pearson r = 0.892, Spearman's rho = 0.810
PK Pearson r = 0.833, Spearman's rho = 0.841

The league-wide GDA-PP and the GDA-PK balance out by definition.

Notable performances are highlighted in Table 1. The Carolina Hurricanes seem to get a lot more mileage out of their powerplays than their opponents (5.66 goals/hr gained vs. 3.26 goals/hr lost). Philadelphia and Washington are very exciting to watch during penalties in both directions.

After 60 games, each team has had 5-6 hours of powerplay time, and 5-6 hours of penalty kill time. As such, there's still a lot of uncertainty in the GDA-PK and PP of individual teams. Each team value should be considered to have a 'plus-or-minus 2 goals/hr' next to it. For example, Boston's GDA-PP of 5.00 could really mean they've been gaining 3 goals/hr on the powerplay and getting some lucky bounces, or that they should be gaining 7 goals/hr, but can't catch a break this season.

Further inference from these values is possible. We could make reasonable GDA-PK and PP estimates of specific matchups. For example, Pittsburgh's excellent penalty killing and Washington's excellent powerplay skills should cancel each other out. We would expect the Washington Capitals to gain (WSH PP + (- PIT PK) - LeagueAvg) =  (7.76 + 3.56 - 5.163) = 6.16 goals/hr on the Penguins with a 1-man advantage.

We could also find the average length of a minor or major penalty, so we can use this to estimate how many goals a given penalty is worth, either league wide or for a given team. We can also find the success rate of penalty shots, assuming we can use the shootout to increase our sample size, so we could also find out how many penalty minutes a penalty shot is worth.

The correlation measures Pearson's r and Spearman's rho are positive and far from zero (they are on a scale from -1 to 1). These correlations indicate a strong agreement between the GDA measures and the traditional measures; a team that puts up good PP% / PK% numbers will put up comparably good GDA-PP/PK numbers.

Furthermore, we can use a similar metric to isolate even-strength situations and see how well a team fares when there are no penalties involved. In Table 2, GDA-EV (Goal Difference Average - EVen Strength) refers to the number of goals a team gains or loses on average per 60 minutes of even strength play. A positive number represents a team that outscores their opponents when there are no active penalties (or only offsetting ones), and a negative number represents a team that falls behind at even strength.

Table 2: Even-Strength Goal differential per hour (Goals against minus goals for) and related statistics for teams for the 901 of 1230 regular season games up to 2015-02-24, inclusive.

Team GP Goal Diff GD/G GDA-EV
ANA 61 8 0.13 0.06
ARI 61 -71 -1.16 -1.19
BOS 60 4 0.07 0.08
BUF 61 -94 -1.54 -1.09
CAR 59 -24 -0.41 -0.67
CBJ 59 -31 -0.53 -0.66
CGY 60 12 0.20 -0.08
CHI 61 29 0.48 0.28
COL 61 -17 -0.28 -0.08
DAL 59 -11 -0.19 0.00
DET 61 25 0.41 0.17
EDM 59 -65 -1.10 -1.05
FLA 62 -21 -0.34 0.00
L.A 60 16 0.27 0.48
MIN 59 11 0.19 0.16
MTL 60 25 0.42 0.47
N.J 60 -20 -0.33 -0.04
NSH 60 42 0.70 0.86
NYI 61 22 0.36 0.37
NYR 62 43 0.69 0.58
OTT 59 4 0.07 -0.11
PHI 61 -12 -0.20 -0.10
PIT 60 25 0.42 0.37
S.J 61 0 0.00 -0.22
STL 60 32 0.53 0.54
T.B 62 38 0.61 0.61
TOR 60 -16 -0.27 -0.39
VAN 60 13 0.22 0.10
WPG 62 3 0.05 0.23
WSH 61 30 0.49 0.34

League-wide GDA-EV 0.000

Correlation between goal difference per game, and GDA-EV:
Pearson's r = 0.941 , Spearman's rho = 0.916

The league-wide GDA-EV is exactly 0, as expected. There is a very strong linear relationship between GDA-EV and simple goal differential, which may indicate that powerplay performance isn't as important as it's made out to be. Looking at all goals, the Buffalo Sabres seem to be uniquely awful, but when you remove the effect of penalties, they're merely as bad as Edmonton and Arizona.

Another note is that even though Montreal is currently the top of the Eastern Conference standings, they're only putting up half a goal more than their opponents on the 5-on-5 or the 4-on-4. They're on par with the L.A. Kings, who are currently on the playoff bubble.

Since teams spend the large majority of their time at even-strength, GDA-EV has a lot of data to draw from. You should consider them to be (plus-or-minus 0.5 goals/hr).

This is still very much a work in progress, and these measurements should be taken as preliminary only. There are likely still some bugs to work out that have gone unseen.

In this previous post, I gave a short demo of the nhlscrapr package, which allows anybody to download a play-by-play table of every hit, shot, faceoff, line change, penalty, and goal recorded. The data used to make Tables 1 and 2 come from nhlscrapr. Select details are found in the methodology below.

### Methodology

I've written an R script, to count the time spent in each powerplay state and the number of events (in this case, events are goals) that occur in each state and for/against each team. A cautionary note to those trying to do similar things: the numbers in the home.skates and away.skaters variables are unreliable and you're better off tracking it yourself if it's critical. The other variables appeared to be pretty reliable, such as home and away goalie ID which made identifying empty-net times possible.

I apologize, but I won't be sharing the script to get this data, until I've considered (and possibly taken advantage of) the academic publication potential of the results in the tables below. Also, the script is a modification of one used for a research paper Dr. Paramjit Gill and I currently have in the submission stage, and I don't want to complicate that process.

Overtime and two-man advantage situations was ignored. Neither the goals nor the time spent are included here. However, the league-wide goal differential seems to be in the 12-16 goals-per-hour range. There is nowhere near enough data to say anything about individual teams in these situations.

The league-wide measures are not the raw average of the values from each of the 30 teams because some teams spend more time short handed or on the power play. Instead, they are computed by treating the whole league as a single team playing against itself. If a team draws more penalties and spends more time on the powerplay, that team's performance will count for more towards the league-wide average than other teams. However, in reality, every team is contributing between 3.0% and 3.6% towards each league measure.

Any goals scored when the goalie was on the bench for an extra skater (e.g. during a delayed penalty call or when a team is losing at the very end of the game) were ignored. This includes goals on empty nets, goals by teams with empty nets, and that time the Flames did both.  The time where one (or both?) goalies were off the ice was deducted from the appropriate situation. There was a mean average of about 50 seconds of empty net per game, with a lot of games with no empty net time.

Finally, only regulation play (the first 60 minutes of each game) is considered. This is to filter out any confounding issues such as teams scoring less often in overtime than they do in regulation.