Conflicts Over Corsi : Hockey Circles Debate Advanced Statistical Analysis’ Usefulness and Prevalence

While you can’t find two sports more disparate than baseball and ice hockey, they have at least one thing in common: debate in the area of advanced statistical analysis beyond your basic box scores.Baseball has seen this rise more markedly within the past three decades, starting with the work of Bill James and expanding to a variety of differing models such as Baseball Prospectus’ Player Empirical Comparison and Optimization Test Algorithm (PECOTA) ratings, greater emphasis on a team’s or player’s On Base Percentage (OBP), and Voros McCracken’s Batting Average on Balls in Play (BABIP) statistic led to a trend in Major League Baseball teams, most notably the Oakland Athletics and the Boston Red Sox, hiring higher level statistical analysts to augment their player personnel decisions.

National Hockey League front offices are not at that point yet for a variety of reasons, but that doesn’t mean the day isn’t coming, especially in a salary cap era where your currency needs to have maximum value for whatever player you are investing in.

One of the advanced statistical models that are used in a number of hockey circles is Corsi numbers. It was named after former Edmonton Oiler goaltender and long time Buffalo Sabres’ goalie coach Jim Corsi who came up with the initial model. It was initially intended as an indicator of a goaltender’s workload and a means by which Corsi could create an effective training regimen.
The key in understanding Corsi numbers is that it’s not just about shots on net, but directing the puck at the net. Since then, Corsi’s original concepts have been expanded upon by Gabriel Desjardins of the noted hockey stat website Behind The At its basic core, Corsi numbers are a type of plus-minus statistic in which direct shots on net, missed shots, and blocked shots are all part of the equation.

Corsi can be utilized both for entire teams and individual players. To put it succinctly, if the New York Rangers direct 30 shots on net and the Philadelphia Flyers shoot 22 at even strength, the Rangers have a Corsi rating of +8. If Rangers forward Brad Richards is on the ice for 12 Ranger shots on goal and the Flyers have 17 shots while he on the ice, then Richards’ Corsi individual rating would be -5.

Now the question is: why do Corsi ratings cause contentions amongst hockey circles? The argument opposing the ratings seem to indicate that unlike baseball, hockey does not lend itself easily to deep statistical analysis and might not give you a completely accurate picture on a player.

“Hockey is a team game and while raw Corsi numbers gives you a little bit of information about defensive impact, it can also be deceptive,” said writer Bill Meltzer of and Hockey Buzz. “The reason is that the number of shots attempted isn’t a great measure, in my opinion. Former Flyers defenseman Petr Svoboda likely had mediocre Corsis but he was still a highly effective NHL defenseman.”

Broad Street Hockey and SB Nation blogger Geoff Detweiler agrees with the point that it’s not very effective in assessing a players’ defensive acumen alone. However, he points out that the key to using Corsi properly is understanding what context you are using it.

“In its raw form, good defensive players will look really bad since they are put into the defensive zone against the best offensive players.”

However, he points out that many bloggers understand this concept and are using various types of Corsi stats to fit a specific context. According to Detweiler, Relative Corsi (CorsiRel) and the Quality of Competition Metric based on Relative Corsi (CorsiRelQoC) are the most commonly used.

To clarify: CorsiRel is the difference between the team’s Corsi when a player is on the ice versus when the player is on the bench. CorsiRelQoC is the average CorsiRel of the opponents you skated against, which is weighted by how much time you had on the ice against them.

In a 2009 interview with America Online’s Adam Gretz, Desjardins points out the importance of the Quality of Competition statistic.

“It is a very good metric for determining who faced the toughest competition on a given team, and, in a general sense, it can isolate which players faced the toughest competition overall.”

Detweiler expanded the point further.

“That stated, if you look at zone starts, CorsiRel and CorsiRelQoC will show who faced the tough competition in bad situations and managed to drive play forward. It’s very effective as a proxy for puck possession and a predictor of future scoring.”

And that is where the proponents of Corsi find value in the statistics. If you are projecting whether or not a player makes a positive contribution against better competition, then it offers a window into their true value.

Detweiler stated many bloggers will also use CorsiTied or CorsiClose (which shows performance in tied or one goal games, respectively) when approaching score effects (for when teams outshoot their opponents and lose). For Corsi proponents, this is a critical component that tells something beyond shots for or against, and in some ways, even wins and losses.

David Johnson of Hockey somewhat backs up the point that CorsiTied gives a good indication of team’s or player’s ability to control play (puck possession). However he feels that it is not without flaws. The question he asks isn’t whether or not a team is any better at controlling play but better at winning games: the ultimate objective. From a players’ perspective, he argues that it is not a great indicator of the value of a players’ defensive game.

Bruce McCurdy of the Edmonton Journal (cited by Meltzer in his July 22 Hockey Buzz blog), put together a research based blog that shows how frequently in the modern NHL how the team that outshoots their opponents usually winds up losing the game. In 1,185 games in the 2010-2011 season that had an unequal shot distribution, McCurdy noted that the team that was outshot won 627 times, the outshooting team just 558. Meltzer expanded the point to include the Stanley Cup champion Boston Bruins who were outshot in 16 of 25 playoff games and had a record of 12-4 in games they were outshot. Therefore, the primary argument against Corsi is that missed shots or blocked shots are variables that are too problematic to truly take into account as a key indicator of production (especially if a team is firing shots that are off target by a couple of feet). In short, if you combine missed shots combined with real shots, it doesn’t work. In short, it’s not about shot totals. It’s the team that gains the most quality chances that’s more likely to taste victory.

Detweiler, in a response on Broad Street Hockey, specifically counterpointed that argument by stating that quality scoring chances are correlated very strongly by Corsi, drawing upon a 2009 study by Vic Ferrari on the 2008-2009 Oilers season which indicated how by using 20 game rolling averages, the number of scoring chance percentages to Corsi percentages were practically identical.

Needless to say, the point-counterpoint debate could go on and on. The arguments from both sides are very impassioned and understandably so. The debate is really in two points of contention: how useful and relevant is advanced statistical analysis, how much is it used now and/or will be in the future?

It comes down to how pro hockey teams want to incorporate advanced statistical analysis such as Corsi. Do they blend it with traditional scouting? Resist it all together? Use it as a cost saving measure? We know that the Buffalo Sabres do to some extent, especially with their goaltenders and not coincidentally because previous owner Tom Golisano cut back his scouting staff a few years ago.

Is the point that the use of advanced stats such as Corsi doesn’t apply to a team game like hockey a gold plated truth or is it just resistance to change?
The view here is that teams will always use whatever methods are necessary to get an advantage. Coaches, trainers, and management all use some form of research to advance and enhance their teams’ performance, but how do you quantify that performance with a statistical model if it doesn’t measure the ultimate result: did what you analyze help you win or lose the game?
That is ultimately where the argument over Corsi and other advanced statistics lie.

Special thanks to Timo Seppa of Hockey Prospectus and Nina G. from Flyers who helped point me in the right direction for research on this topic. You can follow Timo (@timoseppa and @puckprospectus) and Nina (@potnoodlez), as well as Bill (@billmeltzer) and Geoff (@geoffdetweiler) on Twitter.
Follow me on Twitter @AnthonyMingioni

*photo by West