Statistics et al.: Peer review of "Algorithmically deconstructing shot locations as a method for shot quality in hockey"

This is an open peer review I did for the manuscript "Algorithmically deconstructing shot locations as a method for shot quality in hockey" by Devan G. Becker, Douglas G. Woolford and Charmaine B. Dean as submitted to the Journal of Quantitative Assessment in Sport, back in 2020.

You can find the manuscript behind a $42 paywall put up by DeGruyter at https://www.degruyter.com/document/doi/10.1515/jqas-2020-0012/html

I don't usually do academic reviews because it's usually unpaid labour for a corporation with no benefit to me or my institution for doing it. I make rare exceptions for papers that I would want to read and write about regardless, and for the journal Meta-Psychology because I strongly support their mission. This paper was relevant to my work at Sportlogiq, and I’m a fan of the authors

Note that neither the authors nor the reviewers were compensated by the publisher, DeGruyter for their labour. Typically, authors will share preprint copies of their manuscript if you contact them, but I can’t speak on behalf of individual authors.

Also, note that the published paper incorporates many of the revisions listed. These comments are published here mostly to show the review process when there are multiple rounds of revisions: A main review, and a follow up.

Conflicts of interest statement

Possible conflict 1: I work for a company (Sportlogiq) that has its own proprietary method for assessing hockey shot quality, and I use that method in my own work. However, I have no personal or professional incentive to either promote or suppress this work.

Possible conflict 2: Charmaine B. Dean was the founding chair of the statistics department at Simon Fraser University, the department in which did my PhD. However, she left for U of Western Ontario around the same time I started there, so I still consider this to be an arms-length relationship. Because of my new appointment at U of Waterloo, this is no longer an arms-length relationship, but it was at the time of the review.

My general approach to reviews is to ask "what work does this need to be ready for publication?", not to ask "is this worthy enough to be in publication". As such, I rarely give a major mod / reject recommendation to papers. If you have publication limitations, or a target rejection rate, please take that into account. The paper is fine, it ticks all the boxes a paper should. It's not important enough to rushed to the front of the queue though.

General assessment

"Algorithmically Deconstructing Shot Locations as a Method for Shot Quality in Hockey" is clear, well-written, and approachable despite using traditionally very hard-to-explain methods like INLA. The tone, scope, and style make this paper a good fit for JQAS, using the most recent editor's choice papers as a comparison. The research itself is a meaningful addition to the assessment of shot quality, specifically the characterization of shots into a small set of profiles, and accounting for the handedness of the shoot.

As I understand the manuscript, non-negative matrix factorization (NMF) is the method used to find these clusters, and a Log-Gaussian Cox Process (LGCP) is used to spread shots and goals across a surface of locations that would otherwise be too sparse and noisy to get useful information.

Regarding reproducibility, the data used is publicly available, and the method used is described well enough that someone could recreate the analysis given the dedication.

Regarding novelty, the work described isn't exactly groundbreaking, but it does contribute to the general understanding of hockey dynamics, and the sports analytics world is better off with this paper being published.

Specifics

I have one major and two minor issues with the manuscript, neither of which is a problem with the research itself.

My major issue is that the results section is anemic. In the text, there are only two paragraphs in a single subsection explicitly dedicated to results. The paper is already short with 8 pages of writing outside of the captions and abstract. Five of the six figures have high-quality results, along with good commentary in the caption (the other figure is a diagnostic plot, it's fine too, but it's not results). There's room for a lot more commentary in the main text.

For example: "Figure 3 shows the coefficient estimates for a few selected players." This is a good opportunity to take more about these notable players, either for their competence, their embodiment of an archetype like "left-handed center", or for their uniqueness. Pick a few players and tell the story that their shooting profiles reveal. Alternatively, what is it the shooting percentage in each cluster?

My first minor issue is an obscure problem with the data source, the NHL real time tracking: It requires normalization across different eras. For example, in January 2015, the NHL started using a new tracking system, and things like the average physical distance between recorded events suddenly increased. See: https://www.nhl.com/news/nhl-sportvision-test-program-to-track-players-puck/c-750201

At the beginning of the 2019-20 season, the distance to the net of shots suddenly changed. Micah McCurdy of HockeyViz, a website already mentioned in the manuscript, has commented on this change.

My second minor issue is with the use of the theoretical 'perfect player', which is the distribution of all shots by players of a given position that resulted in goals.

The authors mention that the top players have a different shot distribution than the all-goals perfect player. The explanation given is that "There are several strategies that players can adopt based on their strengths and weaknesses; the perfect players only represent one strategy." Strategy variation and game theory are one explanation, but there's something else: survivor bias.

The perfect player represents all the shots with the best outcomes, but those kinds of quality shots are not often available. A player, proficient scorer or not, cannot often choose to take a shot from where they are or from somewhere closer or at a better angle. Instead, when the opportunity to shoot presents itself, a player can only choose or to trade that opportunity for the chance at a better one. The perfect player represents the times when the best opportunities were not just taken, but available.

I recommend that this paper be published In JQAS with moderate priority and minor revisions.

Follow Up comments after a round of revisions

In the previous version of this draft, there were three issues that I wanted to see addressed before giving a decision to 'accept': Major 1. A lack of specific results (e.g. something a reader could use to describe it to a practitioner or a coach), Minor 1. There are two possible data inconsistencies relating to the NHL's event recording in early 2015 and early 2020. Minor 2. the overreach / over simplification of the concept of a perfect player.

The new results Section 4.1 "Estimated Basis Functions" addresses Major Issue 1 by describing the shot characterization of two example players, how they deviate from the theoretical perfect player, and the implications of this deviation. Also, in Section 4.1 is a clarification about the strategic limitations of the Perfect Player concept, thus addressing Minor Issue 2. Both issues are addressed further in Section 4.2

Minor Issue 1 was addressed by two comments in the concluding discussion. Specifically, the effect of the data inconsistency in 2015 was checked, and that the data used does not extend to the 2019-20 season with the other inconsistency.

Two new issues have arisen: There is a broken ref on Page 9, Line 30 to "Figure ??". Also, the capitalization of "Perfect Player" is inconsistent between Sections 4 and 5. I personally prefer it without capitalization, but the most important thing is to be consistent. These can be fixed in the copyediting stage.

Finally, this revision has several other details that weren't in the original, and I appreciate the thoroughness and rigour.

I am happy to recommend that this manuscript be accepted as is.

Statistics et al.

Featured post

Textbook: Writing for Statistics and Data Science

Thursday, 17 February 2022

Peer review of "Algorithmically deconstructing shot locations as a method for shot quality in hockey"

Conflicts of interest statement

General assessment

Specifics

Follow Up comments after a round of revisions

No comments:

Post a Comment