Featured post

Textbook: Writing for Statistics and Data Science

If you are looking for my textbook Writing for Statistics and Data Science here it is for free in the Open Educational Resource Commons. Wri...

Wednesday 22 August 2018

Draft Pairing Tournament Format

Worst-vs-first pairing structures for playoffs are designed to reward teams for doing well in the regular season. Sometimes this backfires. A better system would be a 'pairing draft' in which teams choose their first-round playoff opponents in order of regular season ranking.

Give me a few minutes to convince you.

Consider the following two-stage tournament setup:

The first stage is some balanced system to establish a ranking, such as a round-robin or a regular season schedule. These rankings are used to determine who goes on to the second stage, which is a collection of head-to-head eliminations. These rankings also determine which competitors play against each other in the first round of eliminations using 'first vs worst' setup, where the first team plays against the worst (qualifying) team.

This general setup describes the regular and playoff seasons respectively of the NHL NFL, CFL, MLB, MLS, and NBA. The two stages in this setup also describe the round robin and elimination stages respectively of the FIFA World Cup, and many Olympic tournaments.  a lot of competitions outside the North American Majors like UEFA which is a round-robin followed by elimination.

Each of these had their own quirks like divisions, wildcards, and reseeding, but the primary goal of the elimination bracket is the same in all of them: Reward merit by pitting top teams against teams that qualified by the smallest margins.

This isn't just a scheme to maximize inequality in sports, it's a deliberate structure to incentivize competitors to win every match even when they have already performed well enough to guarantee/clinch a spot in the elimination structure. It provides stakes for competitors to play for, at least most of the time.

However, on occasion, these seeding systems actually work to incentivize losing.

The notorious badminton case.

Consider the case of badminton at the 2012 Olympics in London.

To quote Justin Peters of Slate from This article ( https://slate.com/culture/2012/08/badminton-scandal-olympics-2012-why-were-those-olympic-badminton-players-trying-to-lose-and-why-is-the-sport-so-dirty.html )

“...one of China’s two women’s doubles teams—Zhao Yunlei and Tian Qing—lost to Denmark’s Christinna Pedersen and Kamilla Rytter Juhl by the score of 22-20, 21-12.

That shocking result meant the two Chinese teams, the tournament favorites, would meet in the semifinals of the knockout round rather than the gold-medal game, depriving China of the chance to win both gold and silver. China’s only hope of putting two teams in the finals, then was for the country’s other team of Wang Xiaoli and Yu Yang to lose, thus pushing themselves to the opposite side of the bracket.

Once their South Korean opponents saw what the Chinese were up to, they decided it was also in their best interest to lose, that a defeat would give them better medal-round matchups as well.”

Official Results: https://www.olympic.org/london-2012/badminton

Video clip: https://youtu.be/JSxnQ-tgE3g

The key points are:

1) All four teams involved in intentionally trying to lose had already qualified for the next round, and were just playing to determine seeding.

2) There was a rational incentive for each of these teams to lose because a surprise result earlier.

3) All four teams were disqualified for poor sportsmanship despite technically doing their best to win the tournament.

Why did this happen? What failed in the system to produce such a perverse incentive of this magnitude at this level of sport?

Essentially there was a disconnect between one team, China's, latent ability (their expected performance on average), and their manifest performance (what actually happened). These ‘first vs. worst’ seeding systems only works as an incentive when the latent and manifest skill levels are close.

What could have happened differently to avoid this Fiasco?

China could have won their game against Denmark, keeping the incentive structure in line with the spirit of the game. But over enough games, upsets are inevitable.

A longer qualifying phase could have been played to reduce the effect of any single upset, like a regular season instead of a round robin.This isn’t feasible for a tournament event like the FIFA World Cup or the Olympics.

All four teams could have actually tried to win. They could have played to the spirit of the competition, rather than the letter of its rules and its strategy. No comment.

Seeding could be randomized. Nate Silver of fivethirtyeight has suggested this for FIFA as a means to prevent teams from colluding or intentionally losing or drawing games in order to avoid a particular strong opponent. In this particular case where two teams from each pool of four qualifiers for the next round this is a sensible solution. In cases where there is a large number of seeds like the NHL playoffs which has 8 seeds per conference, or NCAA’s March Madness which has 16 seeds per region, the loss of the incentive to be seeded highly could be detrimental.

What if we let them draft each other?

What if the highest seed was awarded the opportunity to choose their next opponent amongst those who qualified?

Example: An NHL Conference (ignoring divisions)

A conference in which eight teams have qualified. How would a draft work?

Under a traditional pairing, we go ‘first vs. worst’, or

1st vs 8th
2nd vs 7th
3rd vs 6th
4th vs 5th

But what if that first seed has had really good luck against, say, the 7th seed. Or what if the 4th seed team’s captain has just sustained an injury? Then the draft might go like this:

- Team 1 would opt to play against Team 7.
- The highest remaining seed, Team 2, chooses Team 8.
- Then Team 3 chooses Team 4.
- Team 4, being chosen already, does not get to choose.
- Teams 5 and 6, being the only two remaining teams, play each other.

General Commentary
In a case like merging round robin pools like the Olympic Badminton tournament, or FIFA, we could randomly draw among the two first seeds to determine which one gets their pick of opponents. In the badminton case, the winner of either match would have had a better than 50% chance of a favourable result. (50% chance to choose the weaker eligible opponent, and a non-zero chance to be chosen by the weaker eligible opponent)

Collusion on opponent preference could be avoided further by having each team submit a sealed preference ranking in advance. If that team has an opportunity to pick the highest-ranked preference would be the one selected.

The pairing draft system works is intended only when these latent parameters are known effectively by all parties involved. If one team or competitor has more information, perhaps through scouting or analytics, then that team can benefit more from the draft than their seed that should otherwise allow.

Another drawback is that the advantage compounds as the size of the bracket increases. The top seeds can handpick teams that are particularly weak against their own style of play.

A team could even reduce their chances of being picked by a top seed by disincentivizing being picked in ways unrelated to skill. My close friend Justin Bayes has proposed an ice hockey team composed entirely of enforcers. These are players that are not chosen for their skillful play but their ability to hinder, harass, and otherwise injure opponents. If a team like this made the playoffs, even if just barely, they would probably be given the luxury of not playing against a much higher seed, not because the team playing Justin’s Goons would lose, but because a win would be so costly.

This sort of strategy can be flipped on its head to an advantage as well. Consider a strong team looking to knock out a divisional rival early before they have to deal with them later in the playoffs. They could select that rival even if it was a stronger opponent than they would be otherwise be required to play against.

Consider the regular seasons of sports leagues like the English Premier League the National Hockey League major League Baseball and Major League Soccer. A team in each league plays, 38, 82 and 162 matches in their respective regular seasons.

In cases like these, a single upset is nearly meaningless, and a team’s seasons performance is a good approximation of there latent skill, ignoring changing effects like trades and recent injuries.

What additional advantage could a draft possibly confer them? If skill can be simplified into a single parameter then nothing, the regular season rankings would already be the optimal selections. A team would always choose to play a game Saw the lowest-ranked available opponent. If 8 teams qualify then the first seed would play against the 8th seed, the second seed would choose to play games the 7th seed, and so on.

However, in many sports, skill is a much better described as multiple parameters. The secondary parameters might embody individual properties of teams that make them situationally better or worse than their average season performance would suggest.

For example the stadium of the New York Yankees has a strong bias favouring left-handed hitters. Would you choose to play a game against the Yankees if your team was composed all right handed hitters?

The Boston Bruins have a (possibly unfair) reputation for playing very rough hockey. Would you choose to play against them if your team relied heavily on finesse and was easily disrupted?

In individual head-to-head sports these parameters maybe even more prominent. In a chess tournament would you choose to play against someone with a higher ELO rating or someone with a proven record of beating your favorite opening? In a martial arts tournament would you choose the higher-ranked opponent over one whose favored style counters your own?

Math Details: The CRSP model

We can model these differences in specialization with a CRSP model or ‘crisp’ model.

(Source: https://dl.acm.org/citation.cfm?id=2835787 Modeling Intransitivity in Matchup and Comparison Data by Shuo Chen and Thorsten Joachims)

The CRSP model treats competing teams in a team sport (or individuals in a solo competition) as if they have several latent skill parameters.

A general skill parameter.
A ‘rock’ parameter
And a ‘scissors’ parameter

(a ‘paper’ parameter could theoretically be added for a better model fit if necessary)

The probability of one competitor beating another in a CRSP model is derived from the difference in their general skill and the angle between their competitors’ Rock and scissors parameters.

Under the CRSP model a top seed will optimally choose the opponent they had the best chance of beating, which isn't necessarily the one with the lowest available general skill.

Follow-up questions

How should reseeding work? For example in the WNBA after the second round, the arrangement of the teams is receded to reinforce this best versus worst situation.

How would Multiple round strategy work?

Currently, Major League Baseball has a one-game wild-card round. A pairing draft would logically happen after such a Wild Card round because only two teams play in each wild-card round.

Chica is willing to get messy to scare off a dainty opponent.

No comments:

Post a Comment