How much easier or harder is
the Jeopardy! College Championship than regular Jeopardy? In the
College Championship Jeopardy!. In this tournament, 1 undergraduate
student from each of 15 U.S. postsecondary schools compete in a
tournament of elimination rounds. The intended audience of the
categories are different than it is for regular Jeopardy shows. Some
of the clues referred to new and popular video games, and neologisms
like 'woke'. To me, the College Championship questions were
qualitatively easier, but for the sake of tracking, I want to measure
that. The following method will also work if you find the College
Championship questions harder than those from the regular show.
Here is a chart of my
personal 'Coryat scores' since I started recording them on January
10^{th}. Coryat scores, as found here
http://www.pisspoor.com/jep.html
, are a means of standardizing Jeopardy scores for home viewers. One
rule of Coryat scores is that Daily Doubles are not wagered, but are
treated as regular clues that a home viewer may guess on without a
penalty for an incorrect answer.
The dotted lines in the
chart are at a score of 24,000 and 28,000 respectively; which are
considered by Karl Coryat to be the scores needed to be 'test ready'
and 'show ready' respectively. Filled circles and triangles represent
days in which I got the correct Final Jeopardy answer, and open
shapes represent days in which I missed it. Circles represent regular
show days, including Saturday reruns. Triangles represent College
Championship days.
The red curve represents an
estimate of my average regular show Coryat at my current skill level.
In order to make this estimate without ignoring 10 of my data points,
I had to separate the 'effect' of the College Championship, which I
assumed adds a flat amount to my scores. I applied this assumption to
three regression models:
1. Score = α
+ β(CC game) + γ(Days played) + error
2.
Score = α + β(CC game) + γ(sqrt(Days played)) + error
3.
Score = α + β(CC game) + γ(exp(Days played/(Total
Days)*log(0.70))) + error
In
Model 1, I assume that improvement in mean scores is constant with
each passing day.
In
Model 2, I assume that improvement in mean scores diminishes with
each passing day, but that it continues indefinitely. (e.g. It will
take four times as much time to get twice the progress).
In
Model 3, I assume improvement, but that there is some limit to my
progress, and that I am started at about 28%, and
am already at 70% of my
limit. Under this model, my mean score will decay exponentially
towards some maximum and stay there.
Each
model applies a fixed additive effect for playing a College
Championship game, symbolized by β.
I don't know which if these models is the most correct, so instead I
take an ensemble estimate using all three.
The ensemble estimate is the weighted average of the three estimates
of β,
using
weights inversely proportional to the standard error of each
estimate. That is, if a model had more uncertainty about the size of
CC effect, its estimate was used less in the ensemble. Each model
gave a standard error of 30003500 points on an estimate of
45005000, which means that all three models agree closely and that
they are considered with near equal weight. Also, those standard
errors are so large that I can't claim with confidence that there
even is an effect of the College Championship on scores, but that's
mostly because I'm merely one athome player with less than 40 days
worth of data.
From
my personal data, the College Championship adds 4282 points to my
score.
Each
model can also predict/fit the Coryat score from the data set for
any amount of 'days
played' and whether the
show was a regular one or a college championship.
The red curve is constructed by plugging the appropiate number of
days played a 'regular
show' game type into each
model, and getting an
ensemble estimate. Similar
to the ensemble estimate for the CC effect, this is a weight mean of
the fitted values, with weights inversely proportional the RMSE (root
mean square error) of that model. In short, models that fit the
observed data better are weighted more heavily. However, every model
was weighted almost equally and had a similar estimate for my regular
show mean Coryat on the most recent day.
From
my personal data, I score an average of 16,957 points on a regular
show.
You
can use the attached code and data to recreate this graph, and you
can data in this attached .csv file's format to apply the analysis to
your own Coryat scores.
Download Analysis Code
Download Frontend Code
Download my Jeopardy! data
You
need only apply your own file names and directories, copy/paste the
code from the 'frontend' file into the R console, and execute it.
setwd("C:/Users/Jack/Desktop/Projects
2017")
jep
= read.csv("JackJep 20170303.csv", as.is=TRUE)
source("Jeopardy
analysis code.txt")
jeopardy_analysis(jep,"Jack")
In
this code, setwd() sets the working directory with the analysis code
and the data to be analyzed. The read.csv() command loads the CSV
file into R, and 'jep' is the name of that dataset. The command
source() tells R to run the code in this file, which in this case is
just the definition of the function jeopardy_analysis().
The
function jeopardy_analysis() estimates
the college championship effect, but only if there are any college
championship games in the dataset. It estimates the current Coryat
score of the player in the dataset 'jep'. It also creates a graph
like the one you see above, using the name specified in the second
argument.
Other
optional arguments that you can use in the jeopardy_analysis()
function include:

potential_reached:
Numeric. The proportion of your score potential you have reached.
Used for model tuning. Defaults to 70%.

thresholds:
Vector of numeric values. Determines the vertical location of the
dotted lines, if any, on the graph. Choose NULL for no thresholds.
Defaults to c(24000,28000). More or fewer than 2 lines can be used.

threshold_names:
Vector of strings. Self explanatory. Ideally should be the same
number of values as thresholds. Defaults to c("test ready","show
ready")

verbose:
Boolean. Toggles the printing additional information about each
model, and their weights in each ensemble. Defaults to FALSE.

makeplot:
Boolean. Toggles the creation of the plot. Defaults to TRUE.
If there is sufficient
interest, I will publish updated data and analysis code after the
Tournament of Champions, in which the best 15 Jeopardy contestants of
the year compete, and which has more difficult clues than the
regular show. This update would estimate the effect of both
tournaments, as well as employ a wider ensemble of models.
Also, if other people are
will to share their data, I can compute a better estimate of the
College Championship effect, as well as test if the effect is
nonadditive. For example, does it raise a 10,000 Coryat player's
score by more than it raises a 30,000 Coryat player's score? Does it reduce it for certain demographics?