Featured post

Textbook: Writing for Statistics and Data Science

If you are looking for my textbook Writing for Statistics and Data Science here it is for free in the Open Educational Resource Commons. Wri...

Sunday, 11 April 2021

Wow, what are the odds? (Part 1: American Odds, Decimal Odds, and Implied Probability)

The term "odds" is slippery because it's used to mean different things in different contexts. In layperson terms, "odds" is often used as a synonym for probability. In proper statistical terms, "odds" is a function of probability, but it's not the same as probability. There are also other uses of the term "odds" in gambling contexts which are functions of a parallel concept called "implied probability". In these notes, we're going to look at some common types of odds in statistics and gambling contexts, and some of the calculations to convert between them.

Sunday, 14 February 2021

How does Polychoric Correlation Work? (aka Ordinal-to-Ordinal correlation)

Let's say you've got data of many paired cases of two ordinal variables, like you might when you ask a large number of people the same two Likert scale questions (e.g. "poor", "fair", "good", "very good", "excellent").

What could you learn from the data from those two questions?

Here's a few common approaches:

1) Compare the means of each variable by abusing a t-test.

2) Compare the distribution of each variable with a chi-squared goodness-of-fit test.

3) Check for a relationship between responses of each variable with a chi-squared independence test.

4) Estimate the strength of such a relationship with a Spearman correlation.

Saturday, 30 January 2021

Lottery tickets, baseball cards, and the coupon collector's problem

A man, a woman, an enby, and 27 elephants walk into a bar. The bartender looks at the group of 30 and asks "What is the probability that 2 or more of you have the same birthday?".

The group proceeds to disregard twins, leap years, and building codes. They talk amongst themselves, leave a little space for the reader to calculate or guess for themselves,

Thursday, 21 January 2021

Fantasy Sports Explained

Information on fantasy sports ranges from the overly broad or impossible to apply like personal stories about how fantasy sports changed lives, to extremely narrow or short-lived like opinions on players recent performances. This article is intended to cover some of the middle ground to help you understand the basics of fantasy with a worked example of how you might choose players at the beginning of a league.

Sunday, 20 December 2020

Borel Dice Edition - Brute Forcing Experiments

Borel and Borel: Dice Edition are educational games about probability. I picked up a copy of each because I thought they would be useful in introducing some ideas of probability and gambling without the cultural baggage of better known games.

I'm biased because it's my field, but Borel has a lot more play value than most games of its kind. The dice edition, which is much easier to find, and easier to get into and play, has a set of 7 dice (four 6-sided, and one each of a 10-sided, 20-sided and 30-sided die), and a deck of 100 "experiments", like Experiment 001:

Wednesday, 11 November 2020

T1B: Goodhart's Law and Baserunning

When I was about 11 years old, I was good at running, good enough to represent my elementary school as the anchor in a relay at a district track meet. This prompted all the grown-ups in my life to coach me on running. They told me all sorts of tricks about keeping my hands flat and remembering to breathe and to keep in a straight line and to keep from dragging my feet and to start early to get the baton.

The race came and I did all those things – hands, breathing, straight line, no dragging, start early. I forgot, however, to run. I blew a huge lead by running in perfect form, but in slow motion.

What's the lesson here?

Monday, 12 October 2020

Lost Chapter: Writing for your Career

This is one of the 'lost chapters' of the textbook "Writing for Statistics and Data Science", which was removed because information changes too quickly. This chapter covers data science resumes, describing class projects to businesses, and writing letters of introduction to potential grad supervisors.