Featured post

Textbook: Writing for Statistics and Data Science

If you are looking for my textbook Writing for Statistics and Data Science here it is for free in the Open Educational Resource Commons. Wri...

Tuesday 19 October 2021

Sampling, conditional probability, and random number generation

 Part of the motivation behind making the course Statistics and Gambling is to infuse new applicability into introductory or intermediate probability courses. This blog post is a look at how the course is going to cover familiar probability topics with examples in games of chance, and a simulation-based (rather than theory-based) approach.

This post covers basic methods of random number generation (RNG) in R, and applying RNG to demonstrate core concepts in sampling, conditional probability, and conditional distributions. It is meant to be a very surface-level primer on the topics, just enough to give context for the deeper dives into specific games of chance.

Saturday 26 June 2021

The Bottleneck Retirement Plan

I do not have a voluntary retirement plan or a pension. I have the means to put money away specifically for retirement, but I choose not to. Instead, I use an investment strategy that has been described as "the most and least insane thing I've ever heard". Here is that strategy:


Sunday 11 April 2021

Wow, what are the odds? (Part 1: American Odds, Decimal Odds, and Implied Probability)

The term "odds" is slippery because it's used to mean different things in different contexts. In layperson terms, "odds" is often used as a synonym for probability. In proper statistical terms, "odds" is a function of probability, but it's not the same as probability. There are also other uses of the term "odds" in gambling contexts which are functions of a parallel concept called "implied probability". In these notes, we're going to look at some common types of odds in statistics and gambling contexts, and some of the calculations to convert between them.


Sunday 14 February 2021

How does Polychoric Correlation Work? (aka Ordinal-to-Ordinal correlation)

Let's say you've got data of many paired cases of two ordinal variables, like you might when you ask a large number of people the same two Likert scale questions (e.g. "poor", "fair", "good", "very good", "excellent").


What could you learn from the data from those two questions?

Here's a few common approaches:


1) Compare the means of each variable by abusing a t-test.

2) Compare the distribution of each variable with a chi-squared goodness-of-fit test.

3) Check for a relationship between responses of each variable with a chi-squared independence test.

4) Estimate the strength of such a relationship with a Spearman correlation.

Saturday 30 January 2021

Lottery tickets, baseball cards, and the coupon collector's problem

A man, a woman, an enby, and 27 elephants walk into a bar. The bartender looks at the group of 30 and asks "What is the probability that 2 or more of you have the same birthday?".

The group proceeds to disregard twins, leap years, and building codes. They talk amongst themselves, leave a little space for the reader to calculate or guess for themselves,

Thursday 21 January 2021

Fantasy Sports Explained

Information on fantasy sports ranges from the overly broad or impossible to apply like personal stories about how fantasy sports changed lives, to extremely narrow or short-lived like opinions on players recent performances. This article is intended to cover some of the middle ground to help you understand the basics of fantasy with a worked example of how you might choose players at the beginning of a league.