Saturday, 11 February 2017

Creating Homework Material for Statistical Writing Classes and Workshops.

This summer, I'll be offering a workshop on statistical writing to the graduate students in the SFU Stats department. In the year that follows, I hope to teach SFU's Stat 300 course, an undergraduate writing course that is mandatory for all statistics majors. I want this course to be offered at more universities, but since the typical undergrad degree in stats is already overloaded, this would need some additional motivation, so here's my anecdotal hook:

When graduate schools ask for a letter about your motivations for applying, or about your general background, there's a few things that are being examined. First, different graduate supervisors have different specialties, and they may be looking at your letter for signs of a good match. Second, the letter is a means of seeing firsthand your personal ability to communicate in English. They ask for something personal rather than a set topic to discourage plagiarism. They are not interested in your level of motivation; everyone says they are highly motivated.

I've recently been writing reference letters and filling out graduate school reference forms for students that have been in my previous 300-level classes. All of these schools are asking about the ability to write. Many statistics undergrads are applying to fields outside of pure statistics, for example economics, medical science, and business. All of them are asking their references about skills in English, often spoken English and always written. Likewise, the Statistical Society of Canada asks referees about spoken English as a part of their A.Stat accreditation.

In order to develop these skills in a program that usually focuses on mathematical theory and programming ability, I've sketched out some projects and exercises that could fit into a workshop or course.

The instructor (me) would ask colleagues from other fields that use quantitative data, such as sociology, anthropology, biology, business, ecological restoration, and history for reports from previous or current projects.

The students would read the reports and identify what the sampling method was, what analysis was done, and to check the assumptions on those methods to the data. They would interpret the results in their own words (not in the words of the discussion or conclusion section), and compose a couple of questions about the research that weren't answered. These questions don't have to pertain to the domain; they should pertain to the methodology. These would be questions like: were there any outliers that affected your results? Did you consider multiple testing? Did you consider regression/anova?

For example, for this research report, Water Quality of Stoney Creek and its Effects on Salmon Spawning, by SFU undergrads by Oak, Tony; Thai, Michelle; Orgil, Indra; Ngo, Kevin; Lu, Jerry , available at http://summit.sfu.ca/item/12770

A writing assignment could be:

  • What were the parameters to be estimated?
  • What biologically important thresholds or critical values are mentioned for the parameters being estimated?
  • What sampling method was used? How many sampling units are there?
  • Were there any null hypotheses being tested? If so, were they rejected? Should they have been?
  • Describe the qualitative differences between the four sites. Use the map and Figs 4,5,6, and 7.
  • In your own words, summarize the quantitative data described in Figures 2 and 3.
  • Compose two suggestions or questions about the analysis that may not have been considered.
  • Describe one way you could build upon (e.g. expand, follow up) this study, the type of information you would collect from this new work, and what additional conclusions you could make if the data matched your expectations.

(Note: Please don't take these questions as a criticism of this research report. It was selected to show how material is available within one's on campus. Even a quick skim of the report will show it's truly excellent work for undergrads.)


Another exercise I'm thinking of doing is to have students copy edit some scientific writing. This could be a publication or a blog post. I would do this twice; once with a well written work, and once in something of questionable quality like the unedited papers that come out of discount publishers. If I used something from ArXiv, with the authors' permission, I could use later, more complete versions of the same work as a 'key'. However using ArXiv is untenable for graded work because the 'key' is publicly available.


Also, there are the reading comprehension exercises.The ones I had posted earlier (here and here) were meant to be done in less than 2 hours, and had questions focused on specific statistical topics. Another possibility is to write more such assignments that require less statistical expertise, but more in-depth composition.