Tuesday, 3 October 2017

Book review of Improving How Universities Teach Science Part 2: Criticisms and comparison to the ISTLD



As far as pedagogical literature goes, Carl Wieman’s Improving How Universities Teach Science - Lessons from the Science Education Initiative was among my favourites. It keeps both jargon and length to a minimum, as it is barely more than 200 pages without counting the hiring guide at the end. The work and its presentation are strongly evidence-based and informative, and the transformation guide in the appendices provides possible actions that the reader can do to improve their own classes. Most of it is applicable to most of science education.

 I have some criticisms, but please don’t take this as a condemnation of the book or of the SEI in general.

The term ‘transformation’ is vague, despite being used extensively throughout the first two thirds of the book. This is partially forgivable because it has to be vague in order to cover what could be considered a transformation across many different science fields. However there could have been more examples or elaboration or a better definition of the term early on. First concrete examples to show up and clarify what transformation meant are found in the appendix, 170 pages in.

Dismissal of the UBC mathematics Department, and of mathematics education in general.

The metric Wyman used primarily the proportion of faculty that bought in to the program. That is the proportion of Faculty that transformed their courses, because typically faculty transformed all of their courses or none of them. Many departments were considered a success in that 70% or more of the faculty transformed their classes. Those that were under 50% were mostly special cases where they had entered into the Science Education initiative later and hadn't had the opportunity to transform. Among the All-Stars was the UBC statistics Department 88% of their 17 faculty with teaching appointments transform their classes. Among The Faculty of the UBC mathematics Department however only 10% of their 150 + strong Department bought in and transformed their classes. To contrast 1.2 million dollars was spent on the mathematics Department while $300,000 was spent on the statistics Department, so the mathematics people got more in total but the statistics people got more per faculty. It's not the failure to transform the mathematics Department that bothers me but the explanation for it.
Wieman boils down the failure to transform the mathematics department into two factors. First was the culture within that particular department, which was one that did not emphasize undergraduate education and seemed to assume that mathematics was an innate ability that either students had or had not regardless of the amount of effort put in. Before Wieman started attempting to transform this department it had a policy of automatically failing the bottom few percentiles of every introductory calculus class regardless of performance. The second factor Wieman uses to explain the failure is that mathematics is inherently not empirical, which means that a lot of the active learning meant to make Concepts more concrete would not have applied.

Having taught and been taught in both mathematics and statistics departments at multiple institutions myself I don't buy these arguments. From personal experience the most engaging and active classrooms I experienced have spread equally across mathematics and statistics. With in mathematics the most memorable was abstract algebra which by definition is non-empirical. Furthermore, at Simon Fraser University it's the mathematics department that has been leading the way on course transformation.
As for the argument about innate ability, this is an idea that spreads far beyond just university departments. I have no qualification to claim how true or false it is. However it's not a useful assumption, because it makes many things related to teaching quality in mathematics automatically non-actionable.
Finally it seems like a strange argument for a professor of physics to make about mathematics. I would have like to see more investigation and perhaps it's covered in some of his other literature, but then I would have like to see more reference towards that literature if it exists.

Compared to the Institute for the Study of Teaching and Learning in the Disciplines (ISTLD) at SFU, Wieman’s SEI is several times larger in scale and tackles the problem of university teaching entire departments at a time. The SEI works with department chairs directly and with faculty actively through their science education specialists. The ISTLD’s projects are self-contained course improvements, where staff and graduate student research assistants provided literature searches, initial guidance, and loose oversight over the course improvement projects. Both initiates fostered large volumes of published and publicly presented research.
The funding for course improvement projects through the ISTLD was non-competitive; the only requirements to receive a grant were to submit a draft proposal, to attend some workshops on pedagogy and to submit a new proposal guided by these workshops. Grants from the SEI at both UBC and CU was a competitive process, which Wieman used because, in his words, it was the only system familiar to science faculty.

In case you missed it, here is the first part of this book review, which discusses the content more directly.

Book review of Improving How Universities Teach Science, Part 1: Content.


Unlike many other books and literature on the subject, Carl Wieman’s Improving How Universities Teach Science - Lessons from the Science Education Initiative spent most of its pages talking about the administrative issues involving the improvement of university teaching. If you're familiar with recent pedagogical literature this book doesn't come with many surprises. What set it apart to me is the scale of the work that Wieman undertook, and his emphasis on educational improvement being an integrative process across an entire department rather than a set of independent advances.

 
The Science Education Initiative, or the SEI, model is about changing entire departments in large multi-year, multi-million dollar projects. The initiative focuses on transforming classes by getting faculty to buy into the idea of transforming them, rather than transforming the classes themselves directly.

The content is based on Wieman’s experience developing a science education initiative at both University of British Columbia (UBC) and at Colorado University (CU). It starts with a vision of what an ideal education system would look like any university mostly as an inspiring goal rather than any practical milestone. It continues with the description of how change was enacted in both of these universities. The primary workforce behind these changes was a new staff position called the science education specialist or SES. SES positions typically went to recent science PhD graduates of that had a particular interest in education. These specialists were hired and then trained in modern pedagogy and techniques to foster active learning. These specialists were assigned as consultants or partners to faculty that had requested help in course transformation.

 
The faculty themselves were induced to help through formal incentives like money for research, or through teaching buy-outs that allowed them more time to work on research, and through informal incentives like considering in teaching assignments and opportunities for co-authorship on scholarly research. Overcoming the already established incentive systems (e.g. publish or perish) that prioritized research over teaching was a common motif throughout this book.

 
The middle third of the book is reflective, and it’s also the meatiest part; if you’re short on time, read only Chapters 5, 6, and the coda.  Here, Wieman talks about which parts of the initiative worked immediately, which worked after changes, and which never worked and why. He talks about his change from a focus on changing courses to a focus on changing the attitudes of faculty. He talks about the differences in support he had at the different universities and how that affected the success of his program. For example, UBC got twice the financial support and direct leadership support from the dean. He also compares the success rate of different departments within the science faculty. Of particular interest to me are the UBC statistics and the UBC mathematics departments, which obtained radically different results. The statistics department almost unanimously transformed their courses, while the mathematics department almost unanimously didn’t.

 
Wieman also talks at length about ‘ownership’ of courses, and how faculty feeling that they own certain courses is a roadblock. Calling it a roadblock is partly because of the habit of faculty to keep their lecture notes to themselves on the assumption that they are the only one teaching a particular course. Furthermore, the culture of ownership was perceived to contribute to resistance from faculty to changes to their courses.

 
Under Wieman's model, course material is to be shared with the whole department so that anyone teaching a particular course has access to all the relevant material that has been made for it by department. Although UBC managed to create a repository for course material, the onus on populating that repository the faculty and there were few people that actually contributed. However where this matters most in the introductory courses even partial sharing was enough because many people tend to teach those courses.

 
The final third of the book is a set of appendices which include examples of learning activities and strategies in transformed courses, guiding principles for instruction, and several short essays on educational habits and references to much of the other work that Wieman has done. It also includes a hiring guide with sample interview questions for possible Science Education specialists.

 
The book also includes coda, which is an 8 page executive summary of the first two parts of the book. The coda served as a good review also a nicely packaged chapter that could be shared with decision makers such as deans and faculty chairs. Decision makers are exactly who I would recommend this book to; it has an excellent amount of information for the time and effort it takes to digest.


I had a few other thoughts about this book that were set aside for the sake of flow. You can find them in the second part of this book review.

Tuesday, 19 September 2017

Advice on Selecting the Right Journal

When I try to get something published in a journal, it's for the prestige and implied proof of quality.
Otherwise, if I wanted to write something and get attention for the idea quickly, I could write a blog post like this.

As such, I aim for a balance between the perceived importance of the journal and the chance of acceptance.

There are some obscure journals with few potential contributors that would accept almost any research paper given to them, but these journals typically have few readers and the quality of the research that your work will be compiled with may be of poor quality. The worst of these are 'predatory' journals which charge substantial fees for publication and promise very quick acceptance.

The fee isn't the problem (many open access journals charge a fee to authors instead of readers), the problem that any garbage that looks like research will be accepted by these journals; potential readers may assume your work is also garbage just from the journal name. In short, by aiming too low, you don't get the prestige that publishing provides.

On the other hand, top tier journals (Science, Nature) receive so many articles that even work that is truly groundbreaking only has a small chance of being accepted.

One popular measure for the prestige of a journal is 'impact factor'.

Impact factors are measures of the average number of citations, with various adjustments, that articles published in that journal have been cited by papers in other journals.The general assumption is that articles that receive more citations are more important or have had a larger impact on science.

Articles get cited for several reasons, including as an acknowledgement that the work is related or useful in the creation of the citing article. There are some political reasons for citation as well. Politics aside, some fields give, and receive, citations much more densely. Some types of papers, such as meta-studies, can end up receiving many citations for essentially summarizing previous work.
Personally, I feel that impact factor is an approximation of importance, but not a very good one; I doubt any single measure could summarize importance though.

Here's a good exercise, pick a journal and read or skim a few articles in it, ask yourself:

Q: Is the quality of this work comparable to my own?  

Is it so well done that you would be wasting your time by applying? Is it so bad that you would not want your work to be associated with it?

Avoid being too humble, a rejection isn't the end of the world, and you don't need to settle right away. Even if you get rejected, and you likely will, you may also receive excellent feedback from reviewers that have, hopefully, scrutinized your work extensively. Use rejections to improve the work, and go submit again quickly! Start at the highest tier journal that you have a reasonable chance at, and work your way down if/when you get rejected.

Also check the quality of the writing, and not just the science. This will get you used to the expectations of scientific writing in your field, and you may be able to use it to improve your own paper before submission. If there are more than a couple of typos and grammar errors in a published paper, then it's possible that the journal doesn't do its own round of copy editing. Copy editing is a service some journals do to improve their own legitimacy, as well as to add value to your paper. Having said that, fix all the errors you can find first rather than sending a rough draft; the paper you submit should appear perfect to you.


Q: Are the delays between submission, acceptance and publication reasonable? 

Many articles will include the date of submission and of acceptance, and the publication date can be inferred. Typical delays vary greatly between fields, but as a simple rule, more than six months between papers is an indication of a slow review process or of multiple revision rounds. Treat these long delays as red flags.

If you see other papers with long delays. It doesn't that many hours of work to review and edit a paper, so the length of delay has little to do with the time put into the review process. If there's a delay for them, there will be a delay for you. If you end up rejected, you've wasted more time waiting than you needed to.

You can always contact the editor and ask how long their backlog is, you may end up saving yourself a lot of waiting, and the editor some effort.

If you're early in your research career, speed of results should take priority over depth. Every year spent as a graduate student potentially costs you a year getting an industry or junior faculty wage. You will have time to do larger, longer work when your career is settled.

Q: Is my work within the scope of the journal? 

Is it relevant to the stated range of this journal's interests?

A good test for relevance to the journal is to find an article that's already published in the journal that you could put into the literature section of your article where you list similar work that has been done. Is there an article you could cite that would fit into your paper without being forced? If so, then that's a good indication that your work is similar enough to what has already been published in the journal.

Don't just see if the citation would fit, actually put it in your paper. This is a signal to any reviewer that your work belongs in the journal. There's even a chance that one of the reviewers was the author of the paper you cited. This is because previously published authors are often contacted to be reviewers, especially if the work is relevant.

Relevance can work for you tremendously. If a journal is focused on a small part of a field, in other words has a narrow scope, then there may be less competition for space in the journal. This lack of competition doesn't present the same implied quality problems as a low tier or as a predatory journal because there is a filter for such a journal. It just happens that relevance is a strict part of that journal's figure. Finding a journal with a narrow scope that includes your work is fantastic, but by their very nature, finding such a journal is unlikely.

Finally,

You can't publish a paper in more than one place. You can write a second paper that's derivative of the first, but you can't take what's essentially the same work and put it in two journals. In fact, it's considered extremely poor form to even submit a paper to a second journal before you get a decision from the first journal. This is why turnaround time is important. It's also another reason to avoid predatory journals - once you put your paper in a trash tier journal, it's stuck there!

Tuesday, 12 September 2017

Left-brain creativity

One commonly cited way to improve or maintain mental health is to do something creative, such as painting, drawing, writing, dancing, making music, knitting or cooking. So what's the strategy to gain these benefits if you're not creatively inclined in these or any similar ways? What if you're, say, a statistician or a software engineer?

This post is about acknowledging other, more mathematical means of being creative, that aren't general thought of as traditionally creative. I'm calling these 'left-brain' creative means, which is reductionist, but easy to convey. Whether any of these are artistic in any way is irrelevant.

Martin Gardner was a master of left-brain creativity. He wrote books of mathematical puzzles and novelties, including a version of mini chess mentioned here. Making these challenges was absolutely creative. I would argue that the process of solving these puzzles would also be creative because it requires imagination and decisions that are novel to the solver.

Reiner Knizia has a PhD in mathematics and makes board games for a living. His visual artwork is rudimentary, which is fine because it's meant merely as dressing for the real creative work of abstract sets of rules meant to inspire clever player behaviour.

I mention these two first because I have been disparaged before for being un-creative when I would rely on similar abstractions as outlets. For instance, when told by (now ex-) girlfriend to go try to do something creative, I started working on a farming game I had envisioned, and decided to start with a list of livestock and a draft of their prices in the game. This didn't impress her.

With building toys, I usually made abstract patterns rather than anything that would traditionally have been considered creative. With Lego / Mega Blocks, my most memorable builds were a giant hollow box for holding hockey pucks, and an extremely delicate staircase. With K'nex, my work was always abstract shapes made in an ad-hoc manner.

I enjoy the concept of building toys a lot more more than actually building anything with them. It's a dream of mine that Capsella toys will make a return through 3D printing. Capsella was a toy make of clear plastic capsules with gears inside. It would be difficult, but doable.

There's also this game called Gravity Maze, in which the goal is to drop a marble in one tower of cubes and have it land in a target cube. The game comes with a set of puzzle cards which include a starting configuration of towers and a set of towers that you need to add to finish the maze. The game only comes with 60 such puzzle cards and additional ones aren't available for sale. On one vacation, I took it upon myself to draft a program that could randomly generate configurations and see if they were solutions. It's still in a notebook somewhere. Is this creative? It feels better if I think of it that way; doing this gave me the same joy I imagine someone gets from more traditional creative exploits.

On another vacation, I wrote a proof of concept for Gaussian elimination of a 4x4 matrix where the matrix was populated with fractions. The point was to write the entries of the resulting matrix each as a single fraction. That way, an ASIC (Application Specific Integrated Circuit) could later be made to solve such a matrix in fractions, which avoids the computationally slow method of subtraction, which is typically done through iterated subtraction. Was that creative? It felt a lot like doodling or sketching to decide upon this and solve it.

I was a big fan of Dungeons and Dragons, and later Rifts and GURPS, when I was younger. I almost never played roleplaying games, but I spent a lot of time reading rulebooks and compendiums, and writing my own material such as new monsters. To someone expecting creative work to look more like art, this probably resembled accounting.

This clearly isn't a new discovery to a lot of people. Just looking at websites for chess variants and chess puzzles tell me that much, along with the large custom card making subset of the Magic: The Gathering community tell me this much. There are many people that seem to enjoy making up challenges and rulesets and get creative joy out of it.

If there's a thesis to this post, it's that if you're not inclined to make what would be typically considered art, you can still reap the mental health benefits of being creative through more 'left-brain' means. Other activities worth mentioning, but not from personal experience, include making crossword puzzles, nurikabe puzzles, maps, and fractals. Do something that involves building or making and a lot of small decisions, and don't worry about whether it's expressive, artistic, or traditionally considered creative.

Sunday, 13 August 2017

Sports questions - Speculation, RISP, and PEDs


What will popular sports look like in the year 2030? (All sports)

Technology is rapidly making new sports possible. 

Will improved cameras and laser gates make races with running starts viable? I would love to know how much faster the 100 metre dash can be without starting from a standstill.

Will drone racing or drone hunting take flight?  Will e-sports continue their growth and penetration into the mainstream?

Will self driving cars start competing in Nascar? Formula One? Rally car racing? Will all of these racing formats survive or maintain their scale in the next 15-20 years?

Demographics are opening new possibilities too. 

Shrinking populations and urbanization are leaving behind many otherwise livable and usable buildings as abandoned. Will terrain-based sports like airsoft and paintball take off with the abundance of good locations? Will GoPro and similar robust and portable camera make such pursuits into spectator sports?

Will we see a shift of focus towards women in sport, following the trend of tennis? Will we see mixed-sex competition in sports where size and muscle mass mean less?

Will extreme sports see a revival, led by Red Bull sponsored events like Crashed Ice and Flugtag?

What sports will decline? 

Will UFC mixed martial-arts continue to eat into the viewing market share of WWE wrestling? Why didn't Texas Hold 'em keep its hold on the public? Could the NHL (and the KHL) mismanage ice hockey into a fringe sport? Can American Football maintain its popularity in the face of growing concern over brain injury? Will American football adapt? Can golf maintain its popularity given its cost?

What about stadiums?

Instead of building stadiums for specific sports, or a limited set of sports, will new sports emerge to fit into already made stadiums? Will existing sports start to use stadiums that were built for other purposes, such as softball in a baseball stadium, or soccer football in an American football stadium?




On RISP, Runners In Scoring Position. (Baseball)

Batters do better (or pitchers do worse) when there are runners in scoring position. Why? Is it just a result of skill auto-correlation, such as a pitcher's tendency to do poorly in streaks of batters, or is it something else? Is it the distraction on the pitcher for having a batter who could steal a base or read signs? Is it the effect of the fielders having to do more than one job at once? 

A more measurable or actionable question: is the RISP advantage greater for certain batters? For example does a player with a reputation for stealing bases give a larger 'RISP bonus'  than another with ordinary base-running. Does the effect add with multiple runners? Does it change with base? Does it change with pickoff attempts? How much of this is balks being drawn?

Similarly, how should pickoff attempts be counted with regards to pitching load? My guess is that they have the effect of about half a pitch in terms of performance in that game and in that plate appearance.


What performance enhancers are 'fair'? (All sports)

A lot of drugs are banned from a lot of sports, but why? My assumption is that it makes the feats of one era comparable to another. We can take Usain Bolt's running records and compare them to the records of Donovan Bailey's in the 90's, and say with little or no argument that Bolt at his peak was faster than Bailey at his. The difference in their 100 metre dash times can isolated to the runners and not the chemical technology of their respective eras.

My assumption comes from the qualifying statements in hockey and baseball about different eras of each sport defined by seemingly minor changes in the equipment or rules of the game. Hitting feats from the 1990s seasons of MLB baseball are qualified with comments about steroid use by superstar hitters. Steroid use was allowed at the time, I presume on the basis that every player had access to the steroids.

Why is chemical technology is seen as unfair and other technology like improved running shoes is fair? Probably the hidden nature of drugs, and the related difficulty in directly regulating the 'equipment' used. It's much simpler to enforce rules about the volume of a golf club face, or the curvature of a hockey stick, rather than an acceptable dosage of steroids.

Things have gotten confusing lately.

Oscar Pistorius, whom had both his legs amputated below the knee as an infant, was until recently a competitive paralypmic sprinter. He used springy blades, described here, to run. He also wanted to compete in general sprinting competition but was barred from general competition as it was found that his prosthetic feet were more efficient for running than baseline human feet. So, even though paralypmic competition was designed to provide viable competition to those with physical disabilities, the technology used to mitigate Pistorius's disability was deemed too effective.

In January 2017, the IOC (International Oympic Committee) released the results of testing they had done on various drugs to test for performance enhancement in, of all things, chess. They found that caffeine, Ritalin, and Adderal all improved performance in double blind tests. So, if chess ever becomes an Olympic sport, should these drugs be banned and tested for? What happens if someone has a prescription for Ritalin, do they have to go without to compete?


Things are about to get a lot more confusing.

CRISPR is a technology that may have the potential to arbitrarily rewrite genetic code. If done to a human embryo to specialize the resulting human into a particular sport, what are the rules to be surrounding that? Generic editing seems like drugs and blood doping in that it's a hidden technology that would be very complicated to regulate other than to ban completely. It would be at least intended to be performance enhancing, and not every competitor would have access to the technology, at least not at first.

But changing the genetics of a person is not adding something foreign to the person, it is changing who that person is. That's who that person will be through their entire life growing up. Should we ban someone from competition for being 'naturally' too good at something as a result of a decision made before that person's birth?

Or, do we separate competitors into 'baseline' and 'enhanced' humans? This is starting to sound way more like a dystopian, racist dog show (with terms like 'best in breed') than the 'faster, higher, stronger' tone I was aiming for. It's something we collectively need to think about though, not just for sport but for all human interaction going forward.

Let's close with this thought on the subject by speedrunner Narcissa Wright: "All the categories are arbitrary".

Wednesday, 19 July 2017

Writing Seminar on Short Scientific Pieces

The following is the first of five seminars I am giving this and next week on statistical writing. 

About half of the material here has appeared in previous blog posts on making survey questions.


------------

Composing short pieces. A workshop about scientific writing, using the skill of survey writing as a catalyst.



Why start with surveys / questionnaires?

1. Survey writing tends to be left out of a classical statistics program, and is instead left to the social sciences. It is, however, a skill that is asked of statisticians because of the way it integrates into design of experiments.

2. Surveys have the potential to be very short, therefore it should not be an overwhelming task to create one.

3. Writing a proper survey absolutely requires that the writer imagine an intended reader and how someone else might understand what is written.

With a survey, as with most other writing that you as statisticians, graduate students and/or faculty will be doing, will be read by others that know less about the subject at hand than you do. You are writing from the perspective of an expert.

This perspective is a major shift from much of the work done in undergraduate that involves writing. Aside from peer evaluations, most undergrad writing is done to demonstrate understanding of a topic to someone who knows that topic better than you. That often reduces to using specific key terms and phrases that the grader, a professor or teaching assistant is looking for, and making complete sentences. If any key parts are missing from that work that a casual reader would need to be able to understand the topic, then the grading person can fill it that missing part with their own understanding.

When writing a research report, a scientific article, a thesis, most people that read the material will do so with the intent of learning something from it. That means they won't be able to fill in any missing critical information with knowledge, because you have the knowledge and they don't. Even worse, they may fill in missing parts with incorrect knowledge.

In the case of a survey, even though respondents are the ones answering the questions, the burden of being understood rests with the agent asking the questions. As the survey writer, YOU are the one that knows the variables that you want to measure.

So even though this workshop is titled 'composing short pieces', a large amount of time will be spent on survey questions. Much of this applies to all scientific writing.

Writing Better Surveys

Tip 1. Make sure your questions are answerable. Anticipate cases where questions may not be answerable. For example, a question about family medical history should include an 'unknown' response for adoptees and others who wouldn't know. If someone has difficulty answering a survey question, that frustration lingers and they may guess a response, guess future responses, or quit entirely. Adding a not-applicable or open ended 'other' question, or a means to skip a question are all ways to mitigate this problem.



Tip 2. Avoid logical negatives like 'not', 'against', or 'isn't' when possible. Some readers will fail to see the word 'not', and some will get confused by the logic and will answer a question contrary to their intended answer. If logical negatives are unavoidable, highlight them in BOLD, LARGE AND CAPITAL.


Tip 3. Minimize knowledge assumptions. Not everyone knows what initialisms like FBI or NHL stand for. Not everyone knows what the word 'initialism' means. Lower the language barrier by using as simple language as possible without losing meaning. Use full names like National Hockey League, or define them regularly if the terms are used very often.


Tip 4. If a section of your survey, such as demographic questions, is not obviously related to the context of the rest of the survey, preface that section with a reason why are you asking them. Respondents may otherwise resent being asked questions they perceive as irrelevant.


Tip 5. Each question comes at a cost out of a respondent's attention budget. Don't include questions haphazardly or allow other researchers to piggyback questions onto your survey. Every increase in survey length increases the risk of missed or invalid answers. Engagement will drop off over time. See Tip 17.


Tip 6. Be specific about your questions, don't leave them open to interpretation. Minimize words with context specific definitions like 'framework', and avoid slang and non-standard language. Provide definitions for anything that could be ambiguous. This includes time frames and frequencies. For example, instead of 'very long' or 'often', use '3 months' or 'five or more times per day'.


Tip 7. Base questions on specific time frames like 'In the past week how many hours have you...', as opposed to imagined time frames like 'In a typical week how many hours have you...'. The random noise involved in people doing that activity more or less than typical should balance out in your sample. Time frames should be long enough to include relevant events and short enough to recall.




Exercise: Writing with exactness (also called exactitude)

Part 1 of 2: Consider
Why is it “in the last week” and not “in a typical week”?

If a question asks something like “in a typical week, how many alcoholic drinks have you consumed?”

- Respondents are invited will tend to over-average and discount rare events.
- Respondents are invited to idealize their week, which may increase the potential for social desirability bias.
- Every respondent will draw their week from a different time frame (imagined or real) as their typical week. However, “in the last week”

Part 2 of 2: Create

Put yourself in the shoes of...

...wait, let me restart, that was an idiom. (See Tip 21)

Consider the perspective of a stakeholder in a survey. A stakeholder could be anyone involved in the survey or directly benefiting from what it reveals, such as a respondent, the surveying firm or company, or the client that paid for the survey. Discuss amongst your group the different consequences of choose to ask about a respondent's place of residence in one of two ways:

Version 1:
Where is your main place of residence?

Version 2:
What was your place of residence on July 14, 2017?




Exercise: Sizes of Time Frames

Even if a human respondent is trying their best to be honest, memory is limited. Rare or noteworthy events may be able to be recalled for years, but more mundane things won't be.

Discuss the benefits and drawbacks (the good and bad aspects) of the following three survey questions.

Version 1:
In the last week, how many movies did you see in a theater?

Version 2:
In the last year, how many movies did you see in a theater?

Version 3:
In the last ten years, how many movies did you see in a theater?



Tip 8. For sensitive questions (drug use, trauma, illegal activity), start with the negative or less socially desirable answers first and move towards the milder ones. That gives respondents a comparative frame of reference that makes their own response seem less undesirable.

Tip 9. Pilot your questions on potential respondents. If the survey is for an undergrad course, have some undergrads answer and critique the survey before a full release. Re-evaluate any questions that get skipped in the pilot. Remember, if you could predict the responses you will get from a survey, you wouldn't need to do the survey at all.

Tip 10. Hypothesize first, then determine the analysis and data format you'll need, and THEN write or find your questions.

Tip 11. Some numerical responses, like age and income, are likely to be rounded. Some surveys ask such questions as categories instead of open-response numbers, but information is lost this way. There are statistical methods to mitigate both problems, but only if you acknowledge the problems first.

Tip 12. Match your numerical categories to the respondent population. For example, if you are asking the age of respondents in a university class, use categories like 18 or younger, 19-20, 21-22, 23-25, 26 or older. These categories would not be appropriate for a general population survey.

Tip 13. For pick-one category (i.e. multiple choice, polytomous) responses, including numerical categories, make sure no categories overlap (i.e. mutually exclusive), and that all possible values are covered (i.e. exhaustive.)

Tip 14. When measuring a complex psychometric variable, (e.g. depression), try to find a set of questions that have already been tested for reliability on a comparable population (e.g. CES-D). Otherwise, consult a psychometrics specialist. Reliability refers to the degree to responses to a set of questions 'move together', or are measuring the same thing. Reliability can be computed after the survey is done.

Exercise - Synonyms
Pick an informative word from a short passage (e.g. Tip 14)
1. Find a synonym of that word.
2. Write a definition of that new word.
3. Consider how using the new word changes the sentence.
Example:
"Share of the smartphone market was hotly contested."
Using "contest" implies more of a struggle or a fight and the original word "compete".


Replace the verb "compete" with "contest".
Contest (verb): To fight to control or hold.


Tip 15. Ordinal answers in which a neutral answer is possible should include one. This prevents neutral people from guessing. However, not every ordinal answer will have a meaningful neutral response.


Tip 16. Answers that are degrees between opposites should be balanced. For each possible response, its opposite should also be included. For example, strongly agree / somewhat agree / no option / somewhat disagree / strongly disagree is a balanced scale.


Tip 17. Limit mental overheard - the amount of information that people need to keep in mind at the same time in order to answer your question. Try to limit the list of possible responses to 5-7 items. When this isn't possible, don't ask people to interact with every item. People aren't going to be able to rank 10 different objects 1st through 10th meaningfully, but they will be able to list the top or bottom 2-3. An ordered-response question rarely needs more than 5 levels from agree to disagree. See Tip 5.



Exercise – Information Density
Part 1 of 2: Consider

Consider the following two sentences. They convey the same information, but one version packs all that information into a single sentence with one independent clause. The other version splits this into two sentences and three independent clauses.


Version 1
“Reefs of Silurian age are of great interest. These are found in the Michigan basin, and they are pinnacle reefs.”


Version 2
“Pinnacle reefs of Silurian age in the Michigan basin are of great interest.”
(Inspiring Source: The Chicago Guide to Communicating Science, 2nd ed, page 46)


Each version is appropriate, but for different situations. When words are at a premium, such as when writing an abstract, when giving a talk of very limited time, or giving priming information for a survey question, the shorter version is typically appropriate. However, readers and listeners, especially those that speak English as an additional language will have a harder time parsing the shorter version, even if it takes less time to read or say.
The operative difference between the versions is information density. The longer version requires less effort to read because there are fewer possibilities for each word to modify or interact with the other words in its clause. This is done by adding syntax words that convey no additional information on their own.


Part 2 of 2: Create

On your own take the following sentence and make a less information dense version of it by breaking it into smaller sentences. 

“Data transformations are commonly-used tools that can serve many functions in quantitative analysis of data, including improving normality of a distribution and equalizing variance to meet assumptions and improve effect sizes, thus constituting important aspects of data cleaning and preparing for your statistical analyses.“

Now take this following passage and condense it into a single sentence with greater information density.

“Many of us in the social sciences deal with data that do not conform to assumptions of normality and/or homoscedasticity/homogeneity of variance. Some research has shown that parametric tests (e.g., multiple regression, ANOVA) can be robust to modest violations of these assumptions.”

Source: Jason W. Osborne, Improving your data transformations: Applying the Box-Cox transformation , Practical Assessment, Research & Evaluation. Vol 15, Number 12, Oct 2010


Tip 18. Layout matters. Make every response field unambiguously next to its most relevant text. For an ordinal response question, make sure that ordering structure is apparent by lining up all the answers along one line or column of the page.


Tip 19. Randomize response order where appropriate. All else being equal, earlier responses in a list are chosen more often, especially when there are many items. To smooth out this bias, scramble the order of responses differently for each survey. This is only appropriate when responses are not ordinal. Example of an appropriate question: 'Which of the following topics in this course did you find the hardest?'


Tip 20. A missing value for a variable does not invalidate a survey. Even if the variable is used in an analysis, the value can substituted with a set of plausible values by a method called imputation. A missing value is not as bad as a guessed value, because then the uncertainty can be identified.


Tip 21. Restrict your language to 'international English' (assuming the questions are in English). This means that idioms, or local names for things should be avoided when possible. When there are two or more competing names for a thing, rather than one internationally recognized one, use all major names for an object that are in use for your target demographic.


[As time permits, Exercise prompt: Try to figure out what 'Your potato is baking.' means without knowledge of Brazilian Portuguese]


Main Inspiration Source for tips: Fink, Arlene (1995). "How to Ask Survey Questions" - The Survey Kit Vol. 2, Sage Publications.



Digital Writing Tools
Showcase:
Hemmingway, find/replace, text diff, texrendr
Caveat / Pitfall 1: Digital tools are not a substitute for judgement. In one book, every instance of the word 'mage' was to be replaced with the synonym 'wizard', according to the style guide of the publisher. (Both 'mage' and 'wizard' are words that refer to people with magic-using abilities. However, the publisher may have preferred to use one term over another for internal consistency.) Rather that make a case-by-case replacement, the person responsible for making the change simply used a digital 'replace all' function, changing every instance of 'mage' to 'wizard'. Unfortunately, this particular text also included the word 'damage', which was changed automatically to the nonsense word 'dawizard'.


Caveat / Pitfall 2: Another issue with digital tools is that they can't all be depended upon to be available in their current forms forever. Microsoft is moving towards a SaaS (software as a service) model, where access to tools like Word with its grammar check are based on a subscription rather than a one-time fee. This means in the future you may lose access to that tool for reasons beyond your control. Web-based tools like Hemmingway carry an even greater risk, because the server for Hemmingway could be shut down without any warning and leave you without access.

Also, you may need to send your writing or other material (e.g. figures, tables) to a remote server to be processed in order to use those tools. If your writing contains sensitive or confidential material, you may be breaking legal agreements with your data providers by using these tools.




Further Homework and Reading

This is based on Chapter 8 of the book Successful Surveys - Research Methods and Practice by George Gray and Neil Guppy. The chapter is "Designing Questions of the book Successful Surveys."
Q1. Give an example of a numerical (e.g. quantitative) open-ended question and a numerical closed-ended question.
Q2. Give an example of a non-numerical (e.g. nominal, text-based) open-ended question and a non-numerical closed-ended question.
Q3. In your OWN WORDS, give two advantages and disadvantages of open-ended questions.
Q4. In your OWN WORDS, give two advantages and disadvantages of closed-ended questions.
Q5. How do field coded questions combine the features of both open- and closed-ended questions.
Q6. For what kind of surveys are open-ended questions more useful? When are they less useful?
Q7. What are five features that make for well worded questions.
Q8. What is the name used for a survey question that asks about and focuses on two distinct things?
Q9. What is a Likert scale?
Q10. What are four things that all have important effects on how people respond to survey questions?