Wednesday, 8 November 2017

Writing a Resume as a Data Scientist or Statistician.

Consider the reader:

Your audience is NOT another data scientist, typically. In a large company, it will be someone in a human resources department who has been told to look for certain key words and skills. They might reed your resume for 20 seconds or less.

In a small company, you resume may get (a little) more time and may be read by someone closer to your specialty and (slightly) more familiar with the jargon of your little corner of your field. The same guiding principle governs both cases: make it as easy as possible for someone to evaluate and say 'yes'.

What is this reader going to want to know? “How can this person fill the missing hole in my organization RIGHT NOW?”

This means, even in highly qualified personnel jobs like those of data scientists, statisticians, programmers and researchers, that the potential for long term growth within a company is not a priority, at least not at the resume-reading stage. This is a major shift from the academic world, where timelines are typically much longer.


What does this mean to you, specifically:

- Opportunities come regularly, so don't panic if you don't get the 'right' position the first time it is posted.

- Future plans like the answer to the stereotypical interview question “where do you see yourself in 5 years” are now irrelevant to many employers. They shouldn't be mentioned on resume either. Stick to what is solid: The past and present.

- Emphasize your skills as they are right now, not where they will be in 6 months. (e.g. 'I am currently studying...')

- Promising company loyalty (e.g. 'I have always wanted to work at...') in cover letters and other correspondence is a waste of time as best, and comes across as insincere at worst.


On the subject of transcript grades:

- After a certain minimum passing threshold, grades are not a good indicator of job performance.

- If you have graduated, or are on track to graduate soon, then you are already above this threshold, and no more about your grades needs to be said.

- The one exception would be that it is a good idea to mention the awards you have received related to grades. This includes the dean's list and scholarships. (e.g. “Graduated in 2017 with distinction”, “Made the Dean's List in 2015”)

Hobbies and activities from high school are irrelevant unless they are programming related or you are applying to a position in fast food. In you're reading a book titled “Writing for Statisticians”, it had better be the first case or you are severely undervaluing the value of your labour.

Rather than talk about the grades you earn or the courses you took, describe the projects you did in these courses as experiences. Be specific and clear without relying on jargon or writing too much, and keep in mind that the reader is unlikely to be familiar with the course numbers and titles from your institution.

Example 1:

Very bad: “Took Stat 485”

Bad: “Took a course in time-series”

Good: “Analyzed a time-series dataset of the economy of Kansas state.”

Better: “Investigated time-series econometric data, and wrote an executive report.”


Example 2:

Bad: “Took a course in big data.”

Good: “Scraped, cleaned, and applied a random-forest model to police call data in a Kaggle competition.”

Also good: “Developed a model to predict crime hotspots from a JSON database from the Seattle Police Department. Presented findings in a slide deck.”


In each of the 'good' examples, the experience is written in such a way as to demonstrate as many high-value skills as possible in a limited space.

The 'good' time-series example signals that
- You (the writer) can analyze real data.
- You are familiar with time-series data.
- You are familiar with econometric data.


The 'better' time-series example signals that
- You (the writer) can analyze real data.
- You are familiar with time-series data.
- You are familiar with econometric data.
As well as...
- You can communicate your finds to non-specialists.


The 'good' big data example communicates that.
- You can analyze big (as in 'high volume') data.
- You can scrape data from the web, or at least an internal database.
- You can prepare and clean data.
- You can format results into a common government format (i.e. Kaggle).

The 'also good' big data example communicates that.
- You can analyze big (as in 'high volume') data.
- You can build predictive, actionable models.
- You can work with JSON data.
- You can disseminate your findings to non-specialists, such as experts in fields other than your own.


Use 'business language' to subtly stretch the truth and frame things more favourably. For example, use the work 'setback' instead 'of failure', or use 'leverage' instead of 'use' or 'exploit'.


Use 'action verbs' as a helpful guide to demonstrate your experience, especially in the first work of each statement of your experience. These are verbs that typically imply leadership, teamwork, or productivity skills. Such words include, but are not limited to:

(distributed, produced, created, developed, disseminated (i.e. spread), distributed, maintained, updated, cleaned, scraped, prepared, built, wrote, analyzed, coded, investigated)

MAKE SURE YOU KNOW A WORD WELL BEFORE YOU USE IT.

In your experience and even your education, try to start as many sentences as possible with one of those action words. Remember to write about what you DID and not what you DO. In other words, use the past tense for everything including your current position.

One apparent exception is when describing duties instead of actions. A popular way to write about duties instead of actions is to describe duties is to write “responsible for...”. This sounds like it's present tense, however, it's short for the past tense “I was responsible for..”, which brings us cleanly to the next point:


Taking advantage of assumptions and formatting:

You may have noticed some things are missing from the 'good' examples of experience. Specifically, articles and some prepositions are mission. The statements on a resume hsould be closer to news headlines than to complete sentences.

Everything on a resume is assumed to be about the person whose name is at the top of the resume. “Statements that start with 'I was' are already longer than necessary; you and the things you have done are the topics of your resume, so it makes the most to include key information only, as long as it is not ambiguous. The other relevant 'what's and 'who's in a good resume are typically made clear from formatting.


Consider the following example:
Constructed the database management system for the company”

, which can be shortened to

Constructed database management system.”

while retaining all or nearly all of the meaning in a resume standpoint. In this example, the article “the” isn't necessary because without the, tests. Likewise, “for the company” is redundant. Who else would you be doing this work for, if not the company? (If it was a personal skill building exercise, you would still leave that information out, the point is that you have the skill. Why you got it is not important.)

The fewer words you use, which retaining the meaning, the less of theose previous 20 seconds of reading time will be to ensure as great as possible a share of that time is spent observing that you have the qualifications requested in the document.


------

As a footnote, this is my 100th published post, and it is also an excerpt from my upcoming interactive textbook, Writing for Statisticians, which should be available to the public on TopHat in Summer of 2018.