data-viz-for-people-in-a-hurry

A quick introduction to data science. To be used as a workshop for nwPlus.

View on GitHub

Data Visualization (with R) for People in a Hurry

This workshop serves as a quick and dirty introduction to data science fundamentals, with a focus on visualization. It is currently a part of nwPlus’ Workshop Series happening this fall. The title is loosely inspired by Neil deGrasse Tyson’s “Astrophysics for People in a Hurry.”


“Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.”
―Josh Wills, Director of Data Engineering at Slack

Participate

Want to participate in this workshop? Join us on November 12th at 6pm. This workshop will be hosted on Zoom. RSVP here!

Psst… interested in using this workshop? Don’t be afraid to reach out!

Pre-requisites

None! To take part in this workshop, just show up ready to learn. All coding will take place on UBC’s Syzygy servers. Access the server here. If you don’t have access to UBC’s servers, that’s no problem! You can also access the starter file with Google Colab, available here. The key point is that no environment setup is required to participate.

Setup

Getting setup with Syzygy

  1. First download the starter file, available here.
  2. Then, head to either UBC’s Syzygy servers.
  3. Hit File > Upload from the top context menu.
  4. Upload the starter file.
  5. Open the file!

Once you begin, you’ll need to uncomment all of the lines in the first cell and run the first cell to install the necessary dependencies. This will be covered in the workshop.

Getting setup with Google Colab

  1. Open Google Colab.
  2. Hit File > Open from the top context menu, or CTRL + O (or CMD + O if you’re on a Mac).
  3. Choose the GitHub option, and enter the following link: https://github.com/nwplus/data-viz-for-people-in-a-hurry
  4. Select the file called 0-data-viz-for-people-in-a-hurry.ipynb (it should be first in the list of files that appear).
  5. And you’re down!

Once you begin, you’ll need to uncomment all of the lines in the first cell and run the first cell to install the necessary dependencies. This will be covered in the workshop.

Here are some helpful screenshots of the steps I just described.




About

In this workshop you will…

Data set

We will be using UCI’s Breast Cancer data set, available here.

Slides

The slide deck for this workshop is available here.

Feedback

Have feedback for the workshop? Fill out this form!

Contributing

Spot an issue or have something to add? PRs are absolutely welcome—feel free to make an issue first so we can discuss the proposed changes.