Data Visualization (with R) for People in a Hurry
This workshop serves as a quick and dirty introduction to data science fundamentals, with a focus on visualization. It is currently a part of nwPlus’ Workshop Series happening this fall. The title is loosely inspired by Neil deGrasse Tyson’s “Astrophysics for People in a Hurry.”
“Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.”
―Josh Wills, Director of Data Engineering at Slack
Participate
Want to participate in this workshop? Join us on November 12th at 6pm. This workshop will be hosted on Zoom. RSVP here!
Psst… interested in using this workshop? Don’t be afraid to reach out!
Pre-requisites
None! To take part in this workshop, just show up ready to learn. All coding will take place on UBC’s Syzygy servers. Access the server here. If you don’t have access to UBC’s servers, that’s no problem! You can also access the starter file with Google Colab, available here. The key point is that no environment setup is required to participate.
Setup
Getting setup with Syzygy
- First download the starter file, available here.
- Then, head to either UBC’s Syzygy servers.
- Hit
File
>Upload
from the top context menu. - Upload the starter file.
- Open the file!
Once you begin, you’ll need to uncomment all of the lines in the first cell and run the first cell to install the necessary dependencies. This will be covered in the workshop.
Getting setup with Google Colab
- Open Google Colab.
- Hit
File
>Open
from the top context menu, orCTRL + O
(orCMD + O
if you’re on a Mac). - Choose the GitHub option, and enter the following link:
https://github.com/nwplus/data-viz-for-people-in-a-hurry
- Select the file called
0-data-viz-for-people-in-a-hurry.ipynb
(it should be first in the list of files that appear). - And you’re down!
Once you begin, you’ll need to uncomment all of the lines in the first cell and run the first cell to install the necessary dependencies. This will be covered in the workshop.
Here are some helpful screenshots of the steps I just described.
About
In this workshop you will…
- Explore the data science process from start to finish
- Learn how to create effective visualizations
- Use your new skills to predict whether a breast tumor is benign and malignant (in part 2!)
Data set
We will be using UCI’s Breast Cancer data set, available here.
Slides
The slide deck for this workshop is available here.
Feedback
Have feedback for the workshop? Fill out this form!
Contributing
Spot an issue or have something to add? PRs are absolutely welcome—feel free to make an issue first so we can discuss the proposed changes.