Nobel Prize Winners

Data manipulation, Graphing for EDA (Exploratory Data Analysis)

Notable topics: Data manipulation, Graphing for EDA (Exploratory Data Analysis)

Recorded on: 2019-05-23

Timestamps by: Alex Cookson

## Screencast

## Timestamps

Creating a stacked bar plot using geom_col and the aes function's fill argument (also bins years into decades with truncated division operator %/%)

Using n_distinct function to quickly count unique years in a group

Using distinct function and its .keep_all argument to de-duplicate data

Using coalesce function to replace NAs in a variable (similar to SQL COALESCE verb)

Using year function from lubridate package to calculate (approx.) age of laureates at time of award

Using fct_reorder function to arrange boxplot graph by the median age of winners

Defining a new variable within the count function (like doing a mutate in the count function)

Creating a small multiples bar plot using geom_col and facet_wrap functions

Importing income data from WDI package to explore relationship between high/low income countries and winners

Using fct_relevel to change the levels of a categorical income variable (e.g., "Upper middle income") so that the ordering makes sense

Starting to explore new dataset of nobel laureate publications

Taking the mean of a subset of data without needing to fully filter the data beforehand

Using rank function and its ties.method argument to add the ordinal number of a laureate's publication (e.g., 1st paper, 2nd paper)

Lots of playing around with exploratory histograms (geom_histogram)

Discussion of right-censoring as an issue (people winning the Nobel prize but still having active careers)

Summary of screencast