Maryland Bridges

Data manipulation, Map visualization

Notable topics: Data manipulation, Map visualization

Recorded on: 2018-11-26

Timestamps by: Alex Cookson

## Screencast

## Timestamps

Using geom_line to create an exploratory line graph

Using %/% operator (truncated division) to bin years into decades (e.g., 1980, 1984, and 1987 would all become "1980")

Converting two-digit year to four-digit year (e.g., "16" becomes "2016") by adding 2000 to each one

Using percent_format function from scales package to get nice-looking axis labels

Using geom_col to create an ordered nice bar/column graph

Using replace_na to replace NA values with "Other"

Starting exploration of average daily traffic

Using comma_format function from scales package to get more readable axis labels (e.g., "1e+05" becomes "100,000")

Using cut function to bin continuous variable into customized breaks (also does a mutate within a group_by!)

Starting to make a map

Encoding a continuous variable to colour, then using scale_colour_gradient2 function to specify colours and midpoint

Specifying the trans argument (transformation) of the scale_colour_gradient2 function to get a logarithmic scale

Using str_to_title function to get values to Title Case (first letter of each word capitalized)

Predicting whether bridges are in "Good" condition using logistic regression (remember to specify the family argument! Dave fixes this at 52:54)

Explanation of why we should NOT be using an OLS linear regression

Using the augment function from the broom package to illustrate why a linear model is not a good fit

Specifying the type.predict argument in the augment function so that we get the actual predicted probability

Explanation of why the sigmoidal shape of logistic regression can be a drawback

Using a cubic spline model (a type of GAM, Generalized Additive Model) as an alternative to logistic regression

Explanation of the shape that a cubic spline model can take (which logistic regression cannot)

Visualizing the model in a different way, using a coefficient plot

Using geom_vline function to add a red reference line to a graph

Adding confidence intervals to the coefficient plot by specifying conf.int argument of tidy function and graphing using the geom_errorbarh function

Brief explanation of log-odds coefficients

Summary of screencast