Car Fuel Efficiency

Natural splines for regression

Notable topics: Natural splines for regression

Recorded on: 2019-10-14

Timestamps by: Alex Cookson

## Screencast

## Timestamps

Using select and sort and colnames functions to sort variables in alphabetical order

Adding geom_abline for y = x to a scatter plot for comparison

Visualising using geom_boxplot for mpg by vehicle class (size of car)

Start of explanation of prediction goals

Creating train and test sets, along with trick using sample_frac function to randomly re-arrange all rows in a dataset

First step of developing linear model: visually adding geom_smooth

Using augment function to add extra variables from model to original dataset (fitted values and residuals, especially)

Creating residuals plot and explaining what you want and don't want to see

Explanation of splines

Visualising effect of regressing using natural splines

Creating a tibble to test different degrees of freedom (1:10) for natural splines

Using unnest function to get tidy versions of different models

Visualising fitted values of all 6 different models at the same time

Investigating whether the model got "better" as we added degrees of freedom to the natural splines, using the glance function

Using ANOVA to perform a statistical test on whether natural splines as a group explain variation in MPG

Exploring colinearity of dependant variables (displacement and cylinders)

Binning years into every two years using floor function

Using summarise_at function to do quick averaging of multiple variables