Palmer Penguins

Modeling (logistic regression, k-nearest neighbors, decision tree, multiclass logistic regression) with cross validated accuracy

Published

July 27, 2020

Notable topics: Modeling (logistic regression, k-nearest neighbors, decision tree, multiclass logistic regression) with cross validated accuracy

Recorded on: 2020-07-27

Timestamps by: Eric Fletcher

View code

Screencast

Timestamps

pivot_longergeom_histogramfacet_wrap
tidyrggplot2

Create a pivoted histogram plot to visualize the distribution of penguin metrics using pivot_longer, geom_histogram, and facet_wrap

geom_densityfacet_wrap
ggplot2

Create a pivoted density plot to visualize the distribution of penguin metrics using geom_density and facet_wrap

geom_boxplotfacet_wrap
ggplot2

Create a pivoted boxplot plot to visualize the distribution of penguin metrics using geom_boxplot and facet_wrap

geom_bar
ggplot2

Create a bar plot to show penguin species changed over time

geom_bar
ggplot2

Create a bar plot to show specie counts per island

initital_splittraininglogistic_regset_enginefitfct_lumppredictmetricsvfold_cvfit_resamplescollect_metrics
tidymodelsrsampleparsnip yardstick

Create a logistic regression model to predict if a penguin is Adelie or not using bill length with cross validaiton of metrics

initital_splittraininglogistic_regset_enginefitfct_lumppredictmetricsvfold_cvfit_resamplescollect_metrics
tidymodelsrsampleparsnip yardstick

Create second logistic regression model using 4 predictive metrics (bill length, bill depth, flipper length, body mass) and then compare the accuracy of both models

nearest_neighborinitital_splittraininglogistic_regset_enginefitfct_lumppredictmetricsvfold_cvfit_resamplescollect_metrics
tidymodelsrsampleparsnip yardstick

Create a k-nearest neighbor model and then compare accuracy against logistic regression models to see which has the highest cross validated accuracy

testingpredictmetrics
rsamplestatsyardstick

What is the accuracy of the testing holdout data on the k-nearest neighbor model?

decision_treeset_engine
parsnip

Create a decision tree and then compare accuracy against the previous models to see which has the highest cross validated accuracy + how to extract a decision tree

multinom_regset_enginefit_resamples
parsniptune

Perform multi class regression using multinom_reg

Summary of screencast