NYC Restaurant Inspections

Multiple t-test models (broom package), Principal Component Analysis (PCA)

Published

December 10, 2018

Notable topics: Multiple t-test models (broom package), Principal Component Analysis (PCA)

Recorded on: 2018-12-10

Timestamps by: Alex Cookson

View code

Screencast

Timestamps

separate

Separating column using separate function

distinct

Taking distinct observation, but keeping the remaining variables using distinct function with .keep_all argument

nestt.test
broom

Using broom package and nest function to perform multiple t-tests at the same time

broom

Tidying nested t-test models using broom package

Creating TIE fighter plot of estimates of means and their confidence intervals

Recode long description using regex to remove everything after a parenthesis

cut

Using cut function to manually bin data along user-specified intervals

Asking, "What type of violations tend to occur more in some cuisines than others?"

semi_join

Using semi_join function to get the most recent inspection of all the restaurants

Asking, "What violations tend to occur together?"

pairwise_cor
widyr

Using widyr package function pairwise_cor (pairwise correlation) to find co-occurrence of violation types

widely_svd

Beginning of PCA (Principal Component Analysis) using widely_svd function

widely_svd

Actually typing in the widely_svd function

widely_svd

Reviewing and explaining output of widely_svd function

Creating graph of opposing elements of a PCA dimension

str_sub

Shortening string using str_sub function

Reference to Julia Silge's PCA walkthrough using StackOverflow data: https://juliasilge.com/blog/stack-overflow-pca/