Coffee Ratings
Ridgeline plot, Pairwise correlation, Network plot, Singular value decomposition, Linear model
Notable topics: Ridgeline plot, Pairwise correlation, Network plot, Singular value decomposition, Linear model
Recorded on: 2020-07-06
Timestamps by: Eric Fletcher
Screencast
Timestamps
Using fct_lump within count and then mutate to lump the variety of coffee together except for the most frequent
Create a geom_boxplot to visualize the variety and the distribution of total_cup_points
Create a geom_histogram to visualize the variety and the distribution of total_cup_points
Using fct_reorder to reorder variety by sorting it along total_cup_points in ascending order
Using summarize with across to calculate the percent of missing data (NA) for each rating variable
Create a bar chart using geom_col with fct_lump to visualize the frequency of top countries
Using pivot_longer to pivot the rating metrics for wide format to long format
Create a geom_line chart to see if the sum of the rating categories equal to the total_cup_points column
Create a geom_density_ridges chart to show the distribution of ratings across each rating metric
Using summarize with mean and sd to show the average rating per metric with its standard deviation
Using pairwise_cor to find correlations amongst the rating metrics
Create a network plot to show the clustering of the rating metrics
Using widely_svd to visualize the biggest source of variation with the rating metrics (Singular value decomposition)
Create a geom_histogram to visualize the distribution of altitude
Using pmin to set a maximum numeric altitude value of 3000
Create a geom-point chart to visualize the correlation between altitude and quality (total_cup_points)
Using summarize with cor to show the correlation between altitude and each rating metric
Create a linear model lm for each rating metric then visualize the results using a geom_line chart to show how each kilometer of altitude contributes to the score
Summary of screencast