Government Spending on Kids

Data Manipulation, Functions, Embracing, Reading in Many .csv Files, Pairwise Correlation

Notable topics: Data Manipulation, Functions, Embracing, Reading in Many .csv Files, Pairwise Correlation

Recorded on: 2020-09-14

Timestamps by: Eric Fletcher

## Screencast

## Timestamps

Using `geom_line`

and `summarize`

to visualize education spending over time. First for all states. Then individual states. Then small groups of states using `%in%`

. Then in random groups of size n using `%in%`

and `sample`

with `unique`

. `fct_reorder`

is used to reorder `state`

factor levels by sorting along the `inf_adj`

variable.

`geom_vline`

used to add reference to the 2009 financial crisis.

Take the previous chart setting the `inf_adj_perchild`

for the first year `1997`

to `100%`

in order to show a measure of increase from `100%`

as opposed to absolute value for change over time for each state relative to `1997`

. `geom_hline`

used to add reference for the `100%`

starting point. David ends up changing the starting point from `100%`

to `0%`

`fct_reorder`

with `max`

used to reorder the plots in descending order based on highest peak values.

David briefly mentions the small multiples approach to analyzing data.

Create a `function`

named `plot_changed_faceted`

to make it easier to visualize the many other variables included in the dataset.

Create a `function`

named `plot_faceted`

with a `{{ y_axis }}`

embracing argument. Adding this function creates two stages: one for data transformation and another for plotting.

Use the `dir`

function with `pattern`

and `purrr`

package's `map_df`

function to read in many different `.csv`

files with GDP values for each state.

Troubleshooting `Can't combine <character> and <double> columns`

error using `function`

and `mutate`

with `across`

and `as.numeric`

.

Extract state name from filename using `extract`

from `tidyr`

and `regular expression`

.

Unsuccessful attempt at importing state population data via a not user friendly dataset from `census.gov`

by skipping the first 3 rows of the Excel file.

Use `geom_col`

to see which states spend the most for each child for a single variable and multiple variables using `%in%`

.

Use `scale_fill_discrete`

with `guide_legend(reverse = TRUE)`

to change the ordering of the legend.

Use `geom_col`

and 'pairwise_corr`to visualize the correlation between variables across states in 2016 using`

pairwise correlation`.

Use ` geom_point`

to plot `inf_adjust_perchild_PK12ed`

versus `inf_adj_perchild_highered`

. `geom_text`

used to apply state names to each point.

Summary of screencast.