Art Collections

geom_area plot, distributions, calculating area (square meters) and ratio (width / height)

Notable topics: geom_area plot, distributions, calculating area (square meters) and ratio (width / height)

Recorded on: 2021-01-11

Timestamps by: Eric Fletcher

## Screencast

## Timestamps

Using `clean_names`

to convert variable names from `camelcase`

to `snakecase`

.

Use `fct_reorder`

to reorder `geom_col`

columns in ascending order.

"Use `extract`

to extract a character column into multiple columns using the regular expression `""(.*) on (.*)""`

at `6:05`

David decides to change this to: Use `separate`

with `sep = "" on ""`

and `fill = ""left""`

and `extra = ""merge""`

to control what happens when there are not enoughor too many pieces. at `7:10`

David decides to change to `fill = ""right""`

."

Use `replace_na`

to replace NAs with specified values. In this case replace them with `Missing`

.

"Use `fct_lump`

to lump `artist`

and `medium`

levels except for the n most frequent. at `11:30`

David decides to use `filter(fct_lump(artist, 16) != ""Other"")`

to get rid of the artist `Other`

category. "

"Create a `geom_area`

plot to show the distribution of paintings by medium over time. At `15:35`

David decides to change from count to percentage to make it easier to show the difference in composition using `mutate(pct = n / sum)`

."

Bucket `year`

variable into decades using `round(year -1)`

to round the year to the nearest 10.

Use `scale_y_continuous(labels = scales::percent)`

to change y-axis labels to percent format.

Turn the `geom_area`

plot into a faceted `geom_col`

.

"Calculate the percentage of artists for each medium per decade. "

Calculate the distribution of the area (square meters) and ratio (width / height) of the art pieces.

Categorize the pieces by shape(landscape, portait, scquare) based on their ratio then plot using `geom_area`

to look at the composition over time.

Craete a `line plot`

showing the median ratio by decade over time.

Craete a `line plot`

showing the median area by decade over time.

Create a `boxplot`

showing the distribution of area over time.

Create various `summary statistics`

for the artists such as `avg_year`

, first_year`, `

last_year`, `

n_pieces`, `

median_area`, `

median_ratio`.

Create a `boxplot`

showing the distribution of ratio over time for n amount of artists. Use `glue`

to concatonate number of pieces for each artist ont he y axis.

Create a `boxplot`

showing the distribution of ratio over time for each medium. Use `glue`

to concatonate number of pieces for each medium on the y axis.

Summary of screencast