Plants in Danger

Data manipulation, Web scraping (rvest package) and SelectorGadget

Published

August 17, 2020

Notable topics: Data manipulation, Web scraping (rvest package) and SelectorGadget

Recorded on: 2020-08-17

Timestamps by: Eric Fletcher

View code

Screencast

Timestamps

countfct_lumpfct_reorder
dplyrforcats

Using count, fct_lump, and fct_reorder to get an overview of categorical data

fct_relevel
forcats

Using fct_relevel to reorder the "Before 1900" level to the first location leaving the other levels in their existing order

fct_reorder
forcats

Using n and sum in fct_reorder to reorder factor levels when there are multiple categories in count

reorder_withinscale_y_reordered
tidytext

Using reorder_within and scale_y_reordered such that the values are ordered within each facet

axis.text.x
ggplot2

Using `axis.text.x" to rotate overlapping labels

filterfct_lump
dplyrforcats

Using filter and fct_lump to lump all levels except for the 8 most frequest facet panels

separate
tidyr

Using separate to separate the character column binomial_name into multiple columns (genus and species)

fct_lump
forcats

Using fct_lump within count to lump all levels except for the 8 most frequent genus

read_htmlhtml_nodeshtml_text
rvest

Using rvest and SelectorGadget to web scrape list of species

str_trim
stringr

Using str_trim to remove whitespace from character string

separate
tidyr

Using separate to separate character string into genus, species, and rest/citation columns and using extra = "merge" to merge extra pieces into the rest/citation column

read_htmlhtml_nodeshtml_text html_attrinner_joinpaste0map
rvestdplyr purrr

Using rvest and SelectorGadget to web scrape image links

Summary of screencast