class: center, middle, inverse, title-slide # Functional programming and iteration with purrr ### Malcolm Barrett ### 2019-09-26 --- class: middle
try me!
### Interactive app: [malcolmbarrett.shinyapps.io/purrr_exercises](https://malcolmbarrett.shinyapps.io/purrr_exercises) ### Or try `exercises.Rmd` in the repo: [github.com/malcolmbarrett/lawrug_purrr](https://github.com/malcolmbarrett/lawrug_purrr) --- background-image: url(http://hexb.in/hexagons/purrr.png) background-position: 90% 26% # purrr: A functional programming toolkit for R <br/><br/><br/> ## Complete and consistent set of tools for working with functions and vectors --- class: inverse, middle # Problems we want to solve: 1. Making code clear 2. Making code safe 3. Working with lists and data frames --- # Lists, vectors, and data.frames (or tibbles) ```r c(a = "hello", b = 1) ``` ``` ## a b ## "hello" "1" ``` --- # lists can contain any object ```r list(a = "hello", b = 1, c = mean) ``` ``` ## $a ## [1] "hello" ## ## $b ## [1] 1 ## ## $c ## function (x, ...) ## UseMethod("mean") ## <bytecode: 0x7fe91ec2c958> ## <environment: namespace:base> ``` --- # data frames are lists ```r x <- list(a = "hello", b = 1) as.data.frame(x) ``` ``` ## a b ## 1 hello 1 ``` --- # data frames are lists ```r library(gapminder) head(gapminder$pop) ``` ``` ## [1] 8425333 9240934 10267083 11537966 13079460 14880372 ``` --- # data frames are lists ```r gapminder[1:6, "pop"] ``` --- # data frames are lists ```r gapminder[1:6, "pop"] ``` ``` ## # A tibble: 6 x 1 ## pop ## <int> ## 1 8425333 ## 2 9240934 ## 3 10267083 ## 4 11537966 ## 5 13079460 ## 6 14880372 ``` --- # data frames are lists ```r head(gapminder[["pop"]]) ``` ``` ## [1] 8425333 9240934 10267083 11537966 13079460 14880372 ``` --- # vectorized functions don't work on lists ```r sum(rnorm(10)) ``` --- # vectorized functions don't work on lists ```r sum(rnorm(10)) ``` ``` ## [1] -3.831574 ``` --- # vectorized functions don't work on lists ```r sum(rnorm(10)) ``` ``` ## [1] -3.831574 ``` ```r sum(list(x = rnorm(10), y = rnorm(10), z = rnorm(10))) ``` --- # vectorized functions don't work on lists ```r sum(rnorm(10)) ``` ``` ## [1] -3.831574 ``` ```r sum(list(x = rnorm(10), y = rnorm(10), z = rnorm(10))) ``` ``` ## Error in sum(list(x = rnorm(10), y = rnorm(10), z = rnorm(10))): invalid 'type' (list) of argument ``` --- background-image: url(http://hexb.in/hexagons/purrr.png) background-position: 95% 2% # map(.x, .f) -- ## .x: a vector, list, or data frame -- ## .f: a function -- ## Returns a list ---
try me!
# Using map() ```r library(purrr) x_list <- list(x = rnorm(10), y = rnorm(10), z = rnorm(10)) map(x_list, mean) ``` ---
try me!
# Using map() ```r library(purrr) *x_list <- list(x = rnorm(10), y = rnorm(10), z = rnorm(10)) map(x_list, mean) ``` ---
try me!
# Using map() ```r library(purrr) x_list <- list(x = rnorm(10), y = rnorm(10), z = rnorm(10)) *map(x_list, mean) ``` ---
try me!
# Using map() ```r library(purrr) x_list <- list(x = rnorm(10), y = rnorm(10), z = rnorm(10)) map(x_list, mean) ``` ``` ## $x ## [1] -0.6097971 ## ## $y ## [1] -0.2788647 ## ## $z ## [1] 0.6165922 ``` --- <img src="img/purrr_list.png" width="50%" height="50%" style="display: block; margin: auto;" /> --- <img src="img/purrr_f_list.png" width="912" style="display: block; margin: auto;" /> --- <img src="img/purr_x_input.png" width="901" style="display: block; margin: auto;" /> ---
try me!
## using `map()` with data frames -- ```r gapminder %>% dplyr::select_if(is.numeric) %>% map(sd) ``` ---
try me!
## using `map()` with data frames ```r *gapminder %>% * dplyr::select_if(is.numeric) %>% map(sd) ``` ---
try me!
## using `map()` with data frames ```r gapminder %>% dplyr::select_if(is.numeric) %>% * map(sd) ``` ---
try me!
## using `map()` with data frames ```r gapminder %>% dplyr::select_if(is.numeric) %>% map(sd) ``` ``` ## $year ## [1] 17.26533 ## ## $lifeExp ## [1] 12.91711 ## ## $pop ## [1] 106157897 ## ## $gdpPercap ## [1] 9857.455 ``` --- # Review: writing functions ```r x <- x^2 x <- scale(x) x <- max(x) ``` --- # Review: writing functions ```r x <- x^2 x <- scale(x) x <- max(x) y <- x^2 y <- scale(y) y <- max(y) z <- z^2 z <- scale(x) z <- max(z) ``` --- # Review: writing functions ```r x <- x^2 x <- scale(x) x <- max(x) *y <- x^2 y <- scale(y) y <- max(y) z <- z^2 *z <- scale(x) z <- max(z) ``` --- # Review: writing functions ```r *x <- x^3 x <- scale(x) x <- max(x) *y <- x^2 y <- scale(y) y <- max(y) *z <- z^2 z <- scale(x) z <- max(z) ``` --- # Review: writing functions ```r .f <- function(x) { x <- x^3 x <- scale(x) max(x) } .f(x) .f(y) .f(z) ``` --- class: inverse, center, middle, takeaway # **If you copy and paste your code three times, it's time to write a function** --- class: inverse # Three ways to pass functions to `map()` 1. pass directly to `map()` 2. use an anonymous function 3. use ~ --- <img src="img/purr_f_input1.png" width="904" style="display: block; margin: auto;" /> --- <img src="img/purr_f_input2.png" width="968" style="display: block; margin: auto;" /> --- <img src="img/purr_f_input3.png" width="839" style="display: block; margin: auto;" /> --- ## Annonymous functions
try me!
```r map(gapminder, ~length(unique(.x))) ``` --- ## Annonymous functions
try me!
``` ## $country ## [1] 142 ## ## $continent ## [1] 5 ## ## $year ## [1] 12 ## ## $lifeExp ## [1] 1626 ## ## $pop ## [1] 1704 ## ## $gdpPercap ## [1] 1704 ``` --- # Returning types | map | returns | |:--|:--| |`map()` | list | |`map_chr()` | character vector | |`map_dbl()` | double vector (numeric) | |`map_int()` | integer vector | |`map_lgl()` | logical vector | |`map_dfc()` | data frame (by column) | |`map_dfr()` | data frame (by row) | ---
try me!
# Returning types ```r map_int(gapminder, ~length(unique(.x))) ``` ---
try me!
# Returning types ```r map_int(gapminder, ~length(unique(.x))) ``` ``` ## country continent year lifeExp pop gdpPercap ## 142 5 12 1626 1704 1704 ``` --- # map2(.x, .y, .f) -- ## .x, .y: a vector, list, or data frame -- ## .f: a function -- ## Returns a list --- <img src="img/purr_x2_input.png" width="80%" height="80%" style="display: block; margin: auto;" /> --- <img src="img/purr_x2_input_warn.png" width="80%" height="80%" style="display: block; margin: auto;" /> --- <img src="img/purr_f2_input.png" width="80%" height="80%" style="display: block; margin: auto;" /> ---
try me!
# map2() ```r means <- c(-3, 4, 2, 2.3) sds <- c(.3, 4, 2, 1) map2_dbl(means, sds, rnorm, n = 1) ``` ---
try me!
# map2() ```r means <- c(-3, 4, 2, 2.3) sds <- c(.3, 4, 2, 1) *map2_dbl(means, sds, rnorm, n = 1) ``` ---
try me!
# map2() ```r means <- c(-3, 4, 2, 2.3) sds <- c(.3, 4, 2, 1) map2_dbl(means, sds, rnorm, n = 1) ``` ``` ## [1] -2.997932 2.178125 1.266952 2.948287 ``` --- class: middle, center | input 1 | input 2 | returns | |:--|:--| |`map()` | `map2()` | list | |`map_chr()` | `map2_chr()` | character vector | |`map_dbl()` | `map2_dbl()` | double vector (numeric) | |`map_int()` | `map2_int()` | integer vector | |`map_lgl()` | `map2_lgl()` | logical vector | |`map_dfc()` | `map2_dfc()` | data frame (by column) | |`map_dfr()` | `map2_dfr()` | data frame (by row) | --- # Other mapping functions ## **pmap()** and friends: take n lists or data frame with argument names --- # Other mapping functions ## ~~pmap() and friends: take n lists or data frame with argument names~~ ## **walk()** and friends: for side effects like plotting; returns input invisibly --- # Other mapping functions ## ~~pmap() and friends: take n lists or data frame with argument names~~ ## ~~walk() and friends: for side effects like plotting; returns input invisibly~~ ## **imap()** and friends: includes counter `i` --- # Other mapping functions ## ~~pmap() and friends: take n lists or data frame with argument names~~ ## ~~walk() and friends: for side effects like plotting; returns input invisibly~~ ## ~~imap() and friends: includes counter `i`~~ ## **map_if()**, **map_at()**: Apply only to certain elements --- class: middle, center | input 1 | input 2 | input n | returns | |:--|:--| |`map()` | `map2()` | `pmap()` | list | |`map_chr()` | `map2_chr()` | `pmap_chr()` | character vector | |`map_dbl()` | `map2_dbl()` | `pmap_dbl()` | double vector (numeric) | |`map_int()` | `map2_int()` | `pmap_int()` | integer vector | |`map_lgl()` | `map2_lgl()` | `pmap_lgl()` | logical vector | |`map_dfc()` | `map2_dfc()` | `pmap_dfc()` | data frame (by column) | |`map_dfr()` | `map2_dfr()` | `pmap_dfr()` | data frame (by row) | |`walk()` | `walk2()` | `pwalk()` | input (side effects!) | --- # Base R | base R | purrr | |:--|:--| |`lapply()` | `map()` | |`vapply()` | `map_*()` | |`sapply()` | ? | |`x[] <- lapply()` | `map_dfc()` | |`mapply()` | `map2()`, `pmap()` | --- class: inverse # Benefits of purrr 1. Consistent 2. Type-safe 3. ~f(.x) --- # Efficient Loops <img src="https://media.giphy.com/media/fwilDvlZ1GVHcI9BlT/giphy.gif" style="display: block; margin: auto;" /> ---
try me!
## Loops vs functional programming ```r x <- rnorm(10) y <- map(x, mean) ``` ```r x <- rnorm(10) y <- vector("list", length(x)) for (i in seq_along(x)) { y[[i]] <- mean(x[[i]]) } ``` ---
try me!
## Loops vs functional programming ```r x <- rnorm(10) y <- map(x, mean) ``` ```r x <- rnorm(10) *y <- vector("list", length(x)) for (i in seq_along(x)) { y[[i]] <- mean(x[[i]]) } ``` ---
try me!
## Loops vs functional programming ```r x <- rnorm(10) y <- map(x, mean) ``` ```r x <- rnorm(10) y <- vector("list", length(x)) *for (i in seq_along(x)) { y[[i]] <- mean(x[[i]]) } ``` ---
try me!
## Loops vs functional programming ```r x <- rnorm(10) y <- map(x, mean) ``` ```r x <- rnorm(10) y <- vector("list", length(x)) for (i in seq_along(x)) { * y[[i]] <- mean(x[[i]]) } ``` --- class: center, middle, inverse # **Of course someone has to write loops. It doesn’t have to be you.** # **--Jenny Bryan** --- # Working with lists and nested data <img src="img/cheatsheet_lists.png" width="568" style="display: block; margin: auto;" /> --- # Working with lists and nested data <img src="img/cheatsheet_nested.png" width="845" style="display: block; margin: auto;" /> --- # Adverbs: Modify function behavior <img src="img/cheatsheet_adverbs.png" width="317" style="display: block; margin: auto;" /> --- class: inverse, center # Learn more! ## [Jenny Bryan's purrr tutorial](https://jennybc.github.io/purrr-tutorial/): A detailed introduction to purrr. Free online. ## [R for Data Science](http://r4ds.had.co.nz/): A comprehensive but friendly introduction to the tidyverse. Free online. ## [RStudio Primers](https://rstudio.cloud/learn/primers): Free, interactive courses, including purrr. --- class: inverse, center, middle ### *Thanks for coming!* #### *map(kittens, wag_tail)* <img src="img/cat-min.gif" width="25%" height="25%" style="display: block; margin: auto;" /> ###
[malcolmbarrett](https://github.com/malcolmbarrett/) ###
[@malco_barrett](https://twitter.com/malco_barrett) Slides created via the R package [xaringan](https://github.com/yihui/xaringan).