+ - 0:00:00
Notes for current slide
Notes for next slide

Functional programming and iteration with purrr

Malcolm Barrett

2019-09-26

1 / 68

Interactive app: malcolmbarrett.shinyapps.io/purrr_exercises

Or try exercises.Rmd in the repo: github.com/malcolmbarrett/lawrug_purrr

2 / 68

purrr: A functional programming toolkit for R




Complete and consistent set of tools for working with functions and vectors

3 / 68

Problems we want to solve:

  1. Making code clear
  2. Making code safe
  3. Working with lists and data frames
4 / 68

Lists, vectors, and data.frames (or tibbles)

c(a = "hello", b = 1)
## a b
## "hello" "1"
5 / 68

lists can contain any object

list(a = "hello", b = 1, c = mean)
## $a
## [1] "hello"
##
## $b
## [1] 1
##
## $c
## function (x, ...)
## UseMethod("mean")
## <bytecode: 0x7fe91ec2c958>
## <environment: namespace:base>
6 / 68

data frames are lists

x <- list(a = "hello", b = 1)
as.data.frame(x)
## a b
## 1 hello 1
7 / 68

data frames are lists

library(gapminder)
head(gapminder$pop)
## [1] 8425333 9240934 10267083 11537966 13079460 14880372
8 / 68

data frames are lists

gapminder[1:6, "pop"]
9 / 68

data frames are lists

gapminder[1:6, "pop"]
## # A tibble: 6 x 1
## pop
## <int>
## 1 8425333
## 2 9240934
## 3 10267083
## 4 11537966
## 5 13079460
## 6 14880372
10 / 68

data frames are lists

head(gapminder[["pop"]])
## [1] 8425333 9240934 10267083 11537966 13079460 14880372
11 / 68

vectorized functions don't work on lists

sum(rnorm(10))
12 / 68

vectorized functions don't work on lists

sum(rnorm(10))
## [1] -3.831574
13 / 68

vectorized functions don't work on lists

sum(rnorm(10))
## [1] -3.831574
sum(list(x = rnorm(10), y = rnorm(10), z = rnorm(10)))
14 / 68

vectorized functions don't work on lists

sum(rnorm(10))
## [1] -3.831574
sum(list(x = rnorm(10), y = rnorm(10), z = rnorm(10)))
## Error in sum(list(x = rnorm(10), y = rnorm(10), z = rnorm(10))): invalid 'type' (list) of argument
15 / 68

map(.x, .f)

16 / 68

map(.x, .f)

.x: a vector, list, or data frame

16 / 68

map(.x, .f)

.x: a vector, list, or data frame

.f: a function

16 / 68

map(.x, .f)

.x: a vector, list, or data frame

.f: a function

Returns a list

16 / 68

Using map()

library(purrr)
x_list <- list(x = rnorm(10), y = rnorm(10), z = rnorm(10))
map(x_list, mean)
17 / 68

Using map()

library(purrr)
x_list <- list(x = rnorm(10), y = rnorm(10), z = rnorm(10))
map(x_list, mean)
18 / 68

Using map()

library(purrr)
x_list <- list(x = rnorm(10), y = rnorm(10), z = rnorm(10))
map(x_list, mean)
19 / 68

Using map()

library(purrr)
x_list <- list(x = rnorm(10), y = rnorm(10), z = rnorm(10))
map(x_list, mean)
## $x
## [1] -0.6097971
##
## $y
## [1] -0.2788647
##
## $z
## [1] 0.6165922
20 / 68

21 / 68

22 / 68

23 / 68

using map() with data frames

24 / 68

using map() with data frames

gapminder %>%
dplyr::select_if(is.numeric) %>%
map(sd)
24 / 68

using map() with data frames

gapminder %>%
dplyr::select_if(is.numeric) %>%
map(sd)
25 / 68

using map() with data frames

gapminder %>%
dplyr::select_if(is.numeric) %>%
map(sd)
26 / 68

using map() with data frames

gapminder %>%
dplyr::select_if(is.numeric) %>%
map(sd)
## $year
## [1] 17.26533
##
## $lifeExp
## [1] 12.91711
##
## $pop
## [1] 106157897
##
## $gdpPercap
## [1] 9857.455
27 / 68

Review: writing functions

x <- x^2
x <- scale(x)
x <- max(x)
28 / 68

Review: writing functions

x <- x^2
x <- scale(x)
x <- max(x)
y <- x^2
y <- scale(y)
y <- max(y)
z <- z^2
z <- scale(x)
z <- max(z)
29 / 68

Review: writing functions

x <- x^2
x <- scale(x)
x <- max(x)
y <- x^2
y <- scale(y)
y <- max(y)
z <- z^2
z <- scale(x)
z <- max(z)
30 / 68

Review: writing functions

x <- x^3
x <- scale(x)
x <- max(x)
y <- x^2
y <- scale(y)
y <- max(y)
z <- z^2
z <- scale(x)
z <- max(z)
31 / 68

Review: writing functions

.f <- function(x) {
x <- x^3
x <- scale(x)
max(x)
}
.f(x)
.f(y)
.f(z)
32 / 68

If you copy and paste your code three times, it's time to write a function

33 / 68

Three ways to pass functions to map()

  1. pass directly to map()
  2. use an anonymous function
  3. use ~
34 / 68

35 / 68

36 / 68

37 / 68

Annonymous functions

map(gapminder, ~length(unique(.x)))
38 / 68

Annonymous functions

## $country
## [1] 142
##
## $continent
## [1] 5
##
## $year
## [1] 12
##
## $lifeExp
## [1] 1626
##
## $pop
## [1] 1704
##
## $gdpPercap
## [1] 1704
39 / 68

Returning types

map returns
map() list
map_chr() character vector
map_dbl() double vector (numeric)
map_int() integer vector
map_lgl() logical vector
map_dfc() data frame (by column)
map_dfr() data frame (by row)
40 / 68

Returning types

map_int(gapminder, ~length(unique(.x)))
41 / 68

Returning types

map_int(gapminder, ~length(unique(.x)))
## country continent year lifeExp pop gdpPercap
## 142 5 12 1626 1704 1704
42 / 68

map2(.x, .y, .f)

43 / 68

map2(.x, .y, .f)

.x, .y: a vector, list, or data frame

43 / 68

map2(.x, .y, .f)

.x, .y: a vector, list, or data frame

.f: a function

43 / 68

map2(.x, .y, .f)

.x, .y: a vector, list, or data frame

.f: a function

Returns a list

43 / 68

44 / 68

45 / 68

46 / 68

map2()

means <- c(-3, 4, 2, 2.3)
sds <- c(.3, 4, 2, 1)
map2_dbl(means, sds, rnorm, n = 1)
47 / 68

map2()

means <- c(-3, 4, 2, 2.3)
sds <- c(.3, 4, 2, 1)
map2_dbl(means, sds, rnorm, n = 1)
48 / 68

map2()

means <- c(-3, 4, 2, 2.3)
sds <- c(.3, 4, 2, 1)
map2_dbl(means, sds, rnorm, n = 1)
## [1] -2.997932 2.178125 1.266952 2.948287
49 / 68
input 1 input 2 returns
map() map2() list
map_chr() map2_chr() character vector
map_dbl() map2_dbl() double vector (numeric)
map_int() map2_int() integer vector
map_lgl() map2_lgl() logical vector
map_dfc() map2_dfc() data frame (by column)
map_dfr() map2_dfr() data frame (by row)
50 / 68

Other mapping functions

pmap() and friends: take n lists or data frame with argument names

51 / 68

Other mapping functions

pmap() and friends: take n lists or data frame with argument names

walk() and friends: for side effects like plotting; returns input invisibly

52 / 68

Other mapping functions

pmap() and friends: take n lists or data frame with argument names

walk() and friends: for side effects like plotting; returns input invisibly

imap() and friends: includes counter i

53 / 68

Other mapping functions

pmap() and friends: take n lists or data frame with argument names

walk() and friends: for side effects like plotting; returns input invisibly

imap() and friends: includes counter i

map_if(), map_at(): Apply only to certain elements

54 / 68
input 1 input 2 input n returns
map() map2() pmap() list
map_chr() map2_chr() pmap_chr() character vector
map_dbl() map2_dbl() pmap_dbl() double vector (numeric)
map_int() map2_int() pmap_int() integer vector
map_lgl() map2_lgl() pmap_lgl() logical vector
map_dfc() map2_dfc() pmap_dfc() data frame (by column)
map_dfr() map2_dfr() pmap_dfr() data frame (by row)
walk() walk2() pwalk() input (side effects!)
55 / 68

Base R

base R purrr
lapply() map()
vapply() map_*()
sapply() ?
x[] <- lapply() map_dfc()
mapply() map2(), pmap()
56 / 68

Benefits of purrr

  1. Consistent
  2. Type-safe
  3. ~f(.x)
57 / 68

Efficient Loops

58 / 68

Loops vs functional programming

x <- rnorm(10)
y <- map(x, mean)
x <- rnorm(10)
y <- vector("list", length(x))
for (i in seq_along(x)) {
y[[i]] <- mean(x[[i]])
}
59 / 68

Loops vs functional programming

x <- rnorm(10)
y <- map(x, mean)
x <- rnorm(10)
y <- vector("list", length(x))
for (i in seq_along(x)) {
y[[i]] <- mean(x[[i]])
}
60 / 68

Loops vs functional programming

x <- rnorm(10)
y <- map(x, mean)
x <- rnorm(10)
y <- vector("list", length(x))
for (i in seq_along(x)) {
y[[i]] <- mean(x[[i]])
}
61 / 68

Loops vs functional programming

x <- rnorm(10)
y <- map(x, mean)
x <- rnorm(10)
y <- vector("list", length(x))
for (i in seq_along(x)) {
y[[i]] <- mean(x[[i]])
}
62 / 68

Of course someone has to write loops. It doesn’t have to be you.

--Jenny Bryan

63 / 68

Working with lists and nested data

64 / 68

Working with lists and nested data

65 / 68

Adverbs: Modify function behavior

66 / 68

Learn more!

Jenny Bryan's purrr tutorial: A detailed introduction to purrr. Free online.

R for Data Science: A comprehensive but friendly introduction to the tidyverse. Free online.

RStudio Primers: Free, interactive courses, including purrr.

67 / 68

Thanks for coming!

map(kittens, wag_tail)

malcolmbarrett

@malco_barrett

Slides created via the R package xaringan.

68 / 68

Interactive app: malcolmbarrett.shinyapps.io/purrr_exercises

Or try exercises.Rmd in the repo: github.com/malcolmbarrett/lawrug_purrr

2 / 68
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow