Tidy Evaluation

tidyr logo

Colin A Gross /

Purpose

“The client will be supplying the column specification.”

“I need to use that spec to process the data.”

Example

Data

Id Time Perf
Alice 1 10
Alice 2 11
Alice 3 11
Bob 1 7
Bob 2 7
Bob 3 9

Configuration

    "columns": [
{ "datatype": "integer",
"name": "Id"
},
{ "datatype": "integer",
"name": "Time"
},
{ "datatype": "integer",
"name": "Perf"
}

Dplyr using known names

df %>% 
group_by(Id) %>%
summarise(has_mastery=eval_mastery(Perf))

Result

Id has_mastery
Alice TRUE
Bob FALSE

Try Normal Eval

Assigning column names as variables

id_cols <- c('Id')
perf_cols <- c('Perf')
time_col <- c('Time')

Try to use column names stored in variables

mastery_summ <- df %>% 
group_by(id_cols) %>%
summarise(eval_mastery(perf_cols))

Failure

Error in grouped_df_impl(data, unname(vars), drop) : 
  Column `id_cols` is unknown

Solution Example

Passing column names as variables

df %>% 
group_by_at(.vars=id_cols) %>%
summarise_at(.vars=perf_cols, .funs=eval_mastery) %>%
rename_at(.vars=perf_cols,
.funs=gsub, ".","has_mastery")

Results

Id has_mastery
Alice TRUE
Bob FALSE
Carol TRUE
Dan TRUE
Elaine TRUE

Dplyr scoped variants

   
select(.data, ...) select_at(.tbl, .vars,
rename(.data, ...) rename_at(.tbl, .vars,
group_by(.data, ..., group_by_at(.tbl, .vars,

Non-Standard Evaluation

common use case

df %>% group_by(Id)
# A tibble: 50 x 3
# Groups:   id [5]
      id  time  perf
   <chr> <int> <int>
 1 Alice     0    15
 2 Alice     1    14

Standard Evaluation

does not apply

colname <- "Id"
df %>% group_by(colname)
Error in grouped_df_impl(data, unname(vars), drop) : 
  Column `colname` is unknown

What group_by is doing with …

# group_by method for data.frame
group_by.data.frame <- function(.data, ...,
groups <- group_by_prepare(.data, ...,

group_by_prepare <- function(.data, ...,
new_groups <- c(quos(...),

quos()


quos() quotes its arguments and returns them as a list of quosures

Scoped form does not quos() your .vars

group_by_at <- function(.tbl, .vars, 
funs <- manip_at(.tbl, .vars,

End of Sufficient Explanation

Issue Resolved

Add, Commit, Push

stackoverflow logo

Quosure

R code and Environment

vals <- c(1:5)   # [1] 1 2 3 4 5
sum(vals) # 15

ls() # [1] "vals"

R code + Environment

library(rlang)

q <- quo(sum(vals))
print(q) # <quosure: global>
# quo(sum(vals))
# ~sum(vals)

eval_tidy(q) # 15

Take enclosed environment with them.

foo <- function(x){
vals <- c(1:3)
print(ls()) # "vals" "x"
print(x) # ~sum(vals)
print(sum(vials)) # 6
eval_tidy(x)
}
foo(q) # 15

vals <- c(1:10)
foo(q) # 55

Overscope

When data scope comes first

z  <- 2 * c(1:5)
x <- "Not even a number."
df <- data.frame(x=c(1:5),y=c(1:5))
lm( y ~ x, data=df)   # Coefficients:
# (Intercept) x
# 1.192e-15 1.000e+00
lm( y ~ z, data=df)   # Coefficients:
# (Intercept) z
# 1.192e-15 5.000e-01

Same for dplyr

z  <- 2 * c(1:5)
x <- c(1:9999)
df <- data.frame(x=c(1:5), y=c(1:5))
df %>% summarize(sum(x), sum(y), sum(z))
# sum(x) sum(y) sum(z)
# 1 15 15 30

Don't Quote Me

Strings to Symbols

  var_name <- "id"            # "id"
var_sym <- sym(var_name) # id

Still breaks

df %>% group_by( var_sym )

# Error in grouped_df_impl(data, unname(vars), drop)
# Column `var_sym` is unknown

Unquoting

Use UQ or !! to unquote an arguement. Evaluated immediately in surrounding context.

df %>% group_by( UQ(var_sym) )

# A tibble: 50 x 3
# Groups: id [5]

Equivalent Solutions

df %>% 
group_by_at(.vars=id_cols) %>%
summarise_at(.vars=perf_cols, .funs=eval_mastery) %>%
rename_at(.vars=perf_cols,
.funs=gsub, ".","has_mastery")

id_sym <- sym(id_cols) 
var_sym <- sym(perf_cols)

df %>%
group_by(UQ(id_sym)) %>%
summarise(has_mastery = eval_mastery(UQ(perf_sym)))

Resources

End of Line