Compose data into a list suitable to be passed into a Bayesian model (JAGS, BUGS, Stan, etc).

compose_data(..., .n_name = n_prefix("n"))

Arguments

...

Data to be composed into a list suitable for being passed into Stan, JAGS, etc. Named arguments will have their name used as the name argument to as_data_list when translated; unnamed arguments that are not lists or data frames will have their bare value (passed through make.names) used as the name argument to as_data_list. Each argument is evaluated using eval_tidy in an environment that includes all list items composed so far.

.n_name

A function that is used to form dimension index variables (a variable whose value is number of levels in a factor or the length of a data frame in ...). For example, if a data frame with 20 rows and a factor "foo" (having 3 levels) is passed to compose_data, the list returned by compose_data will include an element named .n_name("foo"), which by default would be "n_foo", containing the value 3, and a column named "n" containing the value 20. See n_prefix().

Value

A list where each element is a translated variable as described above.

Details

This function recursively translates each argument into list elements using as_data_list(), merging all resulting lists together. By default this means that:

  • numerics are included as-is.

  • logicals are translated into numeric using as.numeric().

  • factors are translated into numeric using as.numeric(), and an additional element named .n_name(argument_name) is added with the number of levels in the factor. The default .n_name function prefixes "n_" before the factor name; e.g. a factor named foo will have an element named n_foo added containing the number of levels in foo.

  • character vectors are converted into factors then translated into numeric in the same manner as factors are.

  • lists are translated by translating all elements of the list (recursively) and adding them to the result.

  • data.frames are translated by translating every column of the data.frame and adding them to the result. A variable named "n" (or .n_name(argument_name) if the data.frame is passed as a named argument argument_name) is also added containing the number of rows in the data frame.

  • NULL values are dropped. Setting a named argument to NULL can be used to drop that item from the resulting list (if an unwanted element was added to the list by a previous argument, such as a column from a data frame that is not needed in the model).

  • all other types are dropped (and a warning given)

As in functions like dplyr::mutate(), each expression is evaluated in an environment containing the data list built up so far.

For example, this means that if the first argument to compose_data is a data frame, subsequent arguments can include direct references to columns from that data frame. This allows you, for example, to easily use x_at_y() to generate indices for nested models.

If you wish to add support for additional types not described above, provide an implementation of as_data_list() for the type. See the implementations of as_data_list.numeric, as_data_list.logical, etc for examples.

Author

Matthew Kay

Examples


library(magrittr)

df = data.frame(
  plot = factor(paste0("p", rep(1:8, times = 2))),
  site = factor(paste0("s", rep(1:4, each = 2, times = 2)))
)

# without changing `.n_name`, compose_data() will prefix indices
# with "n" by default
df %>%
  compose_data()
#> $plot
#>  [1] 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
#> 
#> $n_plot
#> [1] 8
#> 
#> $site
#>  [1] 1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4
#> 
#> $n_site
#> [1] 4
#> 
#> $n
#> [1] 16
#> 

# you can use n_prefix() to define a different prefix (e.g. "N"):
df %>%
  compose_data(.n_name = n_prefix("N"))
#> $plot
#>  [1] 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
#> 
#> $N_plot
#> [1] 8
#> 
#> $site
#>  [1] 1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4
#> 
#> $N_site
#> [1] 4
#> 
#> $N
#> [1] 16
#> 

# If you have nesting, you may want a nested index, which can be generated using x_at_y()
# Here, site[p] will give the site for plot p
df %>%
  compose_data(site = x_at_y(site, plot))
#> $plot
#>  [1] 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
#> 
#> $n_plot
#> [1] 8
#> 
#> $site
#> [1] 1 1 2 2 3 3 4 4
#> 
#> $n_site
#> [1] 4
#> 
#> $n
#> [1] 16
#>