Compose data for input into a Bayesian model

Compose data into a list suitable to be passed into a Bayesian model (JAGS, BUGS, Stan, etc).

compose_data(..., .n_name = n_prefix("n"))

Arguments

...: Data to be composed into a list suitable for being passed into Stan, JAGS, etc. Named arguments will have their name used as the name argument to as_data_list when translated; unnamed arguments that are not lists or data frames will have their bare value (passed through make.names) used as the name argument to as_data_list. Each argument is evaluated using eval_tidy in an environment that includes all list items composed so far.
.n_name: A function that is used to form dimension index variables (a variable whose value is number of levels in a factor or the length of a data frame in ...). For example, if a data frame with 20 rows and a factor "foo" (having 3 levels) is passed to compose_data, the list returned by compose_data will include an element named .n_name("foo"), which by default would be "n_foo", containing the value 3, and a column named "n" containing the value 20. See n_prefix().

Value

A list where each element is a translated variable as described above.

Details

This function recursively translates each argument into list elements using as_data_list(), merging all resulting lists together. By default this means that:

numerics are included as-is.
logicals are translated into numeric using as.numeric().
factors are translated into numeric using as.numeric(), and an additional element named .n_name(argument_name) is added with the number of levels in the factor. The default .n_name function prefixes "n_" before the factor name; e.g. a factor named foo will have an element named n_foo added containing the number of levels in foo.
character vectors are converted into factors then translated into numeric in the same manner as factors are.
lists are translated by translating all elements of the list (recursively) and adding them to the result.
data.frames are translated by translating every column of the data.frame and adding them to the result. A variable named "n" (or .n_name(argument_name) if the data.frame is passed as a named argument argument_name) is also added containing the number of rows in the data frame.
NULL values are dropped. Setting a named argument to NULL can be used to drop that item from the resulting list (if an unwanted element was added to the list by a previous argument, such as a column from a data frame that is not needed in the model).
all other types are dropped (and a warning given)

As in functions like dplyr::mutate(), each expression is evaluated in an environment containing the data list built up so far.

For example, this means that if the first argument to compose_data is a data frame, subsequent arguments can include direct references to columns from that data frame. This allows you, for example, to easily use x_at_y() to generate indices for nested models.

If you wish to add support for additional types not described above, provide an implementation of as_data_list() for the type. See the implementations of as_data_list.numeric, as_data_list.logical, etc for examples.

Author

Matthew Kay

Examples


library(magrittr)

df = data.frame(
  plot = factor(paste0("p", rep(1:8, times = 2))),
  site = factor(paste0("s", rep(1:4, each = 2, times = 2)))
)

# without changing `.n_name`, compose_data() will prefix indices
# with "n" by default
df %>%
  compose_data()
#> $plot
#>  [1] 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
#> 
#> $n_plot
#> [1] 8
#> 
#> $site
#>  [1] 1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4
#> 
#> $n_site
#> [1] 4
#> 
#> $n
#> [1] 16
#> 

# you can use n_prefix() to define a different prefix (e.g. "N"):
df %>%
  compose_data(.n_name = n_prefix("N"))
#> $plot
#>  [1] 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
#> 
#> $N_plot
#> [1] 8
#> 
#> $site
#>  [1] 1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4
#> 
#> $N_site
#> [1] 4
#> 
#> $N
#> [1] 16
#> 

# If you have nesting, you may want a nested index, which can be generated using x_at_y()
# Here, site[p] will give the site for plot p
df %>%
  compose_data(site = x_at_y(site, plot))
#> $plot
#>  [1] 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
#> 
#> $n_plot
#> [1] 8
#> 
#> $site
#> [1] 1 1 2 2 3 3 4 4
#> 
#> $n_site
#> [1] 4
#> 
#> $n
#> [1] 16
#>

Arguments

Value

Details

See also

Author

Examples