Skip to content

ppc_stat_grouped() fails when subset() is used inside mvbf() #323

Open
@dholstius

Description

@dholstius

Summary

  • Thanks for an amazing package!
  • ppc_stat_grouped() works with a multivariate model fit via brms::brm()
  • ppc_stat_grouped() fails with a multivariate model when subset() is used inside the mvbf()
  • Passing pre-filtered newdata is a workaround; other workarounds welcome
  • May apply to ppc_* other than ppc_stat_grouped(); that's all I tested

In the second example below, I get the error message:

Using all posterior draws for ppc type 'stat_grouped' by default.
Error in `validate_group()`:
! length(group) must be equal to the number of observations.
Run `rlang::last_trace()` to see where the error occurred.
> rlang::last_trace(drop = FALSE)
<error/rlang_error>
Error in `validate_group()`:
! length(group) must be equal to the number of observations.
---
Backtrace:
     ▆
  1. ├─bayesplot::pp_check(...)
  2. └─brms:::pp_check.brmsfit(...)
  3.   └─brms::do_call(ppc_fun, ppc_args)
  4.     └─brms:::eval2(call, envir = args, enclos = envir)
  5.       └─base::eval(expr, envir, ...)
  6.         └─base::eval(expr, envir, ...)
  7.           ├─bayesplot (local) .fun(y = .x1, yrep = .x2, group = .x3, stat = .x4)
  8.           │ └─base::eval(ungroup_call("ppc_stat", call), parent.frame())
  9.           │   └─base::eval(ungroup_call("ppc_stat", call), parent.frame())
 10.           └─bayesplot::ppc_stat(...)
 11.             └─bayesplot::ppc_stat_data(...)
 12.               └─bayesplot:::validate_group(group, length(y))
 13.                 └─rlang::abort("length(group) must be equal to the number of observations.")

Reprex

library(tidyverse)
library(brms)
library(bayesplot)

rbernoulli <- function (n, prob = 0.5) {
  sample(c(0L, 1L), size = n, replace = TRUE, prob = c(1 - prob, prob))
}

dat <-
  tibble(
    grp = LETTERS[1:9],
    y0 = map(seq(0.1, 0.9, by = 0.1), ~ rbernoulli(300, prob = .)),
    y1 = map(seq(0.1, 0.9, by = 0.1), ~ rbernoulli(300, prob = . / 2))) %>%
  unnest(
    c(y0, y1))

#' Just a little helper function to simplify the cases below
fit_logistic <- function (...) {
  brm(
    ...,
    data = dat,
    cores = 4, backend = "cmdstanr", # comment out as you see fit
    family = bernoulli())
}

#' Works great
fit <- fit_logistic(mvbf(y0 ~ (1|grp), y1 ~ (1|grp)))
pp_check(fit, resp = "y1", group = "grp", type = "stat_grouped", stat = "mean")

#' Variation: use `subset(...)` inside the `mvbf()`
#' Fails with "length(group) must be equal to the number of observations"
fit_sub <- fit_logistic(mvbf(y0 ~ (1|grp), y1 | subset(y0 == 1) ~ (1|grp)))
pp_check(fit_sub, resp = "y1", group = "grp", type = "stat_grouped", stat = "mean")

#' If we pass in `newdata` that matches the `subset()` criteria --- here, `y0 == 1` --- then success again
newdata <- filter(dat, y0 == 1)
pp_check(fit_sub, newdata = newdata, resp = "y1", group = "grp", type = "stat_grouped", stat = "mean")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions