6  Mediation Analysis with a Multicategorical Antecedent

“Historically, investigators interested in doing a mediation analysis with a multicategorical antecedents \(X\) have resorted to some less optimal strategies than the one [Hayes] discuss[ed] in this chapter” (Hayes, 2018, p. 188). Happily, the approach outlined in this chapter avoids such gaffs. Hayes’s procedure “does not require discarding any data; the entire sample is analyzed simultaneously. Furthermore, the multicategorical nature of \(X\) is respected and retained (p. 189).”

6.1 Relative total, direct, and indirect effects

In review of regression analysis in Chapter 2, we saw that a multicategorical antecedent variable with \(g\) categories can be used as an antecedent variable in a regression model if it is represented by \(g - 1\) variables using some kind of group coding system (see section 2.7). [Hayes] described indicator or dummy coding as one such system, where groups are represented with \(g - 1\) variables set to either zero or one (see Table 2.1). With indicator coding, one of the \(g\) groups is chosen as the reference group. Cases in the reference group receive a zero on all \(g - 1\) variables coding \(X\). Each of the remaining \(g - 1\) groups gets its own indicator variable that is set to 1 for cases in that group, with all other cases set to zero. Using such a system, which of the \(g\) groups a case is in is represented by its pattern of zeros and ones on the \(g - 1\) indicator variables. These \(g - 1\) indicator variables are then used as antecedent variables in a regression model as a stand-in for \(X\). (pp. 189–190, emphasis in the original)

6.1.1 Relative indirect effects

When our \(X\) is multicategorical, we end up with \(g - 1\) \(a\) coefficients. Presuming the \(M\) variable is continuous or binary, this will yield \(g - 1\) relative indirect effects, \(a_j b\).

6.1.2 Relative direct effects

Similar to above, when our \(X\) is multicategorical, we end up with \(g - 1\) \(c'\) coefficients, each of which is a relative direct effects.

6.1.3 Relative total effects

With the two prior subsections in mind, when our \(X\) is multicategorical, we end up with \(g - 1\) \(c\) coefficients, each of which is a relative total effect. These follow the form

\[c_j = c_j' + a_j b,\]

where \(j\) indexes a given group.

6.2 An example: Sex discrimination in the workplace

Here we load a couple necessary packages, load the data, and take a glimpse().

library(tidyverse)

protest <- read_csv("data/protest/protest.csv")

glimpse(protest)
Rows: 129
Columns: 6
$ subnum   <dbl> 209, 44, 124, 232, 30, 140, 27, 64, 67, 182, 85, 109, 122, 69, 45, 28, 170, 66, 168, 97, 7, 29, 116, 35, 13, 12…
$ protest  <dbl> 2, 0, 2, 2, 2, 1, 2, 0, 0, 0, 2, 2, 0, 1, 1, 0, 1, 2, 2, 1, 2, 1, 1, 2, 2, 0, 1, 1, 0, 1, 1, 1, 0, 2, 1, 2, 2, …
$ sexism   <dbl> 4.87, 4.25, 5.00, 5.50, 5.62, 5.75, 5.12, 6.62, 5.75, 4.62, 4.75, 6.12, 4.87, 5.87, 4.87, 4.62, 4.50, 5.50, 4.2…
$ angry    <dbl> 2, 1, 3, 1, 1, 1, 2, 1, 6, 1, 2, 5, 2, 1, 1, 1, 2, 1, 3, 4, 1, 1, 1, 5, 1, 5, 1, 1, 2, 1, 1, 1, 4, 1, 2, 6, 1, …
$ liking   <dbl> 4.83, 4.50, 5.50, 5.66, 6.16, 6.00, 4.66, 6.50, 1.00, 6.83, 5.00, 5.66, 5.83, 6.50, 4.50, 4.83, 6.16, 4.33, 4.8…
$ respappr <dbl> 4.25, 5.75, 4.75, 7.00, 6.75, 5.50, 5.00, 6.25, 3.00, 5.75, 5.25, 7.00, 4.50, 6.25, 5.00, 5.25, 5.25, 5.00, 4.5…

Here are the ungrouped means and \(\textit{SD}\)’s for respappr and liking shown at the bottom of Table 6.1.

protest %>%
  pivot_longer(liking:respappr) %>% 
  group_by(name) %>% 
  summarize(mean = mean(value),
            sd   = sd(value)) %>% 
  mutate_if(is.double, round, digits = 3)
# A tibble: 2 × 3
  name      mean    sd
  <chr>    <dbl> <dbl>
1 liking    5.64  1.05
2 respappr  4.87  1.35

We compute the summaries for respappr and liking, grouped by protest, like so.

protest %>%
  pivot_longer(liking:respappr) %>% 
  group_by(protest, name) %>% 
  summarize(mean = mean(value),
            sd   = sd(value)) %>% 
  mutate_if(is.double, round, digits = 3)
# A tibble: 6 × 4
# Groups:   protest [3]
  protest name      mean    sd
    <dbl> <chr>    <dbl> <dbl>
1       0 liking    5.31 1.30 
2       0 respappr  3.88 1.46 
3       1 liking    5.83 0.819
4       1 respappr  5.14 1.08 
5       2 liking    5.75 0.936
6       2 respappr  5.49 0.936

It looks like Hayes has a typo in the \(\textit{SD}\) for liking when protest == 0. It seems he accidentally entered the value for when protest == 1 in that slot.

You’ll have to wait a minute to see where the adjusted \(Y\) values came from.

With a little if_else(), computing the dummies d1 and d2 is easy enough.

protest <- protest %>% 
  mutate(d1 = if_else(protest == 1, 1, 0),
         d2 = if_else(protest == 2, 1, 0))

We’re almost ready to fit the model. Let’s load brms.

library(brms)

This is the first time we’ve had a simple univariate regression model in a while–no special mvbind() syntax or multiple bf() formulas, just straight up brms::brm().

model6.1 <- brm(
  data = protest, 
  family = gaussian,
  liking ~ 1 + d1 + d2,
  cores = 4,
  file = "fits/model06.01")

Check the coefficient summaries.

fixef(model6.1)
           Estimate Est.Error         Q2.5     Q97.5
Intercept 5.3156338 0.1707610  4.986492764 5.6538640
d1        0.5134009 0.2353311  0.047812633 0.9800288
d2        0.4375344 0.2307265 -0.007203555 0.8897587

Our \(R^2\) differences a bit from the OLS version in the text. This shouldn’t be surprising when it’s near the boundary.

bayes_R2(model6.1)
     Estimate  Est.Error        Q2.5     Q97.5
R2 0.05806353 0.03639497 0.004798362 0.1429082

Here’s its shape. For the plots in this chapter, we’ll take a few formatting cues from Edward Tufte (2001), courtesy of the ggthemes package. The theme_tufte() function will change the default font and remove some chart junk. We will take our color palette from Pokemon via the palettetown package (Lucas, 2016).

library(ggthemes)
library(palettetown)

bayes_R2(model6.1, summary = F) %>% 
  data.frame() %>% 
  
  ggplot(aes(x = R2)) +
  geom_density(linewidth = 0, fill = pokepal(pokemon = "plusle")[2]) +
  scale_y_continuous(NULL, breaks = NULL) +
  coord_cartesian(xlim = 0:1) +
  xlab(expression(italic(R)^2)) +
  theme_tufte() +
  theme(legend.title = element_blank(),
        plot.background = element_rect(fill = pokepal(pokemon = "plusle")[8]))

To use the model-implied equations to compute the means for each group on the criterion, we’ll extract the posterior draws.

draws <- as_draws_df(model6.1)

draws %>% 
  mutate(Y_np = b_Intercept + b_d1 * 0 + b_d2 * 0,
         Y_ip = b_Intercept + b_d1 * 1 + b_d2 * 0,
         Y_cp = b_Intercept + b_d1 * 0 + b_d2 * 1) %>% 
  pivot_longer(contains("Y_")) %>%
  # this line will order our output the same way Hayes did in the text (p. 197)
  mutate(name = factor(name, levels = c("Y_np", "Y_ip", "Y_cp"))) %>% 
  group_by(name) %>% 
  summarize(mean = mean(value),
            sd   = sd(value))
# A tibble: 3 × 3
  name   mean    sd
  <fct> <dbl> <dbl>
1 Y_np   5.32 0.171
2 Y_ip   5.83 0.162
3 Y_cp   5.75 0.155

What Hayes called the “relative total effects” \(c_1\) and \(c_2\) are the d1 and d2 lines in our fixef() output, above.

Here are the sub-models for the mediation model.

m_model <- bf(respappr ~ 1 + d1 + d2)
y_model <- bf(liking   ~ 1 + d1 + d2 + respappr)

There’s a third way to fit multivariate models in brms. It uses either the mvbrmsformula() function, or its abbreviated version, mvbf(). With these, we first define our submodels in br() statements like before. We then combine them within mvbf(), separated with a comma. If we’d like to avoid estimating a residual correlation, which we do in this project–, we then set rescore = FALSE. Here’s how it looks like for our second model.

model6.2 <- brm(
  data = protest, 
  family = gaussian,
  mvbf(m_model, y_model, rescor = FALSE),
  cores = 4,
  file = "fits/model06.02")
print(model6.2)
 Family: MV(gaussian, gaussian) 
  Links: mu = identity
         mu = identity 
Formula: respappr ~ 1 + d1 + d2 
         liking ~ 1 + d1 + d2 + respappr 
   Data: protest (Number of observations: 129) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
respappr_Intercept     3.88      0.19     3.51     4.24 1.00     5285     2838
liking_Intercept       3.72      0.31     3.13     4.32 1.00     5877     3393
respappr_d1            1.26      0.26     0.75     1.76 1.00     5756     3184
respappr_d2            1.61      0.25     1.12     2.10 1.00     5539     3229
liking_d1             -0.00      0.22    -0.45     0.43 1.00     4017     3171
liking_d2             -0.22      0.24    -0.68     0.24 1.00     4025     3392
liking_respappr        0.41      0.07     0.27     0.55 1.00     4573     3234

Further Distributional Parameters:
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_respappr     1.18      0.08     1.04     1.34 1.00     6940     2909
sigma_liking       0.93      0.06     0.82     1.05 1.00     7109     2888

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Behold the Bayesian \(R^2\) posteriors.

bayes_R2(model6.2, summary = F) %>% 
  data.frame() %>% 
  pivot_longer(everything()) %>% 
  
  ggplot(aes(x = value, fill = name)) +
  geom_density(linewidth = 0, alpha = 2/3) +
  annotate("text", x = 0.2, y = 7, label = "liking", 
           color = pokepal(pokemon = "plusle")[2], family = "Times") +
  annotate("text", x = 0.355, y = 6, label = "respappr", 
           color = pokepal(pokemon = "plusle")[6], family = "Times") +
  scale_fill_manual(values = pokepal(pokemon = "plusle")[c(2, 6)]) +
  scale_x_continuous(NULL, limits = c(0:1)) +
  scale_y_continuous(NULL, breaks = NULL) +
  labs(title = expression(The~italic(R)^2*" densities overlap near perfectly, both hovering around .25.")) +
  theme_tufte() +
  theme(legend.position = "none",
        plot.background = element_rect(fill = pokepal(pokemon = "plusle")[8]))

To get the model summaries as presented in the second two columns in Table 6.2, we use as_draws_df(), rename a bit, and summarize. Like in the last chapter, here we’ll do so with a little help from tidybayes.

library(tidybayes)

draws <- as_draws_df(model6.2) %>% 
  mutate(a1       = b_respappr_d1,
         a2       = b_respappr_d2,
         b        = b_liking_respappr,
         c1_prime = b_liking_d1,
         c2_prime = b_liking_d2,
         i_m      = b_respappr_Intercept,
         i_y      = b_liking_Intercept)

draws %>% 
  pivot_longer(a1:i_y) %>% 
  group_by(name) %>% 
  mean_qi(value) %>% 
  mutate_if(is.double, round, digits = 3)
# A tibble: 7 × 7
  name      value .lower .upper .width .point .interval
  <chr>     <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 a1        1.26   0.749  1.76    0.95 mean   qi       
2 a2        1.62   1.12   2.10    0.95 mean   qi       
3 b         0.411  0.273  0.551   0.95 mean   qi       
4 c1_prime -0.004 -0.45   0.43    0.95 mean   qi       
5 c2_prime -0.219 -0.675  0.24    0.95 mean   qi       
6 i_m       3.88   3.51   4.24    0.95 mean   qi       
7 i_y       3.72   3.13   4.32    0.95 mean   qi       

Working with the \(\overline M_{ij}\) formulas in page 199 is quite similar to what we did above.

draws %>% 
  mutate(M_np = b_respappr_Intercept + b_respappr_d1 * 0 + b_respappr_d2 * 0,
         M_ip = b_respappr_Intercept + b_respappr_d1 * 1 + b_respappr_d2 * 0,
         M_cp = b_respappr_Intercept + b_respappr_d1 * 0 + b_respappr_d2 * 1) %>% 
  pivot_longer(starts_with("M_")) %>%
  # this line will order our output the same way Hayes did in the text (p. 199)
  mutate(name = factor(name, levels = c("M_np", "M_ip", "M_cp"))) %>% 
  group_by(name) %>% 
  summarize(mean = mean(value),
            sd   = sd(value))
# A tibble: 3 × 3
  name   mean    sd
  <fct> <dbl> <dbl>
1 M_np   3.88 0.186
2 M_ip   5.15 0.180
3 M_cp   5.50 0.174

The \(\overline Y^*_{ij}\) formulas are more of the same.

draws <- draws %>% 
  mutate(Y_np = b_liking_Intercept + b_liking_d1 * 0 + b_liking_d2 * 0 + b_liking_respappr * mean(protest$respappr),
         Y_ip = b_liking_Intercept + b_liking_d1 * 1 + b_liking_d2 * 0 + b_liking_respappr * mean(protest$respappr),
         Y_cp = b_liking_Intercept + b_liking_d1 * 0 + b_liking_d2 * 1 + b_liking_respappr * mean(protest$respappr))

draws %>% 
  pivot_longer(starts_with("Y_")) %>% 
  mutate(name = factor(name, levels = c("Y_np", "Y_ip", "Y_cp"))) %>% 
  group_by(name) %>% 
  summarize(mean = mean(value),
            sd   = sd(value))
# A tibble: 3 × 3
  name   mean    sd
  <fct> <dbl> <dbl>
1 Y_np   5.71 0.166
2 Y_ip   5.71 0.142
3 Y_cp   5.50 0.144

Note, these are where the adjusted \(Y\) values came from in Table 6.1.

This is as fine a spot as any to introduce coefficient plots. The brms, tidybayes, and bayesplot packages all offer convenience functions for coefficient plots. Before we get all lazy using convenience functions, it’s good to know how to make coefficient plots by hand. Here’s ours for those last three \(\overline Y^*_{ij}\)-values.

draws %>% 
  pivot_longer(starts_with("Y_")) %>% 
  
  ggplot(aes(x = value, y = name, color = name)) +
  stat_summary(geom = "pointrange",
               fun = median,
               fun.min = function(x) {quantile(x, probs = 0.025)},
               fun.max = function(x) {quantile(x, probs = 0.975)},
               size = 0.75) +
  stat_summary(geom = "linerange",
               fun.min = function(x) {quantile(x, probs = 0.25)},
               fun.max = function(x) {quantile(x, probs = 0.75)},
               size = 1.5) +
  scale_color_manual(values = pokepal(pokemon = "plusle")[c(3, 7, 9)]) +
  labs(x = NULL, y = NULL) +
  theme_tufte() +
  theme(axis.ticks.y = element_blank(),
        legend.position = "none",
        plot.background = element_rect(fill = pokepal(pokemon = "plusle")[8]))

The points are the posterior medians, the thick inner lines the 50% intervals, and the thinner outer lines the 95% intervals. For kicks, we distinguished the three values by color.

If we want to examine \(R^2\) change for dropping the dummy variables, we’ll first fit a model that omits them.

model6.3 <- brm(
  data = protest, 
  family = gaussian,
  liking ~ 1 + respappr,
  cores = 4,
  file = "fits/model06.03")

Here are the competing \(R^2\) distributions.

# get the R2 draws and wrangle
r2 <- rbind(
  bayes_R2(model6.2, resp = "liking", summary = F),
  bayes_R2(model6.3, summary = F)) %>% 
  data.frame() %>% 
  set_names("R2") %>% 
  mutate(fit = rep(c("model6.2", "model6.3"), each = n() / 2))

# plot!
r2 %>% 
  ggplot(aes(x = R2, fill = fit)) +
  geom_density(linewidth = 0, alpha = 2/3) +
  annotate("text", x = 0.15, y = 6.4, label = "model3", 
           color = pokepal(pokemon = "plusle")[7], family = "Times") +
  annotate("text", x = 0.33, y = 6.9, label = "model2", 
           color = pokepal(pokemon = "plusle")[6], family = "Times") +
  scale_fill_manual(values = pokepal(pokemon = "plusle")[c(6, 7)]) +
  scale_x_continuous(NULL, limits = 0:1) +
  scale_y_continuous(NULL, breaks = NULL) +
  ggtitle(expression(The~italic(R)^2*" densities for LIKING overlap a lot.")) +
  theme_tufte() +
  theme(legend.position = "none",
        plot.background = element_rect(fill = pokepal(pokemon = "plusle")[8]))

If you want to compare then with a change score, do something like this.

r2 %>%
  mutate(iter = rep(1:4000, times = 2)) %>% 
  pivot_wider(names_from = fit, values_from = R2) %>% 

  ggplot(aes(x = model6.2 - model6.3)) +
  geom_density(linewidth = 0, fill = pokepal(pokemon = "plusle")[4]) +
  geom_vline(xintercept = 0, color = pokepal(pokemon = "plusle")[8]) +
  scale_y_continuous(NULL, breaks = NULL) +
  labs(title = expression(The~Delta*italic(R)^2~distribution),
       subtitle = "Doesn't appear we have a lot of change.",
       x = NULL) +
  theme_tufte() +
  theme(legend.title = element_blank(),
        plot.background = element_rect(fill = pokepal(pokemon = "plusle")[8]))

Now compute the posterior means and 95% intervals for \(a_1 b\) and \(a_2 b\), the conditional indirect effects.

draws %>% 
  mutate(a1b = a1 * b,
         a2b = a2 * b) %>%
  pivot_longer(a1b:a2b) %>% 
  group_by(name) %>% 
  mean_qi(value) %>% 
  mutate_if(is.double, round, digits = 3)
# A tibble: 2 × 7
  name  value .lower .upper .width .point .interval
  <chr> <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 a1b   0.518  0.271  0.805   0.95 mean   qi       
2 a2b   0.663  0.39   0.994   0.95 mean   qi       

6.3 Using a different group coding system

Here we’ll make our alternative dummies, what we’ll call d_1 and d_2, with orthogonal contrast coding.

protest <- protest %>% 
  mutate(d_1 = if_else(protest == 0, -2/3, 1/3),
         d_2 = if_else(protest == 0, 0, 
                       if_else(protest == 1, -1/2, 1/2)))

Here are the sub-models.

m_model <- bf(respappr ~ 1 + d_1 + d_2)
y_model <- bf(liking   ~ 1 + d_1 + d_2 + respappr)

Now we fit using the mvbf() approach.

model6.4 <- brm(
  data = protest, 
  family = gaussian,
  mvbf(m_model, y_model, rescor = FALSE),
  cores = 4,
  file = "fits/model06.04")

Here are our intercepts and regression coefficient summaries.

fixef(model6.4)
                     Estimate  Est.Error       Q2.5     Q97.5
respappr_Intercept  4.8417337 0.10365550  4.6403229 5.0445784
liking_Intercept    3.6395870 0.35737005  2.9421210 4.3434059
respappr_d_1        1.4329954 0.22776019  0.9809119 1.8840704
respappr_d_2        0.3465410 0.24804578 -0.1261091 0.8294025
liking_d_1         -0.1091479 0.20319804 -0.5111210 0.2860424
liking_d_2         -0.2186369 0.20343717 -0.6164157 0.1849367
liking_respappr     0.4112150 0.07115662  0.2730185 0.5492890

It’s important to note that these will not correspond to the “TOTAL EFFECT MODEL” section of the PROCESS output of Figure 6.3. Hayes’s PROCESS has the mcx=3 command which tells the program to reparametrize the orthogonal contrasts. brms doesn’t have such a command.

For now, we’ll have to jump to Equation 6.8 towards the bottom of page 207. Those parameters are evident in our output. For good measure, here we’ll practice with posterior_summary().

posterior_summary(model6.4) %>% 
  data.frame() %>% 
  rownames_to_column("parameter") %>% 
  filter(str_detect(parameter, "b_respappr"))
             parameter Estimate Est.Error       Q2.5     Q97.5
1 b_respappr_Intercept 4.841734 0.1036555  4.6403229 5.0445784
2       b_respappr_d_1 1.432995 0.2277602  0.9809119 1.8840704
3       b_respappr_d_2 0.346541 0.2480458 -0.1261091 0.8294025

Thus it’s easy to get the \(\overline M_{ij}\) means with a little posterior manipulation.

draws <- as_draws_df(model6.4) %>% 
  mutate(M_np = b_respappr_Intercept + b_respappr_d_1 * -2/3 + b_respappr_d_2 *    0,
         M_ip = b_respappr_Intercept + b_respappr_d_1 *  1/3 + b_respappr_d_2 * -1/2,
         M_cp = b_respappr_Intercept + b_respappr_d_1 *  1/3 + b_respappr_d_2 *  1/2)

draws %>% 
  pivot_longer(starts_with("M_")) %>% 
  mutate(name = factor(name, levels = c("M_np", "M_ip", "M_cp"))) %>% 
  group_by(name) %>% 
  summarize(mean = mean(value),
            sd   = sd(value))
# A tibble: 3 × 3
  name   mean    sd
  <fct> <dbl> <dbl>
1 M_np   3.89 0.183
2 M_ip   5.15 0.182
3 M_cp   5.49 0.176

With these in hand, we can compute \(a_1\) and \(a_2\).

draws <- draws %>% 
  mutate(a1 = (M_ip + M_cp) / 2 - M_np,
         a2 = M_cp - M_ip)

draws %>% 
  pivot_longer(a1:a2) %>% 
  group_by(name) %>% 
  summarize(mean = mean(value),
            sd   = sd(value))
# A tibble: 2 × 3
  name   mean    sd
  <chr> <dbl> <dbl>
1 a1    1.43  0.228
2 a2    0.347 0.248

Happily, our model output will allow us to work with Hayes’s \(\overline Y^*_{ij}\) equations in the middle of page 210.

draws <- draws %>% 
  mutate(Y_np = b_liking_Intercept + b_liking_d_1 * -2/3 + b_liking_d_2 *    0 + b_liking_respappr * mean(protest$respappr),
         Y_ip = b_liking_Intercept + b_liking_d_1 *  1/3 + b_liking_d_2 * -1/2 + b_liking_respappr * mean(protest$respappr),
         Y_cp = b_liking_Intercept + b_liking_d_1 *  1/3 + b_liking_d_2 *  1/2 + b_liking_respappr * mean(protest$respappr))

draws %>% 
  pivot_longer(starts_with("Y_")) %>% 
  mutate(name = factor(name, levels = c("Y_np", "Y_ip", "Y_cp"))) %>% 
  group_by(name) %>%
  summarize(mean = mean(value),
            sd   = sd(value))
# A tibble: 3 × 3
  name   mean    sd
  <fct> <dbl> <dbl>
1 Y_np   5.71 0.157
2 Y_ip   5.71 0.145
3 Y_cp   5.49 0.147

And with these in hand, we can compute \(c'_1\) and \(c'_2\).

draws <- draws %>% 
  mutate(c1_prime = (Y_ip + Y_cp) / 2 - Y_np,
         c2_prime = Y_cp - Y_ip)

draws %>% 
  pivot_longer(c1_prime:c2_prime) %>% 
  group_by(name) %>% 
  summarize(mean = mean(value),
            sd   = sd(value))
# A tibble: 2 × 3
  name       mean    sd
  <chr>     <dbl> <dbl>
1 c1_prime -0.109 0.203
2 c2_prime -0.219 0.203

It appears Hayes has a typo in the formula for \(c'_2\) on page 211. The value he has down for \(\overline Y^*_{IP}\), 5.145, is incorrect. It’s not the one he displayed at the bottom of the previous page and it also contradicts the analyses herein. So it goes… These things happen.

We haven’t spelled it out, but the \(b\) parameter is currently labeled b_liking_respappr in our draws object. Here we’ll make a b column to make things easier. While we’re at it, we’ll compute the indirect effects, too.

draws <- draws %>%
  mutate(b = b_liking_respappr) %>% 
  mutate(a1b = a1 * b,
         a2b = a2 * b)

draws %>% 
  pivot_longer(a1b:a2b) %>% 
  group_by(name) %>% 
  mean_qi(value) %>% 
  mutate_if(is.double, round, digits = 3)
# A tibble: 2 × 7
  name  value .lower .upper .width .point .interval
  <chr> <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 a1b   0.59   0.339  0.893   0.95 mean   qi       
2 a2b   0.142 -0.05   0.359   0.95 mean   qi       

Now we can compute and summarize() our \(c_1\) and \(c_2\).

draws <- draws %>% 
  mutate(c1 = c1_prime + a1b,
         c2 = c2_prime + a2b)

draws %>% 
  pivot_longer(c1:c2) %>% 
  group_by(name) %>% 
  summarize(mean = mean(value),
            sd   = sd(value))
# A tibble: 2 × 3
  name     mean    sd
  <chr>   <dbl> <dbl>
1 c1     0.481  0.197
2 c2    -0.0765 0.224

6.4 Some miscellaneous issues [unrelated to those Hayes covered in the text]

Do you recall how way back in Chapter 2 we covered an alternative way to fit models with multicategorical grouping variables? Well, we did. The basic strategy is to save our grouping variable as a factor and then enter it into the model with the special 0 + syntax, which removes the typical intercept. Since this chapter is all about multicategorical variables, it might make sense to explore what happens when we use this approach. For our first step, well prepare the data.

protest <- protest %>% 
  mutate(group = factor(protest,
                        levels = 0:2,
                        labels = c("none", "individual", "collective")))

protest %>% 
  select(protest, group)
# A tibble: 129 × 2
   protest group     
     <dbl> <fct>     
 1       2 collective
 2       0 none      
 3       2 collective
 4       2 collective
 5       2 collective
 6       1 individual
 7       2 collective
 8       0 none      
 9       0 none      
10       0 none      
# ℹ 119 more rows

Before we fit a full mediation model, we should warm up. Here we fit a univariable model for liking. This is an alternative to what we did way back with model6.1.

model6.5 <- brm(
  data = protest, 
  family = gaussian,
  liking ~ 0 + group,
  cores = 4,
  file = "fits/model06.05")

Check the summary.

print(model6.5)
 Family: gaussian 
  Links: mu = identity 
Formula: liking ~ 0 + group 
   Data: protest (Number of observations: 129) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
groupnone           5.31      0.17     4.99     5.64 1.00     4556     2860
groupindividual     5.82      0.16     5.51     6.13 1.00     4428     2820
groupcollective     5.75      0.15     5.45     6.05 1.00     3912     2758

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     1.04      0.07     0.92     1.18 1.00     4089     2910

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

There’s no conventional intercept parameter. Rather, each of the each of the levels of group get its own conditional intercept. To get a sense of what this model is, let’s practice our coefficient plotting skills. This time we’ll compute the necessary values before plugging them into ggplot2.

# compute the means for `liking` by `group`
group_means <-
  protest %>% 
  group_by(group) %>% 
  summarize(mu_liking = mean(liking))

# pull the posterior summaries and wrangle
fixef(model6.5) %>% 
  data.frame() %>% 
  rownames_to_column("parameter") %>% 
  mutate(group = str_remove(parameter, "group")) %>% 
  
  # plot!
  ggplot(aes(y = group)) +
  # this is the main function for our coefficient plots
  geom_pointrange(aes(x = Estimate, xmin = Q2.5, xmax = Q97.5, color = group),
                  linewidth = 3/4) +
  geom_point(data = group_means,
             aes(x = mu_liking)) +
  scale_color_manual(values = pokepal(pokemon = "plusle")[c(3, 7, 9)]) +
  labs(x = NULL, y = NULL) +
  theme_tufte() +
  theme(axis.text.y = element_text(hjust = 0),
        axis.ticks.y = element_blank(),
        legend.position = "none",
        plot.background = element_rect(fill = pokepal(pokemon = "plusle")[8]))

The results from the model are in colored point ranges. The black dots in the foreground are the empirical means. It looks like our model did a good job estimating the group means for liking.

Let’s see how this coding approach works when you fit a full mediation model. First, define the sub-models with two bf() lines.

m_model <- bf(respappr ~ 0 + group)
y_model <- bf(liking   ~ 0 + group + respappr)

Now fit model6 using the mvbf() approach.

model6.6 <- brm(
  data = protest, 
  family = gaussian,
  mvbf(m_model, y_model, rescor = FALSE),
  cores = 4,
  file = "fits/model06.06")

What will the summary hold?

print(model6.6)
 Family: MV(gaussian, gaussian) 
  Links: mu = identity
         mu = identity 
Formula: respappr ~ 0 + group 
         liking ~ 0 + group + respappr 
   Data: protest (Number of observations: 129) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
respappr_groupnone           3.88      0.18     3.53     4.24 1.00     3568     2459
respappr_groupindividual     5.15      0.18     4.80     5.51 1.00     4109     3193
respappr_groupcollective     5.50      0.17     5.17     5.84 1.00     4065     2561
liking_groupnone             3.71      0.31     3.08     4.32 1.00     1140     1775
liking_groupindividual       3.71      0.40     2.92     4.48 1.00     1070     1861
liking_groupcollective       3.49      0.42     2.67     4.30 1.00     1171     1834
liking_respappr              0.41      0.07     0.27     0.56 1.00     1100     1524

Further Distributional Parameters:
               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_respappr     1.18      0.08     1.04     1.33 1.00     3368     2596
sigma_liking       0.93      0.06     0.82     1.05 1.00     4006     2611

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

If you flip back to page 199, you’ll notice the posterior mean in the first three rows (i.e., respappr_groupnone through respappr_groupcollective) correspond to the estimates for \(\overline M_{NP}\), \(\overline M_{IP}\), and \(\overline M_{CP}\), respectively.

Let’s get to the \(a_j b\) estimates.

draws <- as_draws_df(model6.6) %>% 
  mutate(a1 = b_respappr_groupnone,
         a2 = b_respappr_groupindividual,
         a3 = b_respappr_groupcollective,
         b  = b_liking_respappr) %>% 
  mutate(a1b = a1 * b,
         a2b = a2 * b,
         a3b = a3 * b)

draws %>% 
  pivot_longer(a1b:a3b) %>% 
  group_by(name) %>% 
  mean_qi(value) %>% 
  mutate_if(is.double, round, digits = 3)
# A tibble: 3 × 7
  name  value .lower .upper .width .point .interval
  <chr> <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 a1b    1.60   1.05   2.19   0.95 mean   qi       
2 a2b    2.12   1.4    2.91   0.95 mean   qi       
3 a3b    2.26   1.49   3.07   0.95 mean   qi       

With this parameterization, it’s a little difficult to interpret the \(a_j b\) estimates. None of them are in comparison to anything. However, this approach is quite useful once you compute their various difference scores.

# compute the difference scores
draws <- draws %>% 
  mutate(diff_individual_minus_none       = a2b - a1b,
         diff_collective_minus_none       = a3b - a1b,
         diff_collective_minus_individual = a3b - a2b)

# summarize
draws %>% 
  pivot_longer(contains("diff")) %>% 
  group_by(name) %>% 
  mean_qi(value) %>% 
  mutate_if(is.double, round, digits = 3)
# A tibble: 3 × 7
  name                             value .lower .upper .width .point .interval
  <chr>                            <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1 diff_collective_minus_individual 0.144 -0.062  0.374   0.95 mean   qi       
2 diff_collective_minus_none       0.665  0.381  0.996   0.95 mean   qi       
3 diff_individual_minus_none       0.521  0.273  0.827   0.95 mean   qi       

If you look back to our results from model6.2, you’ll see that diff_individual_minus_none and diff_collective_minus_none correspond to a1b and a2b, respectively. But with our model6.6 approach, we get the additional information of what kind of indirect effect we might have yielded had we used a different coding scheme for our original set of dummy variables. That is, we get diff_collective_minus_individual.

Session info

sessionInfo()
R version 4.5.1 (2025-06-13)
Platform: aarch64-apple-darwin20
Running under: macOS Ventura 13.4

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Chicago
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tidybayes_3.0.7   palettetown_0.1.1 ggthemes_5.1.0    brms_2.23.0       Rcpp_1.1.0        lubridate_1.9.4   forcats_1.0.1    
 [8] stringr_1.6.0     dplyr_1.1.4       purrr_1.2.1       readr_2.1.5       tidyr_1.3.2       tibble_3.3.1      ggplot2_4.0.1    
[15] tidyverse_2.0.0  

loaded via a namespace (and not attached):
 [1] svUnit_1.0.8            tidyselect_1.2.1        farver_2.1.2            loo_2.9.0.9000          S7_0.2.1               
 [6] fastmap_1.2.0           TH.data_1.1-4           tensorA_0.36.2.1        digest_0.6.39           estimability_1.5.1     
[11] timechange_0.3.0        lifecycle_1.0.5         StanHeaders_2.36.0.9000 survival_3.8-3          magrittr_2.0.4         
[16] posterior_1.6.1.9000    compiler_4.5.1          rlang_1.1.7             tools_4.5.1             utf8_1.2.6             
[21] knitr_1.51              labeling_0.4.3          bridgesampling_1.2-1    htmlwidgets_1.6.4       curl_7.0.0             
[26] bit_4.6.0               pkgbuild_1.4.8          plyr_1.8.9              RColorBrewer_1.1-3      abind_1.4-8            
[31] multcomp_1.4-29         withr_3.0.2             grid_4.5.1              stats4_4.5.1            xtable_1.8-4           
[36] inline_0.3.21           emmeans_1.11.2-8        scales_1.4.0            MASS_7.3-65             cli_3.6.5              
[41] mvtnorm_1.3-3           rmarkdown_2.30          crayon_1.5.3            generics_0.1.4          RcppParallel_5.1.11-1  
[46] rstudioapi_0.17.1       reshape2_1.4.5          tzdb_0.5.0              rstan_2.36.0.9000       splines_4.5.1          
[51] bayesplot_1.15.0.9000   parallel_4.5.1          matrixStats_1.5.0       vctrs_0.7.0             V8_8.0.1               
[56] Matrix_1.7-3            sandwich_3.1-1          jsonlite_2.0.0          arrayhelpers_1.1-0      hms_1.1.4              
[61] bit64_4.6.0-1           ggdist_3.3.3            glue_1.8.0              codetools_0.2-20        distributional_0.5.0   
[66] stringi_1.8.7           gtable_0.3.6            QuickJSR_1.8.1          pillar_1.11.1           htmltools_0.5.9        
[71] Brobdingnag_1.2-9       R6_2.6.1                vroom_1.6.6             evaluate_1.0.5          lattice_0.22-7         
[76] backports_1.5.0         rstantools_2.5.0.9000   coda_0.19-4.1           gridExtra_2.3           nlme_3.1-168           
[81] checkmate_2.3.3         xfun_0.55               zoo_1.8-14              pkgconfig_2.0.3        

Comments