From Multilevel Modeling to GEE: Revisiting the Within- and Between-Person Debate with Binary Predictors and Outcomes
Author
Affiliation
Ward B. Eiling
Utrecht University
Published
May 12, 2024
1 Introduction
This document provides supplementary materials for the manuscript “From Multilevel Modeling to GEE: Revisiting the Within- and Between-Person Debate with Binary Predictors and Outcomes”. We include:
A comparison of the hybrid and Mundlak’s contextual method.
Details on the treatment of extreme parameter estimates in Generalized Estimating Equations (GEEs).
All analyses were conducted in R. Package citations are provided at the end of this document.
Code
# Load packageslibrary(purrr) # Functional programminglibrary(dplyr) # Data manipulationlibrary(tidyr) # Data reshapinglibrary(stringr) # String handlinglibrary(ggplot2) # Plottinglibrary(here) # File pathslibrary(patchwork) # Combining ggplotslibrary(knitr) # For bib generation# Save citations to BibTeX for reproducibilitywrite_bib(c("purrr", "dplyr", "tidyr", "stringr", "ggplot2", "here", "patchwork", "knitr"), file ="packages.bib")
2 Loading Simulation Results
First, we need to load the simulation results. As the simulations were ran in two parts, we will load the results from both parts. The first part includes DGMs 2-4 (binary X and continuous Y, continuous X and binary Y, and binary X and Y), while the second part includes DGM 1 (continuous predictor and outcome). The results are saved in the output folder.
Code
### PART 1: DGM 2 (binary X and continuous Y), DGM 3 (continuous X and binary Y) and DGM 4 (binary X and Y) ----runname <-"April10_fullsimulation"# retrieve the design and parameterizationsettings <-readRDS(here(paste0("output/", runname, "/settings.RDS")))design <- settings$designparametrization <- settings$parametrizationnsim <- settings$nsim# repeat each row of the design nsim times and create replication indexdesign_long <- design[rep(1:nrow(design), each = nsim), ] %>%mutate(replication =rep(1:nsim, times =nrow(design)),design_id =rep(1:nrow(design), each = nsim))# initialize list to store simulation resultsresults_list <-list()# loop over the designfor (idesign in1:nrow(design)) {### Extract parameter values from design g.00<- design$g.00[idesign] g.01<- design$g.01[idesign] g.10<- design$g.10[idesign]# determine the beta values based on the parametrizationif (parametrization =="centeredX") { beta_between <- g.01 beta_within <- g.10 beta_contextual <- beta_between - beta_within } elseif (parametrization =="mundlak") { beta_contextual <- g.01 beta_within <- g.10 beta_between <- beta_within + beta_contextual }# read in the results parallel_results_setting <-readRDS(here(paste0("output/", runname, "/", idesign, ".RDS")))# unlist the lists inside the list df <-map_dfr(parallel_results_setting, function(rep) {map_dfr(rep, ~as.data.frame(as.list(.x)), .id ="model") }, .id ="replication") df_wide <- df %>%select(-X.Intercept.) %>%pivot_wider(id_cols = replication, names_from = model, values_from =c("X", "X.cent", "X.cluster.means"), names_glue ="{model}_{.value}") %>%select(where(~!all(is.na(.)))) %>%mutate(replication =as.integer(replication)) # ensure numeric# compute bias df_wide2 <- df_wide %>%mutate(l1_g.10_bias = l1_X - beta_within,l2_g.10_bias = l2_X.cent - beta_within,l3a_g.10_bias = l3a_X.cent - beta_within,l3a_g.01_bias = l3a_X.cluster.means - beta_between,l4_g.10_bias = l4_X - beta_within,l4_g.01_bias = l4_X.cluster.means - beta_contextual,# also for GEE# parametrization 1g.independence1_g.10_bias = g.independence1_X - beta_within,g.exchangeable1_g.10_bias = g.exchangeable1_X - beta_within,g.ar11_g.10_bias = g.ar11_X - beta_within,# parametrization 2g.independence2_g.10_bias = g.independence2_X.cent - beta_within,g.exchangeable2_g.10_bias = g.exchangeable2_X.cent - beta_within,g.ar12_g.10_bias = g.ar12_X.cent - beta_within,# parametrization 3g.independence3_g.10_bias = g.independence3_X.cent - beta_within,g.independence3_g.01_bias = g.independence3_X.cluster.means - beta_between,g.exchangeable3_g.10_bias = g.exchangeable3_X.cent - beta_within,g.exchangeable3_g.01_bias = g.exchangeable3_X.cluster.means - beta_between,g.ar13_g.10_bias = g.ar13_X.cent - beta_within,g.ar13_g.01_bias = g.ar13_X.cluster.means - beta_between,# parametrization 4g.independence4_g.10_bias = g.independence4_X - beta_within,g.independence4_g.01_bias = g.independence4_X.cluster.means - beta_contextual,g.exchangeable4_g.10_bias = g.exchangeable4_X - beta_within,g.exchangeable4_g.01_bias = g.exchangeable4_X.cluster.means - beta_contextual,g.ar14_g.10_bias = g.ar14_X - beta_within,g.ar14_g.01_bias = g.ar14_X.cluster.means - beta_contextual,# add design_id for merging laterdesign_id = idesign) # store the result results_list[[idesign]] <- df_wide2}# combine all resultssim_results_all <-bind_rows(results_list)# merge with the long designfinal_df <-left_join(design_long, sim_results_all, by =c("design_id", "replication"))# save the final data framesaveRDS(final_df, here(paste0("output/", runname, "/plotting_bias_df1.RDS")))### PART 2: DGM 1 (continuous X and Y) ----runname <-"April17_fullsimulation_contXY"# retrieve the design and parameterizationsettings <-readRDS(here(paste0("output/", runname, "/settings.RDS")))design <- settings$designparametrization <- settings$parametrizationnsim <- settings$nsim# repeat each row of the design nsim times and create replication indexdesign_long <- design[rep(1:nrow(design), each = nsim), ] %>%mutate(replication =rep(1:nsim, times =nrow(design)),design_id =rep(1:nrow(design), each = nsim) +378)# initialize list to store simulation resultsresults_list <-list()# loop over the designfor (idesign in1:nrow(design)) {### Extract parameter values from design g.00<- design$g.00[idesign] g.01<- design$g.01[idesign] g.10<- design$g.10[idesign]# determine the beta values based on the parametrizationif (parametrization =="centeredX") { beta_between <- g.01 beta_within <- g.10 beta_contextual <- beta_between - beta_within } elseif (parametrization =="mundlak") { beta_contextual <- g.01 beta_within <- g.10 beta_between <- beta_within + beta_contextual }# read in the results parallel_results_setting <-readRDS(here(paste0("output/", runname, "/", idesign, ".RDS")))# unlist the lists inside the list df <-map_dfr(parallel_results_setting, function(rep) {map_dfr(rep, ~as.data.frame(as.list(.x)), .id ="model") }, .id ="replication") df_wide <- df %>%select(-X.Intercept.) %>%pivot_wider(id_cols = replication, names_from = model, values_from =c("X", "X.cent", "X.cluster.means"), names_glue ="{model}_{.value}") %>%select(where(~!all(is.na(.)))) %>%mutate(replication =as.integer(replication)) # ensure numeric# compute bias df_wide2 <- df_wide %>%mutate(l1_g.10_bias = l1_X - beta_within,l2_g.10_bias = l2_X.cent - beta_within,l3a_g.10_bias = l3a_X.cent - beta_within,l3a_g.01_bias = l3a_X.cluster.means - beta_between,l4_g.10_bias = l4_X - beta_within,l4_g.01_bias = l4_X.cluster.means - beta_contextual,# also for GEE# parametrization 1g.independence1_g.10_bias = g.independence1_X - beta_within,g.exchangeable1_g.10_bias = g.exchangeable1_X - beta_within,g.ar11_g.10_bias = g.ar11_X - beta_within,# parametrization 2g.independence2_g.10_bias = g.independence2_X.cent - beta_within,g.exchangeable2_g.10_bias = g.exchangeable2_X.cent - beta_within,g.ar12_g.10_bias = g.ar12_X.cent - beta_within,# parametrization 3g.independence3_g.10_bias = g.independence3_X.cent - beta_within,g.independence3_g.01_bias = g.independence3_X.cluster.means - beta_between,g.exchangeable3_g.10_bias = g.exchangeable3_X.cent - beta_within,g.exchangeable3_g.01_bias = g.exchangeable3_X.cluster.means - beta_between,g.ar13_g.10_bias = g.ar13_X.cent - beta_within,g.ar13_g.01_bias = g.ar13_X.cluster.means - beta_between,# parametrization 4g.independence4_g.10_bias = g.independence4_X - beta_within,g.independence4_g.01_bias = g.independence4_X.cluster.means - beta_contextual,g.exchangeable4_g.10_bias = g.exchangeable4_X - beta_within,g.exchangeable4_g.01_bias = g.exchangeable4_X.cluster.means - beta_contextual,g.ar14_g.10_bias = g.ar14_X - beta_within,g.ar14_g.01_bias = g.ar14_X.cluster.means - beta_contextual,# add design_id for merging laterdesign_id = idesign) # store the result results_list[[idesign]] <- df_wide2}# combine all resultssim_results_all <-bind_rows(results_list) %>%mutate(design_id = design_id +378) # adjust design_id to match the first simulation# merge with the long designfinal_df <-left_join(design_long, sim_results_all, by =c("design_id", "replication"))# save the final data framesaveRDS(final_df, here(paste0("output/", runname, "/plotting_bias_df2.RDS")))
Let us now combine the results from both simulations
Code
### COMBINE RESULTS FROM BOTH SIMULATIONS ### ----# read in the first simulation resultsrunname1 <-"April10_fullsimulation"final_df1 <-readRDS(here(paste0("output/", runname1, "/plotting_bias_df1.RDS")))# read in the second simulation resultsrunname2 <-"April17_fullsimulation_contXY"final_df2 <-readRDS(here(paste0("output/", runname2, "/plotting_bias_df2.RDS")))# combine the two data framesfinal_df <-bind_rows(final_df1, final_df2)# save the final data framenewrunname <-"April18_fullsimulation_combined"saveRDS(final_df, here(paste0("output/", newrunname, "/plotting_bias_df.RDS")))
Code
# read in the final data framenewrunname <-"April18_fullsimulation_combined"final_df <-readRDS(here(paste0("output/", newrunname, "/plotting_bias_df.RDS")))
3 Comparison of Hybrid and Mundlak’s Contextual Method
To demonstrate numerical similarity between the hybrid (Model 4) and mundlak’s contextual (Model 3a) specifications, we computed the bias difference per replication. Results show that the within-person effects are equivalent across all scenarios. For contextual effects, we find no systematic differences under continuous outcomes (a linear link), but do find systematic differences in GEE under binary outcomes (a logit link). However, these do not affect conclusions made in the manuscript.
For continuous outcomes (DGMs 1 and 2; see Figure 1), we can see that there are no differences in the within-person effect estimated by the two methods across estimation frameworks and the 1000 replications. In contrast, we see slight differences in the contextual effect estimated by the two methods, although the distribution is closely centered around 0, implying that any imbalance in the estimates is very small and will balance out in the long run.
Code
# filter for the hybrid and contextual methodhybrid_contextual_df <- final_df %>%select(-c(ends_with("_success"), ends_with("_X"), ends_with("_X.cent"), ends_with("_X.cluster.means"),"l1_X", "g.independence1_X", "g.exchangeable1_X", "g.ar11_X", contains("2_"))) %>%filter(outcome.type =="continuous") %>%# create a column for the difference for within-person effect beta_1mutate(GLMM_beta1_dif = l4_g.10_bias - l3a_g.10_bias,GEE_indep_beta1_dif = g.independence4_g.10_bias - g.independence3_g.10_bias,GEE_exch_beta1_dif = g.exchangeable4_g.10_bias - g.exchangeable3_g.10_bias,GEE_ar_beta1_dif = g.ar14_g.10_bias - g.ar13_g.10_bias) %>%# create columns for the difference in contextual effect mutate(GLMM_g01_dif = l4_g.01_bias - l3a_g.01_bias,GEE_indep_g01_dif = g.independence4_g.01_bias - g.independence3_g.01_bias,GEE_exch_g01_dif = g.exchangeable4_g.01_bias - g.exchangeable3_g.01_bias,GEE_ar_g01_dif = g.ar14_g.01_bias - g.ar13_g.01_bias)# plot the differences in a density plotplot1 <- hybrid_contextual_df %>%ggplot(aes(x = GLMM_beta1_dif)) +geom_density() +labs(x ="Difference in within-person effect", y ="Density", title ="GLMM within-person effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot2 <- hybrid_contextual_df %>%ggplot(aes(x = GLMM_g01_dif)) +geom_density() +labs(x ="Difference in contextual effect", y ="Density", title ="GLMM contextual effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot3 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_indep_beta1_dif)) +geom_density() +labs(x ="Difference in within-person effect", y ="Density", title ="GEE-indep within-person effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot4 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_indep_g01_dif)) +geom_density() +labs(x ="Difference in contextual effect", y ="Density", title ="GEE-indep contextual effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot5 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_exch_beta1_dif)) +geom_density() +labs(x ="Difference in within-person effect", y ="Density", title ="GEE-exch within-person effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot6 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_exch_g01_dif)) +geom_density() +labs(x ="Difference in contextual effect", y ="Density", title ="GEE-exch contextual effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot7 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_ar_beta1_dif)) +geom_density() +labs(x ="Difference in within-person effect", y ="Density", title ="GEE-AR(1) within-person effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot8 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_ar_g01_dif)) +geom_density() +labs(x ="Difference in contextual effect", y ="Density", title ="GEE-AR(1) contextual effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")# combine plots with titlecombined_plot <- (plot1 + plot2) / (plot3 + plot4) / (plot5 + plot6) / (plot7 + plot8)combined_plot
Figure 1: Differences in Estimates Between Hybrid and Contextual Methods in DGM 1 and 2 (continuous outcome)
For binary outcomes (DGM 3 and 4; see Figure 2), the story is different. We see again that the within-person effect estimates are similar across the two methods. For the contextual effect, we see that GLMM shows substantially more variation than for the continuous outcomes, but that the distribution is still symmetric and centered around 0. However, for the GEEs estimation approaches, we see bi-modal distributions (two peaks) for the contextual effect estimates and that the distribution is not centered around 0 anymore. This indicates a difference in estimation of the identity (linear) and logit (non-linear) link functions.
Code
# filter for the hybrid and contextual methodhybrid_contextual_df <- final_df %>%select(-c(ends_with("_success"), ends_with("_X"), ends_with("_X.cent"), ends_with("_X.cluster.means"),"l1_X", "g.independence1_X", "g.exchangeable1_X", "g.ar11_X", contains("2_"))) %>%filter(outcome.type =="binary") %>%# create a column for the difference for within-person effect beta_1mutate(GLMM_beta1_dif = l4_g.10_bias - l3a_g.10_bias,GEE_indep_beta1_dif = g.independence4_g.10_bias - g.independence3_g.10_bias,GEE_exch_beta1_dif = g.exchangeable4_g.10_bias - g.exchangeable3_g.10_bias,GEE_ar_beta1_dif = g.ar14_g.10_bias - g.ar13_g.10_bias) %>%# create columns for the difference in contextual effect mutate(GLMM_g01_dif = l4_g.01_bias - l3a_g.01_bias,GEE_indep_g01_dif = g.independence4_g.01_bias - g.independence3_g.01_bias,GEE_exch_g01_dif = g.exchangeable4_g.01_bias - g.exchangeable3_g.01_bias,GEE_ar_g01_dif = g.ar14_g.01_bias - g.ar13_g.01_bias)# plot the differences in a density plotplot1 <- hybrid_contextual_df %>%ggplot(aes(x = GLMM_beta1_dif)) +geom_density() +labs(x ="Difference in within-person effect", y ="Density", title ="GLMM within-person effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot2 <- hybrid_contextual_df %>%ggplot(aes(x = GLMM_g01_dif)) +geom_density() +labs(x ="Difference in contextual effect", y ="Density", title ="GLMM contextual effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot3 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_indep_beta1_dif)) +geom_density() +labs(x ="Difference in within-person effect", y ="Density", title ="GEE-indep within-person effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot4 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_indep_g01_dif)) +geom_density() +labs(x ="Difference in contextual effect", y ="Density", title ="GEE-indep contextual effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot5 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_exch_beta1_dif)) +geom_density() +labs(x ="Difference in within-person effect", y ="Density", title ="GEE-exch within-person effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot6 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_exch_g01_dif)) +geom_density() +labs(x ="Difference in contextual effect", y ="Density", title ="GEE-exch contextual effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot7 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_ar_beta1_dif)) +geom_density() +labs(x ="Difference in within-person effect", y ="Density", title ="GEE-AR(1) within-person effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")plot8 <- hybrid_contextual_df %>%ggplot(aes(x = GEE_ar_g01_dif)) +geom_density() +labs(x ="Difference in contextual effect", y ="Density", title ="GEE-AR(1) contextual effect") +theme_bw() +xlim(-5, 5) +theme(legend.position ="bottom")# combine plots with titlecombined_plot <- (plot1 + plot2) / (plot3 + plot4) / (plot5 + plot6) / (plot7 + plot8)combined_plot
Figure 2: Differences in Estimates Between Hybrid and Contextual Methods in DGM 3 and 4 (binary outcome)
Let us investigate whether this influenced the figures shown in the manuscript for binary outcomes. Let us first consider DGM 3 (continuous predictor, binary outcome), which is visualized in Figure 3 . As expected based on the density plots, we see that the GLMM estimates of the two methods are very similar, but that the GEE estimates are not. Specifically, we see that the HB method produces estimates that have greater bias than the MuCo method. Nevertheless, this does not change any conclusions, as both methods under GEE with a binary outcome consistently fail to retrieve the true contextual effect.
Code
runname <-"April18_fullsimulation_combined"# read in the final data framefinal_df <-readRDS(here(paste0("output/", runname, "/plotting_bias_df.RDS")))# select relevant variables and cases (select and filter)plot_df <- final_df %>%# select T = 20, N = 200, sd.u0 = 1, predictor.type = "binary" and outcome.type = "continuous"filter(T_total %in%c(5, 20), N_total ==200, sd.u0 %in%c(1, 3), predictor.type =="continuous", outcome.type =="binary", sdX.between ==3, g.01==3) %>%select(-c(ends_with("_success"), ends_with("_X"), ends_with("_X.cent"), ends_with("_X.cluster.means")))plot_df_g01 <- plot_df %>%select(-ends_with("_g.10_bias")) %>%# turn the bias variables into long format, with a new column indicating the model namepivot_longer(cols =ends_with("_bias"), names_to ="model", values_to ="g01_bias") %>%# remove the "_g.01_bias" suffix from the model namesmutate(model =str_remove(model, "_g.01_bias")) %>%# change model namesmutate(model =recode(model,"l3a"="GLMM-HB","l4"="GLMM-MuCo","g.independence3"="GEE-independence-HB","g.independence4"="GEE-independence-MuCo","g.exchangeable3"="GEE-exchangeable-HB","g.exchangeable4"="GEE-exchangeable-MuCo","g.ar13"="GEE-AR1-HB","g.ar14"="GEE-AR1-MuCo")) %>%# set factor levels of model to ensure correct order in the plotmutate(model =factor(model, levels =c("GLMM-HB", "GLMM-MuCo", "GEE-independence-HB", "GEE-independence-MuCo","GEE-exchangeable-HB", "GEE-exchangeable-MuCo","GEE-AR1-HB", "GEE-AR1-MuCo"))) %>%# create new variable indicating method type (so M1 and G1 are "Method 1")mutate(method_type =case_when(str_detect(model, "HB") ~"HB",str_detect(model, "MuCo") ~"MuCo",)) %>%mutate(estimation_type =case_when(str_detect(model, "GLMM") ~"GLMM",str_detect(model, "independence") ~"GEE-independence",str_detect(model, "exchangeable") ~"GEE-exchangeable",str_detect(model, "AR1") ~"GEE-AR(1)")) %>%# set factor levels of method_type to ensure correct order in the plotmutate(estimation_type =factor(estimation_type, levels =c("GLMM", "GEE-independence", "GEE-exchangeable", "GEE-AR(1)")),method_type =factor(method_type, levels =c("HB", "MuCo"))) %>%# Turn label variables (sdX.between, g.01 and sd.u0) into strings with an underscoremutate(sd.u0_label =factor(sd.u0,levels =c(1, 3),labels =c(expression(sigma[u] ==1), expression(sigma[u] ==3)))) %>%mutate(T_total_label =factor(T_total,levels =c(5, 20),labels =c("T == 5", "T == 20"))) %>%# remove all bias values exceeding 100mutate(g01_bias =ifelse(abs(g01_bias) >100, NA, g01_bias)) # For the contextual effectggplot(plot_df_g01, aes(x = method_type, y = g01_bias, col = estimation_type)) +geom_boxplot() +geom_hline(yintercept =0, linetype ="dashed") +# Dashed horizontal line at 0coord_cartesian(ylim =c(-1.5, 1.5)) +# Set y-axis limits# add tick mark at Y for every 0.5scale_y_continuous(breaks =seq(-1.5, 1.5, by =0.5)) +labs(x ="Method", y ="Bias") +facet_grid(sd.u0_label ~ T_total_label, labeller = label_parsed) +# Show T and N values in labelstheme_bw() +# remove X axis labelstheme(# remove vertical grid linespanel.grid.major.x =element_blank(),# remove X tick marksaxis.ticks.x =element_blank(),# increase font size for grid titlesstrip.text.x =element_text(size =12),strip.text.y =element_text(size =12),# increase font size for X entriesaxis.text.x =element_text(size =12),axis.text.y =element_text(size =12),axis.title.y =element_text(size =13),axis.title.x =element_text(size =13),# remove legend# increase legend font sizelegend.text =element_text(size =11),legend.title =element_text(size =13),legend.position ="bottom",) +# change legend title to "Estimation"scale_color_brewer(name ="Estimation", palette ="Spectral")
Figure 3: Bias in contextual effect estimates for DGM 3 (continuous predictor, binary outcome)
For DGM 4 (binary predictor and outcome), we see the same pattern found for DGM 3, indicating that the differences in bias between the two methods do not alter the conclusions.
Code
runname <-"April18_fullsimulation_combined"# read in the final data framefinal_df <-readRDS(here(paste0("output/", runname, "/plotting_bias_df.RDS")))# select relevant variables and cases (select and filter)plot_df <- final_df %>%# select T = 20, N = 200, sd.u0 = 1, predictor.type = "binary" and outcome.type = "continuous"filter(T_total %in%c(5, 20), N_total ==200, sd.u0 %in%c(1, 3), predictor.type =="binary", outcome.type =="binary", sdX.between ==3, g.01==3) %>%select(-c(ends_with("_success"), ends_with("_X"), ends_with("_X.cent"), ends_with("_X.cluster.means")))plot_df_g01 <- plot_df %>%select(-ends_with("_g.10_bias")) %>%# turn the bias variables into long format, with a new column indicating the model namepivot_longer(cols =ends_with("_bias"), names_to ="model", values_to ="g01_bias") %>%# remove the "_g.01_bias" suffix from the model namesmutate(model =str_remove(model, "_g.01_bias")) %>%# change model namesmutate(model =recode(model,"l3a"="GLMM-HB","l4"="GLMM-MuCo","g.independence3"="GEE-independence-HB","g.independence4"="GEE-independence-MuCo","g.exchangeable3"="GEE-exchangeable-HB","g.exchangeable4"="GEE-exchangeable-MuCo","g.ar13"="GEE-AR1-HB","g.ar14"="GEE-AR1-MuCo")) %>%# set factor levels of model to ensure correct order in the plotmutate(model =factor(model, levels =c("GLMM-HB", "GLMM-MuCo", "GEE-independence-HB", "GEE-independence-MuCo","GEE-exchangeable-HB", "GEE-exchangeable-MuCo","GEE-AR1-HB", "GEE-AR1-MuCo"))) %>%# create new variable indicating method type (so M1 and G1 are "Method 1")mutate(method_type =case_when(str_detect(model, "HB") ~"HB",str_detect(model, "MuCo") ~"MuCo",)) %>%mutate(estimation_type =case_when(str_detect(model, "GLMM") ~"GLMM",str_detect(model, "independence") ~"GEE-independence",str_detect(model, "exchangeable") ~"GEE-exchangeable",str_detect(model, "AR1") ~"GEE-AR(1)")) %>%# set factor levels of method_type to ensure correct order in the plotmutate(estimation_type =factor(estimation_type, levels =c("GLMM", "GEE-independence", "GEE-exchangeable", "GEE-AR(1)")),method_type =factor(method_type, levels =c("HB", "MuCo"))) %>%# Turn label variables (sdX.between, g.01 and sd.u0) into strings with an underscoremutate(sd.u0_label =factor(sd.u0,levels =c(1, 3),labels =c(expression(sigma[u] ==1), expression(sigma[u] ==3)))) %>%mutate(T_total_label =factor(T_total,levels =c(5, 20),labels =c("T == 5", "T == 20"))) %>%# remove all bias values exceeding 100mutate(g01_bias =ifelse(abs(g01_bias) >100, NA, g01_bias)) # For the contextual effectggplot(plot_df_g01, aes(x = method_type, y = g01_bias, col = estimation_type)) +geom_boxplot() +geom_hline(yintercept =0, linetype ="dashed") +# Dashed horizontal line at 0coord_cartesian(ylim =c(-1.5, 1.5)) +# Set y-axis limits# add tick mark at Y for every 0.5scale_y_continuous(breaks =seq(-1.5, 1.5, by =0.5)) +labs(x ="Method", y ="Bias") +facet_grid(sd.u0_label ~ T_total_label, labeller = label_parsed) +# Show T and N values in labelstheme_bw() +# remove X axis labelstheme(# remove vertical grid linespanel.grid.major.x =element_blank(),# remove X tick marksaxis.ticks.x =element_blank(),# increase font size for grid titlesstrip.text.x =element_text(size =12),strip.text.y =element_text(size =12),# increase font size for X entriesaxis.text.x =element_text(size =12),axis.text.y =element_text(size =12),axis.title.y =element_text(size =13),axis.title.x =element_text(size =13),# remove legend# increase legend font sizelegend.text =element_text(size =11),legend.title =element_text(size =13),legend.position ="bottom",) +# change legend title to "Estimation"scale_color_brewer(name ="Estimation", palette ="Spectral")
Figure 4: Bias in contextual effect estimates for DGM 4 (binary predictor and outcome)
4 Appendix: Handling of Extreme Parameter Estimates in GEEs
In the simulations, we found that GEEs tended to yield extreme values rather than terminating with an error. These extreme values are likely a sign of non-convergence. In this section, we will diagnose the extent of this issue and how it relates to the design parameters.
In Table 1 we can see that extreme values only occurred in scenarios with binary outcome variables, that they occurred most frequently for the AR(1) GEE approach, followed by rare occurrences in the exchangeable GEE approach and a complete absence in the independence GEE approach. Here we can see that the g.ar12 with a continuous predictor and binary outcome had the most extreme values (4714 extreme values).
Code
# checking how many GEEs yielded extreme valuesrunname <-"April18_fullsimulation_combined"# read in the final data framefinal_df <-readRDS(here(paste0("output/", runname, "/plotting_bias_df.RDS")))# scenario 1GEE_check_df <- final_df %>%# remove columns ending with "g.01", "X", "X.cent", "X.cluster.means"select(-ends_with("_g.01_bias"), -ends_with("_X"), -ends_with("_X.cent"), -ends_with("_X.cluster.means")) %>%pivot_longer(cols =ends_with("_g.10_bias"),names_to ="model", values_to ="beta1_bias") %>%mutate(model =str_remove(model, "_g.10_bias"),model =factor(model, levels =c("l1", "l2", "l3a", "l4", "g.exchangeable1", "g.ar11", "g.independence1","g.exchangeable2", "g.ar12", "g.independence2", "g.exchangeable3", "g.ar13", "g.independence3", "g.exchangeable4", "g.ar14", "g.independence4"))) %>%# select only the GEE modelsfilter(model %in%c("g.exchangeable1", "g.ar11", "g.independence1","g.exchangeable2", "g.ar12", "g.independence2", "g.exchangeable3", "g.ar13", "g.independence3", "g.exchangeable4", "g.ar14", "g.independence4")) # define a threshold for "extreme" biasthreshold1 <-1E+11# can also be set to 5, doesn't affect the number of extreme valuesthreshold2 <-20# can also be set to 5, doesn't affect the number of extreme values# create a table with the number of extreme scenarios per modelextreme_count_df <- GEE_check_df %>%group_by(model, predictor.type, outcome.type) %>%summarize(num_extreme1 =sum(abs(beta1_bias) > threshold1),num_extreme2 =sum(abs(beta1_bias) > threshold2),.groups ="drop" ) %>%arrange(desc(num_extreme1))# create tableknitr::kable(extreme_count_df[1:10,], format ="html")
Table 1: Design Features with Greatest Number of Extreme Values in GEEs
model
predictor.type
outcome.type
num_extreme1
num_extreme2
g.ar12
continuous
binary
4714
4714
g.ar13
continuous
binary
1003
1003
g.ar14
continuous
binary
1003
1003
g.ar13
binary
binary
612
612
g.ar14
binary
binary
611
611
g.ar12
binary
binary
57
57
g.exchangeable3
continuous
binary
26
26
g.exchangeable4
continuous
binary
26
26
g.ar11
continuous
binary
6
6
g.exchangeable1
binary
binary
0
0
All counts are the same across the two thresholds, indicating that when values are extreme, they are very extreme.
Code
# check whether the counts are the same (TRUE or FALSE)counts_equal <- extreme_count_df$num_extreme1 == extreme_count_df$num_extreme2# check if there are exclusively TRUE valuesall(counts_equal)
[1] TRUE
Another way of investigating this is by evaluating the proportion of extreme replications per scenario. Table 2 shows that there is one scenario (design_id = 299) with nearly half of the replications resulting in extreme values (according to both thresholds).
Code
# create a table with the proportion of extreme replications per scenarioextreme_prop_df <- GEE_check_df %>%group_by(model, design_id) %>%summarize(proportion_extreme1 =mean(abs(beta1_bias) > threshold1),proportion_extreme2 =mean(abs(beta1_bias) > threshold2),.groups ="drop" ) %>%arrange(desc(proportion_extreme1))knitr::kable(extreme_prop_df[1:20,])
Table 2: Designs with Greatest Proportion of Extreme Values in GEEs
model
design_id
proportion_extreme1
proportion_extreme2
g.ar12
299
0.484
0.484
g.ar12
300
0.453
0.453
g.ar12
353
0.451
0.451
g.ar12
354
0.425
0.425
g.ar13
263
0.375
0.375
g.ar14
263
0.375
0.375
g.ar12
227
0.257
0.257
g.ar12
173
0.245
0.245
g.ar12
47
0.223
0.223
g.ar13
264
0.216
0.216
g.ar14
264
0.216
0.216
g.ar12
101
0.210
0.210
g.ar12
174
0.195
0.195
g.ar13
257
0.190
0.190
g.ar14
257
0.190
0.190
g.ar12
228
0.184
0.184
g.ar12
83
0.181
0.181
g.ar12
209
0.181
0.181
g.ar12
102
0.151
0.151
g.ar12
48
0.141
0.141
All proportions are the same across the two thresholds, indicating that when values are extreme, they are very extreme.
Code
# check whether the proportions are the same by checking if there is any false in the vector extreme_prop_df$proportion_extreme1 == extreme_prop_df$proportion_extreme2proportions_equal <- extreme_prop_df$proportion_extreme1 == extreme_prop_df$proportion_extreme2# check if there are exclusively TRUE valuesall(proportions_equal)
[1] TRUE
There are 76 scenarios that have at least one extreme replication.
Code
# remove zeros from the extreme_prop_dfextreme_prop_df_nozeros <- extreme_prop_df %>%filter(proportion_extreme1 >0)# count number of unique design_idslength(unique(extreme_prop_df_nozeros$design_id))
[1] 76
Table 3 shows the GEE scenarios for which the most extreme values occurred in a descending order. We can see here that the scenarios with the most extreme values had a large number of time points (T = 20), a continuous predictor, a binary outcome a non-zero random intercept value (sd.u0). The model with the most extreme values was the AR(1) GEE approach with the CWC method (g.ar12), followed by a tie between Mundlak’s contextual and the Hybrid methods (g.ar13 and g.ar14).
Code
# filter the first 8 columnsextreme_scenarios <- extreme_prop_df[1:8,]# filter the GEE_check_df for the extreme scenariosGEE_check_df_extreme <- GEE_check_df %>%right_join(extreme_scenarios, by =c("model", "design_id")) %>%select(-c(beta1_bias, replication, g.00, sdX.within, g.10, true_cluster_means, design_id, sd.u1, proportion_extreme2)) %>%distinct() %>%rename(prop_extreme = proportion_extreme1) %>%arrange(desc(prop_extreme))# turn this into tableknitr::kable(GEE_check_df_extreme, format ="html")
Table 3: Scenarios with Greatest Proportion of Extreme Values in GEEs
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org
Wickham, H. (2023). Stringr: Simple, consistent wrappers for common string operations. https://stringr.tidyverse.org
Wickham, H., Chang, W., Henry, L., Pedersen, T. L., Takahashi, K., Wilke, C., Woo, K., Yutani, H., Dunnington, D., & van den Brand, T. (2025). ggplot2: Create elegant data visualisations using the grammar of graphics. https://ggplot2.tidyverse.org
Wickham, H., François, R., Henry, L., Müller, K., & Vaughan, D. (2023). Dplyr: A grammar of data manipulation. https://dplyr.tidyverse.org
Xie, Y. (2014). Knitr: A comprehensive tool for reproducible research in R. In V. Stodden, F. Leisch, & R. D. Peng (Eds.), Implementing reproducible computational research. Chapman; Hall/CRC.
Xie, Y. (2015). Dynamic documents with R and knitr (2nd ed.). Chapman; Hall/CRC. https://yihui.org/knitr/
Xie, Y. (2025). Knitr: A general-purpose package for dynamic report generation in r. https://yihui.org/knitr/