Title: | Variational Inference for Hierarchical Generalized Linear Models |
---|---|
Description: | Estimates hierarchical models using variational inference. At present, it can estimate logistic, linear, and negative binomial models. It can accommodate models with an arbitrary number of random effects and requires no integration to estimate. It also provides the ability to improve the quality of the approximation using marginal augmentation. Goplerud (2022) <doi:10.1214/21-BA1266> and Goplerud (2024) <doi:10.1017/S0003055423000035> provide details on the variational algorithms. |
Authors: | Max Goplerud [aut, cre] |
Maintainer: | Max Goplerud <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.6 |
Built: | 2025-02-20 06:00:58 UTC |
Source: | https://github.com/mgoplerud/vglmer |
Given a model estimated using vglmer
, this function
performs marginally augmented variational Bayes (MAVB) to improve the
approximation quality.
MAVB(object, samples, verbose = FALSE, var_px = Inf)
MAVB(object, samples, verbose = FALSE, var_px = Inf)
object |
Model fit using |
samples |
Number of samples to draw. |
verbose |
Show progress in drawing samples. |
var_px |
Variance of working prior for marginal augmentation. Default
( |
This function returns the improved estimates of the parameters. To use MAVB when generating predictions, one should use predict_MAVB. At present, MAVB is only enabled for binomial models.
This function returns a matrix with samples
rows and columns
for each fixed and random effect.
Goplerud, Max. 2022. "Fast and Accurate Estimation of Non-Nested Binomial Hierarchical Models Using Variational Inference." Bayesian Analysis. 17(2): 623-650.
This function draws samples from the estimated variational
distributions. If using MAVB
to improve the quality of the
approximating distribution, please use MAVB or predict_MAVB.
posterior_samples.vglmer(object, samples, verbose = FALSE)
posterior_samples.vglmer(object, samples, verbose = FALSE)
object |
Model fit using |
samples |
Number of samples to draw. |
verbose |
Show progress in drawing samples. |
This function returns a matrix with samples
rows and columns
for each fixed and random effect.
These functions integrate vglmer
(or glmer
) into
SuperLearner
. Most of the arguments are standard for
SuperLearner
functions.
SL.vglmer( Y, X, newX, formula, family, id, obsWeights, control = vglmer_control() ) ## S3 method for class 'SL.vglmer' predict(object, newdata, allow_missing_levels = TRUE, ...) SL.glmer(Y, X, newX, formula, family, id, obsWeights, control = NULL) ## S3 method for class 'SL.glmer' predict(object, newdata, allow.new.levels = TRUE, ...) add_formula_SL(learner, env = parent.frame())
SL.vglmer( Y, X, newX, formula, family, id, obsWeights, control = vglmer_control() ) ## S3 method for class 'SL.vglmer' predict(object, newdata, allow_missing_levels = TRUE, ...) SL.glmer(Y, X, newX, formula, family, id, obsWeights, control = NULL) ## S3 method for class 'SL.glmer' predict(object, newdata, allow.new.levels = TRUE, ...) add_formula_SL(learner, env = parent.frame())
Y |
From |
X |
From |
newX |
From |
formula |
The formula used for estimation. |
family |
From |
id |
From |
obsWeights |
From |
control |
Control object for estimating |
object |
Used in |
newdata |
Dataset to use for predictions. |
allow_missing_levels |
Default ( |
... |
Not used; included to maintain compatibility with existing methods. |
allow.new.levels |
From |
learner |
Character name of model from |
env |
Environment to assign model. See "Details" for how this is used. |
This documentation describes two types of function.
Estimating Hierarchical Models in SuperLearner: Two methods for
estimating hierarchical models are provided one for variational methods
(SL.vglmer
) and one for non-variational methods (SL.glmer
).
The accompanying prediction functions are also provided.
Formula with SuperLearner: The vglmer
package provides a way
to estimate models that require or use a formula with SuperLearner
.
This allows for a design to be passed that contains variables that are
not used in estimation. This can be used as follows (see
"Examples"). One calls the function add_formula_SL
around the quoted
name of a SuperLearner
model, e.g. add_formula_SL(learner =
"SL.knn")
. This creates a new model and predict function with the suffix
"_f"
. This requires a formula to be provided for estimation.
With this in hand, "SL.knn_f"
can be passed to SuperLearner
with the
accompanying formula argument and thus one can compare models with
different formula or design on the same ensemble. The env
argument
may need to be manually specified to ensure the created functions can be
called by SuperLearner
.
The functions here return different types of output. SL.vglmer
and SL.glmer
return fitted models with the in-sample predictions as
standard for SuperLearner
. The predict
methods return vectors
of predicted values. add_formula_SL
creates two objects in the
environment (one for estimation model_f
and one for prediction
predict.model_f
) used for SuperLearner
.
set.seed(456) if (requireNamespace('SuperLearner', quietly = TRUE)){ require(SuperLearner) sim_data <- data.frame( x = rnorm(100), g = sample(letters, 100, replace = TRUE) ) sim_data$y <- rbinom(nrow(sim_data), 1, plogis(runif(26)[match(sim_data$g, letters)])) sim_data$g <- factor(sim_data$g) sl_vglmer <- function(...){SL.vglmer(..., formula = y ~ x + (1 | g))} SL.glm <- SuperLearner::SL.glm add_formula_SL('SL.glm') sl_glm_form <- function(...){SL.glm_f(..., formula = ~ x)} SuperLearner::SuperLearner( Y = sim_data$y, family = 'binomial', X = sim_data[, c('x', 'g')], cvControl = list(V = 2), SL.library = c('sl_vglmer', 'sl_glm_form') ) }
set.seed(456) if (requireNamespace('SuperLearner', quietly = TRUE)){ require(SuperLearner) sim_data <- data.frame( x = rnorm(100), g = sample(letters, 100, replace = TRUE) ) sim_data$y <- rbinom(nrow(sim_data), 1, plogis(runif(26)[match(sim_data$g, letters)])) sim_data$g <- factor(sim_data$g) sl_vglmer <- function(...){SL.vglmer(..., formula = y ~ x + (1 | g))} SL.glm <- SuperLearner::SL.glm add_formula_SL('SL.glm') sl_glm_form <- function(...){SL.glm_f(..., formula = ~ x)} SuperLearner::SuperLearner( Y = sim_data$y, family = 'binomial', X = sim_data[, c('x', 'g')], cvControl = list(V = 2), SL.library = c('sl_vglmer', 'sl_glm_form') ) }
This function estimates splines in vglmer
, similar to s(...)
in
mgcv
albeit with many fewer options than mgcv
. It allows for
truncated (linear) splines (type="tpf"
), O'Sullivan splines
(type="o"
), or kernel ridge regression (type="gKRLS"
). Please
see vglmer for more discussion and examples. For information on kernel
ridge regression, please consult gKRLS.
v_s( ..., type = "tpf", knots = NULL, by = NA, xt = NULL, by_re = TRUE, force_vector = FALSE, outer_okay = FALSE )
v_s( ..., type = "tpf", knots = NULL, by = NA, xt = NULL, by_re = TRUE, force_vector = FALSE, outer_okay = FALSE )
... |
Variable name, e.g. |
type |
Default ( |
knots |
Default ( |
by |
A categorical or factor covariate to interact the spline with; for
example, |
xt |
Arguments passed to |
by_re |
Default ( |
force_vector |
Force that argument to |
outer_okay |
Default ( |
This function returns a list of class of vglmer_spline
that is
passed to unexported functions. It contains the arguments noted above where
...
is parsed into an argument called term
.
Chang, Qing, and Max Goplerud. 2024. "Generalized Kernel Regularized Least Squares." Political Analysis 32(2):157-171.
Wand, Matt P. and Ormerod, John T. 2008. "On Semiparametric Regression with O'Sullivan Penalized Splines". Australian & New Zealand Journal of Statistics. 50(2): 179-198.
Wood, Simon N. 2017. Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC.
This function estimates hierarchical models using mean-field variational
inference. vglmer
accepts standard syntax used for lme4
, e.g.,
y ~ x + (x | g)
. Options are described below. Goplerud (2022; 2024)
provides details on the variational algorithms.
vglmer(formula, data, family, control = vglmer_control())
vglmer(formula, data, family, control = vglmer_control())
formula |
|
data |
|
family |
Options are "binomial", "linear", or "negbin" (experimental).
If "binomial", outcome must be either binary ( |
control |
Adjust internal options for estimation. Must use an object created by vglmer_control. |
Estimation Syntax: The formula
argument takes syntax designed
to be a similar as possible to lme4
. That is, one can specify models
using y ~ x + (1 | g)
where (1 | g)
indicates a random intercept. While
not tested extensively, terms of (1 | g / f)
should work as expected. Terms
of (1 + x || g)
may work, although will raise a warning about duplicated
names of random effects. (1 + x || g)
terms may not work with spline
estimation. To get around this, one can might copy the column g
to
g_copy
and then write (1 | g) + (0 + x | g_copy)
.
Splines: Splines can be added using the term v_s(x)
for a
spline on the variable x
. These are transformed into hierarchical
terms in a standard fashion (e.g. Ruppert et al. 2003) and then estimated
using the variational algorithms. At the present, only truncated linear
functions (type = "tpf"
; the default) and O'Sullivan splines (Wand and
Ormerod 2008) are included. The options are described in more detail at
v_s.
It is possible to have the spline vary across some categorical predictor by
specifying the "by"
argument such as v_s(x, by = g)
. In effect,
this adds additional hierarchical terms for the group-level deviations from
the "global" spline. Note: In contrast to the typical presentation of
these splines interacted with categorical variables (e.g., Ruppert et al.
2003), the default use of "by"
includes the lower order interactions
that are regularized, i.e. (1 + x | g)
, versus their unregularized
version (e.g., x * g
); this can be changed using the by_re
argument described in v_s. Further, all group-level deviations from
the global spline share the same smoothing parameter (same prior
distribution).
Default Settings: By default, the model is estimated using the
"strong" (i.e. fully factorized) variational assumption. Setting
vglmer_control(factorization_method = "weak")
will improve the quality
of the variance approximation but may take considerably more time to
estimate. See Goplerud (2022) for discussion.
By default, the prior on each random effect variance () uses a Huang-Wand prior (Huang
and Wand 2013) with hyper-parameters
and
.
This is designed to be proper but weakly informative. Other options are
discussed in vglmer_control under the
prior_variance
argument.
By default, estimation is accelerated using SQUAREM (Varadhan and Roland
2008) and (one-step-late) parameter expansion for variational Bayes. Under
the default "strong"
factorization, a "translation" expansion is used;
under other factorizations a "mean" expansion is used. These can be adjusted
using vglmer_control. See Goplerud (2024) for more discussion of
these methods.
This returns an object of class vglmer
. The available methods
(e.g. coef
) can be found using methods(class="vglmer")
.
Contains the estimated distribution of the fixed effects
(). It is multivariate normal.
mean
contains the means;
var
contains the variance matrix; decomp_var
contains a matrix
such that
equals the full variance matrix.
Contains the estimated distribution of the random effects
(). They are all multivariate normal.
mean
contains the
means; dia.var
contains the variance of each random effect. var
contains the variance matrix of each random effect (j,g). decomp_var
contains a matrix such that
equals the full variance of
the entire set of random effects.
If factorization_method="weak"
, this is a list with one
element (decomp_var
) that contains a matrix such that
equals the full variance matrix between the fixed and random effects
. The marginal variances are included in
beta
and
alpha
. If the factorization method is not "weak"
, this is
NULL
.
Contains the estimated distribution of each random
effect covariance ; all distributions are Inverse-Wishart.
cov
contains a list of the estimated scale matrices. df
contains a list of the degrees of freedom.
If a Huang-Wand prior is used (see Huang and Wand 2013 or Goplerud
2024 for more details), then the estimated distribution. Otherwise, it is
NULL
. All distributions are Inverse-Gamma. a
contains a list of
the scale parameters. b
contains a list of the shape parameters.
If family="linear"
, this contains a list of the
estimated parameters for ; its distribution is Inverse-Gamma.
a
contains the scale parameter; b
contains the shape
parameter.
If family="negbin"
, this contains the variational
parameters for the log dispersion parameter .
mu
contains
the mean; sigma
contains the variance.
Family of outcome.
Contains the ELBO at the termination of the algorithm.
data.frame
tracking the ELBO per iteration.
Contains the control parameters from vglmer_control
used in estimation.
Variety of internal parameters used in post-estimation functions.
Contains the formula used for estimation; contains the
original formula, fixed effects, and random effects parts separately for
post-estimation functions. See formula.vglmer
for more details.
Goplerud, Max. 2022. "Fast and Accurate Estimation of Non-Nested Binomial Hierarchical Models Using Variational Inference." Bayesian Analysis. 17(2): 623-650.
Goplerud, Max. 2024. "Re-Evaluating Machine Learning for MRP Given the Comparable Performance of (Deep) Hierarchical Models." American Political Science Review. 118(1): 529-536.
Huang, Alan, and Matthew P. Wand. 2013. "Simple Marginally Noninformative Prior Distributions for Covariance Matrices." Bayesian Analysis. 8(2):439-452.
Ruppert, David, Matt P. Wand, and Raymond J. Carroll. 2003. Semiparametric Regression. Cambridge University Press.
Varadhan, Ravi, and Christophe Roland. 2008. "Simple and Globally Convergent Methods for Accelerating the Convergence of any EM Algorithm." Scandinavian Journal of Statistics. 35(2): 335-353.
Wand, Matt P. and Ormerod, John T. 2008. "On Semiparametric Regression with O'Sullivan Penalized Splines". Australian & New Zealand Journal of Statistics. 50(2): 179-198.
set.seed(234) sim_data <- data.frame( x = rnorm(100), y = rbinom(100, 1, 0.5), g = sample(letters, 100, replace = TRUE) ) # Run with defaults est_vglmer <- vglmer(y ~ x + (x | g), data = sim_data, family = "binomial") # Simple prediction predict(est_vglmer, newdata = sim_data) # Summarize results summary(est_vglmer) # Extract parameters coef(est_vglmer); vcov(est_vglmer) # Comparability with lme4, # although ranef is formatted differently. ranef(est_vglmer); fixef(est_vglmer) # Run with weaker (i.e. better) approximation vglmer(y ~ x + (x | g), data = sim_data, control = vglmer_control(factorization_method = "weak"), family = "binomial") # Use a spline on x with a linear outcome vglmer(y ~ v_s(x), data = sim_data, family = "linear")
set.seed(234) sim_data <- data.frame( x = rnorm(100), y = rbinom(100, 1, 0.5), g = sample(letters, 100, replace = TRUE) ) # Run with defaults est_vglmer <- vglmer(y ~ x + (x | g), data = sim_data, family = "binomial") # Simple prediction predict(est_vglmer, newdata = sim_data) # Summarize results summary(est_vglmer) # Extract parameters coef(est_vglmer); vcov(est_vglmer) # Comparability with lme4, # although ranef is formatted differently. ranef(est_vglmer); fixef(est_vglmer) # Run with weaker (i.e. better) approximation vglmer(y ~ x + (x | g), data = sim_data, control = vglmer_control(factorization_method = "weak"), family = "binomial") # Use a spline on x with a linear outcome vglmer(y ~ v_s(x), data = sim_data, family = "linear")
This function controls various estimation options for vglmer
.
vglmer_control( iterations = 1000, prior_variance = "hw", factorization_method = c("strong", "partial", "weak"), parameter_expansion = "translation", do_SQUAREM = TRUE, tolerance_elbo = 1e-08, tolerance_parameters = 1e-05, force_whole = TRUE, print_prog = NULL, do_timing = FALSE, verbose_time = FALSE, return_data = FALSE, linpred_method = "joint", vi_r_method = "VEM", verify_columns = FALSE, debug_param = FALSE, debug_ELBO = FALSE, debug_px = FALSE, quiet = TRUE, quiet_rho = TRUE, px_method = "dynamic", px_numerical_it = 10, hw_inner = 10, init = "EM_FE" )
vglmer_control( iterations = 1000, prior_variance = "hw", factorization_method = c("strong", "partial", "weak"), parameter_expansion = "translation", do_SQUAREM = TRUE, tolerance_elbo = 1e-08, tolerance_parameters = 1e-05, force_whole = TRUE, print_prog = NULL, do_timing = FALSE, verbose_time = FALSE, return_data = FALSE, linpred_method = "joint", vi_r_method = "VEM", verify_columns = FALSE, debug_param = FALSE, debug_ELBO = FALSE, debug_px = FALSE, quiet = TRUE, quiet_rho = TRUE, px_method = "dynamic", px_numerical_it = 10, hw_inner = 10, init = "EM_FE" )
iterations |
Default of 1000; this sets the maximum number of iterations used in estimation. |
prior_variance |
Prior distribution on the random effect variance
Estimation may fail if an improper prior ( |
factorization_method |
Factorization assumption for the variational
approximation. Default of |
parameter_expansion |
Default of |
do_SQUAREM |
Default ( |
tolerance_elbo |
Default ( |
tolerance_parameters |
Default ( |
force_whole |
Default ( |
print_prog |
Default ( |
do_timing |
Default ( |
verbose_time |
Default ( |
return_data |
Default ( |
linpred_method |
Default ( |
vi_r_method |
Default ( |
verify_columns |
Default ( |
debug_param |
Default ( |
debug_ELBO |
Default ( |
debug_px |
Default ( |
quiet |
Default ( |
quiet_rho |
Default ( |
px_method |
When code |
px_numerical_it |
Default of 10; if L-BFGS_B is needed for a parameter expansion, this sets the number of steps used. |
hw_inner |
If |
init |
Default ( |
This function returns a named list with class vglmer_control
.
It is passed to vglmer
in the argument control
. This argument
only accepts objects created using vglmer_control
.
Goplerud, Max. 2022. "Fast and Accurate Estimation of Non-Nested Binomial Hierarchical Models Using Variational Inference." Bayesian Analysis. 17(2): 623-650.
Goplerud, Max. 2024. "Re-Evaluating Machine Learning for MRP Given the Comparable Performance of (Deep) Hierarchical Models." American Political Science Review. 118(1): 529-536.
Huang, Alan, and Matthew P. Wand. 2013. "Simple Marginally Noninformative Prior Distributions for Covariance Matrices." Bayesian Analysis. 8(2):439-452.
Varadhan, Ravi, and Christophe Roland. 2008. "Simple and Globally Convergent Methods for Accelerating the Convergence of any EM Algorithm." Scandinavian Journal of Statistics. 35(2): 335-353.
These functions calculate the estimated linear predictor using
the variational distributions. predict.vglmer
draws predictions
using the estimated variational distributions; predict_MAVB
does so
using the MAVB procedure described in Goplerud (2022).
## S3 method for class 'vglmer' predict( object, newdata, type = "link", samples = 0, samples_only = FALSE, summary = TRUE, allow_missing_levels = FALSE, ... ) predict_MAVB( object, newdata, samples = 0, samples_only = FALSE, var_px = Inf, summary = TRUE, allow_missing_levels = FALSE )
## S3 method for class 'vglmer' predict( object, newdata, type = "link", samples = 0, samples_only = FALSE, summary = TRUE, allow_missing_levels = FALSE, ... ) predict_MAVB( object, newdata, samples = 0, samples_only = FALSE, var_px = Inf, summary = TRUE, allow_missing_levels = FALSE )
object |
Model fit using |
newdata |
Dataset to use for predictions. It cannot be missing. |
type |
Default ( |
samples |
Number of samples to draw. Using |
samples_only |
Default ( |
summary |
Default ( |
allow_missing_levels |
Default ( |
... |
Not used; included to maintain compatibility with existing methods. |
var_px |
Variance of working prior for marginal augmentation. Default
( |
This function returns an estimate of the linear predictor. The
default returns the expected mean, i.e. . If
samples > 0
, these functions return a
summary of the prediction for each observation, i.e. the estimated mean and
variance. If summary = FALSE
, the sampled values of the linear
predictor are returned as a matrix. predict_MAVB
performs MAVB as
described in Goplerud (2022) before returning the linear predictor.
If allow_missing_levels = TRUE
, then observations with a new
(unseen) level for the random effect are given a value of zero for that
term of the prediction.
set.seed(123) sim_data <- data.frame( x = rnorm(100), y = rbinom(100, 1, 0.5), g = sample(letters, 100, replace = TRUE) ) # Run with defaults est_vglmer <- vglmer(y ~ x + (x | g), data = sim_data, family = "binomial") # Simple prediction predict(est_vglmer, newdata = sim_data) # Return 10 posterior draws of the linear predictor for each observation. predict_MAVB(est_vglmer, newdata = sim_data, summary = FALSE, samples = 10) # Predict with a new level; note this would fail if # allow_missing_levels = FALSE (the default) predict(est_vglmer, newdata = data.frame(g = "AB", x = 0), allow_missing_levels = TRUE )
set.seed(123) sim_data <- data.frame( x = rnorm(100), y = rbinom(100, 1, 0.5), g = sample(letters, 100, replace = TRUE) ) # Run with defaults est_vglmer <- vglmer(y ~ x + (x | g), data = sim_data, family = "binomial") # Simple prediction predict(est_vglmer, newdata = sim_data) # Return 10 posterior draws of the linear predictor for each observation. predict_MAVB(est_vglmer, newdata = sim_data, summary = FALSE, samples = 10) # Predict with a new level; note this would fail if # allow_missing_levels = FALSE (the default) predict(est_vglmer, newdata = data.frame(g = "AB", x = 0), allow_missing_levels = TRUE )
vglmer
uses many standard methods from lm
and lme4
with
limited changes. These provide summaries of the estimated variational
distributions.
## S3 method for class 'vglmer' fixef(object, ...) ## S3 method for class 'vglmer' sigma(object, ...) ## S3 method for class 'vglmer' ranef(object, ...) ## S3 method for class 'vglmer' coef(object, ...) ## S3 method for class 'vglmer' vcov(object, ...) ## S3 method for class 'vglmer' fitted(object, ...) ## S3 method for class 'vglmer' print(x, ...) ## S3 method for class 'vglmer' summary(object, display_re = TRUE, ...) ## S3 method for class 'vglmer' formula(x, form = "original", ...) format_vglmer(object) format_glmer(object) ELBO(object, type = c("final", "trajectory"))
## S3 method for class 'vglmer' fixef(object, ...) ## S3 method for class 'vglmer' sigma(object, ...) ## S3 method for class 'vglmer' ranef(object, ...) ## S3 method for class 'vglmer' coef(object, ...) ## S3 method for class 'vglmer' vcov(object, ...) ## S3 method for class 'vglmer' fitted(object, ...) ## S3 method for class 'vglmer' print(x, ...) ## S3 method for class 'vglmer' summary(object, display_re = TRUE, ...) ## S3 method for class 'vglmer' formula(x, form = "original", ...) format_vglmer(object) format_glmer(object) ELBO(object, type = c("final", "trajectory"))
object |
Model fit using |
... |
Not used; included to maintain compatibility with existing methods. |
x |
Model fit using |
display_re |
Default ( |
form |
Describes the type of formula to report:
|
type |
Default ( |
The accompanying functions are briefly described below.
coef
and vcov
return the mean and variance of the fixed effects
().
fixef
returns the mean of the fixed effects.
ranef
extracts the random effects () in a similar,
although slightly different format, to
lme4
. It includes the estimated
posterior mean and variance in a list of data.frames with one entry per
random effect .
fitted
extracts the estimated expected linear predictor, i.e.
at convergence.
summary
reports the estimates for all fixed effects as in lm
as
well as some summaries of the random effects (if display_re=TRUE
).
format_vglmer
collects the mean and variance of the fixed and random
effects into a single data.frame. This is useful for examining all of the
posterior estimates simultaneously. format_glmer
converts an object
estimated with [g]lmer
into a comparable format.
ELBO
extracts the ELBO from the estimated model. type
can be
set equal to "trajectory"
to get the estimated ELBO at each iteration
and assess convergence.
sigma
extracts the square root of the posterior mode of
if a linear model is used.
formula
extracts the formula associated with the vglmer
object.
By default, it returns the formula provided. The fixed and random effects
portions can be extracted separately using the form
argument.
The functions here return a variety of different objects depending on the specific function. "Details" describes the behavior of each one. Their output is similar to the typical behavior for the corresponding generic functions.