| Title: | Imputation Methods for Multivariate Multinomial Data |
|---|---|
| Description: | Implements imputation methods using EM and Data Augmentation for multinomial data following the work of Schafer 1997 <ISBN: 978-0-412-04061-0>. |
| Authors: | Alex Whitworth [aut, cre] |
| Maintainer: | Alex Whitworth <[email protected]> |
| License: | GPL-3 |
| Version: | 0.8.4 |
| Built: | 2026-05-13 09:09:06 UTC |
| Source: | https://github.com/alexwhitworth/imputemulti |
Creates a data depedent prior for p-dimensional multinomial distributions
using a conjugate prior (eg ) based on 20
data_dep_prior_multi(dat)data_dep_prior_multi(dat)
dat |
A |
A data.frame containing identifiers for all possible and
the associated prior-counts,
Darnieder, William Francis. Bayesian methods for data-dependent priors. Dissertation. The Ohio State University, 2011.
A multivariate multinomial model imputed by EM or Data Augmentation is
represented as a mod_imputeMulti object. A complete
dataset and model is represented as an imputeMulti object.
Inherits from mod_imputeMulti. Additional slots are supplied for (1) the
call to multinomial_impute; (2) the missing and imputed data;
and (3) the number of observations with missing values.
## S4 method for signature 'imputeMulti' show(object) get_imputations(object) ## S4 method for signature 'imputeMulti' get_imputations(object) n_miss(object)## S4 method for signature 'imputeMulti' show(object) get_imputations(object) ## S4 method for signature 'imputeMulti' get_imputations(object) n_miss(object)
object |
an object of class "imputeMulti" |
Gcallthe call to multinomial_impute
methodthe modeling method
mle_callthe call to the estimation function
mle_iterthe number of iterations in estimation
mle_log_likthe final log-likelihood
mle_cpthe conjugate prior if any
mle_x_ythe MLE estimate of the sufficient statistics and parameters
dataa list of the missing and imputed data
nmissthe number of observations with missing data
Objects are created by calls to
multinomial_impute, multinomial_em, or
multinomial_data_aug.
multinomial_impute, multinomial_em,
multinomial_data_aug
Function that checks if the target object is a imputeMulti object.
is.imputeMulti(x)is.imputeMulti(x)
x |
any R object. |
Returns TRUE if its argument has class "imputeMulti" among its classes and
FALSE otherwise.
Function that checks if the target object is a mod_imputeMulti object.
is.mod_imputeMulti(x)is.mod_imputeMulti(x)
x |
any R object. |
Returns TRUE if its argument has class "mod_imputeMulti" among its classes and
FALSE otherwise.
Merge the imputed dataset from an imputeMulti object with the original dataset.
Merging is done by rownames, since imputeMulti maintains row-order during imputation.
merge_imputed(impute_obj, y, ...)merge_imputed(impute_obj, y, ...)
impute_obj |
An object of class "imputeMulti". |
y |
The dataset from which the missing data was imputed. |
... |
Arguments to be passed to other methods |
A multivariate multinomial model imputed by EM or Data Augmentation is
represented as a mod_imputeMulti object. A complete
dataset and model is represented as an imputeMulti object.
Slots for mod_imputeMulti objects include: (1) the modeling method;
(2) the call to the estimation function; (3) the number of iterations in estimation;
(4) the final log-likelihood; (5) the conjugate prior if any; (6) the MLE estimate of
the sufficient statistics and parameters.
## S4 method for signature 'mod_imputeMulti' show(object) get_parameters(object) ## S4 method for signature 'mod_imputeMulti' get_parameters(object) get_prior(object) ## S4 method for signature 'mod_imputeMulti' get_prior(object) get_iterations(object) ## S4 method for signature 'mod_imputeMulti' get_iterations(object) get_logLik(object) ## S4 method for signature 'mod_imputeMulti' get_logLik(object) get_method(object) ## S4 method for signature 'mod_imputeMulti' get_method(object) ## S4 method for signature 'imputeMulti' n_miss(object)## S4 method for signature 'mod_imputeMulti' show(object) get_parameters(object) ## S4 method for signature 'mod_imputeMulti' get_parameters(object) get_prior(object) ## S4 method for signature 'mod_imputeMulti' get_prior(object) get_iterations(object) ## S4 method for signature 'mod_imputeMulti' get_iterations(object) get_logLik(object) ## S4 method for signature 'mod_imputeMulti' get_logLik(object) get_method(object) ## S4 method for signature 'mod_imputeMulti' get_method(object) ## S4 method for signature 'imputeMulti' n_miss(object)
object |
an object of class "mod_imputeMulti" |
methodthe modeling method
mle_callthe call to the estimation function
mle_iterthe number of iterations in estimation
mle_log_likthe final log-likelihood
mle_cpthe conjugate prior if any
mle_x_ythe MLE estimate of the sufficient statistics and parameters
Objects are created by calls to
multinomial_impute, multinomial_em, or
multinomial_data_aug.
multinomial_impute, multinomial_em,
multinomial_data_aug
Implement the Data Augmentation algorithm for multvariate multinomial data given
observed counts of complete and missing data ( and ). Allows for specification
of a Dirichlet conjugate prior.
multinomial_data_aug( x_y, z_Os_y, enum_comp, conj_prior = c("none", "data.dep", "flat.prior", "non.informative"), alpha = NULL, burnin = 100, post_draws = 1000, verbose = FALSE )multinomial_data_aug( x_y, z_Os_y, enum_comp, conj_prior = c("none", "data.dep", "flat.prior", "non.informative"), alpha = NULL, burnin = 100, post_draws = 1000, verbose = FALSE )
x_y |
A |
z_Os_y |
A |
enum_comp |
A |
conj_prior |
A string specifying the conjugate prior. One of
|
alpha |
The vector of counts |
burnin |
A scalar specifying the number of iterations to use as a burnin. Defaults
to |
post_draws |
An integer specifying the number of draws from the posterior distribution.
Defaults to |
verbose |
Logical. If |
An object of class mod_imputeMulti-class.
multinomial_em, multinomial_impute
## Not run: data(tract2221) x_y <- multinomial_stats(tract2221[,1:4], output= "x_y") z_Os_y <- multinomial_stats(tract2221[,1:4], output= "z_Os_y") x_possible <- multinomial_stats(tract2221[,1:4], output= "possible.obs") imputeDA_mle <- multinomial_data_aug(x_y, z_Os_y, x_possible, n_obs= nrow(tract2221), conj_prior= "none", verbose= TRUE) ## End(Not run)## Not run: data(tract2221) x_y <- multinomial_stats(tract2221[,1:4], output= "x_y") z_Os_y <- multinomial_stats(tract2221[,1:4], output= "z_Os_y") x_possible <- multinomial_stats(tract2221[,1:4], output= "possible.obs") imputeDA_mle <- multinomial_data_aug(x_y, z_Os_y, x_possible, n_obs= nrow(tract2221), conj_prior= "none", verbose= TRUE) ## End(Not run)
Implement the EM algorithm for multivariate multinomial data given
observed counts of complete and missing data ( and ). Allows for
specification of a Dirichlet conjugate prior.
multinomial_em( x_y, z_Os_y, enum_comp, n_obs, conj_prior = c("none", "data.dep", "flat.prior", "non.informative"), alpha = NULL, tol = 5e-07, max_iter = 10000, verbose = FALSE )multinomial_em( x_y, z_Os_y, enum_comp, n_obs, conj_prior = c("none", "data.dep", "flat.prior", "non.informative"), alpha = NULL, tol = 5e-07, max_iter = 10000, verbose = FALSE )
x_y |
A |
z_Os_y |
A |
enum_comp |
A |
n_obs |
An integer specifying the number of observations in the original data. |
conj_prior |
A string specifying the conjugate prior. One of
|
alpha |
The vector of counts |
tol |
A scalar specifying the convergence criteria. Defaults to |
max_iter |
An integer specifying the maximum number of allowable iterations. Defaults
to |
verbose |
Logical. If |
An object of class mod_imputeMulti-class.
multinomial_data_aug, multinomial_impute
## Not run: data(tract2221) x_y <- multinomial_stats(tract2221[,1:4], output= "x_y") z_Os_y <- multinomial_stats(tract2221[,1:4], output= "z_Os_y") x_possible <- multinomial_stats(tract2221[,1:4], output= "possible.obs") imputeEM_mle <- multinomial_em(x_y, z_Os_y, x_possible, n_obs= nrow(tract2221), conj_prior= "none", verbose= TRUE) ## End(Not run)## Not run: data(tract2221) x_y <- multinomial_stats(tract2221[,1:4], output= "x_y") z_Os_y <- multinomial_stats(tract2221[,1:4], output= "z_Os_y") x_possible <- multinomial_stats(tract2221[,1:4], output= "possible.obs") imputeEM_mle <- multinomial_em(x_y, z_Os_y, x_possible, n_obs= nrow(tract2221), conj_prior= "none", verbose= TRUE) ## End(Not run)
Impute values for multivariate multinomial data using either EM or Data Augmentation.
multinomial_impute( dat, method = c("EM", "DA"), conj_prior = c("none", "data.dep", "flat.prior", "non.informative"), alpha = NULL, verbose = FALSE, ... )multinomial_impute( dat, method = c("EM", "DA"), conj_prior = c("none", "data.dep", "flat.prior", "non.informative"), alpha = NULL, verbose = FALSE, ... )
dat |
A |
method |
|
conj_prior |
A string specifying the conjugate prior. One of
|
alpha |
The vector of counts |
verbose |
Logical. If |
... |
Arguments to be passed to other methods |
An object of class imputeMulti-class
Schafer, Joseph L. Analysis of incomplete multivariate data. Chapter 7. CRC press, 1997.
data_dep_prior_multi, multinomial_em
## Not run: data(tract2221) imputeEM <- multinomial_impute(tract2221[,1:4], method= "EM", conj_prior = "none", verbose= TRUE) imputeDA <- multinomial_impute(tract2221[,1:4], method= "DA", conj_prior = "non.informative", verbose= TRUE) ## End(Not run)## Not run: data(tract2221) imputeEM <- multinomial_impute(tract2221[,1:4], method= "EM", conj_prior = "none", verbose= TRUE) imputeDA <- multinomial_impute(tract2221[,1:4], method= "DA", conj_prior = "non.informative", verbose= TRUE) ## End(Not run)
Calculate observed-data sufficient statistics, marginally-observed summary statistics or enumerate all possible observed patterns from a multivariate multinomial dataset.
multinomial_stats(dat, output = c("x_y", "z_Os_y", "possible.obs"))multinomial_stats(dat, output = c("x_y", "z_Os_y", "possible.obs"))
dat |
A |
output |
A string specifying the desired output. One of |
A data.frame containing either sufficient statistics or possible observed patterns.
## Not run: data(tract2221) obs_suff_stats <- multinomial_stats(tract2221, output= "x_y") marg_obs_suff_stats <- multinomial_stats(tract2221, output= "z_Os_y") ## End(Not run)## Not run: data(tract2221) obs_suff_stats <- multinomial_stats(tract2221, output= "x_y") marg_obs_suff_stats <- multinomial_stats(tract2221, output= "z_Os_y") ## End(Not run)
summary method for class "imputeMulti"
## S4 method for signature 'imputeMulti' summary(object, ...)## S4 method for signature 'imputeMulti' summary(object, ...)
object |
an object of class "imputeMulti" |
... |
further arguments passed to or from other methods. |
summary method for class "mod_imputeMulti"
## S4 method for signature 'mod_imputeMulti' summary(object, ...)## S4 method for signature 'mod_imputeMulti' summary(object, ...)
object |
an object of class "mod_imputeMulti" |
... |
further arguments passed to or from other methods. |
sup of L1 distance between x and y
supDistC(x, y)supDistC(x, y)
x |
A numeric |
y |
A numeric |
a numeric scalar.
A dataset containing attributes of 3974 individuals living in census tract 2221 in Los Angeles County, CA. Data comes from the 5-year American Community Survey with end year 2014. Missing values have been inserted.
tract2221tract2221
A data.frame with 3974 rows and 10 variables. All variables are of class factor:
The individual's age coded in roughly 5 year age buckets.
The indiviudals gender – Male, Female
The individuals marital status. Takes one of 5 levels:
never_mar never married; married married; mar_apart married but living apart;
divorced divorced; and widowed widowed
The individual's educational attainment. Takes one of 7 levels:
lt_hs less than high school; some_hs completed some high school but did not graduate;
hs_grad high school graduate; some_col completed some college but did not graduate;
assoc_dec completed an associates degree; ba_deg obtained a bachelors degree;
grad_deg obtained a graduate or professional degree
The individuals employment status. Takes one of 3 levels:
employed individual is in the labor force and employed;
unemployed individual is in the labor force and unemployed;
not_in_labor_force individual is not in the labor force
The individual's nativity status. Takes one of 4 values: born_state_residence
born in the state of residence; born_other_state born in another US state; born_out_us
a US citizen born outside the US; foreigner foreign born
The individual's poverty status in the past year. Takes one of 2 levels:
below_pov_level below the poverty level; at_above_pov_level at or above the poverty level
The individual's geographic mobility in the last year. Takes one of 5 values:
same house lived in the same house; same county moved within the same county;
same state moved within the same state; same state moved from a different county
within the same state; diff state moved from a different state; moved from abroad
moved from another country
The individual's annual income. Takes one of 9 levels: no_income no income;
1_lt10k income <$10,000; 10k_lt15k $10000-$14999; 15k_lt25k $15000-$24999;
25k_lt35k $25000-$34999; 35k_lt50k $35000-$49999; 50k_lt65k $50000-$64999;
65k_lt75k $65000-$74999; gt75k $75000+
The individual's ethnicity.