R/eim-class.R
bootstrap.RdThis function computes the Expected-Maximization (EM) algorithm "nboot" times. It then computes the standard deviation from the nboot estimated probability matrices on each component.
It supports both non-parametric and parametric models; the parametric mode is enabled by providing V and only supports method = "mult".
bootstrap(
object = NULL,
X = NULL,
W = NULL,
V = NULL,
json_path = NULL,
nboot = 100,
allow_mismatch = TRUE,
seed = NULL,
maxnewton = 1,
...
)An object of class eim, which can be created using the eim function. This parameter should not be used if either (i) X and W matrices or (ii) json_path is supplied. See Note.
A (b x c) matrix representing candidate votes per ballot box.
A (b x g) matrix representing group votes per ballot box.
Optional (b x a) matrix with the attributes for each ballot box. This is only used for parametric models.
A path to a JSON file containing X, W (and optionally V) fields, stored as nested arrays. It may contain additional fields with other attributes, which will be added to the returned object.
Integer specifying how many times to run the EM algorithm.
Boolean, if TRUE, allows a mismatch between the voters and votes for each ballot-box. If FALSE, throws an error if there is a mismatch. By default it is TRUE. See Notes for more details.
An optional integer indicating the random seed for the randomized algorithms. This argument is only applicable if initial_prob = "random" or method is either "mcmc" or "mvn_cdf". Aditionally, it sets the random draws of the ballot boxes.
Maximum number of Newton iterations used in the parametric M-step. Default is 1. Ignored if no covariates are provided (i.e., V = NULL).
Additional arguments passed to the run_em function that will execute the EM algorithm.
Returns an eim object with the sd field containing the estimated standard deviations of the probabilities and the avg_prob field with the average bootstrapped probability matrix. If an eim object is provided, its attributes (see run_em) are retained in the returned object.
For parametric models, it returns sd_beta and sd_alpha instead of sd and avg_prob.
This function can be executed using one of three mutually exclusive approaches:
By providing an existing eim object.
By supplying both input matrices (X and W) directly.
By specifying a JSON file (json_path) containing the matrices.
These input methods are mutually exclusive, meaning that you must provide exactly one of these options. Attempting to provide more than one or none of these inputs will result in an error.
When called with an eim object, the function updates the object with the computed results.
If an eim object is not provided, the function will create one internally using either the
supplied matrices or the data from the JSON file before executing the algorithm.
If there are ballot-boxes with mismatch between W and X, and allow_mismatch = TRUE, then: if method = "exact", at each ballot-box with mismatch D'Hont is applied to add or remove the necessary voters from (W) so that its total match the total number of votes (X); if method is "mvn_pdf", "mvn_cdf" or "mcmc", the number of voters (W) of the ballot-box with mismatch is scaled to match its total number of votes (X).
# \donttest{
# Example 1: Using an 'eim' object directly
simulations <- simulate_election(
num_ballots = 200,
num_candidates = 5,
num_groups = 3,
)
model <- eim(X = simulations$X, W = simulations$W)
model <- bootstrap(
object = model,
nboot = 30,
method = "mult",
maxiter = 500,
verbose = FALSE,
)
#> Applying a D'Hondt correction for correcting mismatches in W
# Access standard deviation throughout 'model'
print(model$sd)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0.007315623 0.009894301 0.012529311 0.006347813 0.007244017
#> [2,] 0.005345960 0.003848299 0.013209998 0.003900690 0.010052641
#> [3,] 0.004304229 0.006052943 0.008320448 0.007353700 0.008801976
# Example 2: Providing 'X' and 'W' matrices directly
model <- bootstrap(
X = simulations$X,
W = simulations$W,
nboot = 15,
method = "mvn_pdf",
maxiter = 100,
maxtime = 5,
param_threshold = 0.01,
allow_mismatch = FALSE
)
print(model$sd)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0.006828030 0.006229797 0.009471162 0.005453572 0.005902559
#> [2,] 0.005896021 0.004157179 0.014612996 0.006067319 0.010650224
#> [3,] 0.005956931 0.004287437 0.008102596 0.006138259 0.013142616
# }
# Example 3: Using a JSON file as input
if (FALSE) { # \dontrun{
model <- bootstrap(
json_path = "path/to/election_data.json",
nboot = 70,
method = "mult",
)
print(model$sd)
} # }