In addition to the non-covariate approach presented in the main vignette (see demonstration.Rmd), fastei also supports a covariate approach that incorporates electoral-unit–level covariates. This approach follows the same general strategy of maximizing the log-likelihood via an EM algorithm, with the M-step carried out through a multinomial formulation.

The covariate specification requires computing a Hessian matrix, which can be computationally costly; therefore, an approximation is used in practice. In addition, parameter updates within each M-step are obtained via a Newton–Raphson procedure. Empirically, performing a single Newton step (i.e., maxnewton = 1) is typically sufficient to achieve convergence.

The main difference from the non-parametric approach is that the probability matrix is defined at the electoral-unit level rather than as a single global probability matrix. This enables the inclusion of electoral-unit covariates, which may better capture and explain variation in voting behavior across units. Consequently, convergence of the EM algorithm is assessed solely through increases in the log-likelihood.

Under this model, probabilities are computed using a multinomial softmax. Specifically, letting pbgcp_{bgc} denote the probability associated with electoral unit bb, group gg, and candidate cc, we define:

pbgc=exp(βgc+(VαT)bc)1+j=1C1exp(βgj+(VαT)bj). p_{bgc}= \frac{\exp\!\big(\beta_{gc} + (V\alpha^{T})_{bc}\big)} {1+\sum_{j=1}^{C-1}\exp\!\big(\beta_{gj} + (V\alpha^{T})_{bj}\big)}.

Thus, the initial probabilities depend on the parameters alpha and beta, which are set to zero by default if not provided.

Simulating a parametric election

An election can be simulated using the function simulate_election(). When employing the covariate approach, it is necessary to specify both the number of covariates (num_covariates) and the number of districts (num_districts). In practice, a district may represent a collection of ballot boxes that share the same covariate values. In the example below, we simulate an election with three candidates, two demographic groups, and 20 covariates.

library(fastei)

election <- simulate_election(
    num_ballots = 100,
    num_candidates = 3,
    num_groups = 2,
    num_covariates = 20,
    num_districts = 10,
    seed = 42
)

Note that the covariate setting must be explicitly enabled by specifying num_covariates and num_districts distinctly from zero.

The results of the simulation can be inspected by printing the resulting object. The simulated election is returned as an eim object.

election
## eim ecological inference model
## Candidates' vote matrix (X) [b x c]:
##      [,1] [,2] [,3]
## [1,]   82    3   15
## [2,]   82    2   16
## [3,]   82    0   18
## [4,]   85    3   12
## [5,]   41   44   15
## .
## .
## .
## Group-level voter matrix (W) [b x g]:
##      [,1] [,2]
## [1,]   73   27
## [2,]   70   30
## [3,]   73   27
## [4,]   76   24
## [5,]   68   32
## .
## .
## .
## Attribute matrix (V) [b x a]:
##           [,1]      [,2]       [,3]       [,4]       [,5]       [,6]      [,7]
## [1,] 0.1233006 0.7853494 0.04989215 0.73170760 0.44456953 0.97933430 0.3937769
## [2,] 0.3110496 0.4533034 0.18735671 0.31526080 0.06038559 0.49690343 0.1419089
## [3,] 0.9457392 0.1357424 0.98265941 0.38645401 0.32750602 0.09307467 0.2798067
## [4,] 0.5000251 0.8852210 0.32827407 0.33244599 0.87842905 0.21177366 0.5648222
## [5,] 0.1352305 0.3367135 0.17099639 0.08977797 0.93060489 0.93050075 0.9351395
##             [,8]       [,9]     [,10]     [,11]     [,12]     [,13]     [,14]
## [1,] 0.002378107 0.09091831 0.5763365 0.6525101 0.3220339 0.4700924 0.5247906
## [2,] 0.160429915 0.79859664 0.0736678 0.2795735 0.7582701 0.3650141 0.6568109
## [3,] 0.399272951 0.35978525 0.1646274 0.9799076 0.1044129 0.2801246 0.2295194
## [4,] 0.675319577 0.04048758 0.7398908 0.6438641 0.7102779 0.5997159 0.7212260
## [5,] 0.480372016 0.04108634 0.4757110 0.5825784 0.9664774 0.8185696 0.4907504
##          [,15]     [,16]      [,17]      [,18]     [,19]     [,20]
## [1,] 0.5132395 0.5815542 0.48275690 0.54551435 0.9213298 0.3968551
## [2,] 0.6307262 0.2401496 0.97917358 0.44745733 0.3626018 0.6969568
## [3,] 0.4187716 0.7218879 0.81151679 0.08388484 0.8551350 0.6593197
## [4,] 0.8792659 0.1459287 0.54291282 0.93013369 0.3009062 0.4073507
## [5,] 0.1079871 0.1528388 0.07236709 0.01644819 0.4656624 0.3069202
## .
## .
## .

Running the model

The model is estimated using the same function as in the non-covariate case. A minimal example is shown below:

result <- run_em(election)

The quality of the estimated solution can be assessed by computing the mean absolute error (MAE):

mae <- mean(abs(result$prob - result$real_prob))
mae
## [1] 0.01356109

If a global probability matrix—analogous to that obtained in the non-parametric approach—is desired, it can be computed as a weighted average using the group sizes contained in W. Formally, the global probability for group (gg) and candidate (cc) is defined as

Pgc=bWbg,pgcbbWbgP_{gc} = \frac{\sum_b W_{bg}, p_{gcb}}{\sum_b W_{bg}}

The following code snippet implements this aggregation (note that we could also get the global matrix by using the as.matrix() function):

global_prob <- function(W, prob) {
    G <- dim(prob)[1]
    C <- dim(prob)[2]
    P <- matrix(NA_real_, G, C)
    for (g in seq_len(G)) {
        w <- W[, g]
        P[g, ] <- colSums(t(prob[g, , ]) * w) / sum(w)
    }
    return(P)
}

Using this function, the estimated global probability matrix can be obtained as follows:

estimated_global_probability <- global_prob(result$W, result$prob)
estimated_global_probability
##           [,1]       [,2]      [,3]
## [1,] 0.6225445 0.16127307 0.2161825
## [2,] 0.5684556 0.06672677 0.3648177

The MAE for the global probabilities is then computed by comparing the estimated matrix with the true global probabilities:

real_global_probability <- global_prob(election$W, election$real_prob)

parametric_global_mae <- mean(abs(estimated_global_probability - real_global_probability))
parametric_global_mae
## [1] 0.007670176

Finally, the contribution of covariates to estimation accuracy can be evaluated by comparing the covariate model with the non-covariate approach. This is achieved by removing the covariate matrix V and re-estimating the model:

non_parametric_election <- election
non_parametric_election$V <- NULL
non_parametric_result <- run_em(non_parametric_election)

non_parametric_mae <- mean(abs(non_parametric_result$prob - real_global_probability))

improvement <- (non_parametric_mae - parametric_global_mae) /
    non_parametric_mae * 100
improvement
## [1] 9.298436

The resulting value represents the percentage reduction in MAE achieved by incorporating covariates, relative to the non-parametric baseline.

Reducing dimensionality with PCA

When working with a large number of covariates, it may be advantageous to reduce dimensionality using Principal Component Analysis (PCA). Dimensionality reduction can improve both computational efficiency and model interpretability. The function PCA allows the covariate matrix V to be transformed prior to running the EM algorithm.

Dimensionality can be reduced either by explicitly specifying the number of principal components to retain via the argument components, or by providing a threshold for the proportion of explained variance using sd_threshold. The example below illustrates how PCA can be applied to the covariate matrix V in the simulated election:

PCA_election <- PCA(election, components = 2)
dim(PCA_election$V)
## [1] 100   2

As expected, the resulting covariate matrix contains only two columns, corresponding to the two retained principal components. The EM algorithm can then be run on this modified election object:

pca_result <- run_em(PCA_election)
pca_global_probability <- global_prob(pca_result$W, pca_result$prob)
pca_global_mae <- mean(abs(pca_global_probability - real_global_probability))

In some cases, reducing the number of covariates can lead to improved estimation accuracy. This can be evaluated by comparing the PCA-based model with the non-parametric baseline:

pca_improvement <- (non_parametric_mae - pca_global_mae) /
    non_parametric_mae * 100
pca_improvement
## [1] 48.14994

The resulting value represents the percentage reduction in MAE achieved by applying PCA relative to the non-parametric approach.