This function simulates an election by creating matrices representing candidate votes (X) and voters' demographic group (W) across a specified number of ballot-boxes. It either (i) receives as input or (ii) generates a probability matrix (prob), indicating how likely each demographic group is to vote for each candidate. It supports both non-parametric and parametric simulations; set num_covariates and num_districts greater than zero to generate V, real_alpha, and real_beta.

By default, the number of voters per ballot box (ballot_voters) is set to a vector of 100 with length num_ballots. You can optionally override this by providing a custom vector.

Optional parameters are available to control the distribution of votes:

  • group_proportions: A vector of length num_groups specifying the overall proportion of each demographic group. Entries must sum to one and be non-negative.

  • prob: A user-supplied probability matrix of dimension (num_groups \(\times\) num_candidates). If provided, this matrix is used directly. Otherwise, voting probabilities for each group are drawn from a Dirichlet distribution.

simulate_election(
  num_ballots,
  num_candidates,
  num_groups,
  ballot_voters = rep(100, num_ballots),
  lambda = 0.5,
  seed = NULL,
  group_proportions = rep(1/num_groups, num_groups),
  prob = NULL,
  num_covariates = 0,
  num_districts = 0
)

Arguments

num_ballots

Number of ballot boxes (b).

num_candidates

Number of candidates (c).

num_groups

Number of demographic groups (g).

ballot_voters

A vector of length num_ballots representing the number of voters per ballot box. Defaults to rep(100, num_ballots).

lambda

A numeric value between 0 and 1 that represents the fraction of voters that are randomly assigned to ballot-boxes. The remaining voters are assigned sequentially according to their demographic group.

  • lambda = 0: The assignment of voters to ballot-boxes is fully sequential in terms of their demographic group. This leads to a high heterogeneity of the voters' groups across ballot-boxes.

  • lambda = 1: The assignment of voters to ballot-boxes is fully random. This leads to a low heterogeneity of the voters' group across ballot-boxes.

Default value is set to 0.5. See Shuffling Mechanish for more details.

seed

If provided, overrides the current global seed. Defaults to NULL.

group_proportions

Optional. A vector of length num_groups that indicates the fraction of voters that belong to each group. Default is that all groups are of the same size.

prob

Optional. A user-supplied probability matrix of dimension (g x c). If provided, this matrix is used as the underlying voting probability distribution. If not supplied, each row is sampled from a Dirichlet distribution with each parameter set to one.

num_covariates

Optional. Number of covariates (a) used to build the parametric covariates matrix V.

num_districts

Number of districts used to assign ballot boxes, when num_covariates isn't zero.

Value

An eim object. For the non-parametric case it contains:

X

A (b x c) matrix with candidates' votes for each ballot box.

W

A (b x g) matrix with voters' groups for each ballot-box.

real_prob

A (g x c) matrix with the probability that a voter from each group votes for each candidate. If prob is provided, it would equal such probability.

outcome

A (b x g x c) array with the number of votes for each candidate in each ballot box, broken down by group.

When num_attributes and num_districts are not zero, it returns:

X

A (b x c) matrix with candidates' votes for each ballot box.

W

A (b x g) matrix with voters' groups for each ballot-box.

V

A (b x a) matrix with ballot-box attributes.

real_prob

A (g x c x b) array with ballot-box probabilities.

real_alpha

A ((c-1) x a) matrix of true attribute parameters.

real_beta

A (g x (c-1)) matrix of true group parameters.

Shuffling Mechanism

Without loss of generality, consider an order relation of groups and ballot-boxes. The shuffling step is controlled by the lambda parameter and operates as follows:

  1. Initial Assignment: Voters are assigned to each ballot-box sequentially according to their demographic group. More specifically, the first ballot-boxes receive voters from the first group. Then, the next ballot-boxes receive voters from the second group. This continues until all voters have been assigned. Note that most ballot-boxes will contain voters from a single group (as long as the number of ballot-boxes exceeds the number of groups).

  2. Shuffling: A fraction lambda of voters who have already been assigned is selected at random. Then, the ballot-box assignment of this sample is shuffled. Hence, different lambda values are interpreted as follows:

    • lambda = 0 means no one is shuffled (the initial assignment remains).

    • lambda = 1 means all individuals are shuffled.

    • Intermediate values like lambda = 0.5 shuffle half the voters.

Using a high level of lambda (greater than 0.7) is not recommended, as this could make identification of the voting probabilities difficult. This is because higher values of lambda induce similar ballot-boxes in terms of voters' group.

Examples

# Example 1: Default usage with 200 ballot boxes, each having 100 voters
result1 <- simulate_election(
    num_ballots = 200,
    num_candidates = 3,
    num_groups = 5
)

# Example 2: Using a custom ballot_voters vector
result2 <- simulate_election(
    num_ballots = 340,
    num_candidates = 3,
    num_groups = 7,
    ballot_voters = rep(200, 340)
)

# Example 3: Supplying group_proportions
result3 <- simulate_election(
    num_ballots = 93,
    num_candidates = 3,
    num_groups = 4,
    group_proportions = c(0.3, 0.5, 0.1, 0.1)
)

# Example 4: Providing a user-defined prob matrix
custom_prob <- matrix(c(
    0.9,  0.1,
    0.4,  0.6,
    0.25, 0.75,
    0.32, 0.68,
    0.2,  0.8
), nrow = 5, byrow = TRUE)

result4 <- simulate_election(
    num_ballots = 200,
    num_candidates = 2,
    num_groups = 5,
    lambda = 0.3,
    prob = custom_prob
)

result4$real_prob == custom_prob
#>      [,1] [,2]
#> [1,] TRUE TRUE
#> [2,] TRUE TRUE
#> [3,] TRUE TRUE
#> [4,] TRUE TRUE
#> [5,] TRUE TRUE
# The attribute of the output real_prob matches the input custom_prob.