vignettes/apportionment_scenarios.Rmd
apportionment_scenarios.Rmd
This vignette explores the implementation of different seat apportionment scenarios. There are generally two ways to assign seats of a parliament: Either you distribute seats within each district independently or you distribute seats according to the vote share for the whole election region (e.g. on a national level). We will focus on the difference in seat shares compared to vote shares these two approaches can produce.
We’ll mainly use the data set for the 2019 Finnish parliamentary
elections. The finland2019
data set contains two
data.frames: One with the number of votes for each party, the other
data.frame tells us how many seats (election_mandates
) each
district has.
library(proporz)
str(finland2019)
#> List of 2
#> $ votes_df :'data.frame': 229 obs. of 3 variables:
#> ..$ party_name : chr [1:229] "KOK" "SDP" "VIHR" "PS" ...
#> ..$ district_name: chr [1:229] "UUS" "UUS" "HEL" "UUS" ...
#> ..$ votes : int [1:229] 114243 97107 90662 86691 84141 78486 73626 66109 59722 55244 ...
#> $ district_seats_df:'data.frame': 12 obs. of 2 variables:
#> ..$ district_name: chr [1:12] "HEL" "HÄM" "KAA" "KES" ...
#> ..$ seats : int [1:12] 22 14 17 10 7 18 19 8 15 36 ...
To make comparisons easier, we’ll use matrices instead of
data.frames. We create a votes_matrix
from the given
data.frame.
votes_matrix = pivot_to_matrix(finland2019$votes_df)
dim(votes_matrix)
#> [1] 44 12
# Let's look at all parties with at least 10k votes
knitr::kable(votes_matrix[rowSums(votes_matrix) > 10000,])
HÄM | HEL | KAA | KES | LAP | OUL | PIR | SAT | SKA | UUS | VAA | VAR | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
KD | 11733 | 7253 | 10875 | 8774 | 1077 | 5728 | 17180 | 3361 | 17389 | 14912 | 16741 | 5121 |
KESK | 20892 | 11015 | 40715 | 30597 | 29145 | 78486 | 26536 | 20878 | 50459 | 35348 | 50053 | 29796 |
KOK | 39511 | 84141 | 44224 | 19707 | 11243 | 28382 | 55244 | 17666 | 27699 | 114243 | 29530 | 52367 |
Nyt | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 24778 | 0 | 0 |
NYT | 0 | 13529 | 0 | 0 | 0 | 1593 | 0 | 0 | 0 | 0 | 0 | 0 |
PIR | 996 | 5840 | 687 | 1239 | 280 | 1244 | 2018 | 253 | 686 | 4083 | 462 | 1244 |
PS | 43199 | 47276 | 46196 | 27851 | 17160 | 52771 | 51819 | 30090 | 39996 | 86691 | 42843 | 52913 |
RKP | 500 | 20348 | 0 | 0 | 112 | 432 | 327 | 178 | 0 | 49625 | 52880 | 15238 |
SDP | 49359 | 52393 | 59722 | 29201 | 13479 | 26588 | 66109 | 31475 | 38274 | 97107 | 33608 | 49156 |
SIN | 821 | 2056 | 3212 | 705 | 483 | 1801 | 3342 | 0 | 4128 | 9498 | 2470 | 1427 |
STL | 1028 | 1285 | 829 | 389 | 667 | 847 | 1371 | 199 | 769 | 2754 | 777 | 451 |
VAS | 15172 | 42899 | 10371 | 12681 | 14128 | 33820 | 24341 | 12435 | 15908 | 26257 | 8315 | 35481 |
VIHR | 17354 | 90662 | 23032 | 17639 | 9706 | 20376 | 37152 | 7461 | 21362 | 73626 | 10515 | 25309 |
We also create a named vector that defines how many seats each
district has (district_seats
). Note that the order of
district names in district_seats
and columns in
votes_matrix
differ. However, this does not affect the
analysis as we access districts by name (as does
biproporz()
by the way).
district_seats = finland2019$district_seats_df$seats
names(district_seats) <- finland2019$district_seats_df$district_name
district_seats
#> HEL HÄM KAA KES LAP OUL PIR SAT SKA UUS VAA VAR
#> 22 14 17 10 7 18 19 8 15 36 16 17
In Finland and many other jurisdictions, seats are assigned for each
district independently. There is no biproportional apportionment among
all districts. The following function, apply_proporz
,
calculates the seat distribution for each district and returns the
number of seats per party and district as a matrix. The user can specify
the apportionment method and a quorum threshold.
apply_proporz = function(votes_matrix, district_seats, method, quorum = 0) {
seats_matrix = votes_matrix
seats_matrix[] <- NA
# calculate proportional apportionment for each district (matrix column)
for(district in names(district_seats)) {
seats_matrix[,district] <- proporz(votes_matrix[,district],
district_seats[district],
quorum = quorum,
method = method)
}
return(seats_matrix)
}
The Finnish election system uses the D’Hondt method. We’ll calculate the seat distribution as a baseline to compare it with other methods.
bydistrict_v0 = apply_proporz(votes_matrix, district_seats, "d'hondt")
bydistrict_v0[rowSums(bydistrict_v0) > 0,]
#> district_name
#> party_name HÄM HEL KAA KES LAP OUL PIR SAT SKA UUS VAA VAR
#> KD 1 0 0 0 0 0 1 0 1 1 1 0
#> KESK 1 0 3 3 3 6 2 1 4 2 4 2
#> KOK 3 6 3 1 1 2 4 1 2 9 2 4
#> Nyt 0 0 0 0 0 0 0 0 0 1 0 0
#> PS 3 3 4 2 1 4 4 2 3 6 3 4
#> RKP 0 1 0 0 0 0 0 0 0 3 4 1
#> SDP 4 3 5 2 1 2 5 3 3 7 2 3
#> VAS 1 3 0 1 1 3 1 1 1 2 0 2
#> VIHR 1 6 2 1 0 1 2 0 1 5 0 1
We’ll now look at alternative apportionment methods. The Sainte-Laguë method (standard rounding) for example is impartial to party size. On the other hand, the Huntington–Hill method favors small parties. Generally, Huntington-Hill is used with a quorum, otherwise all parties with more than zero votes get at least one seat. In this example, the quorum is 3% of votes in a district.
bydistrict_v1 = apply_proporz(votes_matrix, district_seats,
method = "sainte-lague")
bydistrict_v2 = apply_proporz(votes_matrix, district_seats,
method = "huntington-hill",
quorum = 0.03)
Let’s compare the seat distributions for these three methods. We get
the number of party seats on the national level with
rowSums
.
df_bydistrict = data.frame(
D.Hondt = rowSums(bydistrict_v0),
Sainte.Lague = rowSums(bydistrict_v1),
Huntington.Hill = rowSums(bydistrict_v2)
)
# sort table by D'Hondt seats
df_bydistrict <- df_bydistrict[order(df_bydistrict[[1]], decreasing = TRUE),]
# print parties with at least one seat
knitr::kable(df_bydistrict[rowSums(df_bydistrict) > 0,])
D.Hondt | Sainte.Lague | Huntington.Hill | |
---|---|---|---|
SDP | 40 | 36 | 36 |
PS | 39 | 37 | 37 |
KOK | 38 | 35 | 35 |
KESK | 31 | 29 | 29 |
VIHR | 20 | 25 | 27 |
VAS | 16 | 18 | 18 |
RKP | 9 | 8 | 8 |
KD | 5 | 7 | 6 |
Nyt | 1 | 2 | 2 |
NYT | 0 | 1 | 1 |
SIN | 0 | 1 | 0 |
The actual political analysis is better left for people familiar with the party system. However, it is fair to say that the election system within districts has a significant impact on the number of seats in parliament. For example, the party with the most seats changes with Sainte-Laguë or Huntington-Hill.
Let’s now compare the vote shares on the national level with the seat
share in parliament. Since every entry in votes_matrix
is
only one voter, we can simply use the row sums to get the vote shares.
Otherwise, we’d have to weigh votes in each district according to the
number of seats. Since disproportionality analysis is not the focus of
this package, we’ll simply look at the difference in shares for the
actual distribution method.
vote_shares = rowSums(votes_matrix)/sum(votes_matrix)
shares = data.frame(
seats = rowSums(bydistrict_v0)/sum(district_seats),
votes = vote_shares
)
shares$difference <- shares$seats-shares$votes
shares <- round(shares, 4)
# Only look at parties with at least 0.5 % of votes
shares <- shares[shares$votes > 0.005,]
shares <- shares[order(shares$difference),]
shares
#> seats votes difference
#> VIHR 0.1005 0.1154 -0.0149
#> KD 0.0251 0.0391 -0.0140
#> SIN 0.0000 0.0098 -0.0098
#> PIR 0.0000 0.0062 -0.0062
#> Nyt 0.0050 0.0081 -0.0030
#> VAS 0.0804 0.0821 -0.0017
#> RKP 0.0452 0.0455 -0.0003
#> KESK 0.1558 0.1381 0.0176
#> KOK 0.1910 0.1707 0.0202
#> PS 0.1960 0.1756 0.0204
#> SDP 0.2010 0.1781 0.0229
The difference between seat and vote share varies among parties from -1.5 to +2.3 percentage points.
We’ll now look at biproportional apportionment and whether it better
matches the seat share with the national vote share. Keep in mind that
simply using existing data sets with biproporz
is not
really suitable from a modeling perspective. Vote distributions are
likely to be different if, for example, people know their vote also
counts on a national level or if smaller parties choose to run in more
districts.
With that being said, let’s assume our data set is properly modeled
already. We need to consider that voters in Finland can only vote for
one person in each district. They can’t distribute as many
votes as there are seats in their district among candidates/parties.
Thus we need to use use_list_votes=FALSE
as a parameter in
biproporz
.
seats_biproportional = biproporz(votes_matrix,
district_seats,
use_list_votes = FALSE)
# show only parties with seats
seats_biproportional[rowSums(seats_biproportional) > 0,]
HÄM | HEL | KAA | KES | LAP | OUL | PIR | SAT | SKA | UUS | VAA | VAR | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
KD | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 0 |
KESK | 1 | 1 | 3 | 2 | 2 | 6 | 2 | 1 | 3 | 2 | 3 | 2 |
KOK | 3 | 5 | 3 | 1 | 1 | 2 | 4 | 1 | 2 | 7 | 2 | 4 |
KP | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
Nyt | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 |
NYT | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
PIR | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
PS | 3 | 3 | 3 | 2 | 1 | 4 | 3 | 2 | 3 | 5 | 3 | 3 |
RKP | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 3 | 1 |
SDP | 4 | 3 | 4 | 2 | 1 | 2 | 4 | 2 | 3 | 6 | 2 | 3 |
SIN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
STL | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
VAS | 1 | 2 | 1 | 1 | 1 | 2 | 2 | 1 | 1 | 2 | 1 | 2 |
VIHR | 1 | 5 | 2 | 1 | 1 | 1 | 2 | 1 | 1 | 5 | 1 | 2 |
Let’s now look at the difference between vote and seat share.
vote_shares = rowSums(votes_matrix)/sum(votes_matrix)
seat_shares = rowSums(seats_biproportional)/sum(seats_biproportional)
range(vote_shares - seat_shares)
#> [1] -0.005144852 0.002429257
As we can see, the difference between vote and seat share ranges from -0.5 to +0.2 percentage point. Biproportional apportionment matches the national vote share better than apportionment by district. This is expected however, since biproportional apportionment actually considers the national vote share. Discussing the pros and cons of a regional representation compared to a priority on national vote shares is not within the scope of this vignette. The following chunk shows the seat changes.
seat_changes = seats_biproportional-bydistrict_v0
knitr::kable(seat_changes[rowSums(abs(seat_changes)) > 0,colSums(abs(seat_changes))>0])
HEL | KAA | KES | LAP | OUL | PIR | SAT | SKA | UUS | VAA | VAR | |
---|---|---|---|---|---|---|---|---|---|---|---|
KD | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
KESK | 1 | 0 | -1 | -1 | 0 | 0 | 0 | -1 | 0 | -1 | 0 |
KOK | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -2 | 0 | 0 |
KP | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
Nyt | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
NYT | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
PIR | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
PS | 0 | -1 | 0 | 0 | 0 | -1 | 0 | 0 | -1 | 0 | -1 |
RKP | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | -1 | 0 |
SDP | 0 | -1 | 0 | 0 | 0 | -1 | -1 | 0 | -1 | 0 | 0 |
SIN | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 |
STL | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
VAS | -1 | 1 | 0 | 0 | -1 | 1 | 0 | 0 | 0 | 1 | 0 |
VIHR | -1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 |
Normally the number of seats per district is defined before an
election, based on census data. However, you could also assign
the number of seats per district, based on the actual votes cast in
every district. This way, each districts seat share is proportional to
its vote share. To do this, we use the total number of seats for the
district_seats
parameter.
full_biproportional = biproporz(votes_matrix,
district_seats = sum(district_seats),
use_list_votes = FALSE)
# party seat distribution has not changed
rowSums(full_biproportional) - rowSums(seats_biproportional)
#> Asyl E117 E118 E154 E168 E169 E185 E191 E192 E287 E491
#> 0 0 0 0 0 0 0 0 0 0 0
#> E492 E493 EOP FP IP KD KESK KOK KP KTP LIBE
#> 0 0 0 0 0 0 0 0 0 0 0
#> LII LIIK LN LNLY LNY LNYL Nyt NYT PEL PIR PS
#> 0 0 0 0 0 0 0 0 0 0 0
#> REF Reform RKP RLI SDP SIN SKE SKP STL VAS VIHR
#> 0 0 0 0 0 0 0 0 0 0 0
# district seat distribution is different
colSums(full_biproportional) - colSums(seats_biproportional)
#> HÄM HEL KAA KES LAP OUL PIR SAT SKA UUS VAA VAR
#> -1 3 -1 0 0 -1 0 0 -1 0 0 1
As we can see, the number of party seats does not change. However, districts where more people voted get a higher share of the seats. 2 districts gain seats, 4 districts lose a seat.
As a last example, we’ll look at the changes in seat distribution biproportional apportionment had in the 2020 parliament election in the Swiss canton of Uri. In Uri, 2020 was the first year the cantonal parliament was elected with biproportional apportionment for some municipalities. Previously, these four municipalities used the Hagenbach-Bischoff method. We’ll ignore all other municipalities which use a majoritarian electoral system.
seats_old_system = apply_proporz(uri2020$votes_matrix, uri2020$seats_vector, "hagenbach-bischoff")
seats_new_system = biproporz(uri2020$votes_matrix, uri2020$seats_vector)
seats_new_system-seats_old_system
#> Altdorf Bürglen Erstfeld Schattdorf
#> CVP 1 0 0 0
#> SPGB -1 0 0 0
#> FDP 0 0 0 0
#> SVP 0 0 0 0
Compared to the previous election system there was a change of one seat (out of 37).