This lab journal is created for the Summer-School project BIGSSS - Segregation and Polarization.

In this section we determine the political positions of Dutch parties. We use multiple data sources/approaches to determining party positions. The end product is your own custom dataframe of the political positions of Dutch parties!

1 Getting started

1.1 Clean up

rm(list = ls())

1.2 General custom functions

fpackage.check: Check if packages are installed (and install if not) in R (source).
fsave: Function to save data with time stamp in correct directory

fsave <- function(x, file, location = "./data/processed/", ...) {
    if (!dir.exists(location))
        dir.create(location)
    datename <- substr(gsub("[:-]", "", Sys.time()), 1, 8)
    totalname <- paste(location, datename, file, sep = "")
    print(paste("SAVED: ", totalname, sep = ""))
    save(x, file = totalname)
}

fpackage.check <- function(packages) {
    lapply(packages, FUN = function(x) {
        if (!require(x, character.only = TRUE)) {
            install.packages(x, dependencies = TRUE)
            library(x, character.only = TRUE)
        }
    })
}

colorize <- function(x, color) {
    sprintf("<span style='color: %s;'>%s</span>", color, x)
}

1.3 Packages

haven: Read and write various data formats used by other statistical packages (eg, Stata and SPSS)
tidyverse: for data handling.

packages <- c("haven", "tidyverse")
fpackage.check(packages)

2 Data

We use 3 data sources / approaches to determining party positions:

Election Tool Dutch Kieskompas (Vote compass) data. Using the Vote compass, individuals can see how their views align with parties running for election. Users are presented with approximately 30 statements, divided over 8 themes. The result appears in a two-dimensional axis system (or spectrum), positioned between the different political parties. We have prepared a dataset based on Kieskompas, Download kieskompas_df.csv and place it in folder ./data/.¹
Expert surveys: positions of parties with respect to multiple political issues are evaluated by multiple political experts. We use:
- CHES (Jolly et al. 2022). Download CHES2019_experts.dta and place it in folder ./data/. There are many datasets on the website of CHES. Dowload the correct one. :-)
- POPPA (Meijers and Zaslove 2020). Download the zip file, unzip the files and place the dataset expert_data_stata.dta in your ./data/
Voter survey data of the Dutch Parliamentary Election Study (DPES) (Jacobs et al. 2022). We are allowed to use this dataset for the summerschool. Normally, individual researchers have to ask for permission. Thus DO NOT DISTRIBUTE THIS DATASET, DO NOT PUBLISH THIS DATASET ON GIT!!. In this dataset respondent’s political attitudes and expectations of the government are covered. You may download the data here and the codebook here. Place the files in folder ./data/.

2.1 KiesKompas

The website of Kieskompas contains 30 items divided over 8 themes (in what follows, these are labeled build, democracy, social, climate, education and care, immigration foregin policy, and justice). Party positions on each of these items/issues are indicated on a 5-point Likert scale, indicating the extent to which parties (dis)agree with the statements (1=Fully disagree; 5=Fully agree). For each party, we also included their position on the left-right scale (x) and conservative-progressive scale (y), both ranging from -2 to 2.

# load in the data
kieskom <- read.csv("./data/kieskompas_df.csv")

kieskom <- kieskom[-1]  # exclude the indicator column
names(kieskom)[-1] <- paste0("kieskom_", names(kieskom)[-1])  # add a kieskom label to the column names
names(kieskom)[1] <- "party"

2.2 Expert surveys

We load in the data of the POPPA and CHES; we make 2 data-frames of expert-level responses about Dutch parties.

2.2.1 poppa data

# public data include expert-level data (ie, expert judgement); and mean and median judgements we
# use data at the expert-level
pop <- read_dta("./data/expert_data_stata.dta")

# subset Dutch parties (country id=19)
pop <- pop[which(pop$country_id == 19), ]

# subset party and dimensions names(pop)
pop <- pop[, c(4, 6:21)]
# add poppa label to variables names(pop)[-c(1)] <- paste0('pop_',names(pop)[-c(1)])

2.2.2 CHES data

ches <- read_dta("./data/CHES2019_experts.dta")
ches <- as.data.frame(ches)

dutch <- 1001:1051  #Dutch party ids
ches <- ches[ches$party_id %in% dutch, ]  #subset Dutch parties

# subset party and dimensions names(ches)
ches <- ches[, c(3, 5:49)]

# add ches label to variables names(ches)[-c(1)] <- paste0('ches_',names(ches)[-c(1)])
# unique(pop$party) unique(ches$party_name)

Political scientists make frequent use of expert surveys. The goal of expert surveys is often to aggregate the responses from many experts, typically by taking the mean.

2.2.3 Assignment 1

A. Read the paper by Lindstädt, Proksch, and Slapin (2020).

B. What is the conclusion of the authors about inferring about a party position using the mean of expert responses?

C. Lindstädt and colleagues state: “Studies using expert surveys largely rely on mean expert placement, using standard deviations or standard errors to assess uncertainty. Yet, the shape of expert placement distributions can vary drastically across the items in a survey. Political parties, for example, can have similar estimated mean party positions based on very different distributions of expert placements.” Pick an item that reflects a policy dimension you are intested in, from either the POPPA or CHES dataset. Find Dutch parties that have approximately equal mean-scores on that particular item; and investigate how the shape of expert placement distributions varies across these parties. Use histograms to illustrate. If you need help, click the Code button on the right

# example with experts' party left-right placement:

# we want to include means as well!
mean <- pop %>%
    group_by(party) %>%
    summarize(mean = mean(lroverall, na.rm = T))

p <- ggplot(pop, aes(x = lroverall, fill = party, color = "black")) + geom_histogram(position = "identity") +
    scale_x_continuous(breaks = seq(0, 10, by = 1)) + labs(title = "Distribution of expert placements of Dutch parties on the left-right scale (0-10) ",
    subtitle = "with means...") + theme(legend.position = "none", axis.text.x = element_blank(), axis.ticks.x = element_blank())

# disaggregate per party
p + facet_grid(~party)

D. Based on the article and your own empirical insights, make an informed decision about how to aggregate the expert-level responses regarding this dimension to the party-level. Use this strategy to aggregate the responses and construct a dataframe (named: df) with rows reflecting the parties and column reflecting the aggregated scores on the picked dimension(s). If you need help, click the Code button on the right

# example on median left-right placement

# tidyverse solution
df <- pop %>%
    group_by(party) %>%
    summarize(median = median(lroverall, na.rm = T))  # strip missings before aggregation!

# base R solution
df2 <- data.frame(party = unique(pop$party), pop_median_LR = NA)
for (i in unique(df2$party)) {
    df2$pop_median_LR[which(df2$party == i)] <- median(pop$lroverall[which(pop$party == i)], na.rm = T)
}

Let us merge the two dataframes:

# first translate party characters to lower case
kieskom$party <- tolower(kieskom$party)
df$party <- tolower(df$party)

# merge by party;
merged <- merge(df, kieskom, by = "party", all = T)

2.3 DPES

Aggregating responses using the mean may produce biased estimates of latent concepts we want to measure! One major difference is that, in expert surveys, standard deviations/errors indicate a form of uncertainty, while in voter surveys (e.g., DPES), standard deviations/errors indicate heterogeneity in attitudes.

We load in the DPES dataset

dpes <- read_spss("./data/DPES2021 v1.0.sav")

2.3.1 Assignment 2

A. Pick one (or more) items you are interested in. Use the codebook to navigate through the dataset.

B. Based on empirical insights, pick a strategy to aggregate individual responses on this item to the party-level.

C. Enrich the merged dataset with this party position indicator. Add a dpes-label to the variable. Hint: V76 indicates which party the respondent intends to vote for. This is a numeric variable with labels indicating the party name.

3 Save

Last, make sure to save your custom dataset using our custom function!

You are going to use this dataset to calculate the level of polarization at each polling station.

fsave(..., "positions_data.RData")

References

Jacobs, KTE, M Lubbers, T Sipma, N Spierings, and TWG van der Meer. 2022. “DUTCH PARLIAMENTARY ELECTION STUDY 2021 (DPES/NKO 2021).” SKON.

Jolly, Seth, Ryan Bakker, Liesbet Hooghe, Gary Marks, Jonathan Polk, Jan Rovny, Marco Steenbergen, and Milada Anna Vachudova. 2022. “Chapel Hill Expert Survey Trend File, 1999–2019.” Electoral Studies 75: 102420.

Lindstädt, R., S. O. Proksch, and J. B. Slapin. 2020. “When Experts Disagree: Response Aggregation and Its Consequences in Expert Surveys.” Political Science Research and Methods 3 (8): 580–88. https://doi.org/10.1017/psrm.2018.52.

Meijers, Maurits, and Andrej Zaslove. 2020. “Populism and Political Parties Expert Survey 2018 (POPPA).” Harvard Dataverse. https://doi.org/10.7910/DVN/8NEL7B.

The positions of the parties on these topics are determined by … experts.↩︎

Political Polarization - Political positions of Dutch parties