Survey weighting in R
I think that I figured out a way to use R to construct survey weights. I got most of the R code below from here and other code from here. The dataset that I'll use for the illustration is here.
# This installs the survey package:
install.packages("survey")
# This loads the survey package into an open session of R:
library(survey)
# This imports the CSV dataset:
DATASET <- read.csv(file.choose(),header=TRUE)
# This gets the data as a data frame:
DATA <- as.data.frame(DATASET)
# Let's get sample percentages from the CSV file:
table(DATA$gender) # 62% Male, 38% Female
table(DATA$race) # 67% White, 13% Black, 20% Other
# Let's use these population percentages for the percentages that we want to weight to:49% Male, 51% Female, 64% White, 12% Black, 24% Other
# This next set of commands produces the weights:
data.svy.unweighted <- svydesign(ids=~1, data=DATA)
gender.dist <- data.frame(gender=c(0,1), Freq=nrow(DATA)*c(.49,.51))
race.dist <- data.frame(race=c(1,2,3), Freq=nrow(DATA)*c(.64,.12,.24))
data.svy.rake <- rake(design = data.svy.unweighted,
sample.margins <- list(~gender,~race),
population.margins <- list(gender.dist, race.dist))
# The next command trims the weights to that no weight is too small or too large. For this, I used 0.3 as the low end and 3 as the high end, but those aren't the only reasonable values.
data.svy.rake.trim <- trimWeights(data.svy.rake, lower=.3, upper=3, strict=TRUE)
# This puts the weights in a WEIGHTS variable:
DATA$WEIGHTS <- weights(data.svy.rake)
# This puts the trimmed weights in a WEIGHTS.TRIM variable.
DATA$WEIGHTS.TRIM <- weights(data.svy.rake.trim)
# This produces a list of the weights used:
stack(table(DATA$WEIGHTS))
# This produces a list of the trimmed weights used:
stack(table(DATA$WEIGHTS.TRIM))
# This indicates the population percentages for gender when weights are applied:
svytable(~DATA$gender, data.svy.rake.trim)
# This indicates the population percentages for race when weights are applied:
svytable(~DATA$race, data.svy.rake.trim)
# This produces a CSV dataset with the weights:
write.csv(DATA, "F:/DATA.csv")