Selective reporting in "Priming Racial Resentment without Stereotypic Cues"
In a survey experiment reported in LaFleur Stephens-Dougan's 2016 Journal of Politics article, "Priming Racial Resentment without Stereotypic Cues", respondents were shown a campaign mailer for a white candidate named Greg Davis, with experimental manipulations of the candidate's party (Democrat or Republican) and the photos on the candidate's mailers (five photos of whites, five photos of blacks, or a mixture of photos of whites and blacks).
One key finding described in the abstract is that "white Democratic candidates are penalized for associating with blacks, even if blacks are portrayed in a positive manner".
The JOP article describes the analysis in Chapter 5 of Stephens' dissertation, which reported on a July 2011 YouGov/Polimetrix survey. Dissertation page 173 indicated that the survey had 13 experimental conditions, but the JOP article reports only six conditions, omitting the control condition and the six conditions in which candidate Greg Davis was black. Stephens-Dougan might plan to report on these omitted conditions in a subsequent publication, so I'll concentrate on the outcome variables.
---
Many potential outcome variables were not reported on in the JOP article, based on a comparison of the survey questionnaire in Appendix D of the dissertation to the outcome variables mentioned in the main text or the appendix of the article. The box below describes each post-treatment item on the survey questionnaire except for the Q7 manipulation check: regular font with a [*] indicates items reported on in the article, and boldface indicates items not reported on in the article. See pages 224 to 230 of the dissertation for the exact item wording.
Q8. Feeling thermometer for the candidate.
Q9 [*]. Likelihood of voting for Greg Davis.
Q10. How well a series of terms describe Greg Davis:
- Intelligent
- Inexperienced
- Trustworthy
- Hardworking
- Fair [*]
- Competent
Q11. Perception of Greg Davis' political ideology.
Q12. Whether Democrats, Republicans, or neither party would be better at:
- Helping Senior Citizens
- Improving Health Care
- Improving the Economy
- Reducing Crime
- Reforming Public Education
Q13. Like, dislike, or neither for the Democratic Party.
Q14. Like, dislike, or neither for the Republican Party.
Q15. Perception of how well Greg Davis would handle:
- Helping Senior Citizens
- Improving Health Care
- Improving the Economy
- Reducing Crime [*]
- Reforming Public Education
Q16 [*]. Whether Greg Davis' policies will favor Whites over Blacks, Blacks over Whites, or neither.
Q17. Perception of which groups Greg Davis will help if elected:
- Teachers
- Latinos
- Corporate Executives
- Farmers
- Senior Citizens
- African Americans
- Homeowners
- Students
- Small Business Owners
- Whites
Q18 [*]. Perception of Greg Davis' position on affirmative action for Blacks in the workplace.
Q19. Perception of Greg Davis' position on the level of federal spending on Social Security.
Q20. Job approval, disapproval, or neither for Barack Obama as president.
The boldface above indicates that many potential outcome variables were not reported on in the JOP article. For example, Q10 asked respondents how well the terms "intelligent", "inexperienced", "trustworthy", "hardworking", "fair", and "competent" describe the candidate, but readers are told about results only for "fair"; readers are told results for Q16 about the candidate's perceived preference for policies that help whites over blacks or vice versa, but readers are not told results for "Whites" and "Blacks" on Q17 about the groups that the candidate is expected to help.
Perhaps the estimates and inferences are identical for the omitted and included items, but prior analyses [e.g., here and here] suggest that omitted items often produce different estimates and sometimes produce different inferences than included items.
Data for most omitted potential outcome variables are not in the article's dataset at the JOP Dataverse, but the dataset did contain a "thermgregdavis" variable that ranged from 1 to 100, which is presumably the feeling thermometer in Q8. I used the model in line 14 of Stephens-Dougan's code, but -- instead of using the reported Q9 outcome variable for likelihood of voting for Greg Davis -- I used "thermgregdavis" as the outcome variable and changed the estimation technique from ordered logit to linear regression: the p-value for the difference between the all-white photo condition and the mixed photo condition was p=0.693, and the p-value for the difference between the all-white photo condition and the all-black photo condition was p=0.264.
---
This sort of selective reporting is not uncommon in social science [see here, here, and here], but I'm skeptical that researchers with the flexibility to report the results they want based on post-hoc research design choices will produce replicable estimates and unbiased inferences, especially in the politically-charged racial discrimination subfield. I am also skeptical that selective reporting across publications will balance out in a field in which a supermajority of researchers fall on one side of the political spectrum.
So how can such selective reporting be prevented? Researchers can preregister their research designs. Journals can preaccept articles based on a preregistered research design. For non-preregistered studies, journals can require as a condition of publication the declaration of omitted studies, experiments, experimental conditions, and outcome variables. Peer reviewers can ask for these declarations, too.
---
It's also worth comparing the hypotheses as expressed in the dissertation to the hypotheses as expressed in the JOP article. First, the hypotheses from dissertation Chapter 5, on page 153:
H1: Democratic candidates are penalized for an association with African Americans.
H2: Republican candidates are rewarded for an association with African Americans.
H3: The racial composition of an advertisement influences voters' perceptions of the candidates' policy preferences.
Now, the JOP hypotheses:
H1. White Democratic candidates associated with blacks will lose vote support and will be perceived as more likely to favor blacks over whites and more likely to support affirmative action relative to white Democratic candidates associated with images of only whites.
H2. Counterstereotypic images of African Americans paired with a white Democratic candidate will prime racial attitudes on candidate evaluations that are implicitly racial relative to a comparable white Democratic candidate associated with all whites.
H3. Counterstereotypic images of African Americans paired with a white Republican candidate will be inconsequential such that they will not be associated with a main effect or a racial priming effect.
So hypotheses became more specific for Democratic candidates and switched from Republicans being rewarded to Republicans not experiencing a consequential effect. My sense is that hypothesis modification is not uncommon in social science, but the reason for the survey items asking about personal characteristics of the candidate (e.g., trustworthy, competent) is clearer in light of the dissertation's hypotheses about candidates being penalized or rewarded for an association with African Americans. After all, the feeling thermometer and the other Q10 characteristic items can be used to assess a penalty or reward for candidates.
---
In terms of the substance of the penalty, the abstract of the JOP article notes: "I empirically demonstrate that white Democratic candidates are penalized for associating with blacks, even if blacks are portrayed in a positive manner."
My analysis of the data indicated that, based on a model with no controls and no cases dropped, and comparing the all-white photo condition to the all-black photo condition, there is evidence of this penalty in the Q9 vote item at p=0.074. However, evidence for this penalty is weak in the feeling thermometer (p=0.248) and in the "fair" item (p=0.483), and I saw no evidence in the article or dissertation that the penalty can be detected in the items omitted from the dataset.
Moreover, much of the estimated penalty might reflect only the race of persons in the photos providing a signal about candidate Greg Davis' ideology. Compared to respondents in the all-white photo condition, respondents in the mixed photo condition and the all-black photo condition rated Greg Davis as more liberal (p-values of 0.014 and 0.004), and the p=0.074 penalty in the Q9 vote item inflates to p=0.710 when including the measure of Greg Davis' perceived ideology, with corresponding p-values ranging from p=0.600 to p=0.964 for models predicting a penalty in the thermometer and the "fair" item.
---
NOTES:
1. H/T to Brendan Nyhan for the pointer to the JOP article.
2. The JOP article emphasizes the counterstereotypical nature of the mailer photos of blacks, but the experiment did not vary the photos of blacks, so the experiment provides no evidence about the influence of the counterstereotypical nature of the photos.
3. The JOP article reports four manipulation checks (footnote 6), but the dissertation reports five manipulation checks (footnote 65, p. 156). The omitted manipulation check concerned whether the candidate tried to appeal to racial feelings. The dataset for the article at the JOP Dataverse has a "manipchk_racialfeelings" variable that is presumably this omitted manipulation check.
4. The abstract reports that "Racial resentment was primed such that white Democratic candidates associated with blacks were perceived as less fair, less likely to reduce crime, and less likely to receive vote support." However, Table 2 of the article and my analysis indicate that no photo condition comparison produced a statistically-significant main effect for the "fair" item and only the all-white vs. mixed photo comparison produced a statistically-significant main effect for perceptions of the likelihood of reducing crime, with this one main effect reaching statistical significance only under the article's generous convention of using a statistical significance asterisk for a one-tailed p-value less than 0.10 (the p-value was p=0.142).
Table 4 of the article indicated a statistically-significant interaction between photo conditions and racial resentment when predicting the "fair" item and perceptions of the likelihood of reducing crime, so I think that this interaction is what is referred to in the abstract statement that "Racial resentment was primed such that white Democratic candidates associated with blacks were perceived as less fair, less likely to reduce crime, and less likely to receive vote support."
5. The 0.142 p-value referred to in the previous item inflates to p=0.340 when the controls are removed from the model. There are valid reasons for including demographic controls in a regression predicting results from a survey experiment, but the particular set of controls should be preregistered to prevent researchers from estimating models without controls and with different combinations of controls and then selecting a model or models to report based on the corresponding p-value or effect size.
6. Code for the new analyses:
- reg votegregdavis i.whitedem_treatments [pweight = weight]
- reg thermgregdavis i.whitedem_treatments [pweight = weight]
- reg fair_gregdavis i.whitedem_treatments [pweight = weight]
- reg ideo_gregdavis i.whitedem_treatments [pweight = weight]
- reg votegregdavis i.whitedem_treatments ideo_gregdavis [pweight = weight]
- reg thermgregdavis i.whitedem_treatments ideo_gregdavis [pweight = weight]
- reg fair_gregdavis i.whitedem_treatments ideo_gregdavis [pweight = weight]
- ologit fair_gregdavis i.whitedem_treatments gender educ income pid7 south [pweight = weight]
- ologit gregdavis_redcrim i.whitedem_treatments gender educ income pid7 south [pweight = weight]
- ologit gregdavis_redcrim i.whitedem_treatments [pweight = weight]
7. I emailed Dr. Stephens-Dougan, asking whether there was a reason for the exclusion of items and about access to a full dataset. I received a response and invited her to comment on this post.