1.

In May, I published a blog post about deviations from the pre-analysis plan for the Stephens-Dougan 2022 APSR letter, and I tweeted a link to the blog post that tagged @LaFleurPhD and asked her directly about the deviations from the pre-analysis plan. I don't recall receiving a response from Stephens-Dougan, and, a few days later, on May 31, I emailed the APSR about my post, listing three concerns:

* The Stephens-Dougan 2022 description of racially prejudiced Whites not matching how the code for Stephens-Dougan 2022 calculated estimates for racially prejudiced Whites.

* The substantial deviations from the pre-analysis plan.

* Figure 1 of the APSR letter reporting weighted estimates, but the evidence being much weaker in unweighted analyses.

Six months later (December 5), the APSR has published a correction to Stephens-Dougan 2022. The correction addresses each of my three concerns, but not perfectly, which I'll discuss below, along with other discussion about Stephens-Dougan 2022 and its correction. I'll refer to the original APSR letter as "Stephens-Dougan 2022" and the correction as "the correction".

---

2.

The pre-analysis plan associated with Stephens-Dougan 2022 listed four outcomes at the top of its page 4, but only one of these outcomes (referred to as "Individual rights and freedom threatened") was reported on in Stephens-Dougan 2022. However, Table 1 of Stephens-Dougan 2022 reported results for three outcomes that were not mentioned in the pre-analysis plan.

The t-statistics for the key interaction term for the three outcomes included in Table 1 of Stephens-Dougan 2022 but not mentioned in pre-analysis plan were 2.6, 2.0, and 2.1, all of which indicate sufficient evidence. The t-statistics for the key interaction term mentioned in pre-analysis plan but omitted from Stephens-Dougan 2022 were 0.6, 0.6, and 0.6, none of which indicate sufficient evidence.

I calculated the t-statistics of 2.6, 2.0, and 2.1 from Table 1 of Stephens-Dougan 2022, by dividing a coefficient by its standard error. I wasn't able to use the correction to calculate the t-statistics of 0.6, 0.6, and 0.6, because the relevant data for these three omitted pre-analysis plan outcomes are not in the correction but instead are in Table A12 of a "replication-final.pdf" file hosted at the Dataverse.

That's part of what I meant about an imperfect correction: a reader cannot use information published in the APSR itself to calculate the evidence provided by the outcomes that were planned to be reported on in the pre-analysis plan, or, for that matter, to see how there is substantially less evidence in the unweighted analysis. Instead, a reader needs to go to the Dataverse and dig through table after table of results.

The correction refers to deviations from the pre-analysis plan, but doesn't indicate the particular deviations and doesn't indicate what happens when these deviations are not made.  The "Supplementary Materials Correction-Final.docx" file at the Dataverse for Stephens-Dougan 2022 has a discussion of deviations from the pre-analysis plan, but, as far as I can tell, the discussion does not provide a reason why the results should not be reported for the three omitted outcomes, which were labeled in Table A12 as "Slow the Spread", "Stay Home", and "Too Long to Loosen Restrictions".

It seems to me to be a bad policy to permit researchers to deviate from a pre-analysis plan without justification and to merely report results from a planned analysis on, say, page 46 of a 68-page file on the Dataverse. But a bigger problem might be that, as far as I can tell, many journals don't even attempt to prevent misleading selective reporting for survey research for which there is no pre-analysis plan. Journals could require researchers reporting on surveys to submit or link to the full questionnaire for the surveys or at least to declare that the main text reports on results for all plausible measured outcomes and moderators.

---

3.

Next, let me discuss a method used in Stephens-Dougan 2022 and the correction, which I think is a bad method.

The code for Stephens-Dougan 2022 used measures of stereotypes about Whites and Blacks on the traits of hard working and intelligent, to create a variable called "negstereotype_endorsement". The code divided respondents into three categories, coded 0 for respondents who did not endorse a negative stereotype about Blacks relative to Whites, 0.5 for respondents who endorsed exactly one of the two negative stereotypes about Blacks relative to Whites, and 1 for respondents who endorsed both negative stereotypes about Blacks relative to Whites. For both Stephens-Dougan 2022 and the correction, Figure 3 reported for each reported outcome an estimate of how much the average treatment effect among prejudiced Whites (defined as those coded 1) differed from the average treatment effect among unprejudiced Whites (defined as those coded 0).

The most straightforward way to estimate this difference in treatment effects is to [1] calculate the treatment effect for prejudiced Whites coded 1, [2] calculate the treatment effect for unprejudiced Whites coded 0, and [3] calculate the difference between these treatment effects. The code for Stephens-Dougan 2022 instead estimated this difference using a logit regression that had three predictors: the treatment, the 0/0.5/1 measure of prejudice, and an interaction of the prior two predictors. But, by this method, the estimated difference in treatment effect between the 1 respondents and the 0 respondents depends on the 0.5 respondents. I can't think of a valid reason why responses from the 0.5 respondents should influence an estimated difference between the 0 respondents and the 1 respondents.

See my Stata output file for more on that. The influence of the 0.5 respondents might not be major in most or all cases, but an APSR reader won't know, based on Stephens-Dougan 2022 or its correction, the extent to which the 0.5 respondents influenced the estimates for the comparison of the 0 respondents to the 1 respondents.

Now about those 0.5 respondents…

---

4.

Remember that the Stephens-Dougan 2022 "negative stereotype endorsement" variable has three levels: 0 for the 74% of respondents who did not endorse a negative stereotype about Blacks relative to Whites, 0.5 for the 16% of respondents who endorsed exactly one of the two negative stereotypes about Blacks relative to Whites, and 1 for the 10% of respondents who endorsed both negative stereotypes about Blacks relative to Whites.

The correction indicates that "I discovered an error in the description of the variable, negative stereotype endorsement" and that "there was no error in the code used to create the variable". So was the intent for Stephens-Dougan 2022 to measure racial prejudice so that only the 1 respondents are considered prejudiced? Or was the intent to consider the 0.5 respondents and the 1 respondents to be prejudiced?

The pre-analysis plan seems to indicate a different method for measuring the moderator of negative stereotype endorsement:

The difference between the rating of Blacks and Whites is taken on both dimensions (intelligence and hard work) and then averaged.

But the pre-analysis plan also indicates that:

For racial predispositions, we will use two or three bins, depending on their distributions.

So, even ignoring the plan to average the stereotype ratings, the pre-analysis plan is inconclusive about whether the intent was to use two or three bins. Let's try this passage from Stephens-Dougan 2022:

A nontrivial fraction of the nationally representative sample—26%—endorsed either the stereotype that African Americans are less hardworking than whites or that African Americans are less intelligent than whites.

So that puts the 16% of respondents at the 0.5 level of negative stereotype endorsement into the same bin as the 10% at the 1 level of negative stereotype endorsement. Stephens-Dougan 2022 doesn't report the percentage that endorsed both negative stereotypes about Blacks. Reporting the percentage of 26% is what would be expected if the intent was to place into one bin any respondent who endorsed at least one of the negative stereotypes about Blacks, so I'm a bit skeptical of the claim in the correction that the description is in error and the code was correct. Maybe I'm missing something, but I don't see how someone who intends to have three bins reports the 26% and does not report the 10%.

For another thing, Stephens-Dougan 2022 has only three figures: Figure 1 reports results for racially prejudiced Whites, Figure 2 reports results for non-racially prejudiced Whites, and Figure 3 reports on the difference between racially prejudiced Whites and non-racially prejudiced Whites. Did Stephens-Dougan 2022 intend to not report results for the group of respondents who endorsed exactly one of the negative stereotypes about Blacks? Did Stephens-Dougan 2022 intend to suggest that respondents who rate Blacks as lazier in general than Whites aren't racially prejudiced as long as they rate Blacks equal to or higher than Whites in general on intelligence?

---

5.

Stephens-Dougan 2022 and the correction depict 84% confidence intervals in all figures. Stephens-Dougan 2022 indicated (footnote omitted) that:

For ease of interpretation, I plotted the predicted probability of agreeing with each pandemic measure in Figure 1, with 84% confidence intervals, the graphical equivalent to p < 0.05.

The 84% confidence interval is good for assessing a p=0.05 difference between estimates, but not for assessing at p=0.05 whether an estimate differs from a particular number such as zero. So 84% confidence intervals make sense for Figures 1 and 2, in which the key comparisons are of the control estimate to the treatment estimate. But 84% confidence intervals don't make as much sense for Figure 3, which plot only one estimate and for which the key assessment is whether the estimate differs from zero (Figure 3 in Stephens-Dougan 2022) or from 1 (the correction).

---

6.

I didn’t immediately realize why, in Figure 3 in Stephens-Dougan 2022, two of the four estimates cross zero, but in Figure 3 in the correction, none of the four estimates cross zero. Then I realized that the estimates plotted in Figure 3 of the correction (but not Figure 3 in Stephens-Dougan 2022) are odds ratios.

The y-axis for odds ratios for Figure 3 of the correction ranges from 0 to 30-something, using a linear scale. The odds ratio that indicates no effect is 1, and an odds ratio can't be negative, so that it why none of the four estimates cross zero in the corrected Figure 3.

It seems like a good idea for a plot of odds ratios to have a guideline for 1, so that readers can assess whether an odds ratio indicating no effect is a plausible value. And a log scale seems like a good idea for odds ratios, too. Relevant prior post that mentions that Fenton and Stephens-Dougan 2021 described a "very small" 0.01 odds ratio as "not substantively meaningful".

None of the 84% confidence intervals for Figure 3 capture an odds ratio that crosses 1, but an 84% confidence interval for Figure A3 in "Supplementary Materials Correction-Final.docx" does.

---

7.

Often, when I alert an author or journal to an error in a publication, the subsequent correction doesn't credit me for my work. Sometimes the correction even suggests that the authors themselves caught the error, like the correction to Stephens-Dougan 2022 seems to do:

After reviewing my code, I discovered an error in the description of the variable, negative stereotype endorsement.

I guess it's possible that Stephens-Dougan "discovered" the error. For instance, maybe after she submitted page proofs, for some reason she decided to review her code, and just happened to catch the error that she had missed before, and it's a big coincidence that this was the same error that I blogged about and alerted the APSR to.

And maybe Stephens-Dougan also discovered that her APSR letter misleadingly deviated from the relevant pre-analysis plan, so that I don't deserve credit for alerting the APSR to that.

Tagged with: , , , , , , , ,

Some research includes measures of attitudes about certain groups but not about obvious comparison groups, such as research that includes attitudes about Blacks but not Whites or includes attitudes about women but not men. Feeling thermometers can help avoid this, which I'll illustrate with data from the Democracy Fund Voter Study Group's Views of the Electorate Research (VOTER) Survey.

---

The outcome is this item, from the 2018 wave of the VOTER survey:

Do you approve or disapprove of football players protesting by kneeling during the national anthem?

I coded responses 1 for strongly approve and somewhat approve and 0 for somewhat disapprove, strongly disapprove, don't know, and skipped. The key predictor was measured in 2017 and is based on 0-to-100 feeling thermometer ratings about Blacks and Whites, coded into six categories:

* Rated Whites equal to Blacks

---

* Rated Whites under 50 and Blacks at 50 or above

* Residual ratings of Whites lower than Blacks

---

* Rated Blacks under 50 and Whites at 50 or above

* Residual ratings of Blacks lower than Whites

---

* Did not rate Whites and/or Blacks

The plot below controls for only participant race measured in 2018, with error bars indicating 83.4% confidence intervals and survey weights applied.

The plot suggests that attitudes about anthem protests associated with negative attitudes about Blacks and with negative attitudes about Whites. These are presumably obvious results, but measures such as racial resentment probably won't be interpreted as suggesting both results.

---

NOTE

1. Stata code and output. The output reports results that had more extensive statistical control.

Tagged with: , , , ,

Social Science & Medicine published Skinner-Dorkenoo et al 2022 "Highlighting COVID-19 racial disparities can reduce support for safety precautions among White U.S. residents", with data for Study 1 fielded in September 2020. Stephens-Dougan had a similar Time-sharing Experiments for the Social Sciences study "Backlash effect? White Americans' response to the coronavirus pandemic", fielded starting in late April 2020 according to the TESS page for the study.

You can check tweets about Skinner-Dorkenoo et al 2022 and what some tweeters said about White people. But you can't tell from the Skinner-Dorkenoo et al 2022 publication or the Stephens-Dougan 2022 APSR article whether any detected effect is distinctive to White people.

Limiting samples to Whites doesn't seem to be a good idea if the purpose is to understand racial bias. But it might be naive to think that all social science research is designed to understand.

---

There might be circumstances in which it's justified to limit a study of racial bias to White participants, but I don't think such circumstances include:

* The Kirgios et al 2022 audit study that experimentally manipulated the race and gender of an email requester, but for which "Participants were 2,476 White male city councillors serving in cities across the United States". In late April, I tweeted a question to the first author of Kirgios et al 2022 about why the city councilor sample was limited to White men, but I haven't yet gotten a reply.

* Studies that collect sufficient data on non-White participants but do not report results from these data in the eventual publications (examples here and here).

* Proposals for federally funded experiments that request that the sample be limited to White participants, such as in the Stephens-Dougan 2020 proposal: "I want to test whether White Americans may be more resistant to efforts to curb the virus and more supportive of protests to reopen states when the crisis is framed as disproportionately harming African Americans".

---

One benefit of not limiting the subject pool by race is to limit unfair criticism of entire racial groups. For example, according to the analysis below from Bracic et al 2022, White nationalism among non-Whites was at least as influential as White nationalism among Whites in predicting support for a family separation policy net of controls:So, to the extent that White nationalism is responsible for support for the family separation policy, that applies to White respondents and to non-White respondents.

Of course, Bracic et al. 2022 doesn't report how the association for White nationalism compares to the association for, say, Black nationalism or Hispanic nationalism or how the association for the gendered nationalist belief that "the nation has gotten too soft and feminine" compares to the association for the gendered nationalist belief that, say, "the nation is too rough and masculine".

---

And consider this suggestion from Rice et al 2022 to use racial resentment items to screen Whites for jury service:

At the practical level, our research raises important empirical and normative questions related to the use of racial resentment items during jury selection in criminal trials. If racial resentment affects jurors' votes and reasoning, should racial resentment items be used to screen white potential jurors?

Given evidence suggesting that Black juror bias is on average at least as large as White juror bias, I don't perceive a good justification to limit this suggestion to White potential jurors, although I think that the Rice et al decision to not report results for Black mock jurors makes it easier to limit this suggestion to White potential jurors.

---

NOTES

1. I caught two flaws in Skinner-Dorkenoo et al 2022, which I discussed on Twitter: [1] For the three empathy items, more than 700 respondents selected "somewhat agree", more than 500 selected "strongly agree", but no respondent selected "agree", suggesting that the data were miscoded. [2] The p-value under p=0.05 for the empathy inference appears to be because the analysis controlled for a post-treatment measure; see the second model referred to by the lead author in the Twitter thread. I didn't conduct a full check of the Skinner-Dorkenoo et al 2022 analysis. Stata code and output for my analyses of Skinner-Dorkenoo et al 2022, with data here. Note the end of the output, indicating that the post-treatment control was affected by the treatment.

2. I have a prior post about the Stephens-Dougan TESS survey experiment reported on in the APSR that had substantial deviations from the pre-analysis plan. On May 31, I contacted the APSR about that and the error discussed at the post. I received an update in September, but the Stephens-Dougan 2022 APSR article hasn't been corrected as of Oct 2.

Tagged with: , , ,

Suppose that Bob at time 1 believes that Jewish people are better than every other group, but Bob at time 2 changes his belief to be that Jewish people are no better or worse than every other group, and Bob at time 3 changes his belief to be that Jewish people are worse than every other group.

Suppose also that these changes in Bob's belief about Jewish people have a causal effect on his vote choices. Bob at time 1 will vote 100% of the time for a Jewish candidate running against a non-Jewish candidate, no matter the relative qualifications of the candidates. At time 2, a candidate's Jewish identity is irrelevant to Bob's vote choice, so that, if given a choice between a Jewish candidate and an all-else-equal non-Jewish candidate, Bob will flip a coin and vote for the Jewish candidate only 50% of the time. Bob at time 3 will vote 0% of the time for a Jewish candidate running against a non-Jewish candidate, no matter the relative qualifications of the candidates.

Based on this setup, what is your estimate of the influence of antisemitism on Bob's voting decisions?

---

I think that the effect of antisemitism is properly understood as the effect of negative attitudes about Jewish people, so that the effect can be estimated in the above setup as the difference between Bob's voting decisions at time 2, when Bob is indifferent to a candidate's Jewish identity, and Bob's voting decisions at time 3, when Bob has negative attitudes about Jewish people. Thus, the effect of antisemitism on Bob's voting decisions is a 50 percentage point decrease, from 50% to 0%.

For the first decrease, from 100% to 50%, neither belief -- the belief that Jewish people are better than every other group, or the belief that Jewish people are no better or worse than every other group -- is antisemitic, so none of this decrease should be attributed to antisemitism. Generally, I think that this means that respondents who have positive attitudes about a group should not be used to estimate the effect of negative attitudes about that group.

---

So let's discuss the Race and Social Problems article: Sharrow et al 2021 "What's in a Name? Symbolic Racism, Public Opinion, and the Controversy over the NFL's Washington Football Team Name". The key predictor was a measure of resentment against Native Americans, built from responses to the statements below, measured on a 5-point scale from "strongly agree" to "strongly disagree":

Most Native Americans work hard to make a living just like everyone else.

Most Native Americans take unfair advantage of privileges given to them by the government.

My analysis indicates that 39% of the 1500 participants (N=582) provided consistently positive responses about Native Americans on both items, agreeing or strongly agreeing with the first statement and disagreeing or strongly disagreeing with the second statement. I don't see why these 582 respondents should be included in an analysis that attempts to estimate the effect of negative attitudes about Native Americans, if these participants do not fall along the indifferent-to-negative-attitudes continuum about Native Americans.

So let's check what happens after removing these respondents from the analysis.

---

I first conducted an unweighted OLS regression using the full sample and controls to predict the summary Team Name Index outcome, which measured support for the Washington football team's name placed on a 0-to-1 scale. For this regression (N=1024), the measure of resentment against Native Americans ranged from 0 for respondents who selected the most positive responses to both resentment items to 1 for respondents who selected the most negative responses to both resentment items. In this regression, the coefficient was 0.26 (t=6.31) for resentment against Native Americans.

I then removed respondents who provided positive responses about Native Americans for both resentment items. For this next unweighted OLS regression (N=572), the measure of resentment against Native Americans still had a value of 1 for respondents who provided the most negative responses to both resentment items; however, 0 was for participants who were neutral on one resentment item but provided the most positive response on the other resentment item, such as strongly agreeing that "Most Native Americans work hard to make a living just like everyone else" but neither agreeing or disagreeing that "Most Native Americans take unfair advantage of privileges given to them by the government". In this regression, the coefficient was 0.12 (t=2.23) for resentment against Native Americans.

The drop is similar when the regressions include only the measure of resentment against Native Americans and no other predictors: the coefficient is 0.44 for the full sample, but is 0.22 after dropping respondents who provided positive responses about Native Americans for both resentment items.

---

So I think that Sharrow et al 2021 might report substantial overestimates of the effect of resentment of Native Americans, because the estimates in Sharrow et al 2021 about the effect of negative attitudes about Native Americans included the effect of positive attitudes about Native Americans.

---

NOTES

1. About 20% of the Sharrow et al 2022 sample reported a negative attitude on at least one of the two measures of resentment against Native Americans. About 6% of the sample reported a negative attitude on both measures of resentment against Native Americans.

2. Sharrow et al 2021 indicated that "Our conclusions illustrate that symbolic racism toward Native Americans is central to interpreting the public's resistance toward changing the name, in sharp contrast to Snyder's claim that the name is about 'respect.'" (p. 111).

For what it's worth, the Sharrow et al 2021 data indicate that a nontrivial percentage of respondents with positive views of Native Americans somewhat or strongly disagreed with the claim that Washington football team name is offensive (in an item that reported the name of the team at the time): 47% of respondents who provided positive responses about Native Americans for both resentment items, 47% of respondents who rated Native Americans at 100 on a 0-to-100 feeling thermometer, 40% of respondents who provided positive responses about Native Americans for both resentment items and rated Native Americans at 100 on a 0-to-100 feeling thermometer, and 32% of respondents who provided the most positive responses about Native Americans for both resentment items and rated Native Americans at 100 on a 0-to-100 feeling thermometer (although this 32% was only 22% in unweighted analyses).

3. Sharrow et a 2021 indicated a module sample of 1,500 but the sample size fell to 1,024 in model 3 of Table 1. My analysis indicates that this is largely due to missing values on the outcome variable (N=1,362), the NFL sophistication index (N=1,364), and the measure of resentment of Native Americans (N=1,329).

4. Data for my analysis. Stata code and output.

5. Social Science Quarterly recently published Levin et al 2022 "Validating and testing a measure of anti-semitism on support for QAnon and vote intention for Trump in 2020", which also has the phenomenon of estimating the effect of negative attitudes about a target group but not excluding participants who favor the target group.

Tagged with: , , , , ,

The American Political Science Review recently published a letter: Stephens-Dougan 2022 "White Americans' reactions to racial disparities in COVID-19".

Figure 1 of the Stephens-Dougan 2022 APSR letter reports results for four outcomes among racially prejudiced Whites, with the 84% confidence interval in the control overlapping with the 84% confidence interval in the treatment for only one of the four reported outcomes (zooming in on Figure 1, the confidence intervals for the parks outcome don't seem to overlap, and the code returns 0.1795327 for the upper bound for the control and 0.18800818 for the lower bound for the treatment). And results for the most obviously overlapping 84% confidence intervals seem to be interpreted as sufficient evidence of an effect, with all four reported outcomes discussed in the passage below:

When racially prejudiced white Americans were exposed to the racial disparities information, there was an increase in the predicted probability of indicating that they were less supportive of wearing face masks, more likely to feel their individual rights were being threatened, more likely to support visiting parks without any restrictions, and less likely to think African Americans adhere to social distancing guidelines.

---

There are at least three things to keep track of: [1] the APSR letter, [2] the survey questionnaire, located at the OSF site for the Time-sharing Experiments for the Social Sciences project; and [3] the pre-analysis plan, located at the OSF and in the appendix of the APSR article. I'll use the PDF of the pre-analysis plan. The TESS site also has the proposal for the survey experiment, but I won't discuss that in this post.

---

The pre-analysis plan does not mention all potential outcome variables that are in the questionnaire, but the pre-analysis plan section labeled "Hypotheses" includes the passage below:

Specifically, I hypothesize that White Americans with anti-Black attitudes and those White Americans who attribute racial disparities in health to individual behavior (as opposed to structural factors), will be more likely to disagree with the following statements:

The United States should take measures aimed at slowing the spread of the coronavirus while more widespread testing becomes available, even if that means many businesses will have to stay closed.

It is important that people stay home rather than participating in protests and rallies to pressure their governors to reopen their states.

I also hypothesize that White Americans with anti-Black attitudes and who attribute racial health disparities to individual behavior will be more likely to agree with the following statements:

State and local directives that ask people to "shelter in place" or to be "safer at home" are a threat to individual rights and freedom.

The United States will take too long in loosening restrictions and the economic impact will be worse with more jobs being lost

The four outcomes mentioned in the passage above correspond to items Q15, Q18, Q16, and Q21 in the survey questionnaire, but, of these four outcomes, the APSR letter reported on only Q16.

The outcome variables in the APSR letter are described as: "Wearing facemasks is not important", "Individual rights and freedom threatened", "Visit parks without any restrictions", and "Black people rarely follow social distancing guidelines". These outcome variables correspond to survey questionnaire items Q20, Q16, Q23A, and Q22A.

---

The pre-analysis plan PDF mentions moderators, with three moderators about racial dispositions: racial resentment, negative stereotype endorsement, and attributions for health disparities. The plan indicates that:

For racial predispositions, we will use two or three bins, depending on their distributions. For ideology and party, we will use three bins. We will include each bin as a dummy variable, omitting one category as a baseline.

The APSR letter reported on only one racial predispositions moderator: negative stereotype endorsement.

---

I'll post a link in the notes below to some of my analyses about the "Specifically, I hypothesize" outcomes, but I don't want to focus on the results, because I wanted this post to focus on deviations from the pre-analysis plan, because -- regardless of whether the estimates from the analyses in the APSR letter are similar to the estimates from the planned analyses in the pre-analysis plan -- I think that it's bad that readers can't trust the APSR to ensure that a pre-analysis plan is followed or at least to provide an explanation about why a pre-analysis plan was not followed, especially given that this APSR letter described itself as reporting on "a preregistered survey experiment" and included the pre-analysis plan in the appendix.

---

NOTES

1. The Stephens-Dougan 2022 APSR letter suggests that the negative stereotype endorsement variable was coded dichotomously ("a variable indicating whether the respondent either endorsed the stereotype that African Americans are less hardworking than whites or the stereotype that African Americans are less intelligent than whites"), but the code and the appendix of the APSR letter indicate that the negative stereotype endorsement variable was measured so that the highest level is for respondents who reported a negative relative stereotype about Blacks for both stereotypes. From Table A7:

(unintelligentstereotype 2 + lazystereotype2 )/2

In the data after running the code for the APSR letter, the negative stereotype endorsement variable is a three-level variable coded 0 for respondents who did not report a negative relative stereotype about Blacks for either stereotype, 0.5 for respondents who reported a negative stereotype about Blacks for one stereotype, and 1 for respondents who reported a negative relative stereotype about Blacks for both stereotypes.

2. The APSR letter indicated that:

The likelihood of racially prejudiced respondents in the control condition agreeing that shelter-in-place orders threatened their individual rights and freedom was 27%, compared with a likelihood of 55% in the treatment condition (p < 0.05 for a one-tailed test).

My analysis using survey weights got 44% and 29% among participants who reported a negative relative stereotype about Blacks for at least one of the two stereotype items, and my analysis got 55% and 26% among participants who reported negative relative stereotypes about Blacks for both stereotype items, with a trivial overlap in 84% confidence intervals.

But the 55% and 26% in a weighted analysis were 43% and 37% in an unweighted analysis with a large overlap in 84% confidence intervals, suggesting that at least some of the results in the APSR letter might be limited to the weighted analysis. I ran the code for the APSR letter removing the weights from the glm command and got the revised Figure 1 plot below. The error bars in the APSR letter are described as 84% confidence intervals.

I think that it's fine to favor the weighted analysis, but I'd prefer that publications indicate when results from an experiment are not robust to the application or non-application of weights. Relevant publication.

3. Given the results in my notes [1] and [2], maybe the APSR letter's Figure 1 estimates are for only respondents who reported negative relative stereotype about Blacks for both stereotypes. If so, the APSR letter's suggestion that this population is the 26% that reported anti-Black stereotypes for either stereotype might be misleading, if the Figure 1 analyses were estimated for only the 10% that reported negative relative stereotype about Blacks for both stereotypes.

For what it's worth, the R code for the APSR letter has code that doesn't use the 0.5 level of the negative stereotype endorsement variable, such as:

# Below are code for predicted probabilities using logit model

# Predicted probability "individualrights_dichotomous"

# Treatment group, negstereotype_endorsement = 1

p1.1 <- invlogit(coef(glm1)[1] + coef(glm1)[2] * 1 + coef(glm1)[3] * 1 + coef(glm1)[4] * 1)

It's possible to see what happens to the Figure 1 results when the negative stereotype endorsement variable is coded 1 for respondents who endorsed at least one of the stereotypes. Run this at the end of the Stata code for the APSR letter:

replace negstereotype_endorsement = ceil((unintelligentstereotype2 + lazystereotype2)/2)

Then run the R code for the APSR letter. Below is the plot I got for a revised Figure 1, with weights applied and the sample limited to respondents who endorsed at least one of the stereotypes:

Estimates in the figure above were close to estimates in my analysis using these Stata commands after running the Stata code from the APSR letter. Stata output.

4. Data, Stata code, and Stata output for my analysis about the "Specifically, I hypothesize" passage of the Stephens-Dougan pre-analysis plan.

My analysis in the Stata output had seven outcomes: the four outcomes mentioned in the "Specifically, I hypothesize" part of the pre-analysis plan as initially measured (corresponding to questionnaire items Q15, Q18, Q16, and Q21), with no dichotomization of five-point response scales for Q15, Q18, and Q16; two of these outcomes (Q15 and Q16) dichotomized as mentioned in the pre-analysis plan (e.g., "more likely to disagree" was split into disagree / not disagree categories, with the not disagree category including respondent skips); and one outcome (Q18) dichotomized so that one category has "Not Very Important" and "Not At All Important" and the other category has the other responses and skips, given that the pre-analysis plan had this outcome dichotomized as disagree but response options in the survey were not on an agree-to-disagree scale. Q21 was measured as a dichotomous variable.

The analysis was limited to presumed racially prejudiced Whites, because I think that that's what the pre-analysis plan hypotheses quoted above focused on. Moreover, that analysis seems more important than a mere difference between groups of Whites.

Note that, for at least some results, a p<0.05 treatment effect might be in the unintuitive direction, so be careful before interpreting a p<0.05 result as evidence for the hypotheses.

My analyses aren't the only analyses that can be conducted, and it might be a good idea to combine results across outcomes mentioned in the pre-analysis plan or across all outcomes in the questionnaire, given that the questionnaire had at least 12 items that could serve as outcome variables.

For what it's worth, I wouldn't be surprised if a lot of people who respond to survey items in an unfavorable way about Blacks backlashed against a message about how Blacks were more likely than Whites to die from covid-19.

5. The pre-analysis plan included a footnote that:

Given the results from my pilot data, it is also my expectation that partisanship will moderate the effect of the treatment or that the treatment effects will be concentrated among Republican respondents.

Moreover, the pre-analysis plan indicated that:

The condition and treatment will be blocked by party identification so that there are roughly equal numbers of Republicans and Democrats in each condition.

But the lone mention of "Repub-" in the APSR letter is:

The sample was 39% self-identified Democrats (including leaners) and 46% self-identified Republicans (including leaners).

6. Link to tweets about the APSR letter.

Tagged with: , , , , , , , ,

1.

In 2003, Melissa V. Harris-Lacewell wrote that (p. 222):

The defining works of White racial attitudes fail to grapple with the complexities of African American political thought and life. In these studies, Black people are a static object about which White people form opinions.

Researchers still sometimes make it difficult to analyze data from Black participants or don't report interesting data on Black participants. Helping to address this, Darren W. Davis and David C. Wilson have a new book Racial Resentment in the Political Mind (RRPM), with an entire chapter on African Americans' resentment toward Whites.

RRPM is a contribution to research on Black political attitudes, and its discussion of measurement of Whites' resentment toward Blacks is nice, especially for people who don't realize that standard measures of "racial resentment" aren't good measures of resentment. But let me discuss some elements of the book that I consider flawed.

---

2.

RRPM draws, at a high level, a parallel between Whites' resentment toward Blacks and Blacks' resentment toward Whites (p. 242):

In essence, the same model of a just world and appraisal of deservingness that guides Whites' racial resentment also guides African Americans' racial resentment.

That seems reasonable, to have the same model for resentment toward Whites and resentment toward Blacks. But RRPM proposes different items for a battery of resentment toward Blacks and for a battery of resentment toward Whites, and I think that different batteries for each type of resentment will undercut comparison of the size of the effects of these two different resentments, because one battery might capture true resentment better than another battery.

Thus, especially for general surveys such as the ANES that presumably can't or won't devote space to batteries measuring resentments tailored to each racial group, it might be better to measure resentment toward various groups with generalizable items such as agreement/disagreement to statements such as "Whites have gotten more than they deserve" and "Blacks have gotten more than they deserve", which hopefully would produce more valid comparisons of the estimated effect of resentments toward different groups, compared to comparison of batteries of different items.

---

3.

RRPM suggests that all resentment batteries not be given to all respondents (p. 241):

A clear outcome of this chapter is that African Americans should not be presented the same classic racial resentment survey items that Whites would answer (and perhaps vice versa)...

And from page 30:

African Americans and Whites have different reasons to be resentful toward each other, and each group requires a unique set of measurement items to capture resentment.

But not giving participants items measuring resentment of their own racial group doesn't seem like a good idea, because a White participant could think that Whites have received more than they deserve on average, and a Black participant could think that Blacks have received more than they deserve on average, so that omitting White resentment of Whites and similar measures could plausibly bias estimates of the effect of resentment, if resentment of one's own racial group influences a participant's attitudes about political phenomena.

---

RRPM discusses asking Blacks to respond to racial resentment items toward Blacks: "No groups other than African Americans seem to be asked questions about self-hate" (p. 249). RRPM elsewhere qualifies this with "rarely": "That is, asking African Americans to answer questions about disaffection toward their own group is a task rarely asked of other groups"  (p. 215).

The ANES 2016 pilot study did ask White participants about White guilt (e.g., "How guilty do you feel about the privileges and benefits you receive as a white American?") without asking any other racial groups about parallel guilt. Moreover, the CCES had (in 2016 and 2018 at least) an agree/disagree item asked of Whites and others that "White people in the U.S. have certain advantages because of the color of their skin", with no equivalent item about color-of-skin advantages for people who are not White.

But even if Black participants disproportionately receive resentment items directed at Blacks, the better way to address this inequality and to understand racial attitudes is to add resentment items directed at other groups.

---

4.

RRPM seems to suggest an asymmetry in that only Whites' resentment is normatively bad (p. 25):

In the end, African Americans' quest for civil rights and social justice is resented by Whites, and Whites' maintenance of their group dominance is resented by African Americans.

Davis and Wilson discussed RRPM in a video on the UC Public Policy Channel, with Davis suggesting that "a broader swath of citizens need to be held accountable for what they believe" (at 6:10) and that "...the important conversation we need to have is not about racists. Okay. We need to understand how ordinary American citizens approach race, approach values that place them in the same bucket as racists. They're not racists, but they support the same thing that racists support" (at 53:37).

But, from what I can tell, the ordinary American citizens in the same bucket as racists don't seem to be, say, people who support hiring preferences for Blacks for normatively good reasons and just happen to have the same policy preferences as people who support hiring preferences for Blacks because of racism against Whites. Instead, my sense is that the racism in question is limited to racism that causes racial inequality: David C. Wilson at 3:24 in the UC video:

And so, even if one is not racist, they can still exacerbate racial injustice and racial inequality by focusing on their values rather than the actual problem and any solutions that might be at bay to try and solve them.

---

Another apparent asymmetry is that RRPM mentions legitimizing racial myths throughout the book (vii, 3, 8, 21, 23, 28, 35, 47, 48, 50, 126, 129, 130, 190, 243, 244, 247, 261, 337, and 342), but legitimizing racial myths are not mentioned in the chapter on African Americans' resentment toward Whites (pp. 214-242). RRPM page 8 figure 1.1 is model of resentment that has an arrow from legitimizing racial myths to resentment, but RRPM doesn't indicate what, if any, legitimizing racial myths inform resentment toward Whites.

Legitimizing myths are conceptualized on page 8 as follows:

Appraisals of deservingness are shaped by legitimizing racial myths, which are widely shared beliefs and stereotypes about African Americans and other minorities that justify their mistreatment and low status. Legitimizing myths are any coherent set of socially accepted attitudes, beliefs, values, and opinions that provide moral and intellectual legitimacy to the unequal distribution of social value (Sidanius, Devereux, and Pratto 1992).

But I don't see why legitimizing myths couldn't add legitimacy to unequal *treatment*. Presumably resentment flows from beliefs about the causes of inequality, so Whites as a/the main/the only cause of Black/White inequality could serve as a belief that legitimizes resentment toward Whites and, consequently, discrimination against Whites.

---

5.

The 1991 National Race and Politics Survey had a survey experiment, asking for agreement/disagreement to the item:

In the past, the Irish, the Italians, the Jews and many other minorities overcame prejudice and worked their
way up.

Version 1: Blacks...
Version 2: New immigrants from Europe...

...should do the same without any special favors?

This experiment reflects the fact that responses to items measuring general phenomena applied to a group might be influenced by the general phenomena and/or the group.

Remarkably, the RRPM measurement of racial schadenfreude (Chapter 7) does not address this ambiguity, with items measuring participant feelings about only President Obama, such as the schadenfreude felt by "Barack Obama's being identified as one of the worst presidents in history". At least RRPM realizes this (p. 206):

Without a more elaborate research design, we cannot really determine whether the schadenfreude experienced by Republicans is due to his race or to some other issue.

---

6.

For an analysis of racial resentment in the political mind, RRPM remarkably doesn't substantively consider Asians, even if only as a target of resentment to help test alternate explanations about the cause of resentment, given that, like Whites, Asians on average have relatively positive outcomes in income and related measures, but do not seem to be blamed for U.S. racial inequality as much as Whites are.

---

NOTES

1. From RRPM (p. 241):

When items designed on one race are automatically applied to another race under the assumption of equal meaning, it creates measurement invariance.

Maybe the intended meaning is something such as "When items designed on one race are automatically applied to another race, it assumes measurement invariance".

2. RRPM Figure 2.1 (p. 68) reports how resentment correlates with feeling thermometer ratings about Blacks and with feeling thermometer ratings about Whites, but not with the more intuitive measure of the *difference* in feeling thermometer ratings about Blacks and about Whites.

Tagged with: , , , ,

The recent Rhodes et al 2022 Monkey Cage post indicated that:

...as [Martin Luther] King [Jr.] would have predicted, those who deny the existence of racial inequality are also those who are most willing to reject the legitimacy of a democratic election and condone serious violations of democratic norms.

Regarding this inference about the legitimacy of a democratic election, Rhodes et al 2022 reported results for an item that measured perceptions about the legitimacy of Joe Biden's election as president in 2020. But a potential confound is that reported perceptions of the legitimacy of the U.S. presidential election in 2020 are due to who won that election and are not about elections per se. One way to address this confound is to use a measure of reported perceptions of the legitimacy of the U.S. presidential election *in 2016*, which Donald Trump won.

I checked data from the Democracy Fund Voter Study Group VOTER survey for responses to the items below, which can help address this confound:

[from 2016 and 2020] Over the past few years, Blacks have gotten less than they deserve.

[from 2016] How confident are you that the votes in the 2016 election across the country were accurately counted?

[from 2020] How confident are you that votes across the United States were counted as voters intended in the elections this November?

Results are below:

The dark columns are for respondents who strongly disagreed that Blacks have gotten less than they deserve, so that these respondents can plausibly be described as denying the existence of unfair racial inequality. The light columns are for respondents who strongly agreed that Blacks have gotten less than they deserve, so that these respondents can plausibly be described as most strongly asserting the existence of unfair racial inequality.

Comparison of the 2020 column for "strongly disagree" to the 2020 column for "strongly agree" suggests that, as expected based on Rhodes et al 2022, skepticism about votes in 2020 being counted accurately was more common among respondents who most strongly denied the existence of unfair racial inequality than among respondents who most strongly asserted the existence of unfair racial inequality.

But comparison of the 2016 column for "strongly disagree" to the 2016 column for "strongly agree" suggests that the general phrasing of "those who deny the existence of racial inequality are also those who are most willing to reject the legitimacy of a democratic election" does not hold for every election, such as the presidential election immediately prior to the election that was the focus of the relevant item in Rhodes et al 2022.

---

NOTE

1. Data source. Stata do file. Stata output. Code for the R plot.

Tagged with: , , ,