Omitted condition from Monkey Cage post on paying college athletes
The Monkey Cage published a post, "Racial prejudice is driving opposition to paying college athletes. Here's the evidence." I tweeted about this post in several threads, but I'm posting the information here for possible future reference and for anyone who reads the blog.
Here's the key figure from the post. The left side of the post indicates that white respondents expressed more opposition to paying college athletes after exposure to a picture of black athletes than in a control condition with no picture.
After reading the post, I noted two oddities about the figure. First, based on the logic of an experiment -- change one thing only to assess the effect of that thing -- the proper comparison for assessing racial bias among white respondents would have been comparing the effect of a photo of black athletes to the effect of a photo of white athletes; that comparison would have removed the alternate explanations that respondents expressed more opposition because a photo was shown or because a photo of athletes was shown, and not necessarily because a photo of *black* athletes was shown. Second, the data were from the CCES, which typically has team samples of 1,000 respondents; these samples are presumably intended to be a representative of the national population, so there should be more than 411 whites in a 1,000-respondent sample.
Putting two and two together suggested that there was an unreported condition in which respondents were shown a photo of white athletes. I emailed the three authors of the blog post, and to their credit I received substantive replies to my questions about the experiment. Based on the team's responses, the experiment did have a condition in which respondents were shown a photo of white athletes, and opposition to paying college athletes in this "white athletes" photo condition did not differ at p<0.05 (two-tailed test) from opposition to paying college athletes in the "black athletes" photo condition.
Nice that they fessed up, but how can they justify the omission? Why should we ever trust them again?
And why only give the result of a hypothesis test, rather than the estimated effect size? I can't stand it when people focus only on p-values.
This seems like garbage science.
I didn't ask for a justification, but here is a quote from one of the research team members in the email exchange: "You are right. One of the experimental conditions was to show respondents pictures of white faces (with white names). The results here were not interesting and muddied things up for the purposes of this short article. We discuss them at length in the paper but they do not change the substantive story we report in the article."
That might reflect drafting a simplified post for a general audience and keeping the more complicated details for an academic audience. If so, I don't think that simplification justifies omitting results that undercut a narrative, but at least the team was very quick and transparent in answering my questions and did note that the paper mentions the omitted result. But I can't imagine that it's good for social science or society in general to report evidence of antiblack bias that is stronger than is warranted by the data.
I'd expect a nonzero influence of racial prejudice on opposition to paying college athletes, so, as you indicated, the important result is the estimated effect size, so the eventual article should report estimated effect sizes for the white athletes photo condition and for the correlational part of the analysis. In any event, it's possible that the data might be public at some point based on this notice.
I have a question. In your post you say that based on the authors' responses, "opposition to paying college athletes in this "white athletes" photo condition did not differ at p<0.05 (two-tailed test) from opposition to paying college athletes in the "black athletes" photo condition"
But in this comment you're saying that the authors simply wrote that the 'white players condition' did not have any interesting results and doesn't change the gist of their main finding. Which one is it?
Hi DZK,
If you are asking about what the authors indicated in the email exchange, the first return message that I received from a member of the team contained the "not interesting" description of the results that I mentioned in my comment. I then sent a return email message to the team, asking at the end of the message:
"Are the data or the paper available for review? If not, could you indicate whether the experiment permitted detection of a difference at conventional levels of statistical significance (e.g., p<0.05, two-tailed test) in white support for paying college athletes when comparing responses in the condition with the white athletes photo and white names to the condition with the black athletes photo and black names?" The message that I received in response to that second email indicated the lack of a statistical difference between the black athletes and white athletes photo conditions; this second return message was the message referenced in the initial post. Let me know if that doesn't answer your question.
"The results were not interesting."
If you weren't interested in the answer, why did you go through the considerable trouble and expense of performing the experiment? I guess the answer is only "interesting" if you can use it to support a (likely pre-determined) narrative.
"But I can't imagine that it's good for social science or society in general to report evidence of antiblack bias that is stronger than is warranted by the data."
Totally agree, but seems very few think that way.
Of course I also totally agree there are probably true nonzero effects, since the null hypothesis is never really true in social science. I'm a big fan of Gelman's idea about of "type S and type M" errors.
http://andrewgelman.com/2004/12/29/type_1_type_2_t/
Gelman's type M and type S errors are a great way to think about things: in terms of direction and effect size and not in terms of statistical significance. I think that the dichotomous statistical-significance-type thinking might foster labeling something as "driving" opposition when it at most causes an increase from 62% opposition to 72% opposition, as in the Monkey Cage post.
Thank you. I hope we get the complete results soon. I suppose the effect of being shown pictures of white athletes was called "uninteresting" because it was between zero and a larger effect from being shown pictures of black athletes, without being different enough from either to pass the significance test. It would be nice to know.
No problem. I'd agree with your expectation about the effect size, even though it would be more interesting if the estimate in the white photo condition differed statistically from the estimate in the no photo condition.
Thank you, that answered my question.
So their second email response completely debunks their own theory then, doesn't it? If they admitted that there was no statistically significant difference between the 'white-black' photo reactions, that renders the whole MonkeyCage article moot!
Hi DZK,
I think that it's correct that the Monkey Cage post would have appeared more moot had the post indicated that the test for a difference between the black athletes and white athletes photo condition was not statistically significant.
For debunking, I think it's necessary to see what the estimated effect size is for the white athletes condition. Based on the figure, opposition in the black athletes photo condition was about 73%, so if opposition in the white athletes photo condition were, say, 68% but the 73%/68% difference was not statistically significant, then there might be a relatively small overall antiblack bias that the experiment did not have enough power to detect. But if opposition in the white athletes photo condition were, say, 73% or higher, then that would clearly be a debunking of the argument in the post.
I think that the experiment would have been better had respondents from the control condition been distributed to the white athletes and black athletes conditions, to increase the power of the experiment to detect a difference between the racial conditions. Results from the control condition can provide evidence of whether the photos of athletes matter in general, but that doesn't seem to be something that the research should be interested in.
Data and code for the "paying college athletes" survey experiment has been posted. Links are below.
Data: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GEPCEL
Article: http://journals.sagepub.com/doi/abs/10.1177/1065912916685186
For the experiment reported on in the Monkey Cage post, on a 0-to-1 scale in which 1 represents the strongest opposition to paying college athletes, the weighted level of opposition to paying college athletes among all whites was 0.647 in the control condition with no photograph, 0.760 in the black athletes condition, and 0.767 in the white athletes condition. For whites who scored 0.5 or higher on the racial resentment scale, levels of opposition were 0.663 for the control condition, 0.801 for the black athletes condition, and 0.796 for the white athletes condition.
The article reported on two survey experiments conducted after the Monkey Cage post was published. These additional experiments had more subtle primes and provided stronger evidence of bias. The March study had a coefficient of 0.062, a standard error of 0.042, and a p-value of p=0.139, and the April study had respective values of 0.072, 0.044, and p=0.109. Both analyses had attention check restrictions but not weights applied, and the racial resentment main effect and interaction term were not used. Higher values indicated more opposition to paying college athletes, and the treatment was coded higher than the control. The two studies had a combined effect size of 0.067 and a p-value of 0.030.