You might have seen a Tweet or Facebook post on a recent study about sex bias in teacher grading:

Here is the relevant section from Claire Cain Miller's Upshot article in the New York Times describing the study's research design:

Beginning in 2002, the researchers studied three groups of Israeli students from sixth grade through the end of high school. The students were given two exams, one graded by outsiders who did not know their identities and another by teachers who knew their names.

In math, the girls outscored the boys in the exam graded anonymously, but the boys outscored the girls when graded by teachers who knew their names. The effect was not the same for tests on other subjects, like English and Hebrew. The researchers concluded that in math and science, the teachers overestimated the boys' abilities and underestimated the girls', and that this had long-term effects on students' attitudes toward the subjects.

The Upshot article does not mention that the study's first author had previously published another study using the same methodology, but with the other study finding a teacher grading bias against boys:

The evidence presented in this study confirms that the previous belief that schoolteachers have a grading bias against female students may indeed be incorrect. On the contrary: on the basis of a natural experiment that compared two evaluations of student performance–a blind score and a non-blind score–the difference estimated strongly suggests a bias against boys. The direction of the bias was replicated in all nine subjects of study, in humanities and science subjects alike, at various level of curriculum of study, among underperforming and best-performing students, in schools where girls outperform boys on average, and in schools where boys outperform girls on average (p. 2103).

This earlier study was not mentioned in the Upshot article and does not appear to have been mentioned in the New York Times ever. The Upshot article appeared in the print version of the New York Times, so it appears that Dr. Lavy has also conducted a natural experiment in media bias: report two studies with the same methodology but opposite conclusions, to test whether the New York Times will report on only the study that agrees with liberal sensibilities. That hypothesis has been confirmed.

Tagged with: , , , , ,

Social science correlations over 0.90 are relatively rare, at least for correlations of items that aren't trying to measure the same thing, so I thought I'd post about the 0.92 correlation that I came across in the data from the Leslie et al. 2015 Science article. Leslie et al. co-author Andrei Cimpian emailed me the data in Excel form, which made the analysis a lot easier.

Leslie et al. asked faculty, postdoctoral fellows, and graduate students in a given discipline to respond to this item: "Even though it's not politically correct to say it, men are often more suited than women to do high‐level work in [discipline]." Responses were made on a scale from 1 (strongly disagree) to 7 (strongly agree). Responses to that suitability stereotype item correlated at -0.19 (p=0.44, n=19) with the mean GRE verbal reasoning score for a discipline and at 0.92 (p<0.0001, n=19) with the mean GRE quantitative reasoning score for a discipline [source].

suitabilitystereotype

Tagged with: ,

Female representation varies across academic field, as illustrated in the figure below [source]. Females earned 46% of all doctoral degrees awarded in 2013, but represented substantially more than that percentage in education (68%) and the social sciences (59%), and substantially less than that percentage in the physical sciences (29%) and engineering (23%). The gray line indicates the percentage of all doctoral degrees that women earned in 2013 (46%).Female Representation by Sex and Field---

I began to look into this topic after reading the Leslie et al. 2015 Science article. I have since come across treatments that better cover the same ground [Scott Alexander, Ceci et al. 2014], so I'll post a few figures here with brief commentary.

---

One possible explanation for the variation in female representation across academic field is sex differences in mathematics and verbal performance. The figure below reports SAT-Math scores [source] and SAT-Reading scores [source] by sex in 2009. The figure indicates cumulative numbers, so the point at 40% female representation and a 650 SAT-Math score indicates that females were 40% of all SAT test takers who scored a 650 or higher on the SAT-Math section. The 40% is not adjusted to account for the fact that more females than males took the SAT: there were 224,230 total persons who scored a 650 or higher on the 2009 SAT-Math test: 88,896 females, and 135,334 males.

SAT Scores by Sex

---

Scott Alexander linked to this Hsu and Schombert article, which has this passage:

Thus, a strong interpretation of our result would be that even the most determined student is unlikely to master undergraduate Physics or Mathematics if their quantitative ability is below 85th percentile in the overall population. To have a 50 percent or greater chance of success (i.e., for a person of average conscientiousness or work ethic), one needs SAT-M well above 700, or in the top few percent of the overall population.

In the 2009 SAT data, females represented 36% of SAT-Math test takers who scored a 700 or above, 34% of SAT-Math test takers who scored a 750 or above, and 31% of SAT-Math test takers who scored an 800. So -- presuming no downstream discrimination or sex differences in interest or other non-academic factors -- we'd expect females to represent roughly 1/3 of PhDs in math-intensive fields such as engineering or the physical sciences. But females represent less than 1/3 of PhDs in math-intensive fields: in the 2013 data plotted in the first figure, females were 29% of physical sciences PhDs and 23% of engineering PhDs.

This is where the discussion must involve discrimination, bias, interests, life choices, and life constraints. It seems that the relevant question is the relative effect size of each factor and not the presence or absence of each factor. Ceci et al. 2009 and 2014 are good places to find a discussion of such factors.

---

One way to look for evidence of discrimination against women in representation across academic field is to plot sex differences in verbal and quantitative abilities by field. If women with high levels of quantities abilities were being systematically dissuaded from STEM fields such as engineering and the physical sciences into non-STEM fields such as the social sciences and the humanities, then these displaced high-quantitative-ability women should inflate the mean quantitative reasoning scores of women in non-STEM fields. That does not appear to be the case, based on the figure below [source], which plots the mean GRE scores in 2013 in verbal reasoning and quantitative reasoning among male and female GRE test takers in each field. Of course, women with high quantitative abilities might have been systematically dissuaded from STEM fields to non-academic careers; there might also be discrimination upstream that causes observed differences in GRE scores; and there might be selection issues in play with regard to who takes the GRE exam.

GRE by Field By Sex

---

This final figure plots female representation across academic field in 2013 [source] against mean male GRE test scores in verbal reasoning and quantitative reasoning, by broad academic field, in 2013 [source]. The GRE scores are limited to males to account for the fact that some fields such as education have a relatively high percentage of females and a relatively low mean GRE quantitative reasoning score. Patterns for quantitative reasoning scores match those of Randal Olson and Scott Alexander, with female representation falling as male GRE quantitative reasoning scores rise, but patterns for verbal reasoning scores don't match those of Randal Olson: in Olson's data, there is no correlation, but in the data below, there is a positive correlation. One large difference between Olson's data and the data in the figure below is that Olson's data were disaggregated into smaller fields.

Female Representation by Male GRE Score---

Notes:

1. [2015-01-25 edit] Corrected the spelling of Randal Olson, and revised the SAT graph to read "SAT Critical Reading" instead of "SAT Verbal."

Tagged with: ,

Here is the title and abstract to my 2015 MPSA proposal:

A Troublesome Belief? Social Inequality and Belief in Human Biological Differences

In A Troublesome Inheritance, Nicholas Wade speculated that biological differences might help explain inequality of outcomes between human groups. Reviewers suggested that Wade's speculations might encourage xenophobia, so, to understand the possible attitudinal consequences of such a belief, I develop predictions based on the expectation that belief in a biological explanation for group-level social inequalities reduces the perceived need for policies to reduce these inequalities. General Social Survey data supported predictions that this belief is correlated with lower support for policies to reduce sexual inequalities, support for greater social distance between racial groups, more support for traditional sex roles, and less support for immigration, but did not indicate a correlation with aggregate support for policies to reduce racial inequalities. I further developed and tested predictions regarding the possibility that persons who perceive biological differences to have resulted from unguided processes such as Darwinian evolution adopt more progressive attitudes toward social inequalities than persons who perceive biological differences to have resulted from guided processes such as intelligent design.

---

UPDATE (Nov 5, 2014)

The "Troublesome Belief" proposal was accepted for the MPSA public opinion panel, "Using public opinion to gauge democracy and the good life."

---

UPDATE (Mar 10, 2015)

Draft of the manuscript for the MPSA presentation is here. Data are here. Code is here.

---

UPDATE (Mar 23, 2015)

Updated draft of the manuscript for the MPSA presentation is here. Data are here. Code is here. Thanks to Emil Ole William Kirkegaard for helpful comments.

---

UPDATE (Feb 13, 2016)

Updated draft of the manuscript is here.

Tagged with: , , ,

Here's the abstract of a PLoS One article, "Racial Bias in Perceptions of Others' Pain":

The present work provides evidence that people assume a priori that Blacks feel less pain than do Whites. It also demonstrates that this bias is rooted in perceptions of status and the privilege (or hardship) status confers, not race per se. Archival data from the National Football League injury reports reveal that, relative to injured White players, injured Black players are deemed more likely to play in a subsequent game, possibly because people assume they feel less pain. Experiments 1–4 show that White and Black Americans–including registered nurses and nursing students–assume that Black people feel less pain than do White people. Finally, Experiments 5 and 6 provide evidence that this bias is rooted in perceptions of status, not race per se. Taken together, these data have important implications for understanding race-related biases and healthcare disparities.

Here are descriptions of the samples for each experiment, after exclusions of respondents who did not meet criteria for inclusion:

  • Experiment 1: 240 whites from the University of Virginia psychology pool or MTurk
  • Experiment 2: 35 blacks from the University of Virginia psychology pool or MTurk
  • Experiment 3: 43 registered nurses or nursing students
  • Experiment 4: 60 persons from MTurk
  • Experiment 5: 104 persons from MTurk
  • Experiment 6: 245 persons from MTurk

Not the most representative samples, of course. If you're thinking that it would be interesting to see whether results hold in a nationally representative sample with a large sample size, well, that was tried, with a survey experiment as part of the Time Sharing Experiments in the Social Sciences. Here's the description of the results listed on the TESS site for the study:

Analyses yielded mixed evidence. Planned comparison were often marginal or non-significant. As predicted, White participants made (marginally) lower pain ratings for Black vs. White targets, but only when self-ratings came before target ratings. When target ratings came before self-ratings, White participants made (marginally) lower pain ratings for White vs. Black targets. Follow-up analyses suggest that White participants may have been reactant. White participants reported that they were most similar to the Black target and least similar to the White target, contrary to prediction and previous work both in our lab and others' lab. Moreover, White participants reported that Blacks were most privileged and White participants least privileged, again contrary to prediction and previous work both in our lab and others' lab.

The results of this TESS study do not invalidate the results of the six experiments and one archival study reported in the PLoS One article, but the non-reporting of the TESS study does raise questions about whether there were other unreported experiments and archival studies.

The TESS study had an unusually large and diverse sample: 586 non-Hispanic whites, 526 non-Hispanic blacks, 520 non-Hispanic Asians, and 528 Hispanics. It's too bad that these data were placed into a file drawer.

Tagged with: , , ,

Here are some graphs to follow up on this Monkey Cage post.

DW-Nominate_SD

Bailey_SD---

UPDATE (Sept 12, 2014)

Here's another graph of the Bailey data, to illustrate the narrowing and moderate cross-time shift of the congressional GOP and northern Democratic ideal point estimates, but a more dramatic leftward cross-time shift in the congressional southern Democratic ideal point estimates.

bailey_sort

---

UPDATE (Sept 13, 2014)

Here are the do files I used to create a database with Bailey scores and state and party codes for each case (House, Senate). Here is a description of the process that I used to create the do files.

Tagged with: ,