Tour of research on student evaluations of teaching [31-33]: Bray and Howard 1980, Bennett 1982, and Brooks 1982
Let's continue our discussion of studies in Holman et al. 2019 "Evidence of Bias in Standard Evaluations of Teaching" listed as "finding bias". See here for the first entry in the series and here for other entries.
---
31.
Bray and Howard 1980 "Interaction of Teacher and Student Sex and Sex Role Orientations and Student Evaluations of College Instruction" selected 6 instructional faculty at the College of Social Sciences at the University of Houston from each of the following categories: feminine man, androgynous man, masculine man, feminine woman, androgynous woman, and masculine woman. For each of these 36 faculty members, one of their undergraduate courses was given during the final week of classes the IDEA questionnaire for measuring student evaluation of teaching and the Bem Sex Role Inventory (BSRI).
Bray and Howard 1980 indicate that the IDEA had 40 items across four parts: 20 instructor items, 10 student-rated progress items, 3 course evaluation items, and 7 items for the student self-rating of the course (p. 243). The article indicates that "During the last full week of classes (prior to final examination week) the appropriate students in each target class completed BSRI and IDEA questionnaires" (p. 244) and that "Two sections of IDEA were employed as outcome criteria: (1) the average of the responses on Part 2: Student Rated Progress (progress), and (2) student satisfaction with the instructor (satisfaction; IDEA question 37)" (p. 243). From what I can tell, the article thus uses only 11 of the 40 measured IDEA items and does not report results for any of the 20 instructor items from the first part of the IDEA questionnaire.
Results indicate that "Androgynous teachers received somewhat higher student evaluations than did masculine and feminine teachers" (p. 246), but there is no statistical control to assess whether any differences are due to differences in teaching styles or other factors.
---
32.
Bennett 1982 "Student Perceptions of and Expectations for Male and Female Instructors: Evidence Relating to the Question of Gender Bias in Teaching Evaluation" reported on data from 253 liberal arts college students in nonscience introductory courses. Students were instructed to respond to a self-returned questionnaire about a specific course.
Results in Table 2 indicate that, of the four factors in questionnaire responses, female instructors had higher ratings than did male instructors for non-authoritarian interpersonal style and charisma (potency) and did not differ at p<0.05 from male instructors on the self-assurance factor or the instructional approach factor.
Students reported more scheduled office visits for female instructors than for male instructors and were more likely to report feeling free to contact a female instructor at home than to contact a male instructor at home.
Results also suggested that the explanatory power of certain predictors of performance ratings differed for female instructors and for male instructors. From page 176:
Additionally, a highly structured instructional approach—described by students as communicating greater professionalism—was consistently more important for women's performance ratings than for men's. This was especially true for students' ratings of instructor's organization, clarity, and coherence in classroom presentation (rs = .76 and .53, for instructional approach for women and men, respectively), command of material for classroom presentation (rs = .49 and .27), and overall evaluation (rs = .57 and .30). Although in the collective student mind men and women do not differ in instructional approach, students are clearly more tolerant of what they perceive as a lack of formal professionalism in the conduct of teaching from their male professors, demanding of women a higher standard of formal preparation and organization.
Moreover, from pages 177 and 178:
...male instructors are judged independently of students' personal experiences of contact and access, whereas female instructors are judged far more closely in this regard. In this sense women are negatively evaluated when they fail to meet this gender appropriate expectation (and rewarded when they do so), although this study cannot provide evidence of the obverse—whether women are devalued for achievement in stereotypically masculine domains.
The "(and rewarded when they do so)" reflects how evidence that that explanatory power of the predictors of performance ratings differs for female instructors and for male instructors is mixed evidence of unfair bias in these performance ratings. If it is true that, for example, organization matters more for the performance ratings of female instructors than for male instructors, that might merely mean that female instructors have more ability to influence their performance ratings through their level of organization. That's not obviously worse to me than, say, male instructors not being rewarded for higher levels of organization. For a recent discussion of a related finding, here is a passage from Hesli et al. 2012 (p. 486, emphasis in the original):
Another critical finding that we note with some consternation is that among women, the probability of being an associate professor over an assistant professor is unrelated (given other controls) to the total number of publications (model 4D). This confirms that the promotion process at this level involves different dynamics for men as compared with women.
I'm not sure that the Hesli et al. 2012 results indicate that there is a p<0.05 difference in the explanatory power of publications. The coefficient for publications is statistically significant for men and not for women, but that does not necessarily indicate that a difference between the coefficients can be inferred. For what it's worth, the "publications" logit coefficient for men is 0.713 with a 0.230 standard error and for women is 0.303 with a 0.453 standard error.
---
33.
Brooks 1982 "Sex Differences in Student Dominance Behavior in Female and Male Professors' Classrooms" isn't a study about student evaluations of teaching. Rather, the study involved analyzing behaviors of first-year students in a master's of social work program, such as student frequency of speaking in class, student frequency interrupting a fellow student, and student frequency interrupting the professor.
Results indicated that "...male students interrupted both male and female professors significantly more often than female students (p < .01 and p < .001, respectively), and interrupted female professors significantly more often than male professors (p < .001)" (p. 687). Interruptions by female students were more evenly distributed than interruptions by male students: of all female interruptions of professors, 45 percent were of female professors; of all male interruptions of professors, 74 percent were of female professors.
---
Comments are open if you disagree, but I don't think that data from the early 1980s is relevant for discussions of whether student evaluations of teaching should be used in employment decisions made in 2019 or beyond.
From what I can tell, the main evidence of bias in student evaluations of teaching to be concerned with in these three studies is from Bennett 1982. But, from what I can tell, the bias in performance evaluations being better predicted by certain criteria for female instructors than for male instructors doesn't necessarily cause a gender-wide bias. For example, imagine these performance evaluation scores:
2 for a disorganized male instructor
2 for a organized male instructor
1 for a disorganized female instructor
3 for organized female instructor
Organization matters more for the female instructors than for the male instructors, but that particular difference in evaluation criteria does not cause a gender-wide bias. Following Hesli et al. 2012, it could be noted with some consternation that performance evaluation scores for men are unrelated to the instructor's organization.