Comments on Lacina 2022 "Nearly all NFL head coaches are White. What are the odds?"
The Monkey Cage recently published "Nearly all NFL head coaches are White. What are the odds?" [archived], by Bethany Lacina.
Lacina reported analyses that compared observed racial percentages of NFL head coaches to benchmark percentages that are presumably intended to represent what racial percentages of NFL head coaches would occur absent racial bias. For example, Lacina compared the percentage of Whites among NFL head coaches hired since February 2021 (8 of 10, or 80%) to the percentage of Whites among the set of NFL offensive coordinators, defensive coordinators, and recently fired head coaches (which was between 70% and 80% White).
Lacina indicated that:
If the hiring process did not favor White candidates, the chances of hiring eight White people from that pool is only about one in four — or plus-322 in sportsbook terms.
I think that Lacina might have reported the probability that *exactly* eight of the ten recent NFL coach hires were White. But for assessing unfair bias favoring White candidates, it makes more sense to report the probability that *at least* eight of the ten recent NFL coach hires were White: that probability is 38% using a 70% White pool and is 67% using an 80% White pool. See Notes 1 through 3 below.
---
Lacina also conducted an analysis for the one Black NFL head coach among the 14 NFL head coaches in 2021 to 2022 who were young enough to have played in the NCAA between 1999 and 2007, given that demographic data from her source were available starting in 1999. Benchmark percentages were 30% Black from NCAA football players and 44% Black from NCAA Division I football players.
The correctness of Lacina's calculations for this analysis doesn't seem to matter, because the benchmark does not seem to be a reasonable representation of how NFL head coaches are selected. For example, quarterback is the most important player position, and quarterbacks presumably need to know football strategy relatively well compared to players at most or all other positions, so I think that the per capita probability of a college quarterback becoming an NFL head coach is likely nontrivially higher than the per capita probability of players from other positions becoming an NFL head coach; however, Lacina's benchmark doesn't adjust for player position.
---
None of the above analysis should be interpreted to suggest that selection of NFL head coaches has been free from racial bias. But I think that it's reasonable to suggest that the Lacina analysis isn't very informative either way.
---
NOTES
1. Below is R code for a simulation that returns a probability of about 24%, for the probability that *exactly* eight of ten candidates are White, drawn without replacement from a candidate pool of 32 offensive coordinators and 32 defensive coordinators that is overall 70% White:
SET <- c(rep_len(1,45),rep_len(0,19)) LIST <- c() for (i in 1:100000){ LIST[i] <- sum(sample(SET,10,replace=F)) } table(LIST) length(LIST[LIST==8])/length(LIST)
The probability is about 32% if the pool of 64 is 80% White. Adding in a few recently fired head coaches doesn't change the percentage much.
2. In reality, 8 White candidates were hired for the 10 NFL head coaching positions. So how do we assess the extent to which this observed result suggests unfair bias in favor of White candidates? Let's first get results from the simulation...
For my 100,000-run simulation using the above code and a random seed of 123, the simulation produced exactly zero White head coaches zero times, exactly 1 White head coach 5 times, exactly 2 White head coaches 52 times, exactly 3 White head coaches 461 times, exactly 4 White head coaches 2654 times, exactly 5 White head coaches 9255 times, exactly 6 White head coaches 20987 times, exactly 7 White head coaches 29307 times, exactly 8 White head coaches 24246 times, exactly 9 White head coaches 10978 times, and exactly 10 White head coaches 2055 times.
The simulation indicated that, if candidates were randomly drawn from a 70% White pool, exactly 8 of 10 coaches would be White about 24% of the time (24,246/100,000). This 8-of-10 result represents a selection of candidates from the pool that is perfectly fair with no evidence of bias for *or against* White candidates.
The 8-of-10 result would be the proper focus if our interest were bias for *or against* White candidates. But the Lacina post didn't seem concerned about evidence of bias against White candidates, so the 9 White of 10 simulation result and the 10 White of 10 simulation result should be added to the totals to get 37%: the 9 of 10 and 10 of 10 represent simulated outcomes in which White candidates were underrepresented in reality relative to that outcome from the simulation. So the 8 of 10 represents no bias and the 9 of 10 and the 10 of 10 represent bias against Whites, so that everything else represents bias favoring Whites.
3. Below is R code for a simulation that returns a probability of about 37%, for the probability that *at least* eight of ten candidates are White, drawn with replacement from a candidate pool of 32 offensive coordinators and 32 defensive coordinators that is overall 70% White:
SET <- c(rep_len(1,45),rep_len(0,19)) LIST <- c() for (i in 1:100000){ LIST[i] <- sum(sample(SET,10,replace=F)) } table(LIST) length(LIST[LIST>=8])/length(LIST)
---
UPDATE
I corrected some misspellings of "Lacinda" to "Lacina" in the post.
---
UPDATE 2 (March 18, 2022)
Bethany Lacina discussed with me her calculation. She indicated that she did calculate at least eight of ten, but she used a joint probability method that I don't think is correct because random error would bias the inference toward unfair selection of coaches by race. Given the extra information that Bethany provided, here is a revised calculation that produces a probability of about 60%:
# In 2021: 2 non-Whites hired of 6 hires. # In 2022: 0 non-Whites hired of 4 hires (up to the point of the calculation). # The simulation below is for the probability that at least 8 of the 10 hires are White. SET.2021 <- c(rep_len(0,12),rep_len(1,53)) ## 1=White candidate SET.2022 <- c(rep_len(0,20),rep_len(1,51)) ## 1=White candidate LIST <- c() for (i in 1:100000){ DRAW.2021 <- sum(sample(SET.2021,6,replace=F)) DRAW.2022 <- sum(sample(SET.2022,4,replace=F)) LIST[i] <- DRAW.2021 + DRAW.2022 } table(LIST) length(LIST[LIST>=8])/length(LIST)