Implicit Bias , Stereotype Threat , and Political Correctness in Philosophy

This paper offers an unorthodox appraisal of empirical research bearing on the question of the low representation of women in philosophy. It contends that fashionable views in the profession concerning implicit bias and stereotype threat are weakly supported, that philosophers often fail to report the empirical work responsibly, and that the standards for evidence are set very low—so long as you take a certain viewpoint.


Introduction
The "tendency for researchers to fall in line with a particular point of view, especially in the more politically charged branches of science" in a way that "distorts research" and can "stifle original thinking" is a growing concern for some [1] (p. 405).Here "political correctness" will be narrowly taken as the use of political criteria in the justification of scientific claims, though to forgo confusion, we need to distinguish between prescriptive and descriptive senses.The prescriptive sense is often assumed by opponents of "political correctness," as in the characterization of feminist epistemology as holding that, "values should determine what theories are accepted [and that] inquiry should be politicized" [2] (pp.12, 15).In reply, proponents have insisted that although values should (and do) inform the context of scientific discovery (e.g., what projects are worth pursuing, hypothesis formation, etc...), they are not saying politics ought to constrain reasoning about truth or warrant [3] (pp. [81][82] [4] (p. 54).However, this is not the end of the matter.
In observing that philosophers are increasingly receptive to epistemic arguments for political agendas, Antony [5] counters that some do indeed politicize epistemic norms in arguing that "variation in social position" 1 can "pay epistemic dividends," improving the generation of knowledge in the sciences.According to Antony, Longino contends that demographic diversity is important for both discovery (in light of "a greater diversity of ideas") and justification (as it will "increase the rigor of scientific review") [5] (p.168).While it is platitudinous to say that good scientific practice is social in the sense of depending on communities of knowers offering multiple perspectives, why think demographic diversity will tend to improve epistemic diversity?Antony not only finds this claim highly dubious, but also objects that Longino fails to provide any arguments [5] (p.170).Antony recommends that philosophers committed to "progressive programs" including "democratizing science and the academy" should instead attend to "another kind of epistemic argument for pursuing [demographic] diversity," [5] (p.171) one that does not depend on risky claims about its epistemic value.This strategy turns to implicit bias research which indicates we "flout our avowed criteria" for admission to "knowledge institutions" and "thus learn empirically that our current gatekeeping policies ... are, very likely ... at the expense of more able members of socially subordinate groups" [5] (p.172).My interest lies not in litigating the dispute about prescriptive correctness, but rather has a strict focus on the descriptive sense.Are political considerations, nonetheless, constraining reasoning about implicit bias and stereotype threat within the context of justification?There are grounds for suspecting that this happens all the time with philosophers writing about implicit bias.
That philosophy is an outlier in the humanities regarding the underrepresentation of women has been the occasion for a lot of discussion about possible effects of subtle forms of prejudice, including implicit bias and stereotype threat.Implicit associations tend to be involuntary and unconscious and can diverge from a person's declared beliefs affecting our actions, judgments, and attitudes.Implicit bias might influence how we treat junior colleagues from socially stigmatized groups when it comes to sharing opportunities for professional development, advancement, and in evaluating scholarly potential and credentials.For example, a departmental committee might implicitly prefer a male candidate to a female candidate with the same qualifications despite holding conscious and explicit attitudes about the equality of the sexes.Meanwhile, stereotype threat is when awareness of and identification with stereotypes (such as that philosophy is for white males) results in heightened anxiety, performance disparities, and reduced interest.
These ideas have become familiar to the philosophical community, which has responded by implementing policy initiatives and other measures for improving demographic diversity, such as making syllabi and conference line-ups more inclusive, adjusting the management of professional organizations, and reforming journal and hiring practices.However, the philosophers promoting these agendas could do much better at representing the state of the empirical work.In fact, leading hypotheses about implicit bias in philosophy are not at all well supported.The following sections will critically appraise reasons often given for thinking that implicit bias and stereotype threat are major factors driving outcomes in philosophy.Space doesn't permit a comprehensive review of the entire body of literature in social psychology, but fresh insights about prominent examples, some old, others more recent, will be offered.

Concern about Bias in Philosophy
The attention this topic draws is illustrated by departmental websites, which now often include a statement about climate issues.A typical example is the University of Washington, which warns against the effects of implicit bias, microaggressions, and stereotype threat [6].It also refers the reader to works at the Bias Project 2 [7] which lists dozens of affiliated philosophers and psychologists at major universities [8] who, though not necessarily agreeing that implicit bias explains underrepresention, overwhelmingly tend to agree that implicit bias is real and has important real-world effects; this is at least suggestive of an expectation of effects in the profession.The APA, the CPA, Colorado, Sheffield, Rutgers and elsewhere maintain webpages with similar content.There are a growing number of initiatives that take reducing the real harm of implicit bias as their starting point, such as the "policies aimed at overcoming implicit bias and stereotype threat" at Trent University [9].Similarly, at the University of California, strategies to "improve the situation for women in philosophy, in the face of the problems of implicit bias and stereotype threat" have "influenced the pedagogical training of PhD students" [9].It therefore seems reasonable to infer that many philosophers are taking these ideas very seriously.It is also important to note that philosophers' beliefs on this matter might often be 2 As of May 18 2017 these materials are offline, though many of the references can be found at the entry for Implicit Bias at The Stanford Encyclopedia of Philosophy [10].
held informally and can be influential despite not always being expressed or defended in published work-on syllabi and social media, for instance. 3 These websites are the public face of departments who are lending their authority to questions of scientific dispute and the policy changes mentioned above often go beyond the mere sharing of information for interest's sake.In the case of Rutgers the reader learns the Department believes in "making information on implicit bias and stereotype threat" as well as "how to counteract them . . .publicly accessible," as this is "crucial" to "developing an excellent professional philosophical climate for all philosophers."This information is not presented in such a way as to merely generate interest, as it fails to draw attention to critiques and weaknesses in the implicit bias literature, and tends towards advocacy, as in the normative conclusion that it is "crucial" for us to "counteract" these implicit biases.Of course, perhaps philosophers' beliefs are empirically correct and so they are right to be acting.Are they correct?

What Does the Research Show?
One important basis for research on implicit bias is the Implicit Association Task (IAT).This might seem to show that people harbor discriminatory attitudes even when they claim to be free of prejudice.For instance, subjects tend to pair black faces with negative words and white faces with positive ones.But it would be hasty to infer much from these results.I don't doubt that philosophers would exhibit implicit associations, however some of these findings may only reflect the inductive absorption of cultural knowledge rather than agonistic attitudes [16], such as that teenagers are more rebellious, moody, and vulgar than the elderly, or that men are engineers more often than women [17,18].Other results might have an explanation besides racism or sexism, such as that "outsiders" are implicitly associated with negative words regardless of their racial background [19].Still other results might be due to conscious and explicit prejudices that would be embarrassing to reveal to researchers [20,21].Further, the effects are small and the connection between negative associations and actual discriminatory behavior is, as is well known, contested. 4Though philosophers often assume implicit biases are not accessible through introspection or under conscious control, 5 psychologists have largely rejected these assumptions, and now favor the view that automatic processes only tend to have a combination of features that may or may not include access and controllability [27].Let us acknowledge these are ongoing scientific controversies, with one prominent expert going so far as to declare the IAT "completely bogus." 6However, my point is not to dismiss this research, but motivate an attitude of wary skepticism when thinking about its application to the philosophy profession.
The Rutgers page [36] devoted to implicit bias mentions studies about recommendation letters, gender-swapping names on cvs, anonymous reviewing at an ecology journal, and blind auditions at symphony orchestras.The net impression on the reader is that bias manifests in a wide range of real world contexts where professional competence is evaluated.However, the totality of evidential support is rather underwhelming.A telling example is Rutgers' endorsement of a discredited study about gender bias in manuscript reviewing.Though initially feted by Nature as "one bright light" [37], 3 Here are five examples seeming to portray the science as settled, such as a course titled "Implicit Bias: A new (invisible) form of oppression" [11][12][13][14][15]. 4 For a sense of this debate, see Oswald et al. (2015) [22] responding to Greenwald et al. (2014) [23].See also Oswald et al. (2013) [24] responding to Greenwald et al. (2009) [25].5 E.g., Saul writes: "Psychologists have established that these biases are not (readily) amenable to direct conscious control ... a conscious, direct effort to simply not be biased is unlikely to succeed . . ." [26] (pp.257, 259). in a retraction in 2008 Nature acknowledged the study failed to provide "compelling evidence" and "we no longer stand by the statement . . .that double-blind peer review reduces bias against authors with female first names" [38] (as of May 16, 2017 the Rutgers webpage has not been revised).I will return to name-swapping experiments further on, but next consider blind auditions at symphony orchestras [39].
In its simplicity and vividness this example motivates worries about unconscious bias for many-"Wear army boots or sneakers to blind auditions," it's been long said, so the judges will not pick up hints about a woman's identity [40].It is also often cited by philosophers, such as Saul who remarks that blind auditions "worked beautifully with orchestras . . .dramatically increasing their percentages of female members" [41].But, attired in wary skepticism, I wonder: What did this study show exactly?Given the visibility it enjoys, it warrants a careful examination and in the next section it is considered in detail.

Blind Ambition
In reviewing this work one is surprised to learn there was no direct testing of an empirical hypothesis.Instead, Goldin and Rouse [39] examined decades of employment and audition records at major orchestras.They found that during the period when candidates were required to play behind a screen concealing their identity, the proportion of women who were ultimately hired increased.But why did it increase?
While the researchers attribute 30% of the rise to the change in auditioning practices [39] (p.738) this conclusion is speculative and, as they mention in the abstract, is advanced with reservations.As they note, in the 1960s and 1970s trade unionism led to democratization of the workplace and somewhat curtailed the Conductor's tyrannical power.Among the revised auditioning practices adopted in the 70s and 80s was the stipulation that panels would draw on rank-and-file players.This was just one aspect of broader efforts to shift the power dynamics in the management of orchestras towards self-governance.We must therefore be wary of a post hoc fallacy.There is an alternative hypothesis.
Perhaps the improved representation of women in orchestras would have happened anyway.One reason to suspect this, as the authors acknowledge [39] (pp.718, 723), is that women sought more training and employment opportunities in many professions throughout the 1970s and 1980s.Women didn't need blind interviews to make inroads in science, law, medicine and academia, so why should orchestras be any different 7 ?In addition, the uptick begins prior to the adoption of blind auditions, and by the late 1960s the process is already under way.Since screens were being adopted at the same time that orchestras were becoming more egalitarian, is it not plausible that the very changes making orchestras friendly to screens would also tend to make them friendlier towards women?To this, the authors retort that sex composition probably had little effect on the initial adoption of blind auditioning [39] (p.723).Yet this is irrelevant because it is compatible with the post hoc alternative.Orchestras adopted screens for different reasons: some willingly, in light of more meritocratic attitudes taking root, others as a response to various kinds of activism and pressure, including lawsuits and union contacts.There is no expectation for a correlation between the initial adoption of screens and sex composition.
There are also some odd patterns in the data that ought to give one further pause.Goldin and Rouse found that women's chances improved from preliminaries (37% of candidates) to finals (43% of candidates) even when the auditions were not blind [39] (p.725).Also noteworthy is their share of the candidate pool (only 33% female in the 1970s up to 39% in the 1990s) suggesting that women perhaps enjoyed an advantage in the non-blind context.Indeed, women seemed to fare worse behind the screen, leading Goldin and Rouse to propose that blind auditions attracted female candidates of 7 This parallels the reasoning that led Nature to withdraw its endorsement of the study about gender bias in manuscript reviewing.
lower quality [39] (p.726).By restricting the data to candidates who performed under both conditions the effect disappears.In this case the success rate is "almost always" higher with the screen when it comes to being advanced in preliminaries, finals, and hiring.On the other hand, while this might ensure the quality of women performers is held constant, there is no reason to assume that the quality of the judges did not vary.It is not just candidate quality, but standards for assessing quality that must be held constant.This is because auditions without a screen were more likely to be administered by judges who were explicitly prejudiced against women, since these took place in orchestras most resistant to social change.In any case, since the variable is uncontrolled, the result cannot be trusted.
Their data also contains the "anomalous result" [39] (p.727) that the screen once again appears to work sharply against women in semifinals.Goldin and Rouse attempt to dismiss this paradoxical finding as an artifact of the small size of the sample for non-blind preliminaries with semifinal rounds (there were only three such preliminaries, comprising 23 distinct auditions).However, the reader should consider that they also excluded audition data in which only men or only women competed.As a result many candidates who were left out of the data set from certain preliminaries reappeared when they competed later in a mixed-sex semifinal round; this presumably explains why there were over 200 separate auditions in the crucial semifinal rounds (comprising 89 blind and 25 non-blind audiences) where the anomaly actually occurs.It is, in other words, not so easily dismissed.
Despite these difficulties, Goldin and Rouse ultimately conclude that women improved their chances while playing behind a screen [39] (p.726).What are we to make of this claim?Certainly, it flatters common sense that blind auditioning would remove possible sources of bias, and the required effort is meager.This is all well and good.Nevertheless, we ought to be circumspect in the absence of rigorous and replicated hypothesis testing.All in all, there is a great deal of variation and the evidence is hard to square with any great confidence that a woman's chances of being hired improved from about 23.5 to 30%.To their credit, Goldin and Rouse acknowledge that some of their data "do not pass standard tests of statistical significance" [39] (p.737), conceding the effect is literally "nil" when semifinals are included [39] (p.734)!
In short, we have a hiring audit that is compatible with a hypothesis about implicit bias, but did not examine cognitive mechanisms, and provides ambiguous correlative evidence (but only if semifinals are excluded).Certainly there are grounds for further study-for instance, do members of hiring committees implicitly associate female musicians with lesser musical ability?Little work seems to have been done, though at least one experimental study suggests awareness of gender makes no difference to assessments of musical performance [42] (p.76).Clearly we ought to reserve judgment about the efficacy of blind auditions.This skepticism ought to be magnified when it comes to applying this very tentative and conjectural research out of context to publishing and hiring in philosophy where, nonetheless, hyperbolic claims about bias in hiring and manuscript review are routine and often accompanied by calls for double, or even triple blind reviewing, with outright quotas sometimes mooted. 8ext, how have philosophers and others responded to the audition study?Besides Saul's unjustified contention that screens "worked beautifully" the Rutgers website suggests implicit bias works against women musicians and this has some sinister relevance to philosophy.University press offices, news services, and hucksters 9 can take some of the blame for uncritically promoting the idea that blind auditions "boost the chances" [45].But philosophers strain their credibility when they fail to get basic facts right, such as Haslanger who confuses Goldin and Rouse's investigation for name The author shares the view expressed on Alternet that Malcolm Gladwell is America's most successful propagandist and corporate shill.swapping experiments on "identical term papers, CVs and the like" [46]. 10Saul and Haslanger are especially relevant because their articles are credited with having "brought the matter of implicit bias to the attention of the philosophical community" [47] (p.206).

Philosophers on Bias in Hiring
Granted, little can be inferred from the blind-audition study, so let us next consider what other data philosophers have relied upon.Several think implicit bias is real, in the sense of having measurable effects on real-world outcomes, including that "many pernicious and ubiquitous forms of prejudice are perpetuated because of people's implicit biases" [48] (p.629).Next, special attention will be given to what philosophers have said about cv biases and, especially, tenure-track hiring in academia and philosophy.
On this score, first consider this general claim from Saul: These are unconscious, automatic tendencies to associate certain traits with members of particular social groups, in ways that lead to some very disturbing errors: we tend to judge members of stigmatized groups more negatively, in a whole host of ways [41] (p.244).
She offers the example of cv bias, where "The right name makes the reader rate one as more likely to be interviewed, more likely to be hired . . .and a better prospect for mentoring" [41] (p.244).It is "almost certain" this "unfairness" also occurs "within philosophy", and so "there are almost certain to be some excellent students receiving lower marks and less encouragement than they should; some excellent philosophers not getting the jobs they should get" [41] (p.246), and "Philosophy as a field is the worse for this" [41] (p.247).For support she offers a few name-swapping studies [41] (p.244, fn.4), most notably by Moss-Racusin et al. [44] and Steinpreis et al. [49] which occur in artificial contexts and in only the latter case concerned an academic position, though in a different field: psychology.Curiously, Steinpreis et al. found no gendered effects for tenure decisions, but perhaps most importantly, the sample sizes of around 65 respondents for the female-cv and 60 for the male were not huge-the discrepancy in responses is just at the fringes of the margin of error. 11Steinpreis et al.'s results have not been corroborated by STEM hiring audits [77], nor replicated in experiments performed by Bertrand and Mullainathan [63] and Williams and Ceci [51].Despite this, philosophers overlook these limitations and go on to make exaggerated claims about hiring biases.Given the zeal with which Departments are promoting diversity initiatives, it is not hard to explain why this is happening.Elsewhere Saul writes that "[a]cademics are clearly affected" by their implicit biases [52] (p.41)-though this remark is only supported by the discredited study of blind review mentioned earlier.Saul continues: "Both stereotype and implicit bias may have strong effects on a woman's performance" [52] (p.45) on the philosophy job market.This is because: CVs with women's names are likely to be seen as less good than CVs with men's names.As we have noted, letters of recommendation are likely to be weaker for women than for men.And women may well have had more trouble than men at getting publications.Women will also very likely face stereotype threat, often in the form of an overwhelmingly (or wholly) male team of interviewers adding to the stress of the already hideously stressful interview process [52] (p.45).
Saul also writes as if it is "well-established that the presence of a male or female name on a CV has a strong effect on how that CV is evaluated," [52] (p.41) though with support from the two previously mentioned works by Moss-Racusin et al. [44] and Steinpreis et al. [49].
Saul is joined by Antony who also claims that women are likely not taken as seriously in philosophy, are at a disadvantage in hiring competitions, and are left out of special opportunities for early-career publishing.Antony and Cudd have written about a mentoring project designed to "focus on the specific problems of implicit bias, sexual harassment and underrepresentation of women in philosophy" [53] (p.462) by supporting efforts to neutralize (among other things) presumed "forces of implicit bias" [53] (p.461).In this case there is no scientific evidence presented.No stand is taken here on the legitimacy of mentoring programs, and nothing I have written is intended to minimize the impact of conscious and explicit bias, such as sexual harassment (which anecdotally strikes me as a very serious problem at certain elite programs).I am only questioning the inclusion of implicit bias as part of the explanation of underrepresentation in the absence of compelling empirical evidence.
Nor is this to deny there is nuance in their views: Antony, for instance, doesn't say implicit bias is the only factor (indeed, elsewhere she invokes a "perfect storm" of vectors) [54].All the same, she also finds the evidence supports a normative recommendation: she advocates (and acts on) steps to counteract implicit bias, for otherwise "Women academics are very likely to have their work systemically undervalued" as "multiple studies have shown that academic papers and professional CVs are rated more highly if they carry a man's name" [54] (p.236) However, only one study is cited-once again it is Steinpreis et al. [49].
Brennan likewise regards implicit bias as one of a suite of factors with a real-world impact.She suggests "The most compelling explanations of the situation of women which do not focus on intentional wrongdoing, such as harassment and deliberate discrimination, look to the twin-causes of implicit bias and micro-inequalities" [55] (p.183), as these help explain failure to achieve "equity in hiring" at Canadian philosophy departments [55] (p.184).Micro-inequalities are described as "unconscious" or "often unintentional" prejudicial behaviors with "wide-ranging effects," [55] (pp.184-185) yet no scientific literature is provided.Later Brennan proposes unconscious "bias demonstrated in various CV evaluation studies" among other "mistakes in our thinking" are "something we all do," though once again scientific citations are absent [55] (p.190).
Meanwhile Holroyd reminds "hiring committees" of their "epistemic responsibilities, which include familiarizing themselves with that body of knowledge" such as "differential evaluations of the same cv" since "recent studies indicate" a tendency towards the delusion that we are "immune to bias" [56] (pp.512-514, 519, 521).Yet she offers only one study about hiring by Dovidio and Gaertner.Holroyd (with some qualifications) seems to endorse Washington and Kelly's view that "gatekeepers" are "culpably ignorant" if they fail to agree: "they can be held responsible (and perhaps blamed) for not knowing what they should-given the availability of the relevant information, and the role they occupy, be aware of" [56] (pp.517, 522); see also [57].Let us therefore take a moment to consider what Dovidio and Gaertner's findings were [58].
Undergraduates were asked to evaluate candidates for a peer-counseling program, and it was found they sometimes preferred a member of a mostly white fraternity to someone from the Black Students Union, but this was only when the qualifications were ambiguous.Complicating matters, subjects were twice as likely to select the BSU member when candidates' qualifications were clearly weak, and slightly more likely to do so when qualifications were clearly strong.These results were consistent over two time periods (the late 80 s and 90 s) for which the researchers also found a decline in overt expressions of prejudice.They proposed subjects in the later time period exhibited implicit racial bias despite attempting to maintain a non-prejudiced self-image.Once again, this ignores other plausible mechanisms, and in any case nothing here is suggestive of implicit bias in the philosophy profession: how often are candidates with less than outstanding qualifications in the running for a tenure-track job?
The researchers suggest ambiguous qualifications might be interpreted differently depending on the candidates' perceived ethnicity, yet even this might be explained by explicit stereotypes and conscious prejudice-such as stereotypes deeming whites disciplined and blacks undisciplined (only the strong candidates were described as "disciplined")-or even other background knowledge: perhaps the fraternity (which was on campus) had a good reputation and the BSU did not-certainly subjects were more likely to either be friends with, or even belong to, the former than the latter (raising the possibility of an in-group effect that is not necessarily racial).In any case, no evidence is given for ruling out the possibility that some subjects held explicit prejudices, but knew it would be embarrassing or "politically incorrect" to reveal as much.Dovidio and Gaertner even raise possible "cautiousness by whites about being too negative in evaluations of blacks (and thus appearing biased)" [58] (p.318).Hence, Holroyd errs in drawing such definitive conclusions about philosophers on the basis of a single questionable study about undergraduates making hypothetical judgments in a non-scholarly, non-professional context.Notwithstanding these difficulties, others, such as Levy, seem to uncritically recycle Holroyd's analysis of the Dovidio and Gaertner study [59] as support for thinking that "implicit attitudes make a difference" for such things as "being hired" [59] (p.803).
Elsewhere, Holroyd repeats the claim that evaluations are "less positive" when a cv "bears a woman's name rather than a man's" [60] (p.276).This time she cites no primary literature though she includes Valian's Why so slow?The advancement of women [61], which in turn cites only two dated works: a cv study in psychology from 1975 and a meta-analysis by Olian, Schwab and Haberfeld [62].Looking further, "The phenomenon of overrating men and underrating women appears to be widespread," Valian writes in 1998 [61] (p.128) despite Olian et al.'s statement (in their abstract) that "this effect was not consistent and accounted for only 4% of the variance in hiring recommendations," before concluding "with some methodological reservations" that the overall effect is "marginal" [62].Perhaps slight effects can snowball into larger outcomes, though randomness and defects in experimental design can also account for marginal findings.
Some, such as Haslanger may only seem to commit themselves to the weaker claim that "evaluation bias" is merely "potentially" relevant, though at every career stage, including "applications for jobs" [46] (p.213), and when a cv is "'read' as inferior" [46] (p.214).Yet as she finds gender-swapping experiments "plausibly" showing we indeed do "interpret information . . .differently" depending on perceived gender [46] (p.213), it seems reasonable to upgrade her view to something like a high expectation.Only two studies (besides the one about blind-auditions) are mentioned [46] (p.213): once again, Steinpreis et al. [49] along with a callback study for non-academic jobs conducted by Bertrand and Mullainathan [63] about biases against applicants with "black-sounding" names. 12hile one might think that if study-A [49] shows that applicants with female names are rated lower, and study-B [63] shows that "black" names are rated lower, together they must indicate that female and black-sounding names are rated lower.Except, while the second study indeed found applicants with "black" names got fewer callbacks, it also found female names got slightly more!Notwithstanding these confusions Haslanger's paper is "influential" [64] (p.1), prompting some to wonder whether women should abandon philosophy for fields that "value our work properly" [65] (p.22).
While Haslanger does not claim implicit bias explains everything, or that it is the only factor, or even the primary factor, she has in some sense increased her confidence, more recently writing that she is "convinced" implicit bias "plays a role" in the explanation of persistent inequality [66] (p. 1).This is compatible with her claim that implicit bias is a consequence of deeper social factors [66] (p.12).Haslanger writes, "there is empirical evidence to support the claim that we are all biased" [66] (p.12), yet no experimental literature is provided, and her example about hiring [66] (p. 3) suggests that she continues to be unaware of Bertrand and Mullainathan's [63] mixed findings.It appears that her general outlook is becoming orthodox.Evidentiary standards, already low, are giving way to a default politicized "liberation" ideology founded on the assumption that implicit bias is real and well-substantiated, and interest is shifting towards abstract musings on how to best understand it and what the normative implications are. 13 Despite these weaknesses in empirical support, cv biases (among other examples) are being taken as so conclusively established as to make a firm basis for reflections on our blameworthiness in the "rapidly growing" [59] (p.819) literature on implicit bias and epistemic responsibility (see also Levy) [68].A different lesson that could be drawn about epistemic responsibility is that philosophers ought to quit making claims about implicit bias that far outstrip the evidence.Though the focus has only been on evaluative bias in hiring contexts, these philosophers have diminished credibility when they turn to other examples.

Other Evidence Concerning Implicit Bias in Academia
What about other recent experimental work on judgments of professional competence in academia?Certainly, there is some evidence women's credentials are sometimes evaluated more harshly in STEM disciplines. 14But others failed to replicate at least some findings [69], with some contending a systematic look at twenty years of evidence shows no signs of discrimination in publishing, grant competitions, or hiring in the sciences [70].These results have been reinforced by an experimental study about the evaluation of hypothetical tenure track candidates, which found no anti-female bias across several STEM disciplines but turned up strong anti-male bias by a factor of 2 [51].Certainly Ceci and Williams' research is also controversial, 15 and there is no assumption here that it should be accorded more weight than any other study, though we must consider whether some of the negative responses they have received are motivated by hostility to their findings-the reader is challenged to even find a favorable mention of their work in a mainstream philosophy journal. 16 There has been little direct work in the humanities.One exception is a widely reported recent study by Milkman et al. [72] purporting to show that prospective graduate students were more likely to get a reply if their email used a stereotypically "white sounding" name.Though this work seems to have gone unnoticed by philosophers, certainly they often take claims about racial bias and callbacks at face value [73][74][75].And yet on the assumption that the privilege of race and gender tend to increase callbacks, what are we to make of their finding that Hispanic women were favored as much as white males?The researchers also found there was no effect within any discipline if the request was urgent; this is a little curious assuming that conscious and deliberative judgments would occur less often in such cases and that implicit biases would be more likely to affect snap judgments.But for the sake of argument let us be charitable and take their results at face value.In fact this means its support for bias in philosophy is basically zero.This is because almost half the evidence indicating favoritism towards white men was attributable to faculty in Business and Education, with Human and Health Services, and Health Sciences accounting for the vast majority of the rest. 17There was little evidence of prejudice in the hard sciences, even less in the Social Sciences, and essentially none at all in the Humanities (and even reverse discrimination in the Fine Arts) [76].Thus, Milkman et al. [72] report no significant evidence of a "temporal discrimination effect" for underrepresented groups in Arts and Sciences.Indeed, A&S appears to be mostly welcoming regardless of a person's background, though 13 See Benétreau-Dupin and Beaulac [67] for some further criticism of Haslanger. 14Most notably, Steinpreis, et al. [49] and Moss-Racusin et al. [44]. 15The reaction in the philosophical blogosphere was mostly negative and dismissive.Williams responds to many alleged weaknesses in their method and other criticisms in a series of blog posts [50]. 16Though see Sesardic and de Clercq [71].Their article is cogently argued, carefully researched, and worthy of philosophers' attention despite appearing in the obscure organ of a rightwing think-tank.Their vehicle of publication perhaps says more about editors and referees at philosophy journals than the authors' political orientation (the frosty reception of the present manuscript by reviewers at various other venues, sometimes rising even to sexist ad hominem, has been instructive to this author). 17Perhaps I am stereotyping, but it would not be surprised to find explicit prejudices in business schools.Meanwhile, in at least two of the other fields exhibiting temporal discrimination-health and education-men are so underrepresented that this might be a factor in recruitment.
I acknowledge, again attired in skepticism, that this is only one possibly flawed study.However, in failing to find evidence of disproportionate treatment, the default supposition is not that implicit bias exists and is in hiding.The burden of argument is not on the skeptic.But let us keep looking.

Stereotype Threat and Curriculum Inclusivity
Another possible factor explaining the low representation of women is the existence of harmful stereotypes, which might inhibit academic performance and reduce interest.However, this research is even more controversial than IAT tasks [78,79], and the experimental studies beginning to trickle out [80][81][82][83] offer little to no evidence of climate issues in introductory philosophy courses where the "the only statistically significant" decline in women's participation occurs (from 43% in introductory courses to 35% of majors) [80] (p.2).Thompson et al. [80] asked introductory classes over fifty climate-related questions (let us note what questions and answers ought to count as pertaining to climate-related bias is not obvious) and their results led them to propose that philosophers should consider changing their course methods and content to better fit women's preferences, such as by deemphasizing thought experiments [80] (pp.18-19, 21, 26).Oddly, they also recommended including more science-related readings despite their finding that women cared little (and much less than men) about curricula changes that would incorporate more scientific approaches, and showed a significantly stronger preference for "non-philosophical texts" such as literature and newspapers [80] (pp.26, 28).Notably, both males and females agreed strongly that students were treated with equal respect, regardless of race or gender [80] (p.24).Thompson et al. also found that women were not as excited about philosophy as men: they enjoyed the class less, they found it less interesting, less relevant, though they also didn't think they'd be as good at it, and were less likely to think they had a lot in common with the typical philosophy major.But are they less interested because they think they aren't as good or don't fit in, or do they think they aren't as good or don't fit in because interest is an important driver when it comes to performance and satisfaction?Thompson et al.'s study cannot answer this question, and errs in assuming that philosophy has a burden to transform into something more non-philosophical just because some students might find it more enjoyable.Meanwhile, a recent University of Sydney study "found no statistically significant evidence to support the classroom effect hypothesis" [81] (p.469) in an introductory philosophy course.Another noteworthy finding of Thompson et al. is that students who are not white men are more likely to perceive philosophy as impractical when it comes to obtaining a good job.Although a brief presentation about philosophy's utility seemed to improve this perception, it did not change the frequency with which women and visible minorities joined the major [80] (p.18).Some might find this discouraging.But while it is often assumed there should be efforts at changing stereotypes, there is a degree of naiveté in the suggestion.Why ought perceptions of stereotypes be changed if they are correct (such as that philosophy is probably far from an optimal major when it comes to finding a secure career)? 18This brings me to what I will call the Prudential Reasoning hypothesis: perhaps men enjoying a high Social Economic Status (SES) are more likely to take risks with their future careers because they are more confident that everything will turn out all right.Meanwhile, perhaps women and lower-SES men have a more practical outlook and seek educational opportunities accordingly.In other words, if the perceptions of women and lower-SES men about the practical value of a philosophy degree are right, this turns the question of poor representation upside-down.Why is there such a surplus of high-SES individuals?Perhaps high-SES men tend to be worse at prudential reasoning.Or perhaps they are more likely to believe that what they study doesn't matter, since their career prospects depend on other things, such as social networking, racial or gender privilege, etc.Or perhaps it has nothing to do with prudential reasoning and their interests or abilities differ for some other reason.I am not committed to the prudential reasoning hypothesis and only raise it to encourage the reader to consider further factors that might contribute to gender disparities. 19nother way to counter possible negative stereotypes is curriculum reform, which may help foster women's interest and sense of belonging in philosophy classrooms [86,87].The most elaborate defense of this strategy I could find is Norlock's, who writes that "empirical evidence" attests to the benefits of more "inclusive curricula including diverse offerings on syllabi" [88] (p.348).Norlock recommends "the deliberate inclusion of perspectives of women and nonwhite scholars at those points in a course curriculum at which such perspectives are traditionally marginalized or overlooked", especially "the pointed effort to include women on introductory syllabi" [88] (p.354).The benefits of inclusive curricula, she argues, are "repeatedly verified" when it comes to student engagement, learning outcomes and critical thinking.This is strongly supported, according to Norlock, by empirical research in STEM and business-related fields.Though she concedes there is no direct evidence for philosophy, "reasonable inferences" extend these results, since literally "every source to which I turn cites the affective benefits" [88] (p.361, n. 37).However, in reviewing the dozen or so studies she provides, few even spoke to the issue of inclusive curricula, let alone offered evidence.For example, a paper by Tonso says literally nothing about curricula [89].Another by Anderson [90] shares Norlock's outlook but offers no empirical support.Still another conducted an "explanatory survey" of MBA students on gender issues [91].The most relevant result here was that a majority of the 18% self-selected respondents agreed with the anodyne proposal that "inclusion of both female and male perspectives would positively affect learning" [91] (p.163).A 300-page tome by something called the Committee on Science, Engineering, and Public Policy mentions curriculum a handful of times, but offers no new data and indeed says exactly nothing about the inclusivity of syllabi [92]; likewise, a paper by Lawal [93].Mills and Ayre describe efforts at making the engineering curriculum at an Australian school more inclusive, but caution no conclusions can be drawn about its effectiveness in promoting retention or success [94].While it is possible I somehow overlooked something, study after study turned up no scientific basis for Norlock's assertions on this matter. 20ome of these works discuss stereotype threat, but they offer little to no empirical evidence for the benefits of inclusive syllabi.There has been a small amount of research on this issue in philosophy, though so far the results have been sharply negative.A recent field experiment at Georgia State found that requiring at least 20% female authors on syllabi had no effect on the recruitment of new majors [80] (pp.15, 20).Perhaps Georgia State will try again, though I wonder if students pay much attention to such things and, to the extent that they do, how much weight they are given in deliberations about what subjects are worth devoting several years of one's life to.
There are also general difficulties for claims about stereotype threat in philosophy.Stereotype threat is supposed to involve explicit awareness of and identification with stereotypes specific to a given domain and associated anxiety for those who do not fit in.It also is presumed to cause performance disparities.Yet none of this seems to be true of philosophy undergraduates.Newcomers to philosophy don't seem to have stereotypes comparable to other subjects such as math.There is no evidence of performance disparities, especially in the crucial earliest stages of the career pipeline.Some studies even suggest males can be disadvantaged on high school and college entrance exams, including those testing for philosophical acumen [96,97].
Others, echoing Norlock, call for syllabus reform in light of the "large body of research" indicating it is "vital to the process of inviting female students to the study and profession of philosophy" [98] (p.156).Yet Walker only mentions two or three studies of dubious scientific value, including Margolis and Romero [99], whose method was restricted to qualitative interviews of a few sociology graduate students, and works by Hall and Sandler, who are known for popularizing the phrase "chilly climate."Less well known is that "they did not collect any data" [100] (p.30).Once again we are denied any substantial body of evidence for believing that diversifying syllabi would be likely to increase the participation of women.
To be sure, just because real-world studies tend to be messy, and controlled experiments are artificial doesn't mean each may be dismissed, though for different reasons.We must think of research in terms of an overall aggregate in which the limitations of one approach are compensated for by another type.I have only scrutinized a few individual studies, and I acknowledge there may be other works that better substantiate the claims the philosophers are making.However, if this is so, they should produce them.Appeals to aggregates cannot be used to simply dismiss criticism of individual studies either.Philosophers are invited to show exactly how the experimental studies compensate for the hiring audits, and vice versa.For example, in the case of blind-auditions experiments could be devised, though as noted, the one study that essentially did so failed to find any gender bias in the non-blind context.
Instead of anecdotes, intuitions, pop-science, and the selective and uncritical employment of contested empirical literature, I have taken a different approach.The reader has been alerted to ongoing debates between psychologists about the significance of IAT tasks and the evidence for Stereotype Threat.Commonly cited claims about blind refereeing, auditions, returned calls to prospective graduate students, and the purported efficacy of syllabus reform have been scrutinized.In every case we have found the evidence for disadvantage wanting.This analysis contrasts with the confidence some philosophers place in rather shaky findings and who seem unaware of social psychology's even well-publicized internal difficulties, such the debate about its "replication crisis" [101][102][103].It is clear that the dominant narrative about implicit bias is overdue for a reassessment.
The evidence for thinking implicit bias explains patterns of underrepresentation in philosophy is undeniably lackluster, yet despite this, leading figures have consistently and overwhelmingly misrepresented key findings and made grossly unsubstantiated claims.Though my language is strong, I feel it is necessary to call attention to a pattern of employing marginal results in prominent journals, conferences, departmental websites, and professional blogs as settled science, while clear misses and alternative hypotheses are routinely ignored, and reporting is insufficiently critical.This is not necessarily to say that bias is nonexistent in philosophy, or that women are favored, e.g., on the job market.My claim here is only that the basis for thinking implicit bias is an important factor when it comes to explaining the low representation of women (and perhaps other underrepresented groups) is unimpressive.
Finally, what does this have to do with political correctness?Philosophers have been seeking epistemic grounds for political aims, with Antony making a clear and explicit statement about the most promising avenue: implicit bias [5].There is nothing wrong with appealing to scientific findings, but the concern of Haack and others about the danger of political values constraining the context of justification with a net loss in objectivity seems highly plausible.Underdetermined theories, and perhaps even false theories, are being accepted as true when we should at least suspend judgment.Indeed, this is the case with Antony herself when she uncritically relies on two now familiar, though questionable, studies in order to make quite sweeping claims about hiring biases [5] (p.158).She also joins those who erroneously treat Goldin and Rouse's correlational hiring audit as if it was authoritative on the matter of the "power" of blind auditions [5] (p.166).To my knowledge, there are no examples of philosophers indicating awareness of even Goldin and Rouse's self-acknowledged limitations.
Perhaps it will be objected that I am not in a position to determine what cognitive biases or ideological frameworks are influencing others.I am certainly not claiming there is an intentional effort to undermine attempts at an objective assessment of the science.However, it is evidenced that philosophers writing about implicit bias tend to share a commitment to progressive reforms such as affirmative action initiatives [104].Alongside this are errors and omissions otherwise difficult to explain.To give one final example, Washington and Kelly [57] (p.17) make the sole acknowledgement I could find in the main text of a recent two-volume collection that there was anything controversial about implicit bias research, though only to remark that certain unnamed disputes would be ignored.Despite a declaration about sticking to "textbook facts," they go on to overstate results on "shooter" bias and selectively report other results, such as failing to note that Bertrand and Mullainathan's [63] findings did not suggest any gender effect in hiring.Certainly this doesn't mean there is no gender effect, but surely it calls into question the replicability of claims about evaluative biases with respect to gender.Washington and Kelly also exemplify a common error, henceforth dubbed the "glass box" fallacy, which is inferring the specific character of cognitive information processing mechanisms on the basis of crude behavioral measures (which is to say they treat a "black box" as if its inner workings were transparent).They also overlook disconfirming evidence, such as a similarly run correlational study, which found no bias in hiring with respect to race [105]. 21And as we have seen they are not unusual in this respect.Is something other than a concern with truth and warrant influencing their ideas and arguments?Although the motivational mental states of others are often hard to confirm, central players acknowledge in print that philosophers are indeed seeking empirical support for a political agenda.While it is logically possible that this plays no role in the assessment of evidence, it would make for a reckless presumption that this is actually the case in light of the problem of confirmation biases.When the same agenda is constantly and explicitly stated, and the errors are so egregious and always tending towards favoring that agenda, it seems reasonable to hypothesize that political bias may be exerting a distorting effect.But supposing I am mistaken, it is still reasonable to at least air the concern.The possibility that political agendas might be corrupting the review of empirical hypotheses should loom over these discussions, but doesn't.It should be on the minds of philosophers, not least those attempting to put those hypotheses into the service of politics.