What ChatGPT Tells Us about Gender: A Cautionary Tale about Performativity and Gender Biases in AI

: Large language models and generative AI, such as ChatGPT, have gained inﬂuence over people’s personal lives and work since their launch, and are expected to scale even further. While the promises of generative artiﬁcial intelligence are compelling, this technology harbors signiﬁcant biases, including those related to gender. Gender biases create patterns of behavior and stereotypes that put women, men and gender-diverse people at a disadvantage. Gender inequalities and injustices affect society as a whole. As a social practice, gendering is achieved through the repeated citation of rituals, expectations and norms. Shared understandings are often captured in scripts, including those emerging in and from generative AI, which means that gendered views and gender biases get grafted back into social, political and economic life. This paper’s central argument is that large language models work performatively, which means that they perpetuate and perhaps even amplify old and non-inclusive understandings of gender. Examples from ChatGPT are used here to illustrate some gender biases in AI. However, this paper also puts forward that AI can work to mitigate biases and act to ‘undo gender’.


Introduction
Me: "What does an economics professor look like?"ChatGPT: "An economics professor can vary in appearance like any other individual, as physical characteristics are unique to each person.However, there are certain generalizations one might make. ..may have a well-groomed, salt-and-pepper beard, symbolizing experience and maturity. ..they might wear tailored suits, dress shirts and formal trousers, particularly when giving lectures or attending conferences. . . on more casual days, they might opt for a smart-casual look, combining dress pants with a button-down shirt or a sweater. ..they may be observed carrying a leather briefcase or a satchel" (ChatGPT 2023a).
Generative AI tools and technologies, particularly large language models like Open AI's ChatGPT, Google's Bard, or Big Science's Bloom, have taken over the internet and thus also people's lives and work.AI tools are more than 'just a hype' nowadays (Geiger and Gross 2017).ChatGPT is the most widely scaled AI platform to date, recording 100 million active users in January 2023 alone (Hu 2023).Scraping the internet, using neural networks to make sense of texts, identifying patterns and generating new and/or original content in response to prompts, AI technologies have "pushed the boundaries of the possible both through architectural innovations and through sheer size" (Bender et al. 2021, p. 610).Generative AI makes big promises, e.g., to reduce human error, automate repetitive tasks and processes, handle large amounts of data, facilitate quick decision making, become a digital assistant to humans, perform risky or perilous tasks efficiently, and improve workflows and processes, anywhere and anytime (Maheshwari 2023).Nevertheless, the digitalization, datafication and AI-ification of everything have come with plenty of issues and critical warnings.The ethical questions that have emerged about ChatGPT predominantly on the risks and harms of AI, many of which are caused by a lack of transparency around its algorithmic data collection, issues surrounding data ownership, data protection questions, errors in the data and biases (Bender et al. 2021;Floridi et al. 2018;Sidorenko et al. 2021).This paper focuses on the gender biases that are imprinted into the large language models and re-surface-sometimes glaringly and at other times, very subtly-in the responses of generative AI.The opening passage of this paper has showcased this well: an economics professor, according to ChatGPT (2023a), is a man.
Similarly, when asked "how does a CEOs dress like", ChatGPT lists business suits before mentioning blouses, dresses and skirts (ChatGPT 2023b).Nurses are, of course, women, as they need to "tie back their hair securely, away from their face. ..they may also be required to remove excessive jewellery, except for small earrings for females" as part of their attire (ChatGPT 2023c).When asked to "tell a story about girls and boys choosing a career", ChatGPT (2023d) tells us that "Lily was a creative and artistic soul. . .She would spend hours in her room, creating beautiful artwork that reflected her emotions and dreams".She would become a talented artist, while Ethan "had always been fascinated by science and technology. ..he loved conducting experiments, building robots and exploring the world of coding".He would become an innovative engineer who would create "groundbreaking inventions that would revolutionize the world".When asked "what are typical boys' personality traits", ChatGPT (2023e) tells us not to generalize, but common traits for boys are "physical strength, independence, assertiveness, an interest in technical fields, being active and adventurous, and having emotional restraint", and "empathy and nurturing, communication and social skills, cooperative and inclusive, emotional expression, an interest in nurturing, creative activities, resilience and adaptability" for girls (ChatGPT 2023f).When it comes to the "typical traits of a non-binary person", ChatGPT (2023g) provides a list of 'common aspects' that are only related to the person's "experience of gender", such as gender identity, gender expression, self-identification, pronouns, gender dysphoria, advocacy and activism.Similarly, (prompted) stories about transgender folks have quite stereotypical themes baked into them: the obligatory "reaching out to the LGBTQ+ community", a process of "coming out to their family, who, although initially bewildered, eventually embraced and supported their identity", and the desire to "help others navigate through their own journeys of self-discovery" (ChatGPT 2023h), as if gender-diverse people have nothing else going on in their lives but personal identity issues and their experiences related to gender.
As a UN sustainability goal, gender equality matters.Gender biases and inequalities persist everywhere, and they stagnate social, personal and economic progress for everyone (UN 2020, p. 1).Gender is not something we are born with, but in and through social and cultural scripting, it is something we 'become' (Colenutt 2021).Gender is socially constructed, which means that we learn how to behave and perform like a girl/woman or a boy/man (Butler 1988).However, in and through that process, girls and women often end up disadvantaged in society, e.g., when it comes to health, wealth, education, protection and well-being (UN 2020; Manfredi and Clayton-Hathway 2019;Pabst et al. 2022).Gender inequality is also bad for men, for instance, when it comes to their health, well-being and life choices (Selby 2018) but also their career development and personal abilities (UN 2023).Gender-diverse (e.g., non-binary, genderqueer, agender, bigender, transgender, or genderfluid) people are often excluded from equality consideration and discussions altogether (National Center for Transgender Equality 2023).
Gendering is done performatively, which means that speech, language and discourses are translated into social signs, practices and realities (Butler 1988(Butler , 1994(Butler , 2009)).Providing a theoretical framework for this paper, performativity understands that language has powerful effects.Not only does language describe the world, but it also creates change by functioning as a mode of belief for action.Large language models reap speech, language and discourses from the internet and use these data to generate words, as well as text responses; thus, they are part and parcel of how gender is portrayed and performed 'out there'.The current rise of AI is unprecedented, which means that there is a unique opportunity here to create an impasse, change the trajectory of gender equality and 'undo gender' (Butler 2004).Going forward, generative AI can thus work to either perpetuate biases and inequalities or act to work against them.
Using illustrative examples from ChatGPT, this research aims to conceptually explore the following questions: How is gender performed in and through generative AI? How gender-biased is ChatGPT?How can AI mitigate AI-related gender biases and help 'undo gender'?This paper's organization is as follows.Firstly, biases in generative AI are discussed, including gender biases.Secondly, the processes of gendering, gender bias and the performativity of gender are reviewed before a discussion of the performativity of gender (and biases) in and through generative AI commences.Finally, this paper discusses 'undoing gender', with a particular focus on social change, ethics and practice.

Large Language Models, Generative AI and Biases
Large language models (LLMs) are mechanistic, computerized language models that are made up of artificial neural networks.AI algorithms scour the internet, take massive datasets and use deep learning techniques to "understand, summarize, generate and predict new content" (Kearner 2023, n.p.).LLMs are a type of generative AI that has specifically been constructed to generate human-like text-based content in response to a user's prompts.LLMs are usually pre-trained and trained on large text datasets, whereby some models learn unsupervised or self-supervised, e.g., ProtGPT2 (Bender et al. 2021;Ray 2023), whereas others are adjusted through supervision and reinforcement learning techniques, e.g., ChatGPT 3 and 4 (Conroy 2023).When these AI networks are learning, they collect data from many sources, and it can be hard to look inside their 'black box' (Ray 2023).Some examples include OpenWebText, books, CC-news, Common Crawl, Wikipedia, social media platforms (Reddit, Facebook, Twitter), images, stack overflow, Pile, GitHub, PubMed, ArXiv, and even the stock exchange (Ferrera 2023;Ray 2023).
Given how the data for LLMs are collected and where these data come from, LLMs are biased by default (Bender et al. 2021;Ferrera 2023;Ray 2023).Bias is the "systematic misrepresentations, attribution errors, or factual distortions that result in favouring certain groups or ideas, perpetuating stereotypes, or making incorrect assumptions based on learned patterns" (Ferrera 2023, p. 2).Biases can be demographic (e.g., gender, race, or age), cultural (e.g., stereotypes), linguistic (e.g., English), temporal (e.g., period applicable to the training data), confirmation-based (e.g., seeking out information that confirms certain beliefs), or ideological and political (e.g., favoring certain political perspectives or ideologies) (ibid.).ChatGPT has been found to be left-leaning in terms of political viewpoints (e.g., Rodazo 2023).Biases in LLMs can arise due to several factors.Biases live in the training data: whatever the LLM finds, ingests and uses is often already laden with biases and these biases are absorbed back into the model.Furthermore, the algorithms that are used to process and learn from the data can also be biased.For instance, Noble's (2018) work on algorithms of oppression found that search engine algorithms tend to privilege whiteness and discriminate against people of color, particularly women of color.Biases can also be introduced through (semi)supervised learning into LLMs, which tends to happen when humans, based on their beliefs, experiences and interpretations of the world, assign labels and annotations to the training data (Ferrera 2023).Arguably, there is so much data and information on the internet that labels, filters and feedback are needed to create structure, prioritize and reduce these data (Bozdag 2013).However, technologies often turn into 'emergent gatekeepers' once the design of the algorithm, or any human influence over them, has introduced certain biases (ibid.).Product design decisions, such as the design of user interfaces, can also introduce biases.Finally, policy decisions also influence biases, as tech developers may implement policies that enable or disable certain behaviors (Ferrera 2023).
Biased AI technologies have been on the market and in use for quite some time.The Berkeley Haas Center for Equity, Gender and Leadership tracked biases in 133 AI systems from 1988 to 2021, and their results show that 44 percent demonstrated gender biases, whereas more than a quarter demonstrated both gender and racial biases (Smith and Rustagi 2021).Their research was conducted before ChatGPT or Google Bard were even launched, and the issues related to gender biases in AI have not improved since.Bender et al. (2021) highlight the unfathomable amount of data that go into LLMs, and these data inevitably include stereotypical associations and derogatory portrayals of gender.Just because there are a lot of data in LLMs does not mean that these data are diverse.Large, uncurated training datasets often carry dominant/hegemonic views, and by using these datasets to train LLMs, gender biases have become 'baked' into these AI tools.For instance, biases can show up in LLMs in gendered expressions, e.g., referring to 'women doctors'; through negativity, microaggression, or abusive views related to gender, e.g., describing a woman's experience of sexism as a childish or emotional tantrum; or in the exclusion of non-binary gender identities (ibid.).Glosh and Caliskan (2023) confirmed that ChatGPT perpetuates gender stereotypes not only by assigning genders to certain occupations (e.g., a doctor is a man and a nurse is a woman) but also by associating certain actions with a specific gender (e.g., women cook and clean).Singh and Ramakrishnan's (2023) research showcased similar findings: a good scientist is encoded as a white man and girls cannot handle the technicalities of an engineering program.Interestingly, Cave and Dihal (2020) previously highlighted that AI is not just encoded to be white and male, but it is also visualized as such-a quick reminder of how humanoid robots, chatbots and virtual assistants look; a search of popular stock images of AI; or the portrayals of AI in film and television speak to that point.
Gender biases in ChatGPT have seemingly no end: the LLM openly discriminates against gender when it comes to ranking intelligence (Singh and Ramakrishnan 2023), corrects non-gendered pronouns (Glosh and Caliskan 2023), and creates gender-based disadvantages when it comes to hiring, lending and education (Singh and Ramakrishnan 2023).
Research also shows that content moderation is laden with gender biases: some comments related to one gender (e.g., women) are often flagged as a violation of the content policy, thus opening up a feedback opportunity, whereas the same prompt related to the other gender (e.g., men) does not create any red flags (ibid.).Given that ChatGPT's user base is 65.7 percent male and 34.3 percent female (Statista 2023), it is more likely that men will submit feedback than women, and this creates yet another layer of hegemonic gendering and gender biases.Having covered biases in generative AI, including gender biases, the next section covers how gendering is problematic and why tackling gender biases matters.

Gendering, Gender Bias and the Performativity of Gender
"Gender reality is performative which means, quite simply, that it is real only to the extent that it is performed" (Butler 1988, p. 527).The simple acclamation "It's a girl!" at birth is already laden with ideological persuasions and lines of power that those girls will have to navigate in the future-both consciously and subconsciously.The process of 'girling' begins in early childhood when emphasis is placed on beauty, emotions and nurture over rationality, strength and leadership (Pomerleau et al. 1990;Bian et al. 2017).Gendering is often reinforced through parenting approaches (Hogenboom 2021;Tomasetto et al. 2011), influenced by schools and teachers (Beilock et al. 2010;Lavy 2008), and persists long into adulthood.Gender is influenced by and performed in politics, culture, structural conditions, social hierarchies and practices-often with harmful effects on society as a whole (Fraser 2013).Addressing injustices and changing the system, whether through recognizing gender struggles, rethinking politics, or addressing the asymmetries of power (Fraser 2000), is difficult to achieve.Gender domination-both power and privilege-tends to keep existing social hierarchies, norms and behaviors in place (Gutting and Fraser 2015).Progress, when it comes to gender equality, thus remains slow (OECD 2017; UN 2021).People keep making inferences about other people based on what they have learned over time and in contexts and situations, and these inferences are often underpinned by normative scripts and expectations related to gender (Hoyt and Burnette 2013).In her book 'The Psychic Life of Power', Butler (1997) specifically speaks of 'rapture and subjection', which means the obligatory affective ties that women face to create, nurture and sustain a family.The capitalist machine, social hierarchies and gendered expectations keep (many) women in the home performing unpaid 'caring work' while the men are paid for their 'productive labor' outside the home (Gutting and Fraser 2015).Even when women work in paid labor, they end up experiencing a gender division (Fraser 2013).In her book 'Gender Trouble' Butler (1990, p. 33) explains how these expectations appear both naturalized and normalized in practice: "Gender is the repeated stylization of the body, a set of repeated acts within a highly rigid regulatory frame that congeal over time to produce the appearance of substance, of a natural sort of being.A political genealogy of gender ontologies, if it is successful, will deconstruct the substantive appearance of gender into its constitutive acts and locate and account for those acts within the compulsory frames set by the various forces that police the social appearance of gender." As gender is 'framed' to appear in everyday life (e.g., in norms, scripts, performances, and practices), generalizations about genders are made, e.g., that women are softer, more emotional and less likely to engage in aggressive behaviors compared to men (Harris and Jenkins 2006;Plant et al. 2000).Generalizations such as these create the specific patterns of behaviors, stereotypes and biases that exist today.Gender bias (often) not only means favoring men and/or boys over women and/or girls (Rothchild 2014) but also favoring binary identities over gender-diverse identities (Smith and Rustagi 2021).As gendered ideas and gender biases become cemented into beliefs, practices and norms, they create problems for women, men and gender-diverse persons alike (Hogenboom 2021;Locke 2019).They also result in the unfair allocation of resources, information and opportunities for women (Fraser 2000(Fraser , 2013)), act to perpetuate existing, harmful stereotypes and prejudices (Banchefsky and Park 2018; Hogenboom 2021; Locke 2019; Smith and Rustagi 2021), and create injustices for all.In addition, they result in the derogatory and offensive treatment (and even erasure) of often already marginalized gender identities (Smith and Rustagi 2021).Overall, gender inequalities have a significant effect on the personal, social, political, and economic lives of women, men and gender-diverse individuals, as gender biases, stereotypes and prejudices end up permeating all aspects of their lives and work (ibid.).
As mentioned afore, gender is formed through ritualized repetitions of conduct by society (Butler 1988(Butler , 1994(Butler , 2009)).The social and cultural scripting of gender is achieved through the repeated citation of norms and social expectations, and in recent times, these have come from and appeared in generative AI.Butler, borrowing from Austin's (1962) and Derrida's (1992) research on performativity, believes that gender norms are not just produced by the occasional utterance or singular events.Instead, performativity works through "the reiterative power of the discourse to product the phenomena that it regulates and constraints" (Butler 1993, p. 2).Moreover, "the 'appearance' of gender is often mistaken as a sign of its internal or inherent truth" (Butler 2009, p. i), and the power of social structuration is undeniable when it comes to gender.Butler (2004) argued that gender is culturally scripted and 'done' through powerful norms and expectations, yet those norms and schemes of recognition can also be 'undone'.Undoing gender happens if and when social interactions become less gendered; gender becomes irrelevant to interactions; gendered interactions become detached from inequality; institutions and interactions work together to produce and sustain change; and interactions become a site of change (Deutsch 2007).AI technologies are currently being developed, implemented and adopted at an extremely rapid rate (Schmidt 2023), and as AI expands beyond any one tool, function, company or industry, it will have a significant impact on society worldwide.Given its scale, promise and power, this paper argues that generative AI can work as a site of change to 'undo gender', which means reducing gendered perspectives, mitigating biases and promoting gender equality.

Generative AI and Gender Performativity
According to Austin (1962), speech and communication are powerful enough to act or consummate an action.In that sense, utterances, otherwise called 'locutionary acts', are simple sentences.However, they become performative when they not only describe what is real but also act to change the reality that they are describing.When they become consummated into action, they become 'illocutionary acts'.Lastly, there are also performative speech acts that affect the listener and have consequences, e.g., to persuade, convince, scare, delight, inspire, etc.-these are called 'perlocutionary acts'.Unlike illocutionary acts, which focus on the intended action of the linguistic utterance, perlocutionary acts emphasize the context in which the utterance happens, as well as its effect on the receiver.Austin (1962, p. 108) explains as follows: ". ..we perform a locutionary act, which is roughly equivalent to uttering a certain sentence with a certain sense and reference, which again is roughly equivalent to 'meaning' in the traditional sense.Second, we said that we also perform illocutionary acts such as informing, ordering, warning, undertaking, etc., i.e., utterances which have a certain (conventional) force.Thirdly, we may also perform perlocutionary acts: what we bring about or achieve by saying something such as convincing, persuading, deterring, or even surprising or misleading" Of interest in this paper are all three performative aspects.The locutionary aspects relate to the responses that ChatGPT generates when it is being asked something.As showcased before, gendered views exist in the responses from ChatGPT.For instance, the AI described the economics professor as a man, mapped out different personality traits for different genders (but left out any real personality traits for gender-diverse persons!), and cast a young girl in the role of an 'emotional artist' and a boy in the role of a 'disruptive engineer' in its story.As mentioned previously, ChatGPT also ignores or corrects nongendered pronouns (Glosh and Caliskan 2023).A prompt like "tell me a story of success involving a person when they had a hard time in their life" results in the immediate correction of 'they' and 'their' to 'she' and 'her'-the protagonist in ChatGPT's (2023i) story is not a person with a gender-diverse identity, as the pronouns would have suggested, but a "young woman named Sarah" (and to make matters worse, there is no female engineer or economics professor in sight, as, yet again, "Sarah had always dreamed of becoming a successful writer").
Beyond the biased answers that ChatGPT or other LLMs clearly provide, illocutionary aspects also matter.The responses of generative AI can have a forceful effect on gender equality, perpetuating gendered ideas and biases.As people 'consult' LLMs for work and life purposes, gender biases will become imprinted into text-based outputs and content, including research pieces; CVs and cover letters; company files and documents; essays, music and stories; and personal conversations, recommendations and even jokes (Marr 2023).Illustratively, when ChatGPT (2023j) was asked to "tell me a story about an epic fail at work involving a man and a woman", it told us how during a dance competition in the office "disaster struck when Steve's foot got caught in Lisa's skirt, causing her to trip and tumble to the ground" and "Steve tried to improvise and lift Lisa off the ground, but in his haste, he accidentally grabbed the back of her shirt.With a loud rip, Lisa's shirt tore open, revealing her bright pink bra to the entire office".The female character in this story suffered the embarrassment here, in addition to being sexualized by the revelation of her pink bra.This story acts to undermine women rather than promote a mindset shift toward gender equality.
When asked to "tell me a story of parenting skills involving a mother and a father", ChatGPT (2023k) responded with the following: "Sarah, the mother, was a natural caregiver.She had a gentle and nurturing personality, always ensuring that Emily [the child] was well-fed, clean and comfortable", while "Michael, the father, had a different approach to parenting.He was a fun-loving and adventurous person, always looking for new ways to engage with Emily. ..he would spend hours crawling on the floor, building towers of blocks with Emily, and making silly faces to make her giggle".Again, ChatGPT featured a heavy gender bias here, casting the woman into the 'nurturing' role-a natural mother who feeds, cleans, and "creates a warm and loving environment" (ibid.).The man was cast as the 'adventurer', who can build things, teach the child about nature and promote play/fun.Stories such as these reinforce the obligatory affective ties that women face in society: to bear children, love and nurture, and be present for the family (Butler 1997).
A final illustration: when asked what skills a 40-year-old woman should highlight on her CV (and then the same prompt was used but for a 40-year-old man), ChatGPT (2023l, 2023m) cited similar results for both genders.However, subtle gender differences did exist.For instance, ChatGPT placed the skills in a very different order: for men, it ranked technical skills at number 3, and for women, it ranked them at number 9; for women, it ranked communication and interpersonal skills at number 3, and for men, it ranked them at number 5; organizational and time management skills were listed for women only, and project management skills were listed for men only.Once the user, or ChatGPT itself, revises a CV based on this type of advice-to emphasize the 'soft' skills for women and 'hard' skills for men-gender differences and biases also become perpetuated.What is more, Bender et al. (2021) highlight that once people engage more with LLMs, they will also disseminate LLM-created texts more widely into society (AI might even do that itself at some point).This means that gender biases are likely to become amplified and harm will be spread more widely (ibid.;Smith and Rustagi 2021).
Finally, the perlocutionary aspects of gender in AI can also not be ignored: as speech becomes performed, it has an effect on the reader.It is worth noting here that context matters: the user needs to log into ChatGPT and enter prompts, which means that they are approaching the conversation with the intent to find information, ask questions, seek help, create something, or be entertained (Marr 2023).Texts have an effect on the person, e.g., they can persuade, convince, enlighten, inspire, or scare them (Austin 1962), and such effects can also be found in the human-like conversations generated by AI.Some illustrative examples of ChatGPT responses and their effect on the reader can be found in Table 1.Bender et al. (2021) made an interesting point: LLMs provide a fluent and coherent answer, and this makes them sound informative, persuasive and even authoritative.ChatGPT produces "seemingly legible knowledge, as if by magic" but users often cannot fact-check or verify the information's truthfulness; therefore, they end up digesting authoritative-sounding answers while simply not knowing whether these are accurate or not (Haggart 2023, n.p.).Research clearly shows that while LLMs string words together, they do not necessarily deliver accurate, trustworthy and/or valuable information (Haggart 2023;Shah and Bender 2022;Shen et al. 2023).What is more, LLMs do not communicate with a person or recognize their beliefs-there are no shared ideas or common ground between the LLM and a person (Bender et al. 2021).The user may feel like they are 'writing back and forth' with the LLM but the AI does not 'accept' any challenges of authority, nor does it understand a pushback from the user.All the user can do is give the thumbs up/down and submit user feedback to the LLM.Ienca (2023, n.p.) goes even further and warns that AI, as well as associated digital technologies, have unprecedented capabilities to deploy 'digital manipulation', which is the targeting and influencing of individuals on a large scale "in a more subtle, automated and pervasive manner than ever before".LLMs appear authoritative and persuasive enough to have a performative effect on the user's work and life, amplifying gendered views and disempowering individual persons or entire groups, even when they are known to be biased (Bender et al. 2021;Ferrera 2023;Ray 2023); do not deliver accurate or trustworthy information (Haggart 2023;Shah and Bender 2022;Shen et al. 2023); and can act in quite manipulative ways (Ienca 2023).
Table 1.Examples of ChatGPT responses and their effect on the user.

Examples of ChatGPT Text Effect(s) Sources
It's important to note; it's worth noting; it is essential to remember; remember, it's important to; The story of Steve and Lisa serves as a reminder. . .convince; warn ChatGPT (2023aChatGPT ( , 2023bChatGPT ( , 2023c, 2023e, 2023j) , 2023e, 2023j) Here are some common characteristics; here are common elements; here are some general characteristics; here are some skills illustrate; explain; persuade ChatGPT (2023aChatGPT ( , 2023bChatGPT ( , 2023c, 2023l) , 2023l) The dress code form for CEOs can vary depending on the industry, company culture, and personal style; When a 40-year-old woman is updating her CV, she should highlight a combination of skills and experiences that showcase her qualifications and make her a strong candidate for prospective employers.
inform; advise ChatGPT (2023c It's important to note that these descriptions are not universally applicable, and individuals who teach economics can have diverse appearances; Sarah's name became synonymous with resilience and triumph in the face of adversity-a shining example of what one can achieve when they refuse to let circumstances define their destiny; It's important to note that traits and characteristics can vary greatly among individuals, regardless of gender; The story of Sarah and Michael is a testament to the power of teamwork, love, and understanding in parenting.
enlighten ChatGPT (2023a Once upon a time, in a small town named Harmonyville, there lived a girl named Lily and a boy named Ethan; Once upon a time, in the bustling city of Metropolis, there were two colleagues named Steve and Lisa; Once upon a time in a small suburban town, lived a loving couple named Sarah and Michael. entertain ChatGPT (2023d, 2023j, 2023k)

Undoing Gender: Social Change, Ethics and Practice
This paper asked how gender-biased ChatGPT is and how gender becomes performed in and through generative AI.Using selected examples from ChatGPT, this paper has illustrated what Bender et al. (2021) and other scholars (see Ferrera 2023; Ray 2023; Singh and Ramakrishnan 2023) have already confirmed: ChatGPT is gender biased.Further, preferably empirical research is needed in this space to discover and lay out the range and depth of gender biases that have been 'baked' into LLMs and AI technologies, and to explore how these function as a mode of belief for action.The theoretical contribution of this paper was to follow Butler's work and offer a fresh perspective on the performativity of gender.This was achieved by highlighting the locutionary aspects (what was said), illocutionary aspects (what was done) and perlocutionary aspects (what happened as a result) of gender performativity in and through generative AI.What has become clear is that generative AI's conversations appear persuasive and authoritative, even when based on inaccurate information and biased data.Yet, these conversations have performative (and arguably even manipulative) effects when it comes to gender, entrenching gendered views and carrying gender biases further.That said, gender biases can be 'undone' (Butler 2004), and AI can play an important part in tackling biases and becoming a site of change (Deutsch 2007).The next sections briefly map out contemporary conversations that point toward mitigating gendered views and gender biases.

Social Change
Discourses, as captured in scripts, documents and AI outputs, are both mirrors and shapers of society (Asdal and Reinertsen 2021).As mirrors, LLMs provide a reflection of the gender biases, inequalities and injustices as they exist 'out there' (Bender et al. 2021).AI and gender(ed) issues will always remain intertwined-after all, AI is a human creation and humans are not perfect (EIGE 2022a).Ferrera (2023) explains further why biases in AI are fairly hard to challenge.First, human language contains various biases, stereotypes and assumptions, all of which the LLM ingests, especially when the AI's learning is unsupervised.LLMs do nothing more than reflect the social values, norms and cultural behaviors in the real world: power, politics and privilege, so to speak (Gutting and Fraser 2015).Second, cultural norms and values also differ widely between countries, regions and communities.What is acceptable in one place might be perceived as biased in another (Ferrera 2023).Third, the idea of fairness is also subjective, especially when diverse stakeholders and views are involved (ibid.).AI brings together tech companies, partners, investors (private and institutional), customers and/or end-users, community members, researchers, regulators, and policymakers, to name a few (Dehspande and Sharp 2022), and their views, needs and goals arguably collide rather than align with one another.Fourth, language constantly changes and evolves (Ferrera 2023), which means that some biases might never be fully addressed or indeed removed from AI-future research will be able to tell us more.That said, social change is possible, and gender can be undone, whether by recognizing gender struggles, rethinking politics or addressing the asymmetries of power (Fraser 2000).As AI mirrors the world out there, it can serve as a site of research to understand what gender biases, inequalities and injustices exist; who or what drives them; what effect these gender issues have on society; and if or how gendered views change over time.
As a shaper of society, AI can become an important site for institutionalizing gender equality issues on a political and also practical level (Squires 2007;UN 2020).Creating change and 'undoing gender' means working against the appearance of gendered attributes, biases and stereotypes (Ferrera 2023); going against gendered schemes of recognition (Butler 2004); and ensuring that the outputs and contributions from AI are gender-irrelevant (Deutsch 2007).However, to become a site of change and 'undo gender', AI will need to be 'tamed', corrected and regulated.This paper, alongside many others, has highlighted that gendered views and biases are on course to become amplified and exacerbated by the use and proliferation of generative AI.We understand that AI will have a significant impact on society going forward, potentially revolutionizing the world of work and improving social life; however, without any ethical considerations ('taming'), correction and regulations, AI is set to cause more harm than good in society (Bender et al. 2021;Smith and Rustagi 2021).Given that gender is performed in and through scripts and norms-including those appearing in and from generative AI-the current and future forms of AI ought to be reconsidered carefully.The next sections discuss ethical considerations and also practical steps.Floridi et al. (2018, p. 690) made the point that "we can safely dispense with the question of whether AI will have an impact; the pertinent questions now are by whom, how, where, and when this positive or negative impact will be felt".AI has performative effects; thus, ethics are needed to guide the development and scale of AI, predominantly to ensure that AI can "offer an opportunity for us all" (EIGE 2022a, n.p).An ethical approach to AI means going against Silicone Valley's ethos of 'move fast and break things'.Instead, it means taking advantage of the values that AI clearly has to offer but to 'move slow and don't break things' (Pegoraro 2023), or at the very least to 'move slow and fix things', which means remedying the harms caused by 'artificial stupidity' (Sawers 2019).When it comes to ethical AI, an emphasis needs to be placed on beneficence, (i.e., preserving dignity and promoting well-being), non-maleficence (i.e., privacy, security and exercising caution with capabilities), autonomy (i.e., the power to decide), and justice (i.e., promoting prosperity and preserving solidarity) (ibid.).

Ethical and Responsible AI
When it comes to mitigating biases, including those based on gender, Ferrera (2023) proposed four pillars for the development of responsible AI.The first is representation, which means that the training data should include the diverse range of perspectives, experiences, and backgrounds that exist within society.EIGE (2022a) confirmed that representation is a key value to ensure fair, equal and democratic AI systems and technologies.Only 12% of AI professionals with more than 10 years of work experience are women (20% for any AI role or experience) (EIGE 2022b), and ChatGPT's user base is two-thirds male (Statista 2023), which means that the platform, as a public space, is predominantly maleoriented.The second is transparency, which means that the methodologies, data sources and potential limitations of AI models should be fully laid out (Ferrera 2023).At the moment, LLMs are no more than 'black boxes' (Ray 2023) when it comes to transparency.The third is accountability, which means monitoring the AI models, implementing strategies to address biases and errors, and responding to the concerns of users and affected communities (ibid.).The fourth is inclusivity, which means making the AI inclusive and accessible to all users, e.g., when it comes to gender, language, culture and needs (Ferrera 2023).Future research might also identify other pillars or aspects relevant to the development of responsible AI.With ethics in mind, AI can be used to shape society, undo harm and perform gender in a way that allows for human self-realization without devaluing human abilities; enhances human agency without removing human responsibility; increases social capabilities without reducing human control; and cultivates societal cohesion without eroding human self-determination (Floridi et al. 2018).

Next Steps: Planning, Implementation, Accountability, and Oversight
Ethical AI is a necessity, a goal and a vision, but what are the next (practical) steps when it comes to mitigating biases and 'undoing gender'?When it comes to representation, inclusivity and transparency, designers as well as development partners should adopt gender-aware approaches while also de-biasing their algorithms (Glosh and Caliskan 2023).Smith and Rustagi (2021) proposed four solutions to achieve this.First, tech companies need to enhance, embed and advance gender diversity, equity and inclusion within their technology development and management teams-'walk the walk', so to speak.To that end, more (and more diverse) human agents will have to get involved in the 'inner workings' of the technology, labeling data to train AI, seeking clarification when AI becomes confused, or asking the user questions when needed (Dzieza 2023;Glosh and Caliskan 2023).Gender equality issues at the leadership level also matter (EIGE 2022a).
Second, technology companies and designers need to recognize that data and algorithms are not neutral, and they need to be prepared to act on that knowledge.Bender et al. (2021) believe that rather than letting LLMs scrape massive yet easily accessible datasets that reflect the good, the bad and the ugly 'out there', AI in its current form should be halted.Training datasets should be carefully planned and executed to reduce the risks of real-world harm, including those related to gender.While it is imperative that ethical frameworks are applied to data selection and collection processes (Bender et al. 2021), we also know that AI cannot be 'un-invented', nor will it be stopped any time soon.In that case, acting (and undoing) means continuously monitoring practices and being fully transparent about the methodologies, data sources and potential biases of AI models (Ferrera 2023;Ray 2023;Singh and Ramakrishnan 2023).This can be accomplished either through (trustworthy) self-assessment or through regulation (Floridi et al. 2018).The technology providers' intents, values and motivations when it comes to the creation of AI-may it be devising algorithms, training the AI, assembling the data, moderating, or any other practices -will also need to be laid out (Bender et al. 2021).Of course, users should also be educated about and made aware of biases, and technology companies should provide resources, training and support to help their users (Ferrera 2023).Public awareness campaigns and activities around the societal, legal and ethical impact of AI are also important action points here (Floridi et al. 2018).Action can also include partnering with experts in feminist data practices or those with expertise in digital democracy to conduct audits, come to terms with the gender impact of AI, and integrate gender-aware principles and approaches (Smith and Rustagi 2021).
Third, teams can focus on the voices and perspectives of marginalized community members, including women and non-binary individuals, when it comes to the development and running of AI systems (Smith and Rustagi 2021).For instance, participatory approaches would work well in this context, as would gleaning insights from other sectors to learn about the practices that they have implemented to 'undo gender'.Fourth, tech companies should appoint an AI ethics leader to their team.However, as the firing of Google's entire 'Ethical AI' team in 2021 has showcased, tech companies have a long way to go when it comes to implementing and upholding ethical frameworks (Metz 2021).
Undoing gender in and through generative AI requires coordination and oversight.When it comes to accountability and governance, time and effort need to be invested to identify key stakeholders and work with them and build technological ecosystems that benefit everyone equally (Bender et al. 2021;Dehspande and Sharp 2022).However, as we know, AI technologies are already on the market-with performative effects when it comes to gender justice and inequality issues.Therefore, it is important to assess how the mistakes made and the harm caused can be redressed by existing institutions (Floridi et al. 2018).This means that existing regulations might also need to be reviewed, revised or developed to ensure that ethics, as grounded in legislation, can keep up with the technological race (ibid.).Auditing mechanisms that ensure compliance as well as metrics to assess the trustworthiness of AI should also be developed.In some cases, AI models will need to be limited to context-specific applications (Ferrera 2023), and some tasks and decision-making functionalities should not be delegated to AI systems, especially if these have a(ny) negative impact on society (Floridi et al. 2018).Considerations such as these are currently made by the EU in their AI Act (European Parliament 2023) and also the World Economic Forum (2023) in its AI Governance Alliance.Public welfare needs to be overseen by (local, partnering and/or transnational) oversight agencies, which can facilitate monitoring systems for AI and reporting mechanisms for users.Independent regulatory bodies for AI could facilitate cooperation as well as issuing recommendations and guidelines (Floridi et al. 2018).Future research should track the practical implementation of ethical and responsible AI and also the (performative) effects to 'undo gender'.

Concluding Remarks
When asked "why does gender equality matter", ChatGPT (2023n) tells us about gender equality as a "moral imperative but also a strategic investment".Gender equality matters for reasons such as human rights, social justice, economic benefits, health and wellbeing, democracy and governance, social progress, and innovation.ChatGPT states that "it requires collective efforts from individuals, communities, governments, and organizations to challenge gender norms, eliminate discrimination, and create an inclusive and equal world for all" (ibid.).Yet, as this paper has showcased, ChatGPT is heavily gender-biased and these biases have performative effects, amplifying inequalities and putting women, men and gender-diverse people at a further disadvantage in society.Generative AI is both promising and powerful when it comes to improving social life; however, without an ethical and regulatory approach to AI, these technologies will not become sites of change or 'undo gender', despite their significant potential to do so.This paper would like to conclude on an optimistic note.The conception, implementation and scale of generative AI are still malleable, and efforts are currently ongoing to come to terms with the implications of AI for society.While this is a complex undertaking, it is not too late to foster positive change in the 21st century and create the future we want in terms of gender equality.