Next Article in Journal
Unlocking New Horizons: Teacher Mobility and Competence Growth via Erasmus Exchange Programs
Previous Article in Journal
Correction: Brandl and Schrader (2024). Student Player Types in Higher Education—Trial and Clustering Analyses. Education Sciences, 14(4), 352
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using AI Tools to Enhance Educational Robotics to Bridge the Gender Gap in STEM

by
Dialekti A. Voutyrakou
* and
Constantine Skordoulis
Department of Primary Education, National and Kapodistrian University of Athens, 106 76 Athens, Greece
*
Author to whom correspondence should be addressed.
Educ. Sci. 2025, 15(6), 711; https://doi.org/10.3390/educsci15060711
Submission received: 3 May 2025 / Revised: 29 May 2025 / Accepted: 3 June 2025 / Published: 6 June 2025

Abstract

Bridging the gender gap in STEM remains a critical challenge, with nearly 70% of the STEM workforce being male. Prior research suggests that integrating educational robotics (ER) into the STEM curricula can boost young girls’ motivation before gender stereotypes and societal norms discourage their pursuit of STEM-related academic pursuits and career paths. The success of these activities depends on inclusive topics, materials, and teaching approaches, as a lack of diversity in these elements may lead to disengagement among young girls. To address this, educators must develop gender-neutral ER curricula and activities. However, due to the persistence of unconscious bias and gender stereotypes, determining whether an activity is truly gender-neutral can be difficult. This study explores the potential of ChatGPT 3.5, a widely used Artificial Intelligence (AI) tool, to assist educators in designing gender-neutral ER activities. Specifically, we investigate whether ChatGPT can (i) generate gender-neutral activities that serve as a foundation for educators, and (ii) identify unconscious bias, gender stereotypes, or demotivating elements (e.g., activity topics, materials) in suggested ER activities. To ensure consistency and depth in our analysis, we performed several repetitions of each prompt, examining the variations and commonalities across outputs. Our results indicate that AI tools like ChatGPT can both highlight biases in existing activities and assist in the development of more inclusive, unbiased learning experiences.

1. Introduction

The U.S. Bureau of Labor Statistics (2024) estimates that employment in science, technology, engineering, and mathematics (STEM) occupations will grow by approximately 10.8% over the next decade—about five times faster than non-STEM occupations. However, according to the UNESCO Institute for Statistics (2017), women remain underrepresented in these fields, making up less than 35% of STEM graduates worldwide. Moreover, women are less likely to pursue STEM careers, and those who do so are more likely to leave them prematurely (Mich & Ghislandi, 2019).
This gender disparity is not only a matter of representation but also has significant implications for scientific innovation, economic growth, and societal equity. According to the United Nations (2021), increasing gender diversity in STEM leads to more inclusive technological advancements and fosters economic resilience. Addressing this gap requires targeted efforts to engage young women in STEM education from an early stage.
Educational robotics (ER) has been proposed as a promising approach to introduce more young people to STEM, as it fosters creativity, curiosity, and innovation (Ferreira et al., 2019; V. Vasconcelos et al., 2023). Studies suggest that encouraging girls to participate in pre-college robotics activities can create pathways toward STEM careers, where female representation remains low (Atmatzidou & Demetriadis, 2016; Golecki et al., 2022; Dirsuweit & Gouws, 2023). Furthermore, UNESCO’s Policy Guidelines on Inclusion for Education highlight ER’s potential to promote gender balance and support inclusive education (Daniela & Lytras, 2019). Research also indicates that ER activities can enhance girls’ motivation in STEM by familiarizing them with coding and engineering concepts in an engaging way (Mauk et al., 2020; Kucuk & Sisman, 2020).
However, simply introducing ER into the curriculum is not enough. The effectiveness of ER in closing the gender gap depends on how activities are designed and implemented. Educators must understand what increases young girls’ interest in STEM and how to design activities that align with their learning styles. Research suggests that girls tend to prefer collaborative, socially engaging projects over competitive tasks (Schön et al., 2020; Dirsuweit & Gouws, 2023). Additionally, most ER equipment and themes traditionally align with interests typically associated with boys’ play, potentially alienating female students (Dirsuweit & Gouws, 2023). Effective ER activities must therefore be carefully designed, ensuring that topics, materials, and team structures are inclusive and appealing to a diverse range of learners. Studies indicate that girls are more motivated when activities emphasize real-world impact, such as the use of robotics in healthcare or education, rather than purely technical challenges (Schön et al., 2020; Sullivan & Bers, 2019).
Given these considerations, educators require practical tools to design ER scenarios that actively reduce gender bias and foster inclusivity. Despite growing interest in Artificial Intelligence (AI)-driven educational tools, there is a lack of research exploring the role of AI in supporting educators in designing gender-inclusive ER activities. This study seeks to fill that gap by investigating the potential of widely used AI tools, such as ChatGPT, to assist educators in identifying and addressing gender biases in ER. Specifically, this study explores how AI can support educators in key aspects of ER activity design, including team formation, topic selection, and material choice.
To achieve this, the study addresses the following research question:
  • Can AI tools, such as ChatGPT, develop gender-neutral ER scenarios that serve as foundations for educators?
This question explores whether AI can proactively assist educators in designing gender-neutral ER activities, providing a foundation for educators to create more inclusive educational materials. We also seek to determine the following:
  • Can AI tools, such as ChatGPT, highlight potential gender biases in ER scenarios introduced by educators?
By examining this question, the study explores whether AI can help modify activity topics to be more inclusive, encouraging students of all genders to participate in ER activities. Additionally, it investigates whether AI can suggest more gender-inclusive materials and team formation strategies to engage all students in ER.
By examining the effectiveness of AI in these areas, this study seeks to understand its potential as a resource for promoting equity in STEM education.
The following sections build upon this introduction by first presenting a literature review that examines key barriers discouraging women from pursuing STEM careers, the role of ER in fostering inclusive education, and strategies to enhance girls’ engagement in STEM through ER. Additionally, the review explores the potential of AI tools in education, particularly within STEM learning environments. The methodology section then outlines the development of the specific prompts used as inputs for ChatGPT. This is followed by an analysis of the tool’s generated responses. The results section evaluates these outputs in relation to gender inclusivity, leading in a discussion of findings and potential applications in educational settings.

2. Literature Review

Women still face numerous barriers in pursuing academic and career paths in STEM, leading to their underrepresentation in relevant fields. This gender imbalance is a multifaceted issue, with several contributing factors identified in the literature. One significant challenge is stereotype threat—the anxiety that one’s performance will be judged based on a negative stereotype. Existing biases and societal norms shape women’s confidence and interest in these traditionally male-dominated fields (Sullivan & Bers, 2016; Sullivan & Bers, 2019; Mauk et al., 2020; Avalos et al., 2024).
Research suggests that children begin developing gender-related stereotypes as early as age five and apply them to themselves and others (Sullivan & Bers, 2016). By the age of eight, they start making assumptions based on these stereotypes. For instance, they may associate trucks with masculinity because “boys like trucks” (Sullivan & Bers, 2019). During the early years of schooling, gender-based stereotypes become ingrained and normalized within the educational environment through unconscious biases (Golecki et al., 2022; Dirsuweit & Gouws, 2023). Additionally, parents’ and teachers’ perceptions and stereotypes influence children’s decisions and performance (Blackburn, 2017; Mich & Ghislandi, 2019; Schön et al., 2020).
Career choices are largely shaped by the age of thirteen (Bagattini et al., 2021), and some students decide against pursuing engineering studies as early as elementary school (i.e., by age twelve) (Golecki et al., 2022). Stereotypes and biases restrict young girls’ educational and career choices, limiting their freedom to pursue STEM-related fields (Avalos et al., 2024). S. Wang et al. (2019) emphasize that although 67% of elementary-aged children express an interest in science, the percentage of girls interested drops significantly as they enter middle school.
Beyond ingrained stereotypes, additional challenges hinder women’s participation in STEM. Many women experience lower self-efficacy, questioning their ability to succeed in these fields or balance a STEM career with their personal lives (Bagattini et al., 2021; Dirsuweit & Gouws, 2023). Research indicates that women experience the “family penalty” more acutely than men (Swafford & Anderson, 2020). Despite evidence that high self-efficacy enhances task persistence (Tosato & Banzato, 2017), studies show that girls exhibit lower self-efficacy in STEM than boys as early as elementary and secondary school, widening the confidence gap between women and their male peers (Kucuk & Sisman, 2020).
The current male dominance in STEM fields impacts both educational and workplace environments. In education, studies highlight the shortage of female STEM teachers, which may hinder the creation of a supportive learning atmosphere and an inclusive curriculum for students of all genders. This lack of representation further reduces the availability of female role models, limits mentoring opportunities for young girls, and perpetuates stereotypes and unconscious biases (Mim, 2019).
Similarly, in the workplace, women in STEM report several challenges, including isolation, a lack of support, inflexible and high-pressure schedules, sexism, and ambiguous policies (Swafford & Anderson, 2020; Bagattini et al., 2021; Avalos et al., 2024). Compared to their male peers, they often face heightened competition and pressure to establish their credibility and authority in STEM fields (Blackburn, 2017).
To address these disparities, it is crucial to emphasize that gender does not determine ability or interest, and that everyone has the potential to succeed in STEM. Research shows no significant differences in academic performance between girls and boys in STEM subjects (Mim, 2019; Yabas et al., 2022) and studies confirm that women possess the same intellectual capabilities as men (Ferreira et al., 2019). While men tend to exhibit greater confidence in their abilities during introductory STEM courses, all students demonstrate equivalent competency by the end of these courses, reaching similar levels of computational skill development (Atmatzidou & Demetriadis, 2016). Moreover, although initial self-confidence levels may vary, both boys and girls perform equally well in STEM-related subjects like coding and robotics when completing a full curriculum. Notably, research indicates that women’s confidence in STEM increases with practice (Tosato & Banzato, 2017).
Many studies highlight that gender disparities in STEM interest, performance, and attitudes begin to emerge between ages 11 and 13, during the transition from primary to secondary education (Bagattini et al., 2021). Even girls who excel in mathematics and science at this stage may experience a decline in engagement and representation as they progress through their education (Yabas et al., 2022). To prevent young women from leaving STEM fields, early intervention and support are essential. Addressing gender imbalances in STEM is not only crucial for promoting equity and inclusion but also for meeting the growing demand for STEM professionals in an increasingly technology-driven world. Ensuring equal access to STEM education and careers allows all individuals, regardless of gender, to actively engage with technology, contribute to its development, and shape the future rather than merely consume it (S. Wang et al., 2019; Ferreira et al., 2019; Kucuk & Sisman, 2020).
To foster female interest in STEM and reduce gender disparities, targeted educational interventions and early exposure to scientific and technical fields are essential (Sullivan & Bers, 2019; Tarrés-Puertas et al., 2023). Research suggests that introducing children to programming through age-appropriate digital platforms at an early stage enhances engagement and facilitates the understanding of key concepts (V. Vasconcelos et al., 2023). Moreover, engaging students in STEM activities during early childhood—before gender stereotypes become deeply ingrained—can positively shape their academic and career aspirations (Kucuk & Sisman, 2020).
Well-structured, interactive STEM activities that emphasize enjoyment and real-world applications can encourage students to explore these fields and sustain their interest over time (Avalos et al., 2024). In particular, showcasing the collaborative nature, societal impact, and diverse applications of STEM careers can be instrumental in attracting young girls to these disciplines (S. Wang et al., 2019; Swafford & Anderson, 2020; Yabas et al., 2022; Dirsuweit & Gouws, 2023). One effective method for achieving these objectives is ER, which provides a hands-on, engaging introduction to STEM from an early age.
ER is a comprehensive process that involves the development of a robotic prototype, beginning with conceptualization and research within an educational context, followed by design, construction, programming, and testing (Scaradozzi et al., 2019; Voutyrakou & Panos, 2022). Its origins trace back to Seymour Papert, the creator of the Logo programming language, who was influenced by constructivist learning theory. He proposed that learning is most effective when students actively engage with concepts and discover them through hands-on experience (Sullivan & Bers, 2016).
As a STEM education tool, ER enhances computational and critical thinking skills while also fostering essential soft skills such as teamwork, conceptual understanding, and project and time management in a collaborative learning environment (Atmatzidou & Demetriadis, 2016). Beyond cognitive and technical benefits, ER has been found to boost students’ self-confidence, decision-making, and autonomy, while also providing teachers and mentors with a dynamic, interactive approach to STEM instruction (Evripidou et al., 2020).
Numerous studies highlight ER’s effectiveness in encouraging students to pursue STEM careers, largely due to its learning-by-doing approach, which allows students to take on roles as researchers and engineers (Kucuk & Sisman, 2020). Robotics lessons have been shown to increase interest in technical studies (van Wassenaer et al., 2023) and serve as a playful yet meaningful introduction to STEM, significantly influencing students’ future subject choices (Sullivan & Bers, 2016; Tosato & Banzato, 2017). Furthermore, because robotics integrates concepts from physics and mathematics, it fosters curiosity and deepens students’ engagement with STEM disciplines (Ferreira et al., 2019).
Additionally, ER helps simplify complex scientific concepts, solve challenging problems, and demonstrate real-world STEM applications, thereby increasing students’ motivation and interest (Khanlari, 2016). It provides an effective method for teaching difficult principles through a hands-on approach, even to young learners (Pedersen et al., 2021). It also encourages creative and innovative thinking by allowing students to explore ideas and apply STEM concepts in practical ways (Ferreira et al., 2019).
The benefits of participating in a robotics team during school have also been emphasized, as this experience fosters teamwork, inclusion, and a stronger appreciation for STEM fields (Fernández-de la Peña et al., 2021). For instance, students who participated in STEM programs such as the FLL Explore Program demonstrated significant increases in enthusiasm for STEM careers, identification with STEM, and the comprehension of related concepts (Yabas et al., 2022).
Even though ER can be an effective approach to encouraging more women to work in STEM, the selection of materials and topics for ER activities plays a crucial role in shaping girls’ interest and participation. Sullivan and Bers (2016) conducted experiments using commonly used ER equipment, such as Legos, the KIWI robot, and computers, to explore gender perceptions in early childhood. Among children aged 3 to 6, 64% responded that “boys would enjoy playing with Lego”, with reasons including the color of the blocks and the perception that boys enjoy building tasks more. Similarly, 33% of the children believed boys would prefer playing with the KIWI robot, again citing its blue color as a determining factor.
Another study found that girls were more engaged with soft materials and crafting techniques, such as sewing cloth, rather than using motors and gears (Tosato & Banzato, 2017). Additionally, the often-overlooked gender characteristics assigned to robots during their design and development can influence human–robot interaction. A gender-sensitive approach is therefore necessary, not only in designing a robot’s appearance but also in conceptualizing its interactions with users and its environment (Nomura, 2017; Weiss et al., 2023). Research also suggests that humanoid robots, such as NAO, are more likely to inspire young girls to pursue STEM (Pedersen et al., 2021).
In addition to materials, the selection of topics for ER activities plays an important role in shaping girls’ interest and participation. In girls-only STEM programs, such as coding classes, tasks are often designed around social issues to provide more meaningful contexts for learners (Mauk et al., 2020). This focus on social applications of robotics can make the learning environment more attractive to girls (Barco et al., 2019; Pedersen et al., 2021). For instance, Barco et al. (2019) introduced both mechanical and social robotics courses, finding that girls were notably more engaged in the social robotics classes, as they preferred contexts that emphasized social relevance over mechanical, task-focused, or industrial settings.
Researchers suggest that framing STEM in relation to global issues may be an effective way to motivate more girls to participate. The idea that STEM can offer solutions to real-world problems can spark excitement and motivation (Fernández-de la Peña et al., 2021; Yabas et al., 2022). Quantitative findings show that girls are particularly drawn to application areas that focus on helping people, while qualitative results reveal a preference for robots designed to assist humans or animals. Girls are more inclined toward topics involving direct human–robot interaction, rather than more abstract tasks such as ocean environment tracking (Pedersen et al., 2021).
This shift in focus has been reflected by several competitions and research projects that now prioritize social topics. For example, robots designed to care for others, rather than those intended for combat (Barco et al., 2019), or robots simulating animal behavior and communication, rather than football players or warriors (Bagattini et al., 2021), are increasingly preferred. This shift has enhanced girls’ participation and enjoyment in STEM activities.
Adding to the above, linguistic sexism in educational contexts has been shown to impact girls’ interest and participation, particularly by reinforcing gender stereotypes that discourage girls from engaging in certain subjects or activities. Research suggests that when language is gendered, it can perpetuate traditional gender roles, implying that certain fields or tasks are more suited to one gender over another. For instance, using gendered terms such as “fireman” or “housewife” may unintentionally signal to girls that they are less likely to belong in activities or careers associated with certain roles (Carli, 2002). Moreover, the use of gendered pronouns or descriptions that emphasize stereotypical traits, such as the emotional sensitivity or nurturing qualities for girls, can undermine confidence and willingness to participate in more technical or competitive activities (Cheryan et al., 2015). When educators use inclusive, gender-neutral language, however, they create a more welcoming environment for girls by promoting equality and challenging outdated gender norms. Research suggests that inclusive language not only reduces the impact of gender stereotypes but also encourages students to see themselves as capable and competent, regardless of gender (Horvath et al., 2016; Lindqvist et al., 2019). However, addressing linguistic sexism in educational settings is not always straightforward. Many educational resources, including textbooks and curricula, still contain gendered language that reinforces traditional roles and stereotypes. Studies have shown that gender bias is prevalent in educational books worldwide, with male characters often depicted in dominant, professional, or leadership roles, while females are more likely to be portrayed in supportive, domestic, or nurturing roles (Carli, 2002; Tiedt & Tiedt, 2005). In fact, research from various countries has highlighted that gendered language in textbooks is a widespread issue. For instance, a study on textbooks in the United Kingdom found that males were more likely to be portrayed as active participants in historical events and scientific achievements, while females were often excluded or depicted in domestic settings (Dawar & Anand, 2017). Similarly, in South Africa, gender bias was observed in schoolbooks, where male characters were more frequently assigned professional roles while female characters were often depicted as caregivers or in passive roles (Nkosi, 2013). Furthermore, a study in Turkey found that school textbooks commonly presented gendered language, with male figures occupying roles of authority and leadership, while females were relegated to familial and nurturing roles (Çağlak, 2025). Therefore, addressing linguistic sexism in educational settings is a critical step toward fostering greater gender equity and ensuring that all students feel empowered to pursue a wide range of interests and careers, regardless of their gender.
Given the above, educators must thoughtfully choose the topic, learning scheme, language, and materials of an ER activity to ensure that it is inclusive and captivates both genders in ER and consequently in STEM fields. While the additional effort and time needed by educators to plan and develop a curriculum, along with the corresponding activities that incorporate the aforementioned characteristics, are indeed significant, costs are outweighed by the added benefits foreseen (Pedersen et al., 2021).
The effort required to generate and evaluate inclusive ER activities can potentially be reduced by introducing AI tools that assist in this process or even generate ER activities independently. This paper, therefore, aims to explore the potential use of AI tools in developing unbiased and inclusive ER activities. As mentioned earlier, ChatGPT is the primary AI tool employed in this study. Launched in November 2022 by OpenAI, ChatGPT is a conversational AI tool designed to generate human-like text in response to natural language input, supporting a wide range of languages. Since its launch, ChatGPT has been used by over 180.5 million people. Numerous studies highlight its potential benefits in creating adaptive and interactive learning activities, offering personalized learning experiences tailored to individual needs and interests (Božić & Poola, 2023; Oranga, 2023; Opara et al., 2023). ChatGPT has also been found to be useful in research projects, brainstorming sessions, and content generation, as well as when developing test cases for various scenarios (Adeshola & Adepoju, 2023). Additionally, ChatGPT can adjust the difficulty of proposed activities or scenarios based on a student’s prior knowledge (Baidoo-Anu & Ansah, 2023), helping educators assess whether a scenario is suitable for a particular student or group. Beyond that, ChatGPT can assist educators in enhancing teaching methods, enabling them to create and integrate interactive classroom activities (Sok & Heng, 2023). The use of ChatGPT in STEM education has gained significant attention in recent years, and it is increasingly recognized as an important tool for designing STEM activities (M. A. R. Vasconcelos & Santos, 2023). Its benefits in STEM are particularly notable, supporting educators in brainstorming STEM topics and providing students with instant access to vast information on STEM subjects as well as assistance with complex problems (Verma, 2023; Liang et al., 2023). ChatGPT has been effectively utilized by educators for lesson planning, material development, and student support (M. A. R. Vasconcelos & Santos, 2023), thereby enhancing the overall quality of teaching and learning in STEM education (van den Berg & du Plessis, 2023).

3. Materials and Methods

This study employed an explanatory sequential mixed-methods design, integrating both quantitative and qualitative approaches to provide a comprehensive understanding of how ChatGPT generates or processes ER scenarios through a gender-inclusive perspective. This enabled us to first establish quantitative patterns and consistencies across AI responses, which were then further examined through qualitative analysis to uncover underlying themes and interpretive depth.
In the quantitative phase, we focused on detecting recurring patterns across multiple generations of ChatGPT responses. To ensure data reliability and minimize bias from prior interactions, each prompt and iteration was submitted in a new conversation thread using anonymous browsing without an account login, ensuring responses were generated independently without contextual memory. Each prompt was run 50 times, and a response feature (e.g., mention of collaboration, gendered language, or competitive framing) was recorded as significant if it appeared in at least 70% of outputs; this threshold aligns with prior AI content analysis research (e.g., Zhu et al., 2023). This approach systematically measured the consistency of ChatGPT’s suggestions and filtered out anomalous or idiosyncratic responses.
The qualitative phase employed directed content analysis (Hsieh & Shannon, 2005), using a coding scheme informed by the existing literature on gender-inclusive STEM and ER. This approach enabled us to systematically evaluate ChatGPT’s responses, not only regarding surface content but also their alignment with educational equity principles. We developed predefined codes across three overarching dimensions identified in the literature: (1) the presence or absence of gender stereotypes, (2) the use of inclusive versus biased language, and (3) the framing of engagement strategies. These dimensions were further specified through five targeted themes: (a) collaborative vs. competitive learning environments, (b) real-world relevance and social impact, (c) creative and empathetic engagement, (d) non-stereotypical gender framing and team formation, and (e) inclusive language use.
This thematic approach was applied across all responses to explore how framing activities with ChatGPT intended to motivate students of different genders, and we examined whether responses reinforced traditional gender roles or promoted equitable engagement. Linking these thematic codes to quantitative findings allowed the assessment of not only which patterns were present, but also how and why they reflected/or failed to reflect gender-inclusive practices.
To address the first research question, we asked ChatGPT to generate ER scenarios for final-year elementary school students (ages 10–11). We began with a general prompt: “Can you propose an educational robotics activity?” To refine the request, we specified the class size and student age: “Can you propose an educational robotics activity for a classroom of 25 students aged 10–11?” A class size of 25 was chosen as it reflects the average in Greek classrooms.
To explore gender-related aspects, we modified the prompts to include different motivational aims: “Can you propose an educational robotics activity for a classroom of 25 students aged 10–11? The aim of the scenario should be to motivate young boys in educational robotics”, “Can you propose an educational robotics activity for a classroom of 25 students aged 10–11? The aim of the scenario should be to motivate young girls in educational robotics”, and “Can you propose an educational robotics activity for a classroom of 25 students aged 10–11? The aim of the scenario should be to equally motivate young girls and boys in educational robotics”.
To further explore ChatGPT’s recommendations, we requested lists of elements that could appeal to specific student groups. Specifically, we asked for ten topics that could attract young boys, young girls, or both equally, as well as suggestions for materials and equipment, learning schemes, and teaching approaches that could be more engaging for different genders. The prompts used were as follows:
  • Topics investigation: “Can you suggest 10 topics for ER activities that could attract young boys/girls/both equally?
  • Materials and equipment suggestions: “Can you suggest materials and equipment that could equally attract young boys/girls/both equally?
  • Learning schemes: “Can you suggest a learning scheme for ER activities that could attract young boys/girls/both equally?
Finally, to assess whether ChatGPT’s suggestions align with existing literature on increasing female participation in ER, we asked the following question: “What would you suggest including in an ER activity to motivate more young girls to participate?”.
To address the second research question, we analyzed two widely used ER scenarios: Robot Sumo and Robot Soccer. These activities were selected due to their popularity in ER education and competitions. We used two types of prompts to explore ChatGPT’s capacity to identify gender bias and suggest inclusive strategies:
  • A general prompt using only the activity name: “Do you think there are any elements in the Robot Sumo/Robot Soccer educational robotics activity that could demotivate young girls and affect their interest or participation?
  • A more detailed prompt including a full description of the activity: “FULL DESCRIPTION OF THE ACTIVITY. How can the educator ensure that this activity engages young girls and boys equally?
Then, we asked ChatGPT a further question in order to explore whether it can suggest alternative, inclusive topics: “Do you think another educational robotics topic, instead of Robot Sumo and Robot Soccer, would better engage girls in robotics?
The variation between using only the activity name and providing a full description was intentional. Our goal was to simulate realistic educator input, recognizing that, in practice, educators may either refer to activities generally or provide detailed context when seeking AI support. Using both prompt styles allowed us to evaluate whether ChatGPT could generate meaningful responses based on its pre-trained knowledge of common ER activities, even when limited information was provided.
In addition to evaluating existing ER activities, we developed our own scenarios to examine whether ChatGPT could detect elements identified in the literature as potentially discouraging for young girls. To create these scenarios, we first reviewed the literature on gender biases in STEM education and identified key factors that may dissuade young girls from participating in ER. These factors included highly competitive challenges; themes traditionally associated with male interests, such as trucks and battle-style robotics, which can reinforce gender stereotypes; and stereotypical role assignments in team-based tasks, where boys are often given programming tasks while girls are assigned assembly or decorative roles.
Using this information, we developed scenarios that intentionally incorporated these elements to examine whether ChatGPT could detect and address potential biases. For example, we designed a scenario with a time-limited robot race and one with a robot-building task themed around trucks, and asked ChatGPT the following question: “Do you think there are any elements in this scenario that could demotivate young girls and affect their interest or participation?” This question followed by a detailed description of the scenario.
We also investigated whether ChatGPT could detect gender bias in team formation. To achieve this, we created student teams using different groupings, such as all-girl and all-boy teams, as well as mixed-gender teams with predefined stereotypical roles. For instance, in some mixed-gender teams, boys were given leadership or programming roles, while girls were assigned support roles, such as assembly or decoration. These mixed-gender team formations were created to reflect stereotypical role divisions that could affect how boys and girls engage in the activity. We then asked ChatGPT to evaluate these team formations for potential gender biases and to suggest improvements, focusing on how team dynamics might influence the engagement of the students.
To further investigate ChatGPT’s potential in supporting inclusive ER education, we conducted an additional experiment to explore its ability to detect linguistic sexism in the wording of educational robotics topics. Drawing from the European Parliament’s Gender-Neutral Language Guidelines (European Parliament, 2018), we designed a set of 25 ER topic titles that included gendered language likely to reflect implicit linguistic bias. These were systematically categorized into five stereotype-reinforcing domains:
(1)
Job titles with gender bias: A Robot Fireman; A Robot Housewife; The Actress Robot; The Waiter Bot; The Maid Bot.
(2)
Domestic and caregiving stereotypes: MommyBot: Taking Care of Kids; Miss MaidBot: The Elegant Cleaner; NannyBot, DaddyBot: The Repairman; The King of the House Bot.
(3)
Emotional or appearance-based framing: The Emotional Support Girl-Bot; FashionistaBot: “She” Knows the Right Fit for You; The Gentleman’s Fashion Expert Bot; Ladies’ CareBot: Always There for You; SisterBot: Always Ready to Listen.
(4)
Leadership and technical roles: The Robot Chairman; The Engineer Robot: The Guy Who Fixes Things; The Technical Expert: Mr. Solution Specialist; Mr. President Bot, Sir Know-it-All Bot.
(5)
Education and learning contexts: Study Buddy Bot; The HeadmistressBot, The Male Graduate Bot; The “Miss” Class President Bot; Professor Bot: He’ll Teach You Robotics!
Each of the 25 gendered topic titles was submitted to ChatGPT in 50 separate iterations, and we analyzed whether the model was able to identify and flag elements of linguistic sexism.
To ensure a balanced comparison, we then used the same five categories but replaced the original titles with equivalent, rephrased versions that adhered to gender-neutral language principles. These revised topic titles were designed to convey the same activity or concept while avoiding gendered terminology, in line with the European Parliament’s Gender-Neutral Language Guidelines. Each of the 25 gender-neutral titles was again submitted to ChatGPT 50 times, and we examined whether the model still detected any instances of linguistic sexism. This comparative approach allowed us to assess not only ChatGPT’s sensitivity to biased language, but also its recognition of improved, inclusive phrasing.
The revised, gender-neutral topic titles were grouped under the same five stereotype-related domains:
(1)
Job titles: A Robot Firefighter; A Robot Homemaker; The Actor Robot; The Server Bot, The Cleaning Assistant Bot.
(2)
Domestic and caregiving: ParentBot: Taking Care of Kids; CleanMate Bot: Household Assistant Bot; CaregiverBot; ParentBot: The Repair Specialist; Household Expert Bot.
(3)
Emotional or appearance-based framing: The Emotional Support Bot; Fashion-Bot: “They” Know the Right Fit for You; The Fashion Expert Bot; CareBot: Always There for You; SiblingBot: Always Ready to Listen.
(4)
Leadership and technical roles (rephrased): The Robot Chairperson; The Engineer Robot: They Fix Things; The Technical Expert: Mx. Solution Specialist; Mx. President Bot; Head Know-it-All Bot.
(5)
Education and learning contexts: Learning Partner Bot; Head Teacher Bot; The Graduate Bot; The Class Leader Bot; ProfessorBot: They’ll Teach You Robotics!
The prompt for the two aforementioned experiments was: “Is there linguistic sexism in the following ER topic TOPIC NAME? If so, what is your suggestion for making it gender-neutral?”.

4. Results

4.1. Experiment 1: Generated ER Activities by ChatGPT

The goal of this experiment was to address the first research question: “Can AI tools, such as ChatGPT, develop gender-neutral ER scenarios that serve as a foundation for educators?”. Hence, we designed a study to evaluate ChatGPT’s ability to generate inclusive ER activities for a classroom of 25 students aged 10–11 years (the final year of Greek elementary schools). The goal was to assess whether these scenarios promote gender neutrality and engage all students equally.
Initially, we asked ChatGPT to provide an ER activity, without providing any more information. The response was a scenario entitled “Build and Program a Line-following Robot”. ChatGPT suggested specific materials like the robotic kit and a computer and gave a step-by-step analysis of the pre-activity setup. Moreover, it provided a guide for the educator on how to divide the different sections of the activity. At the end of the activity, possible extensions were analyzed and we assessed the learning outcomes.
To assess the variability of ChatGPT’s responses, we repeated the same prompt 50 times. Notably, no single scenario appeared more than 20% of the time. The suggested scenarios varied in both thematic focus and complexity.
Because the prompt did not specify the students’ age or prior experience, the complexity of the proposed scenarios ranged widely. Some responses assumed minimal prior knowledge, suggesting beginner-level activities focused on basic movement or sensor calibration. Others introduced more advanced tasks involving logic programming, environmental sensing, or multistep design thinking processes.
Despite these differences, nearly all responses followed a consistent structure: they included guidance for educators, suggested materials, and clearly defined learning outcomes. While the more frequent scenarios emphasized technical mastery, many alternatives incorporated narrative elements, adventure, or teamwork.
Next, the prompt was rephrased to better reflect the context of our classroom, as outlined in the experimental design described in Section 3. We specified parameters such as student age and group size and asked ChatGPT to generate an ER activity tailored to this environment. The response included a complete breakdown of the required materials and classroom preparation, along with a structured division of the activity into steps, each with suggested time allocations. At the conclusion of the activity, a summary of expected learning outcomes was provided. According to ChatGPT, the ER activities were designed to be fun, engaging, and educational, with a clear focus on promoting interest in STEM subjects among young students.
The most frequent scenario generated in response to this prompt was titled “Build and Program a Smart Robot Car”, but once again there was not any scenario that appeared in more than 26% of the repetitions. This activity divided students into five teams, each using a robotics kit (e.g., LEGO Mindstorms or VEX IQ) to build and program a robot capable of navigating a basic obstacle course. The programming elements involved commands like “move forward”, “turn”, and “stop”, and teams were encouraged to iterate their designs before showcasing their final robots to the class. This scenario emphasized teamwork, problem-solving, and engineering basics, in line with age-appropriate learning goals.
In addition to this scenario, ChatGPT generated a variety of alternatives, which varied in narrative tone and learning objectives: while some focused on physical construction and movement, others integrated storytelling or themed problem-solving. Across all scenarios, ChatGPT maintained a consistent structure, namely, a materials list, activity steps, team-based organization, and clearly articulated learning goals, while adjusting tone and complexity to match the provided classroom context.
The educator may also include time limitations directly within the prompt. For example, we specified that the activity must be completed within three lessons of 45 min each. In all iterations where such constraints were provided, ChatGPT effectively structured the activity to fit the allotted time. It divided the tasks into clearly defined sessions, assigning time blocks to each step while maintaining pedagogical coherence. For instance, in the “Smart Robot Car” activity, Session 1 typically covered basic construction and introduction to the programming environment; Session 2 focused on obstacle avoidance programming and testing; and Session 3 was dedicated to optimization, peer collaboration, and team presentations. This ability to adapt instruction to time constraints demonstrates ChatGPT’s potential to support real-world lesson planning, where fitting content into limited instructional periods is often a critical challenge for educators.
In order to explore gender-related aspects of this, we added the aim of the activity to the given prompt, conducted three different experiments—each repeated 50 times—and then compared the results. The generated activities include the proposed team formation (5 groups of 5 students) and the roles each team member may have. An example was an activity entitled ‘Robot Rescue Mission’, accompanied by a further analysis of the possible steps the educator could follow (along with the required times for each). At the end of the scenario, the learning outcomes were analyzed, with an emphasis on why this activity may motivate young boys, as shown in Figure 1. The reasons included the competitive and adventure-based topic, the leadership roles, and the collaboration of the students to overcome the challenges. Specifically, the competitive and adventure-based theme was mentioned in 94% of the responses generated, hands-on learning was noted in 100% of the responses, and leadership roles were mentioned in 72% of the outputs. Many responses (around 74%) also proposed optional time-based elements or “missions” to enhance excitement and engagement. Altogether, the responses reflected a clear tendency to emphasize themes often associated with action, challenge, and teamwork.
Then, we repeated the experiment for 50 iterations, this time focusing on a different aim, to motivate young girls in the ER activity. An example was the generated activity centered around a rescue mission, titled “The Rescue Mission: Building a Robot for a Cause”. As with the previous experiment, the objective of the scenario and the required materials were clearly outlined. Under the “Mission Briefing” section, the details of the challenge were provided, with the goal of the robot being to navigate and rescue a trapped person after a natural disaster. Each step of the activity was briefly described, followed by an analysis of why this activity could fulfill its initial aim (i.e., The “Why This Activity Works” section), as presented in Figure 2. Key reasons included its promotion of teamwork, its connection to real-world problems, and its encouragement of creativity. When specifying the aim of motivating young girls, 94% of the responses highlighted teamwork, fun, and engagement, while 76% of the answers emphasized the importance of the topic’s connection to real-world problems. Finally, creativity was highlighted in 82% of the responses.
Lastly, we tasked ChatGPT with generating an ER activity for the specified classroom fifty times, with the aim of equally motivating both young girls and boys. An example of a generated activity was titled “Robotic Exploration Challenge”, and its structure was similar to the other experiments. At the beginning, an overview of the method was provided, and the required materials were outlined. Then, a brief description of the scenario followed, and the specific goals were enumerated. The roles of the team members were proposed, and the steps, along with the timing for each, were provided to support educators in setting up the activity in their classrooms. The learning outcomes and possible extensions were also suggested, as shown in Figure 3.
In terms of engagement strategies, 90% of the responses emphasized the importance of a hands-on learning approach to motivate both genders equally, while 88% highlighted the significance of team formation as a key factor. Another common suggestion, noted in 82% of the responses generated, was the incorporation of storytelling to add purpose and relevance to the activity. This would spark the students’ curiosity and provide a meaningful real-world application of the topic. Furthermore, 90% of the responses encouraged all students to experience different roles and aspects of the project, fostering collaboration and teamwork. Flexibility in roles and task assignments was also a frequently mentioned element, appearing in 74% of the responses and further supporting the need for an inclusive, cooperative learning environment.

4.2. Experiment 2: Suggestions for Materials, Topics and Teaching Practices to Include More Young Girls in ER

This experiment also sought to explore the first research question by investigating whether ChatGPT could contribute to developing gender-neutral ER scenarios through suggestions for materials, topics, learning schemes, and teaching strategies to foster inclusivity and engage young girls. By addressing the unique challenges of promoting gender balance in ER, ChatGPT’s recommendations were evaluated as tools to assist educators in designing activities that encourage participation across genders.
The first part of this experiment tasked ChatGPT fifty times with proposing ten topics for ER activities that could attract young boys, young girls, or both equally. The aim was to compare the responses and evaluate their effectiveness alongside the findings from the literature review.
In the topics suggested for young boys, options such as battle bots, which include competition and fighting, and robot races were frequently proposed. These activities emphasized action, rivalry, and physical challenge. Specifically, 86% of the responses included competitive activities such as robot battles or races, and 84% highlighted the thrill of winning, goal-oriented missions, or leadership roles. Additionally, hands-on problem-solving appeared in 92% of the prompts, and action-packed or adventure-driven narratives were present in 78%. The majority of topics were centered around technology, with a strong emphasis on achievement, performance under pressure, and individual or team-based success. These patterns suggest a consistent focus on competition, leadership, and physical engagement in the scenarios aimed at motivating young boys.
The topics proposed for young girls, in contrast, often included more real-world applications, social impact projects, and activities that encouraged empathy and teamwork. Specifically, 88% of the generated topics were based on practical problem-solving, with a strong emphasis on collaboration and community impact. Common suggestions included building robots for social good, environmental sustainability, and supporting people in everyday life challenges, e.g., rescue missions, assistive technologies, or eco-friendly robots. Notably, 80% of these prompts highlighted cooperative work, group discussions, and the need to foster empathy among participants. Additionally, 72% of the responses incorporated creative elements, such as storytelling or the integration of design and the arts into robotics activity, while 76% emphasized the importance of creating gender-inclusive and supportive environments, further reinforcing a holistic and inclusive approach to engagement.
Lastly, the topics proposed for both young girls and boys appeared to be more gender-neutral and focused on real-world applications. The suggested activities often included practical scenarios that could engage all genders equally. For instance, in the sports competition topic, there was a notable shift: rather than being limited to a soccer-related task (as it was for young boys), the topic was expanded to include multiple sports such as basketball, volleyball, and track and field. This broader scope aimed to cater to a wider range of interests and allowed for more inclusive participation. Furthermore, 92% of the gender-neutral topics generated emphasized the importance of teamwork, with a clear focus on collaborative efforts as key to success.
In addition, 86% of the suggestions for both genders included problem-solving challenges that could be applied in real-world contexts, reinforcing the notion that the activities should be engaging, purposeful, and reflective of societal challenges. Overall, 80% of the generated topics incorporated a hands-on learning component, where students actively interacted with the materials and scenarios to enhance engagement. Interestingly, 70% of the topics proposed also featured real-world applications such as environmental sustainability, community development, and technology for social good, demonstrating a strong focus on social impact. Moreover, 78% of the activities emphasized the inclusion of gender-neutral tools and robots that could be customized to reflect the students’ diverse interests, while 76% advocated for creating a supportive learning environment where all students were encouraged to explore different roles within the activity.
Following this, we rephrased the prompt to focus on the materials and equipment suggested for each of the target audiences described. To begin with, the proposed equipment was similar across all prompts and included several options for educational robotics kits that are widely used for such purposes, such as LEGO Mindstorms, VEX Robotics, and Arduino kits. However, the responses varied slightly for each prompt. In the response focusing on materials that attract young boys, the emphasis was placed on different types of equipment, with explanations of why each one works and the possible activities it could support. Notably, 80% of the materials suggested for young boys focused on kits that encourage building and competition, such as LEGO Mindstorms, which allow for customization and programmable robot challenges. Another 72% highlighted the use of sensors and motors that could be used in competitive racing or battle-style challenges.
In the response for young girls, in addition to robotics kits, other engaging materials and software tools were suggested, such as 3D pens (mentioned in 70% of the responses) or interactive storytelling resources. These suggestions were aimed at encouraging creativity and providing a platform for expressive activities. Moreover, 76% of the responses for young girls included extensions that could enhance enjoyment, such as mentorship programs, books that promote inclusivity, and resources for social–emotional learning. The integration of these elements reflected an emphasis on providing a more holistic and supportive environment for learning. Furthermore, the learning environment for young girls emphasized the importance of teamwork-based projects and projects with social impact (80% of responses). These elements were considered important for making the experience more relatable and engaging.
Lastly, in terms of the response targeting both young girls and boys, several robotics kits were mentioned, including the widely used LEGO Mindstorms and VEX Robotics kits. These kits were highlighted for their gender-neutral appeal, with 90% of the responses stressing their versatility and accessibility for both genders. Additionally, key considerations were provided, including the incorporation of storytelling elements (78% of responses), a collaborative setup (86%), and the importance of real-world problems in the topics (88%). These considerations reflect a broader focus on fostering teamwork and real-world relevance, which is essential for ensuring engagement from both genders.
The final part of this experiment explores possible learning schemes to attract young boys, young girls, or both equally. ChatGPT effectively tailored the challenges to engage each target group, emphasizing the importance of feedback and recognition. This aligns with ER’s role in developing not only hard skills but also soft skills in students, such as teamwork, communication, and problem-solving.
In the learning scheme proposed for young boys, action-oriented and gamified elements were incorporated, along with competitive yet team-based challenges. These elements appeared in 86% of the responses for young boys. Additionally, real-world applications (mentioned in 72% of the responses) and storytelling (70%) were highlighted as key components. The focus on competition and action was designed to attract young boys through dynamic and engaging activities. Apart from these, 78% of the responses also emphasized the importance of leadership roles, aiming to provide young boys with opportunities to take charge and make decisions.
For young girls, the learning scheme placed significant emphasis on role models (86%), a collaborative learning environment (90%), and self-confidence (80%). Additionally, real-world applications (86%), mentorship (82%), feedback (80%), and storytelling (78%) were all critical components of the proposed scheme. These elements aimed to foster a supportive environment for young girls, encouraging them to explore STEM fields and develop their skills. The scheme also stressed the importance of independence and problem-solving, which were identified as key aspects of building confidence and empowering young girls to take on leadership roles. Apart from these, 72% of the responses suggested integrating group discussions to encourage communication and collaborative problem-solving.
Regarding the common learning scheme for both groups, several strategies were identified to promote fun, inclusivity, and engagement. Engagement strategies included storytelling (80% of responses), real-world applications (86%), and gender-neutral toys and robot models (72%) that students could customize. These approaches ensured that the activities remained flexible and adaptable, catering to the interests and preferences of both genders. The learning experience was designed to be interactive and collaborative (96%), fostering an inclusive environment where students could work together to solve challenges. Additionally, female role models (72% of the responses) were suggested to inspire young girls and demonstrate that they too could excel in STEM fields. Incorporating social issues into robotics projects was another key focus, showing how robots can contribute to a better world and help to address global challenges. This concept was mentioned in 70% of responses for both groups. Lastly, 74% of the responses recommended celebrating diversity through different cultural perspectives and encouraging students to appreciate diverse backgrounds.
Educators were encouraged to promote problem-solving and empathy by guiding students to consider how their robots could assist people. The final step of the learning scheme involves a celebration of diversity, where role models can be introduced and all students receive recognition for their contributions. This approach aims to create an environment where students feel valued and appreciated for their unique strengths and contributions.
General recommendations were also provided regarding the use of language in instructions and activities to motivate both boys and girls in ER, as shown in Figure 4, ensuring it remained neutral and inclusive. Additionally, mentorship opportunities (72%) and a variety of challenges (76%) were suggested to accommodate diverse interests and learning styles, helping all students engage meaningfully with the material. Beyond these, 80% of responses emphasized the importance of fostering a safe and supportive learning environment where students feel encouraged to take risks and explore new ideas.
We also evaluated ChatGPT’s ability to propose strategies for increasing young girls’ engagement in ER. An example of an AI-generated response is presented in Figure 5. Several key strategies were consistently emphasized across responses. Specifically, creativity was highlighted in 92% of the outputs, while collaborative learning environments were mentioned in 88% of the cases. Designing activities around real-world challenges and applications appeared in 84% of the responses.
Furthermore, the importance of mentorship—particularly through female role models—was emphasized in 76% of the outputs generated. The need for continuous encouragement and positive reinforcement was found in 72% of responses. ChatGPT also frequently suggested integrating design and artistic components into robotics activities, with STEAM-focused approaches mentioned in 78% of iterations.
Lastly, the use of gender-neutral learning materials and language to ensure inclusivity was noted in 82% of the responses. Additionally, 76% of the responses emphasized the need for hands-on, experiential learning to keep girls engaged, ensuring that they can actively participate in constructing and testing their robots. Additionally, 70% of responses also highlighted the importance of integrating social justice or environmental themes into robotics projects to encourage girls to see the broader impact of their work.

4.3. Experiment 3: Evaluate Commonly Used ER Scenarios with Respect to Gender Bias

The third experiment aimed to address the second research question: “Can AI tools, such as ChatGPT, highlight potential gender biases in ER scenarios introduced by educators?”.
To explore this, we presented the Robot Sumo and Robot Soccer scenarios to ChatGPT using two types of prompts, with one that included only the name of the activity, and another that provided a full description. This approach allowed us to observe whether ChatGPT could identify relevant gender-related concerns and generate inclusive recommendations based on either minimal or detailed input, reflecting different ways educators might interact with the tool in practice.
First, we asked whether the activity might contain elements that could demotivate young girls. We then prompted ChatGPT to suggest strategies educators could use to ensure equal engagement of boys and girls.
Our initial focus was on the Robot Sumo activity, a widely recognized ER challenge used in both educational settings and competitions. In this activity, teams design robots to compete in a ring, attempting to push their opponent out while staying inside. The fighting ring is marked with tape on the ground.
Regarding potentially demotivating elements, ChatGPT consistently identified general concerns that could apply across ER scenarios, including team formation issues (mentioned in 84% of responses) and limited access to mentorship or female role models (72%). These factors were often flagged as potential barriers to engagement, particularly for young girls, as they could feel isolated or underrepresented in predominantly male-focused activities. Specifically for Robot Sumo, competition-focused dynamics were flagged as a possible deterrent in 90% of the iterations. ChatGPT noted that such activities might prioritize aggressive or combative elements, which may not appeal to all students, particularly young girls who may gravitate toward collaborative and creative activities. These were identified in 80% of the responses as more engaging for girls, highlighting a need for more cooperative, problem-solving elements within the competition. Another common concern noted was gender stereotypes, with 82% of responses pointing out that activities like Robot Sumo often perpetuate the idea that boys are more suited to competitive or mechanical tasks, while girls are less inclined to participate in these types of challenges. This stereotype can discourage girls from fully engaging in robotics, particularly if the activity is framed in a way that reinforces traditional gender roles. Examples of demotivating elements are shown in Figure 6a. ChatGPT suggested several strategies to mitigate these concerns. The formation of mixed-gender teams was recommended in 76% of responses as it helps to break down gendered expectations and encourages collaborative efforts. Integrating mentorship opportunities (suggested in 72%) was also highlighted as a way to provide girls with support, guidance, and role models, especially those who are underrepresented in STEM fields. The inclusion of female role models was proposed in 74% of responses to demonstrate that girls can succeed in competitive robotics and to encourage them to pursue similar paths. These mitigation steps were all aimed at fostering a more inclusive and supportive learning environment, as shown for instance in Figure 6b.
When asked about strategies to ensure equal engagement, ChatGPT emphasized the importance of a proactive and inclusive pedagogical approach. Overall, 94% of the outputs highlighted the need for equal participation across all stages of the activity, including design, programming, and testing. This was seen as essential to ensure that both genders are equally involved and that no student feels sidelined or excluded at any point in the process. Additionally, teamwork and collaboration were promoted in 88% of responses, underscoring the importance of collective problem-solving, communication, and shared responsibility in successful projects. Gender-neutral roles and tasks were mentioned in 80% of responses, with specific recommendations to avoid traditional gender roles (e.g., boys always handling mechanical tasks and girls focusing on design or documentation) in order to ensure that all students had the opportunity to explore different aspects of robotics.
Moreover, ChatGPT frequently recommended encouraging creative exploration (in 82% of responses), stressing the importance of providing students with room to experiment, innovate, and express themselves through their robot designs, which could appeal to a wider range of interests. The model also emphasized the importance of prioritizing collaboration over competition (92%) to help foster a more inclusive atmosphere where all students can contribute ideas and be recognized for their input, rather than focusing solely on who wins or loses. It was further noted that aligning activities with student interests (72%) can play a crucial role in keeping students engaged. For example, allowing students to incorporate their personal passions into the robot’s design or functionality could increase their motivation to participate fully in the activity.
In addition, the importance of a supportive environment was highlighted in 82% of the responses. ChatGPT recommended creating open communication channels and establishing feedback loops, where students could discuss challenges, reflect on their progress, and feel encouraged to keep improving. This would help ensure that all participants felt heard and valued. It also suggested the use of low-pressure settings (mentioned in 74%) to reduce stress and competition, allowing students to focus more on learning than on the fear of failure. The tool consistently recommended the use of diverse and inclusive challenges (84%) that could appeal to a variety of interests, such as environmental sustainability or social issues, while ensuring the activities remained engaging. Fun and engaging contexts (78%) were also recommended, including storytelling elements or scenarios that tied back to real-world problems, to make the activity more relevant and enjoyable for students of all genders. Some of these suggestions are illustrated in Figure 7.
Next, we examined the Robot Soccer challenge, another popular ER activity. In this version, each team builds two robots, and the educator sets up a soccer field with two goals marked with tape on the ground. A single ball is placed at the center of the field, and each team’s objective is to move the ball toward the opposing team’s goal. Whenever a goal is scored, the clock stops, and the ball is reset at the center. The team that scores the most goals within 180 s wins.
To begin, we prompted ChatGPT to identify potentially demotivating elements for young girls in the activity. In 88% of the responses, it highlighted general challenges such as team formation and prevalent social norms about girls’ roles in robotics. Specifically for Robot Soccer, 92% of responses flagged the aggressive and competitive nature of the task as a potential barrier. Additionally, in 76% of cases, ChatGPT emphasized that gender stereotypes surrounding soccer could reduce engagement among girls. The demotivating elements identified in one iteration of this prompt are presented in Figure 8a.
To address these concerns, ChatGPT proposed several strategies focused on empowerment and inclusion. Mentorship and role models were mentioned in 70% of responses, emphasizing the value of having female role models and mentors available to guide and inspire young girls. Encouragement and affirmation strategies appeared in 78% of responses, with a significant emphasis on reinforcing positive behaviors and offering praise throughout the activity. Mixed-gender teams were recommended in 84% of cases, with a strong focus on balancing team dynamics to foster cooperation and break down gender stereotypes. Additionally, collaborative team dynamics were frequently stressed, appearing in 90% of the responses. They highlighted the importance of fostering cooperation over competition in order to engage all participants equally. An example of these mitigation strategies is shown in Figure 8b.
We then asked ChatGPT to propose ways for educators to ensure equal participation. An example response from ChatGPT is shown in Figure 9. Overall, 92% of the responses stressed the need to prioritize teamwork over competition, urging educators to create a supportive atmosphere where every participant has the opportunity to contribute. ChatGPT also recommended that educators ensure equal access to leadership roles (86%), ensuring that both boys and girls are empowered to take charge in different aspects of the activity. Additionally, gender-neutral materials and instructions were recommended in 78% of responses to avoid reinforcing any existing biases. The promotion of technology and innovation was mentioned as a central theme of the activity in 80% of responses, shifting the focus away from traditional gendered sports narratives and instead framing the activity as a learning experience grounded in technology and creativity.
Furthermore, ChatGPT frequently advised educators to create a safe space for feedback (74%), where all students, regardless of gender, could express their thoughts, ask questions, and receive constructive feedback. Lastly, incorporating diverse problem-solving approaches was recommended in 78% of responses, ensuring that the tasks could be approached from multiple angles and were open to a variety of learning styles, ensuring broad participation and engagement.
The final prompt aimed at exploring ChatGPT’s ability to propose alternative topics that can better engage girls in robotics, and we repeated it 50 times. As both Robot Sumo and Robot Soccer are well-known activities, in this final prompt, we referred to them by name only (avoiding lengthy descriptions) to realistically simulate how educators might pose such questions. An example of the suggestions is presented in Figure 10.
ChatGPT suggested exploring alternative topics that align more closely with young girls’ interests. Initially, it noted that while both scenarios mentioned previously could inspire many students, they were heavily based on stereotypically masculine themes, which could be demotivating for some girls. ChatGPT emphasized the importance of making robotics relevant and accessible for all students, particularly by focusing on applications with a social impact. In 84% of the responses, ChatGPT recommended topics related to environmental sustainability, such as building robots that can clean up waste or monitor ecosystems, recognizing that young girls are often more drawn to projects with real-world applications. These types of projects, which have a direct positive effect on the environment, can make robotics feel more meaningful and align with values that are often emphasized in educational contexts that appeal to girls.
Additionally, social and healthcare robotics were also proposed in 76% of the responses. Examples included designing robots that assist in elderly care, provide healthcare services, or help people with disabilities. These topics were considered to be more inclusive and impactful, fostering empathy and problem-solving skills. By involving robotics in real-world healthcare challenges, the activities not only become more relevant to girls but also more motivating, as they offer the opportunity to directly contribute to improving lives.
Apart from these, ChatGPT also proposed the incorporation of robotics into the creative arts, which was mentioned in 70% of the responses. This suggestion included building robots that could engage in artistic endeavors such as painting, designing sculptures, or creating digital art. By allowing young girls to explore robotics through the lens of creativity, this approach makes robotics more accessible to those who may not identify with traditional technical roles, thus engaging them in a more expressive and less intimidating manner.
In 78% of the responses, ChatGPT suggested the idea of using robots for social good, such as creating robots that help address social challenges like improving accessibility for disabled individuals or providing assistance during disaster response. These projects combine empathy with technology, encouraging young girls to see robotics as a way to contribute positively to society. This theme also reinforces the idea that robotics can be used to tackle real-world issues that young girls might care about, further increasing the relevance of the activity.
In each case, ChatGPT highlighted the engagement factor of the scenarios. For example, the environmental robotics scenarios emphasized the impact that the robots would have on local communities and ecosystems, which makes the activities relevant and connected to real-world problems. Similarly, the social good projects stressed the importance of collaboration and teamwork, allowing students to experience the power of working together toward a common goal. These projects were designed to make girls feel that they could not only be part of the robotics field but also use their creativity and problem-solving abilities to solve important societal challenges.
These suggestions show a distinct shift towards robotics topics that focus on creativity, social good, and collaboration. By moving away from stereotypically masculine themes and embracing subjects that resonate with young girls’ interests, such as environmental and social impact, the proposed topics offer an inclusive approach to engaging girls in robotics. Moreover, the emphasis on teamwork and real-world applications supports the idea that robotics can be accessible, meaningful, and aligned with girls’ values, encouraging them to explore and excel in the field.

4.4. Experiment 4: Evaluate ER Activities That Reinforce Gender Bias

This experiment aimed to further investigate the second research question. We developed three scenarios to examine how gender biases might be embedded in ER activities. The first two scenarios explicitly incorporated societal norms and stereotypes that could discourage boys (Scenario 1) and girls (Scenario 2) from participating. The third scenario explored gender bias in team role assignments and composition.
In Scenario 1, students were asked to construct and program a robotic doll to make children happy. The doll was described as wearing pink clothes, moving its hands to give hugs, and increasing its temperature to provide warmth and comfort. After presenting this scenario, we asked ChatGPT whether it considered the activity to be gender-biased. The AI tool identified multiple elements that could contribute to gender bias.
Across all 50 repetitions (100%), ChatGPT flagged the use of pink clothing as reinforcing traditional gender associations that link the color pink to femininity, potentially discouraging boys from engaging with the task. In 96% of responses, ChatGPT also pointed out that referring to the doll as “she” could perpetuate the stereotype that dolls, and by extension, caregiving or emotional roles, are inherently female. Furthermore, in 92% of outputs, the AI noted that the nurturing function of the doll (hugging, warming, comforting) aligns with stereotypically feminine traits, reinforcing the association between emotional labor and women or girls.
To address the gender bias identified in Scenario 1, ChatGPT proposed a variety of modifications aimed at making the activity more gender-neutral and inclusive. In 94% of responses, the AI suggested allowing students to choose the doll’s clothing color, emphasizing that personal preference should guide design choices rather than defaulting to stereotypical colors like pink. This change was framed as a way to support student agency while minimizing associations between color and gender identity.
In 88% of the outputs, ChatGPT proposed expanding the doll’s functionality to include tasks beyond nurturing, such as storytelling, puzzle-solving, or simple games that involve logic and creativity. These features were seen as a way to diversify the robot’s purpose and avoid reinforcing the stereotype that emotional care is inherently linked to femininity. Furthermore, in 82% of responses, ChatGPT recommended using non-gendered language when referring to the doll, such as “it” or “they”, instead of assigning the pronoun “she”. This adjustment was seen as a straightforward step toward avoiding linguistic cues that imply gendered expectations. Apart from these frequent suggestions, other ideas were presented at lower rates.
In Scenario 2, students were instructed to design and develop a blue robotic truck that explores materials, crafts tools, builds structures, and competes against other students’ robots. The goal was for the robot to be the “last survivor” at the end of the competition. When presented with this scenario, ChatGPT identified multiple potential sources of gender bias. An example response of ChatGPT is shown in Figure 11.
In 98% of responses, ChatGPT flagged the use of trucks, particularly in a combat-like or survival-oriented context, as reinforcing traditionally masculine themes, which could reduce the activity’s appeal to some students, particularly girls. In 94% of responses, the AI noted that the competitive framing of the task, focused on being the “last survivor”, aligns with a zero-sum mindset that may not resonate with students who prefer cooperative, creative, or socially relevant projects.
In 86% of outputs, ChatGPT proposed that the activity could be more inclusive by broadening themes beyond traditional “masculine-coded” narratives like survival and dominance. Suggestions included reframing the robot’s mission to involve constructive or socially beneficial objectives, such as delivering supplies in a disaster simulation or helping to build a sustainable city.
Additionally, in 72% of responses, ChatGPT recommended providing students with more agency in choosing the robot’s design and function, such as offering alternative robot types (e.g., a builder, helper, explorer) that allow students to align tasks with their interests.
In the third scenario, our goal was to use ChatGPT to propose a realistic team role distribution that we could implement in our experimental setup. For this reason, we asked ChatGPT to suggest team roles for five students from our classroom. This prompt was executed only once, as the purpose was not to evaluate the correctness of the suggested roles or to examine patterns across multiple iterations, but simply to obtain a representative set of roles for the activity. The roles proposed by ChatGPT were as follows:
i.
Team Leader/Project Manager
ii.
Programmer/Coder
iii.
Builder/Engineer
iv.
Design Specialist/Planner
v.
Tester/Quality Control
These roles were then used as a foundation for designing the subsequent experiment. Based on this structure, we designed an experimental condition in which roles were intentionally assigned according to traditional gender stereotypes: three boys were designated as Team Leader, Programmer, and Builder, while the two girls were assigned the roles of Design Specialist and Tester.
In our extended analysis, we further examined ChatGPT’s ability to detect implicit gender bias by presenting it with role assignment scenarios rooted in traditional stereotypes. Across all 50 repetitions (100%), ChatGPT consistently identified the distribution of roles, where the three boys were assigned to be the Team Leader, Programmer, and Builder, and the two girls were assigned to be Design Specialist and Tester, as a reflection of gender bias. The AI tool emphasized that assigning construction- and programming-related roles predominantly to boys, while giving girls tasks related to esthetics and testing, reinforces stereotypical assumptions about gendered competencies in STEM fields.
In 98% of the responses, ChatGPT recommended assigning roles based on students’ interests, capabilities, and strengths, rather than gender. It also encouraged educators to engage students in reflective discussions prior to role distribution in order to promote agency and fairness. In 92% of the responses, the AI explicitly suggested role rotation, noting that this approach can help ensure equitable exposure to all aspects of the ER activity and build confidence across a range of skills.
Some less frequent but insightful suggestions also appeared. For instance, in 74% of the outputs, there was a suggestion of involving students in the role-assignment process through self-selection or democratic group decision-making, which ChatGPT argued could empower students and reduce educator-imposed bias.
Additionally, when we posed a scenario where teams were divided strictly by gender, with one all-girls team and one all-boys team, 96% of the responses flagged this structure as potentially reinforcing gender divides. ChatGPT warned that such separation may limit inclusive peer interaction and perpetuate assumptions about which gender is better suited to certain tasks. Instead, it recommended mixed-gender teams, a strategy that it cited as conducive to collaboration, mutual learning, and stereotype reduction.

4.5. Experiment 5: Explore Linguistic Sexism in ER Topics

Our final experiment also addressed the second research question. Specifically, its aim was to investigate whether ChatGPT could identify potential instances of linguistic sexism in given ER topic titles. Furthermore, the tool was tasked with suggesting alternative non-gendered phrasings.
Each of the 25 topic prompts was submitted to ChatGPT 50 times. The results were consistent: in every instance, the AI successfully identified gendered or stereotypical language and flagged it as potentially problematic in an educational context (100%).
While the specific wording of ChatGPT’s suggestions varied, the responses consistently preserved the original pedagogical intent while removing biased or stereotypical phrasing. For example, when analyzing job titles with gender bias, such as “A Robot Fireman”, the most frequent suggestion was “A Robot Firefighter” (84% of responses), followed by “A Firefighting Robot” (72%). Similarly, for the prompt “A Robot Housewife”, neutral alternatives included “A Robot Homemaker” (92%) and “Home Helper Bot” (76%). In the case of “The Actress Robot”, ChatGPT proposed inclusive phrasing such as “The Performer Robot” (78%), while “The Waiter Bot” was often rephrased as “The Server Bot” (86%). Lastly, “The Maid Bot” was most commonly reformulated as “Cleaning Assistant Bot” (74%).
The title “Miss MaidBot: The Elegant Cleaner” was transformed into more inclusive alternatives such as “CleanMate Bot” (88%) and “Household Assistant Bot” (76%). Similarly, “NannyBot” was reimagined as “Childcare Bot” (94%) or “CaregiverBot” (78%). When presented with the prompt “DaddyBot: The Repairman”, ChatGPT recommended neutral phrasing like “ParentBot: The Repair Specialist” (76%). For “The King of the House Bot”, the most common alternative was “Household Expert Bot” (80%).
In prompts emphasizing emotional traits or physical appearance, ChatGPT offered gender-neutral alternatives that retained emotional sensitivity while avoiding stereotypes. For example, “The Emotional Support Girl-Bot” was frequently reformulated as “Emotional Support Bot” (88%), while “FashionistaBot: ‘She’ Knows the Right Fit for You” became “StyleMatch Bot” (72%). In the case of “The Gentleman’s Fashion Expert Bot”, no single alternative appeared with a frequency higher than 60%, with “Fashion Expert Bot” being the most common suggestion. Likewise, “Ladies’ CareBot: Always There for You” was revised to “CareBot: Always There for You” (78%), and “SisterBot: Always Ready to Listen” was commonly replaced with “SiblingBot” (76%).
When it came to leadership and technical roles, ChatGPT consistently avoided gendered job framing. The prompt “The Robot Chairman” was frequently rephrased as “The Robot Chairperson” (94%). For “The Engineer Robot: The Guy Who Fixes Things”, common alternatives included “The Engineer Robot: They Fix Things” (86%). “The Technical Expert: Mr. Solution Specialist” was often reformulated as “Problem Solver Bot” (72%). However, for prompts like “Mr. President Bot” and “Sir Know-it-All Bot”, no single alternative appeared in more than 70% of responses, with the most frequent suggestions being “Mx. President Bot” (68%) and “Head Know-it-All Bot” (58%), respectively.
Finally, in education and learning contexts, gendered or role-specific prompts were smoothly neutralized. For “Study Buddy Bot”, inclusive alternatives included “Learning Partner Bot” (72%). “The HeadmistressBot” was commonly reformulated as “Head Teacher Bot” (84%), while “The Male Graduate Bot” became “The Graduate Bot” (82%). In response to “The ‘Miss’ Class President Bot”, ChatGPT suggested alternatives such as “Class Leader Bot” (76%). Lastly, “Professor Bot: He’ll Teach You Robotics!” was adjusted to “Professor Bot: They’ll Teach You Robotics!” (70%). An example of ChatGPT’s responses can be seen in Figure 12.
Following the initial analysis of explicitly gendered ER topic titles, the second part of the experiment shifted focus to a revised set of prompts that had been deliberately rephrased to align with the European Parliament’s guidelines on gender-neutral language (European Parliament, 2018). These updated titles aimed to remove overtly gendered references and adopt more inclusive phrasing. The purpose was to assess whether ChatGPT would still detect any residual linguistic bias and to examine the types of alternative suggestions it might offer when presented with prompts that were already designed to be neutral. In 19 out of the 25 cases, ChatGPT consistently indicated, across all repetitions (100%), that the titles did not contain linguistic sexism. Interestingly, even when no bias was identified, the model still proposed thoughtful alternatives or refinements, suggesting ways to further enhance inclusivity or clarity. An example response is presented in Figure 13.
In the remaining six cases, namely, “The Actor Robot”, “ParentBot: Taking Care of Kids”, “CleanMate Bot: Household Assistant Bot”, “A Robot Homemaker”, “Fashion-Bot: ‘They’ Know the Right Fit for You”, and “CareBot: Always There for You”, ChatGPT identified potential instances of linguistic sexism. Although these titles did not overtly violate gender-neutral language guidelines, the tool flagged subtle cues that could carry gendered connotations. Across 50 iterations, the tool consistently indicated that these titles may include linguistic sexism due to the following issues:
  • Gendered roles: Titles such as “A Robot Homemaker” and “CleanMate Bot: Household Assistant Bot” were flagged for reinforcing traditional, feminized domestic roles. Similarly, “The Actor Robot” was noted as subtly sexist. Although “actor” is increasingly used as a gender-neutral term in the industry, the word has historically been associated with male performers. It may still be perceived as masculine by default, especially in non-entertainment contexts or by younger learners or non-native speakers. This reflects a male-default bias, potentially excluding or minimizing female or non-binary identities. Overall, 35 out of 50 iterations (70%) identified these as concerns.
  • Emotional labor: Titles like “CareBot: Always There for You” and “Fashion-Bot: ‘They’ Know the Right Fit for You” were noted for emphasizing emotional support or appearance-based functions, which are traditionally feminized roles. Overall, 38 out of 50 iterations (76%) highlighted this issue.
  • Caregiving associations: For the title “ParentBot: Taking Care of Kids”, the tool explained that while “ParentBot” is gender-neutral, the phrase “Taking Care of Kids” may implicitly reinforce the stereotype that caregiving is primarily a female responsibility. This association with caregiving, historically and culturally linked to women, subtly perpetuates gendered expectations of parental roles. Overall, 41 out of 50 iterations (82%) mentioned this as a potential issue.
The most frequent alternative formulations suggested varied. It was often recommended that titles be rephrased to make them more neutral or inclusive. For example, “A Robot Homemaker” could be reframed as “A Robot Domestic Assistant” (80%), while “CleanMate Bot” could be adjusted to “Home Assistant Bot” (70%) or “TaskBot” (72%). “The Actor Robot” could be rephrased as “The Performer Robot” (84%) or “The Creative Robot” (72%) to avoid reinforcing gender biases in professional roles. “ParentBot: Taking Care of Kids” could be reworded as “ParentBot: Childcare Companion” (74%) or “ParentBot: Supporting Young Minds” (82%) to move away from the traditional caregiving association. The tool also suggested shifting titles like “CareBot” to something more functionally descriptive, such as “SupportBot: Always There for You” (84%). Similarly, “Fashion-Bot: ‘They’ Know the Right Fit for You” could be altered to “StyleBot: Perfect Fit for You” (76%) to minimize appearance-based emotional labor. An example is shown in Figure 14.

5. Discussion

Previous research has highlighted the potential of ER activities to introduce students to STEM fields and contribute to narrowing the gender gap (Sullivan & Bers, 2018; Zhao et al., 2024). Achieving this goal requires the creation of opportunities for students to engage with robotics in ways that align with their interests, positioning robotics as a creative, innovative, and inclusive discipline. Key factors influencing student engagement and motivation in ER activities include topic selection, the accessibility of materials, team dynamics, and an emphasis on collaboration over competition (Schön et al., 2020). Importantly, activities must be designed to engage both genders equally, ensuring that they provide opportunities for young girls to develop self-confidence and interest in STEM fields. However, designing and implementing such inclusive activities demands substantial time and effort from educators, underscoring the need for tools that can facilitate this process.
This study investigates whether AI tools like ChatGPT can assist educators in designing ER activities that promote gender equity and student engagement. Across five experiments, we evaluated ChatGPT’s capacity to generate, assess, and revise ER content using a directed content analysis approach (Hsieh & Shannon, 2005). Our analysis was guided by three key dimensions drawn from the literature on gender-inclusive STEM education: (1) the presence or absence of gender stereotypes, (2) the use of inclusive versus biased language, and (3) the framing of engagement strategies, such as collaboration, real-world relevance, and creativity. These dimensions formed the basis of a predefined coding scheme applied throughout the study.
Research Question 1: Can AI tools, such as ChatGPT, develop gender-neutral ER scenarios that serve as a foundation for educators?
In our first experiment, ChatGPT was prompted to generate ER scenarios tailored to educators’ needs. Without prior contextual information, such as the age or number of students, the AI’s responses tended to be more generic, exhibiting wide variations in difficulty and complexity. However, when prompts included specific details about the target student group, ChatGPT adapted its scenarios more precisely to align with the needs of that audience.
To better understand ChatGPT’s capability to tailor responses according to the needs of specific target groups, we conducted a follow-up investigation by explicitly specifying the intended student audience in the prompts. ChatGPT adapted its responses accordingly. For instance, in a scenario designed to engage young boys, the AI tool emphasized competition and adventure. Conversely, in a scenario aimed at motivating young girls, the focus shifted toward real-world problems and teamwork. When targeting both young girls and boys, ChatGPT emphasized collaboration and storytelling. These findings align with the existing literature on themes that effectively engage young girls in ER (Schön et al., 2020; Sullivan & Bers, 2019) and illustrate how the AI tool dynamically adjusted the engagement framing based on gendered expectations, without rigidly reinforcing stereotypes.
Regarding general topic recommendations, ChatGPT suggested social impact projects, creativity-driven activities, and real-world issues as effective ways to increase girls’ motivation and participation. For scenarios designed for all students, the suggested topics balanced adventure and creativity. Even when competitive elements were included, ChatGPT underscored the importance of teamwork, highlighting the educator’s role in fostering collaboration.
Similarly, in our second experiment, we evaluated ChatGPT’s ability to propose materials and learning schemes to enhance participation in ER activities for girls, boys, or both. The AI tool recommended various ER kits, collaborative setups, and engaging themes while emphasizing the importance of diverse role models and actionable strategies by which educators can foster inclusivity. Notably, ChatGPT adjusted its responses based on the target audience. For activities designed for both young girls and boys, the AI tool highlighted the gender-neutral appeal of each suggestion. The learning schemes were also successfully adapted to fit different activity goals. Additionally, ChatGPT proposed several strategies to encourage young girls in ER, including the integration of female role models and mentorship and efforts to ensure that topics and materials support creativity. These recommendations mirror best practices for fostering inclusivity in STEM education, as suggested by previous studies (Sullivan & Bers, 2018; Sullivan & Bers, 2019), and address both gender stereotypes (e.g., underrepresentation) and engagement strategies that emphasize autonomy and relevance.
These recommendations reflect principles found in Feminist Pedagogy, which promotes equity in education through inclusive practices, empowerment, and shared authority in the classroom (Shrewsbury, 1993). By suggesting collaborative approaches, real-world relevance, and female representation in ER scenarios, ChatGPT aligns with feminist pedagogical strategies aimed at disrupting traditional power structures and gender hierarchies within STEM education. The tool’s emphasis on mentorship and the visibility of diverse role models contributes to building inclusive learning environments where all students, particularly girls, can see themselves as capable and valued participants in technology-rich domains (Crabtree et al., 2009).
Overall, the findings suggest that ChatGPT can be a valuable tool for educators seeking to design inclusive ER activities. The AI tool successfully generated gender-neutral scenarios, tailored recommendations based on available student information, and provided actionable strategies to foster engagement across genders. Notably, the consistency of these findings across 50 prompt repetitions reinforces ChatGPT’s reliability as a resource for developing more inclusive robotics curricula. These results align with established best practices in gender-inclusive STEM education and demonstrate the potential of AI tools to support the creation of equitable learning experiences. While specific responses varied across repetitions, key themes and goals—such as emphasizing teamwork over competition, incorporating real-world applications, or using storytelling—remained consistently aligned with the needs of the targeted student group, particularly when the target group was clearly defined in the prompt. Future research should explore how incorporating more detailed contextual information, including prior STEM knowledge and available materials, may further improve the relevance and effectiveness of AI-generated scenarios.
Research Question 2: Can AI tools, such as ChatGPT, highlight gender biases introduced by educators into ER scenarios?
In our third experiment, we tasked ChatGPT with evaluating established ER activities, specifically Robot Sumo and Robot Soccer. Over the 50 repetitions, the AI tool consistently identified potential gender biases within these scenarios, particularly the highly competitive nature of the activities, which could limit their appeal to diverse student groups. This observation reflects a sensitivity to engagement framing, where high-stakes competition may alienate some learners, especially girls, who favor more inclusive and collaborative approaches. It also suggested modifications, including incorporating themes centered on real-world problems or social impact, which research has shown to be more engaging for girls in ER. These findings demonstrate ChatGPT’s sensitivity to engagement framing and its alignment with research showing that socially relevant, collaborative projects are more effective at engaging girls in STEM (Sullivan & Bers, 2019; Schön et al., 2020).
Beyond detecting biases within specific scenarios, ChatGPT also highlighted broader demotivating factors, such as limited access to mentorship, role models, and supportive team dynamics. To further assess its potential, we asked ChatGPT to propose strategies for making these activities more inclusive. The AI tool emphasized the importance of teamwork, fostering an inclusive environment, and ensuring guidance is provided by mentors and role models. Additional recommendations included forming mixed-gender teams, encouraging leadership opportunities for all students, and using gender-neutral language when presenting the activities, strategies that address both stereotypes and language inclusivity. When asked to suggest alternative topics that might better engage girls in robotics, ChatGPT consistently proposed real-world challenges and projects with social impact, aligning with research findings that highlight these themes as particularly effective in increasing young girls’ participation in ER (Schön et al., 2020; Sullivan & Bers, 2019).
In the fourth experiment, ChatGPT analyzed ER scenarios that explicitly incorporated societal norms and gender stereotypes, e.g., assigning team roles based on traditional gender expectations or using materials commonly associated with a specific gender (Nomura, 2017; Weiss et al., 2023). The AI tool successfully identified these biases and proposed solutions, including the use of gender-neutral pronouns (e.g., they instead of he/she) and a more conscious approach to color choices that might reinforce stereotypes. These suggestions directly address concerns around language inclusivity and representation raised in the literature review. Additionally, ChatGPT flagged potentially demotivating elements, such as competitive structures and the use of toys subtly linked to specific genders; these observations echo previous research on children’s preferences and gendered play (Sullivan & Bers, 2019).
To further examine its capacity to mitigate gender bias, we tested ChatGPT’s response to scenarios in which boys were consistently assigned leadership and technical roles while girls were given supportive or design roles. The AI tool promptly identified these biases, flagged their potential impact on student engagement, and proposed strategies to counteract them. Specifically, it recommended assigning roles based on student interests, strengths, and capabilities, implementing role rotation, and forming mixed-gender teams to promote collaboration among all students. These strategies directly address gender stereotypes and seek to establish more equitable participation structures within ER activities.
These suggestions closely align with Self-Determination Theory (SDT), which identifies autonomy, competence, and relatedness as essential components of intrinsic motivation (Deci & Ryan, 1985). By encouraging autonomy through interest-based role assignments, supporting competence through equal access to leadership roles, and fostering relatedness via inclusive team dynamics, ChatGPT’s guidance supports key motivational drivers essential for student engagement in STEM. The AI tool’s responses also resonate once again with Feminist Pedagogy. By recognizing and challenging the gendered distribution of roles, suggesting the use of inclusive language, and promoting collaborative leadership, ChatGPT advances the principles of a more just and equitable educational approach.
The fifth experiment aimed to investigate ChatGPT’s capacity to identify linguistic sexism in ER topic titles and to generate inclusive, gender-neutral alternatives. Results across all 25 tested prompts were consistent: in each of the 50 repetitions per topic, ChatGPT successfully flagged instances of gendered language and framed them as potentially problematic in educational contexts. These findings speak directly to the dimension of language inclusivity, highlighting how subtle linguistic cues, such as gendered job titles, can reinforce normative assumptions and limit students’ identification with STEM roles. Moreover, the specific alternatives varied across iterations; all preserved the core educational objective of the original ER topic while removing stereotypical or gendered elements.
For example, gendered job titles such as “A Robot Fireman” were reformulated as “A Robot Firefighter”, and stereotyped characterizations like “Miss MaidBot” were replaced with neutral alternatives such as “CleanMate Bot” or “Household Assistant Bot”. These revisions reflect broader research showing that gender-inclusive language can reduce bias and promote equity in learning environments (Stout & Dasgupta, 2011; Sczesny et al., 2016; Formanowicz & Hansen, 2022). The suggestions also align with the European Parliament’s Gender-Neutral Language Guidelines (European Parliament, 2018).
Given the well-documented presence of subtle linguistic biases in educational materials, including textbooks and classroom discourse (Blumberg, 2008; Yasin et al., 2012; Vahdatinejad, 2017; Alexopoulos et al., 2022), ChatGPT’s consistent performance suggests it could be a valuable tool for educators aiming to identify and revise problematic phrasing. Moreover, its ability to propose inclusive alternatives provides much-needed support for those who may lack the time, resources, or confidence to rephrase content in a gender-neutral way.
In the second part of the experiment, we examined a new set of ER topic titles that were intentionally written using gender-neutral language, following the European Parliament’s guidelines. In nearly all cases, ChatGPT recognized the absence of gendered language and confirmed that the titles were inclusive. Nonetheless, the model continued to offer thoughtful alternative suggestions in every repetition, demonstrating its usefulness, not only in flagging biased language but also in fine-tuning phrasing for clarity and inclusivity. This consistency reinforces ChatGPT’s potential as a practical aid in promoting more equitable communication in educational contexts.
However, a few noteworthy cases emerged where the model flagged terms already considered gender-neutral according to official guidance. For instance, in the case of “The Robot Actor”, ChatGPT identified potential bias, despite the term “actor” being widely recognized today as gender-neutral (European Parliament, 2018). According to ChatGPT, this may be due to lingering cultural associations in which “actor” traditionally referred to men, while women were categorized separately as “actresses”, reinforcing unnecessary gender distinctions. Even so, the model’s suggested alternatives, like for example “The Performer Robot”, were both accurate and inclusive. While this level of caution may occasionally exceed what is required, it reflects a proactive approach to minimizing unintended bias and supports the broader goal of using gender-inclusive language in educational settings.
Based on our directed content analysis, ChatGPT demonstrated a strong ability to recognize gender biases in both pre-existing and hypothetical ER scenarios, providing educators with detailed and actionable feedback for designing more inclusive activities. This directly answers Research Question 2. Beyond content generation, this can assist educators in phrasing activities in a gender-neutral manner, structuring student teams, and mitigating unconscious biases. These strategies have been well-documented in the literature as effective ways to counteract gender stereotypes in STEM education (Blackburn, 2017; Mich & Ghislandi, 2019; Schön et al., 2020). Apart from this, ChatGPT’s suggestions consistently aligned with key strategies from the literature, emphasizing the importance of fostering inclusive team dynamics, using gender-neutral language, and promoting creativity and real-world applications in ER activities (Sullivan & Bers, 2018; Schön et al., 2020). Importantly, the tool’s responses consistently addressed our three analytical dimensions: challenging gender stereotypes, promoting inclusive language, and reframing engagement to emphasize collaboration, relevance, and creativity.
The interpretation of ChatGPT’s outputs with theories such as SDT and Feminist Pedagogy further validates its recommendations, positioning it as a valuable tool for fostering inclusivity in ER, while still emphasizing the need for human oversight in interpreting and implementing its suggestions. More precisely, while ChatGPT shows promise in generating gender-neutral ER scenarios and providing valuable insights to educators, the quality of its output is heavily dependent on the specificity of the prompts. Our findings underscore the importance of including detailed context in the prompts, as the AI tool’s effectiveness is directly influenced by the clarity of the information provided (Crawford & Calo, 2016; Taddeo & Floridi, 2018). For example, in our first experiment, ChatGPT initially failed to produce inclusive suggestions because the prompt did not explicitly state the goal of engaging both girls and boys. It was only when this objective was clearly specified in the prompt that the AI adjusted its responses accordingly, offering more balanced and gender-sensitive recommendations. This illustrates that while ChatGPT has the capacity to detect and mitigate gender bias, it relies heavily on how the prompt is framed. Educators must be mindful when using AI tools like ChatGPT, as its suggestions reflect the biases and limitations inherent in the tool’s training data (Binns, 2020; Noble, 2018). The quality of AI-generated content depends on how carefully prompts are crafted, and while the AI tool can provide support, it should complement, not replace, educators’ expertise (Brynjolfsson & McAfee, 2017). By combining ChatGPT’s capabilities with human judgment and pedagogical knowledge, more inclusive and impactful ER activities can be developed, ultimately fostering a diverse and equitable future in STEM (Z. Wang, 2020).

6. Conclusions

Increasing the representation of women in STEM is not only a matter of equity but also a key driver of innovation, economic growth, and social progress. A diverse STEM workforce brings a broader range of perspectives, fostering more comprehensive problem-solving and technological advancements. ER has proven to be a promising tool for sparking early interest in STEM, offering engaging, hands-on learning experiences. However, to effectively bridge the gender gap, ER activities must be intentionally designed to be inclusive, ensuring all students, regardless of gender, are encouraged to participate.
This study explored the potential of AI tools, particularly ChatGPT, to support educators in developing gender-inclusive ER activities. Our findings demonstrate that ChatGPT can indeed generate gender-neutral ER scenarios that align with gender-inclusive practices. The AI tool was able to create activities that appealed to both genders equally, highlighting topics that encourage collaboration, creativity, and real-world problem-solving. ChatGPT was also able to dynamically adjust its suggestions based on the target student group, ensuring that the framing of the activity was inclusive and appropriate. The study also revealed that ChatGPT is effective at identifying gender biases in existing ER activities. For example, it flagged scenarios that were overly competitive or gender-stereotyped and suggested more inclusive alternatives. The AI tool demonstrated its capacity to pinpoint issues like the use of gendered job titles, imbalanced role assignments, and competitive structures that might deter certain groups from engaging. Furthermore, it proposed strategies to address these biases, such as promoting teamwork, using gender-neutral language, and ensuring the equitable representation of both genders in roles and activities.
While AI tools like ChatGPT provide substantial assistance in detecting and mitigating gender biases, their recommendations should be critically assessed. AI models operate on data that may contain inherent biases, and their suggestions might not always align with the specific needs of diverse educational settings. Educators must carefully evaluate and refine AI-generated suggestions to ensure meaningful engagement for all students. To maximize the impact of such tools, professional development programs should equip educators with the skills to incorporate AI effectively while fostering an understanding of its limitations.
The quality and diversity of training data are crucial to AI’s effectiveness. If AI models rely on biased historical data, they may inadvertently perpetuate those biases. Future research should explore ways to improve AI training datasets to ensure fairness and inclusivity in STEM education. Additionally, evaluating the direct impact of AI-assisted ER activities on student engagement—using tools like the Draw an Engineer Test, along with qualitative feedback from both students and educators—could offer valuable insights into their real-world effectiveness.
In conclusion, this study highlights the promising role of AI tools in designing more equitable ER activities. By fostering gender-neutral language, promoting inclusivity in team dynamics, and emphasizing real-world applications, AI tools like ChatGPT can enhance the creation of activities that engage all students. However, while AI offers valuable support, it should complement, not replace, the expertise and judgment of educators. When integrated thoughtfully into educational practices, AI tools can play a pivotal role in promoting diversity, equity, and inclusion in STEM fields.

Author Contributions

Conceptualization, D.A.V.; methodology, D.A.V. and C.S.; writing—original draft preparation, D.A.V.; writing—review and editing, C.S.; supervision, C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
EREducational Robotics
AIArtificial Intelligence
STEMScience, Technology, Engineering, and Mathematics
SDTSelf-Determination Theory

References

  1. Adeshola, I., & Adepoju, A. P. (2023). The opportunities and challenges of ChatGPT in education. Interactive Learning Environments, 32, 6159–6172. [Google Scholar] [CrossRef]
  2. Alexopoulos, C., Stamou, A. G., & Papadopoulou, P. (2022). Gender representations in the Greek primary school language textbooks: Synthesizing content with critical discourse analysis. International Journal on Social and Education Sciences, 4(2), 257–274. [Google Scholar] [CrossRef]
  3. Atmatzidou, S., & Demetriadis, S. (2016). Advancing students’ computational thinking skills through educational robotics: A study on age and gender relevant differences. Robotics and Autonomous Systems, 75, 661–670. [Google Scholar] [CrossRef]
  4. Avalos, S., Granados, C., Tafur, M., Arroyo, D., & Roncal, S. J. (2024, March). Allybot: Design studio to enhance girls’ participation in technology and art. In Companion of the 2024 ACM/IEEE international conference on human-robot interaction (pp. 219–222). Association for Computing Machinery. [Google Scholar]
  5. Bagattini, D., Miotti, B., & Operto, F. (2021). Educational robotics and the gender perspective. In Makers at school, educational robotics and innovative learning environments: Research and experiences from FabLearn Italy 2019, in the Italian schools and beyond (pp. 249–254). Springer International Publishing. [Google Scholar]
  6. Baidoo-Anu, D., & Ansah, L. O. (2023). Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. Journal of AI, 7(1), 52–62. [Google Scholar] [CrossRef]
  7. Barco, A., Walsh, R. M., Block, A., Loveys, K., McDaid, A., & Broadbent, E. (2019, March 11–14). Teaching social robotics to motivate women into engineering and robotics careers. 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI) (pp. 518–519), Daegu, Republic of Korea. [Google Scholar]
  8. Binns, J. (2020). Artificial intelligence and the ethical implications in education: How biases impact AI-generated content. AI and Ethics, 1(1), 23–30. [Google Scholar]
  9. Blackburn, H. (2017). The status of women in STEM in higher education: A review of the literature 2007–2017. Science & Technology Libraries, 36(3), 235–273. [Google Scholar]
  10. Blumberg, R. L. (2008). The invisible obstacle to educational equality: Gender bias in textbooks. Prospects, 38(3), 345–361. [Google Scholar] [CrossRef]
  11. Božić, V., & Poola, I. (2023). Chat GPT and education. Preprint. Available online: https://www.researchgate.net/publication/369926506_Chat_GPT_and_education (accessed on 2 June 2025).
  12. Brynjolfsson, E., & McAfee, A. (2017). The second machine age: Work, progress, and prosperity in a time of brilliant technologies. W.W. Norton & Company. [Google Scholar]
  13. Carli, L. L. (2002). Gender, Interpersonal Power, and Social Influence. Journal of Social Issues, 55(1), 81–99. [Google Scholar] [CrossRef]
  14. Cheryan, S., Master, A., & Meltzoff, A. N. (2015). Cultural stereotypes as gatekeepers: Increasing girls’ interest in computer science and engineering by diversifying stereotypes. Frontiers in Psychology, 6, 49. [Google Scholar] [CrossRef]
  15. Crabtree, R. D., Sapp, D. A., & Licona, A. C. (2009). Feminist pedagogy: Looking back to move forward. Johns Hopkins University Press. [Google Scholar]
  16. Crawford, K., & Calo, R. (2016). There is a blind spot in AI research. Nature, 538(7623), 311–313. [Google Scholar] [CrossRef]
  17. Çağlak, I. (2025). Gender representation and sexism in international language coursebooks for tertiary education. Journal of Language Teaching and Learning, 15(1), 20–45. Available online: https://www.jltl.com.tr/index.php/jltl/article/view/727 (accessed on 2 June 2025).
  18. Daniela, L., & Lytras, M. D. (2019). Educational robotics for inclusive education. Technology, Knowledge and Learning, 24, 219–225. [Google Scholar] [CrossRef]
  19. Dawar, T. I. N. N. Y., & Anand, S. A. R. I. T. A. (2017). Gender bias in textbooks across the world. International Journal of Applied Home Science, 4(3&4), 224–235. [Google Scholar]
  20. Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. Plenum. [Google Scholar]
  21. Dirsuweit, T., & Gouws, P. (2023). Barriers to the full participation of girls in robotics: A case study of a South African community of practice. Interchange, 54(4), 439–457. [Google Scholar] [CrossRef]
  22. European Parliament. (2018). Gender-neutral language in the European parliament. Available online: https://www.europarl.europa.eu/cmsdata/151780/GNL_Guidelines_EN.pdf (accessed on 2 June 2025).
  23. Evripidou, S., Georgiou, K., Doitsidis, L., Amanatiadis, A. A., Zinonos, Z., & Chatzichristofis, S. A. (2020). Educational robotics: Platforms, competitions and expected learning outcomes. IEEE Access, 8, 219534–219562. [Google Scholar] [CrossRef]
  24. Fernández-de la Peña, C. P., Gómez-Aladro, V. A., Álvarez-Palacios, L., & Díaz-Tufinio, C. A. (2021, April 21–23). Work in progress: Safe environments and female role models: Important factors for girls approaching STEM-related careers through robotics initiatives. 2021 IEEE Global Engineering Education Conference (EDUCON) (pp. 25–29), Vienna, Austria. [Google Scholar]
  25. Ferreira, M. E., Lima, D. A., & Silva, A. (2019, October). Data analysis for robotics and programming project evaluation involving female students participation. In 2019 Latin American robotics symposium (LARS), 2019 Brazilian symposium on robotics (SBR) and 2019 workshop on robotics in education (WRE) (pp. 417–422). IEEE. [Google Scholar]
  26. Formanowicz, M., & Hansen, K. (2022). Subtle linguistic cues affecting gender in (equality). Journal of Language and Social Psychology, 41(2), 127–147. [Google Scholar] [CrossRef]
  27. Golecki, H., Lamer, S., McNeela, E., Tran, T., & Adnan, A. (2022, June 26–29). Understanding impacts of soft robotics project on female students’ perceptions of engineering (work in progress). 2022 ASEE Annual Conference & Exposition, Minneapolis, MN, USA. [Google Scholar]
  28. Horvath, L. K., Merkel, E. F., Maass, A., & Sczesny, S. (2016). Does gender-fair language pay off? The social perception of professions from a cross-linguistic perspective. Frontiers in Psychology, 6, 2018. [Google Scholar] [CrossRef]
  29. Hsieh, H.-F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative Health Research, 15(9), 1277–1288. [Google Scholar] [CrossRef]
  30. Khanlari, A. (2016, October 12–15). Long term effects of educational robots on a grade 9 girl’s perceptions of science and math. 2016 IEEE Frontiers in Education Conference (FIE) (pp. 1–4), Eire, PA, USA. [Google Scholar]
  31. Kucuk, S., & Sisman, B. (2020). Students’ attitudes towards robotics and STEM: Differences based on gender and robotics experience. International Journal of Child-Computer Interaction, 23, 100167. [Google Scholar] [CrossRef]
  32. Liang, Y., Zou, D., Xie, H., & Wang, F. L. (2023). Exploring the potential of using ChatGPT in physics education. Smart Learning Environments, 10(1), 52. [Google Scholar] [CrossRef]
  33. Lindqvist, A., Renström, E. A., & Gustafsson Sendén, M. (2019). Reducing a male bias in language? Establishing the efficiency of three different gender-fair language strategies. Sex Roles, 81(1–2), 109–117. [Google Scholar] [CrossRef]
  34. Mauk, M., Willett, R., & Coulter, N. (2020). The can-do girl goes to coding camp: A discourse analysis of news reports on coding initiatives designed for girls. Learning, Media and Technology, 45(4), 395–408. [Google Scholar] [CrossRef]
  35. Mich, O., & Ghislandi, P. (2019). Young girls and scientific careers: May a course on robotics change girls’ aspirations about their future? The ROBOESTATE project. QWERTY-Interdisciplinary Journal of Technology, Culture and Education, 14(2), 88–109. [Google Scholar] [CrossRef]
  36. Mim, S. A. (2019). Women missing in STEM careers: A critical review through the gender lens. Journal of Research in Science, Mathematics and Technology Education, 2, 59–70. [Google Scholar] [CrossRef]
  37. Nkosi, Z. P. (2013). Exploring gender stereotypes in secondary school literary texts. South African Journal of African Languages, 33(2), 133–142. [Google Scholar] [CrossRef]
  38. Noble, S. U. (2018). Algorithms of oppression: How search engines reinforce racism. New York University Press. [Google Scholar]
  39. Nomura, T. (2017). Robots and gender. Gender and the Genome, 1(1), 18–26. [Google Scholar] [CrossRef]
  40. Opara, E., Mfon-Ette Theresa, A., & Aduke, T. C. (2023). ChatGPT for teaching, learning and research: Prospects and challenges. Global Academic Journal of Humanities and Social Sciences, 5, 33–40. [Google Scholar]
  41. Oranga, J. (2023). Benefits of artificial intelligence (ChatGPT) in education and learning: Is Chat GPT helpful. International Review of Practical Innovation, Technology and Green Energy (IRPITAGE), 3(3), 46–50. [Google Scholar]
  42. Pedersen, B. K. M. K., Weigelin, B. C., Larsen, J. C., & Nielsen, J. (2021, August 8–12). Using educational robotics to foster girls’ interest in STEM: A systematic review. 2021 30th IEEE international conference on robot & human interactive communication (RO-MAN) (pp. 865–872), Vancouver, BC, Canada. [Google Scholar]
  43. Scaradozzi, D., Screpanti, L., & Cesaretti, L. (2019). Towards a definition of educational robotics: A classification of tools, experiences and assessments. In Smart learning with educational robotics: Using robots to scaffold learning outcomes (pp. 63–92). Springer. [Google Scholar]
  44. Schön, S., Rosenova, M., Ebner, M., & Grandl, M. (2020). How to support girls’ participation at projects in makerspace settings. Overview on current recommendations. In Educational robotics in the context of the maker movement (pp. 193–196). Springer International Publishing. [Google Scholar]
  45. Sczesny, S., Formanowicz, M., & Moser, F. (2016). Can gender-fair language reduce gender stereotyping and discrimination? Frontiers in Psychology, 7, 154379. [Google Scholar] [CrossRef]
  46. Shrewsbury, C. M. (1993). What is feminist pedagogy? Women’s Studies Quarterly, 21(3/4), 8–16. [Google Scholar]
  47. Sok, S., & Heng, K. (2023). ChatGPT for education and research: A review of benefits and risks. Cambodian Journal of Educational Research, 3(1), 110–121. [Google Scholar] [CrossRef]
  48. Stout, J. G., & Dasgupta, N. (2011). When he doesn’t mean you: Gender-exclusive language as ostracism. Personality and Social Psychology Bulletin, 37(6), 757–769. [Google Scholar] [CrossRef] [PubMed]
  49. Sullivan, A., & Bers, M. U. (2016). Girls, boys, and bots: Gender differences in young children’s performance on robotics and programming tasks. Journal of Information Technology Education. Innovations in Practice, 15, 145. [Google Scholar] [CrossRef] [PubMed]
  50. Sullivan, A., & Bers, M. U. (2018). The impact of teacher gender on girls’ performance on programming tasks in early elementary school. Journal of Information Technology Education. Innovations in Practice, 17, 153. [Google Scholar] [CrossRef]
  51. Sullivan, A., & Bers, M. U. (2019). Investigating the use of robotics to increase girls’ interest in engineering during early elementary school. International Journal of Technology and Design Education, 29(5), 1033–1051. [Google Scholar] [CrossRef]
  52. Swafford, M., & Anderson, R. (2020). Addressing the gender gap: Women’s perceived barriers to pursuing STEM careers. Journal of Research in Technical Careers, 4(1), 61–74. [Google Scholar] [CrossRef]
  53. Taddeo, M., & Floridi, L. (2018). The ethics of artificial intelligence. In M. M. L. Rodrigues (Ed.), The Cambridge handbook of information and computer ethics (pp. 264–289). Cambridge University Press. [Google Scholar]
  54. Tarrés-Puertas, M. I., Costa, V., Pedreira Alvarez, M., Lemkow-Tovias, G., Rossell, J. M., & Dorado, A. D. (2023). Child–robot interactions using educational robots: An ethical and inclusive perspective. Sensors, 23(3), 1675. [Google Scholar] [CrossRef]
  55. Tiedt, P. L., & Tiedt, I. M. (2005). Multicultural teaching: A handbook of activities, information, and resources. Pearson/Allyn & Bacon. [Google Scholar]
  56. Tosato, P., & Banzato, M. (2017, July 3–6). Gender difference in handmade robotics for children. IFIP World Conference on Computers in Education (pp. 209–220), Dublin, Ireland. [Google Scholar]
  57. UNESCO Institute for Statistics. (2017). Cracking the code: Girls’ and women’s education in science, technology, engineering and mathematics (STEM). Available online: https://www.unesco.org/en/gender-equality/education/stem (accessed on 23 July 2024).
  58. United Nations. (2021). Progress on the sustainable development goals: The gender snapshot 2021. United Nations. Available online: https://unstats.un.org/sdgs/gender-snapshot/2021/ (accessed on 1 February 2025).
  59. U.S. Bureau of Labor Statistics. (2024, April 17). Employment in STEM occupations. Available online: https://www.bls.gov/emp/tables/stem-employment.htm (accessed on 26 July 2024).
  60. Vahdatinejad, S. (2017). Linguistic sexism in the Iranian high school EFL textbooks. PEOPLE: International Journal of Social Sciences, 3(2), 746–761. [Google Scholar] [CrossRef]
  61. van den Berg, G., & du Plessis, E. (2023). ChatGPT and generative AI: Possibilities for its contribution to lesson planning, critical thinking and openness in teacher education. Education Sciences, 13(10), 998. [Google Scholar] [CrossRef]
  62. van Wassenaer, N., Tolboom, J., & van Beekum, O. (2023). The effect of robotics education on gender differences in STEM attitudes among Dutch 7th and 8th grade students. Education Sciences, 13(2), 139. [Google Scholar] [CrossRef]
  63. Vasconcelos, M. A. R., & Santos, R. P. D. (2023). Enhancing STEM learning with ChatGPT and Bing Chat as objects to think with: A case study. arXiv, arXiv:2305.02202. [Google Scholar] [CrossRef] [PubMed]
  64. Vasconcelos, V., Almeida, R., Marques, L., & Bigotte, E. (2023, June 14–16). Scratch4All project-educate for an all-inclusive digital society. 2023 32nd Annual Conference of the European Association for Education in Electrical and Information Engineering (EAEEIE) (pp. 1–5), Eindhoven, The Netherlands. [Google Scholar]
  65. Verma, M. (2023). The digital circular economy: ChatGPT and the future of STEM education and research. International Journal of Trend in Scientific Research and Development, 7(3), 178–182. [Google Scholar]
  66. Voutyrakou, D. A., & Panos, A. (2022). Educational robotics: Towards a structured, interdisciplinary definition based on the curriculum in Greek schools. European Journal of Electrical Engineering and Computer Science, 6(2), 48–58. [Google Scholar] [CrossRef]
  67. Wang, S., Andrei, S., Urbina, O., & Sisk, D. A. (2019, October 16–19). A coding/programming academy for 6th-grade females to increase knowledge and interest in computer science. 2019 IEEE Frontiers in Education Conference (FIE) (pp. 1–8), Covington, KY, USA. [Google Scholar]
  68. Wang, Z. (2020). AI, equity, and the future of STEM education: A review of trends and challenges. Journal of Educational Technology, 37(2), 175–190. [Google Scholar]
  69. Weiss, A., Zauchner, S. A., Ploessnig, M., Sturm, N., Kirilova, S., & Schmoigl, M. (2023). Navigating gender sensitivity in robot design: Unveiling the challenges and avoiding pitfalls. International Journal of Gender, Science and Technology, 15(2), 211–236. [Google Scholar]
  70. Yabas, D., Kurutas, B. S., & Corlu, M. S. (2022). Empowering girls in STEM: Impact of the girls meet science project. School Science and Mathematics, 122(5), 247–258. [Google Scholar] [CrossRef]
  71. Yasin, M. S. M., Hamid, B. A., Keong, Y. C., Zarina, Z., & Azhar, J. (2012). Linguistic sexism in Qatari primary Mathematics textbooks. GEMA Online Journal of Language Studies, 12, 53–68. [Google Scholar]
  72. Zhao, Y., & Wang, Y. (2024). The effects of educational robotics in STEM education: A multilevel meta-analysis. International Journal of STEM Education, 11(1), 1–19. [Google Scholar] [CrossRef]
  73. Zhu, J., Li, X., Zhu, X., Wang, X., & Wang, W. Y. (2023, December 6–10). Multilingual benchmarks for evaluating large language models. 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), Singapore. [Google Scholar]
Figure 1. Excerpt from ChatGPT-generated educational robotics scenario for motivating young boys in classroom of 25 students aged 10–11.
Figure 1. Excerpt from ChatGPT-generated educational robotics scenario for motivating young boys in classroom of 25 students aged 10–11.
Education 15 00711 g001
Figure 2. Excerpt from ChatGPT-generated educational robotics scenario for motivating young girls in classroom of 25 students aged 10–11.
Figure 2. Excerpt from ChatGPT-generated educational robotics scenario for motivating young girls in classroom of 25 students aged 10–11.
Education 15 00711 g002
Figure 3. Excerpt from ChatGPT-generated educational robotics scenario for motivating all students in classroom of 25 students aged 10–11.
Figure 3. Excerpt from ChatGPT-generated educational robotics scenario for motivating all students in classroom of 25 students aged 10–11.
Education 15 00711 g003
Figure 4. Tips for gender-inclusive learning scheme by ChatGPT in order to motivate both young girls and boys.
Figure 4. Tips for gender-inclusive learning scheme by ChatGPT in order to motivate both young girls and boys.
Education 15 00711 g004
Figure 5. Proposed ER strategies to engage more young girls in ER.
Figure 5. Proposed ER strategies to engage more young girls in ER.
Education 15 00711 g005
Figure 6. Demotivating elements for young girls in the Robot Sumo ER activity (a) along with mitigation strategies (b).
Figure 6. Demotivating elements for young girls in the Robot Sumo ER activity (a) along with mitigation strategies (b).
Education 15 00711 g006
Figure 7. Strategies to achieve engagement of both young girls and boys in Robot Sumo ER activity.
Figure 7. Strategies to achieve engagement of both young girls and boys in Robot Sumo ER activity.
Education 15 00711 g007
Figure 8. Demotivating elements for young girls in Robot Soccer ER activity (a) along with mitigation strategies (b).
Figure 8. Demotivating elements for young girls in Robot Soccer ER activity (a) along with mitigation strategies (b).
Education 15 00711 g008
Figure 9. Strategies to achieve engagement of both young girls and boys in Robot Soccer ER activity.
Figure 9. Strategies to achieve engagement of both young girls and boys in Robot Soccer ER activity.
Education 15 00711 g009
Figure 10. Suggested alternative topics for engaging young girls in robotics generated by ChatGPT.
Figure 10. Suggested alternative topics for engaging young girls in robotics generated by ChatGPT.
Education 15 00711 g010
Figure 11. Demotivating elements identified by ChatGPT in ER activity.
Figure 11. Demotivating elements identified by ChatGPT in ER activity.
Education 15 00711 g011
Figure 12. ChatGPT analysis of linguistic sexism ER topic titles.
Figure 12. ChatGPT analysis of linguistic sexism ER topic titles.
Education 15 00711 g012
Figure 13. ChatGPT analysis of non-gendered ER topic titles.
Figure 13. ChatGPT analysis of non-gendered ER topic titles.
Education 15 00711 g013
Figure 14. Highlighted implicit linguistic sexism by ChatGPT.
Figure 14. Highlighted implicit linguistic sexism by ChatGPT.
Education 15 00711 g014
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Voutyrakou, D.A.; Skordoulis, C. Using AI Tools to Enhance Educational Robotics to Bridge the Gender Gap in STEM. Educ. Sci. 2025, 15, 711. https://doi.org/10.3390/educsci15060711

AMA Style

Voutyrakou DA, Skordoulis C. Using AI Tools to Enhance Educational Robotics to Bridge the Gender Gap in STEM. Education Sciences. 2025; 15(6):711. https://doi.org/10.3390/educsci15060711

Chicago/Turabian Style

Voutyrakou, Dialekti A., and Constantine Skordoulis. 2025. "Using AI Tools to Enhance Educational Robotics to Bridge the Gender Gap in STEM" Education Sciences 15, no. 6: 711. https://doi.org/10.3390/educsci15060711

APA Style

Voutyrakou, D. A., & Skordoulis, C. (2025). Using AI Tools to Enhance Educational Robotics to Bridge the Gender Gap in STEM. Education Sciences, 15(6), 711. https://doi.org/10.3390/educsci15060711

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop