Creativeable: Leveraging AI for Personalized Creativity Enhancement

Kreisberg-Nitzav, Ariel; Kenett, Yoed N.

doi:10.3390/ai6100247

Open AccessArticle

Creativeable: Leveraging AI for Personalized Creativity Enhancement

by

Ariel Kreisberg-Nitzav

and

Yoed N. Kenett

^*

Faculty of Data and Decision Sciences, Technion-Israel Institute of Technology, Haifa 3200003, Israel

^*

Author to whom correspondence should be addressed.

AI 2025, 6(10), 247; https://doi.org/10.3390/ai6100247

Submission received: 20 July 2025 / Revised: 30 August 2025 / Accepted: 16 September 2025 / Published: 1 October 2025

(This article belongs to the Special Issue Understanding Transformers and Large Language Models (LLMs) with Natural Language Processing (NLP))

Download

Browse Figures

Versions Notes

Abstract

Creativity is central to innovation and problem-solving, yet scalable training solutions remain limited. This study evaluates Creativeable, an AI-powered creativity training program that provides automated feedback and adjusts creative story writing task difficulty without human intervention. A total of 385 participants completed five rounds of creative story writing using semantically distant word prompts across four conditions: (1) feedback with adaptive difficulty (F/VL); (2) feedback with constant difficulty (F/CL); (3) no feedback with adaptive difficulty (NF/VL); (4) no feedback with constant difficulty (NF/CL). Before and after using Creativeable, participants were assessed for their creativity, via the alternative uses task, as well as undergoing a control semantic fluency task. While creativity improvements were evident across conditions, the degree of effectiveness varied. The F/CL condition led to the most notable gains, followed by the NF/CL and NF/VL conditions, while the F/VL condition exhibited comparatively smaller improvements. These findings highlight the potential of AI to democratize creativity training by offering scalable, personalized interventions, while also emphasizing the importance of balancing structured feedback with increasing task complexity to support sustained creative growth.

Keywords:

creativity; AI; training; feedback

1. Introduction

Creativity is a highly valuable capacity, particularly in a world that is constantly developing and evolving. Often defined as the ability to produce something new and useful [1,2], creativity is an expansive and multifaceted concept; it encompasses various cognitive processes, including associative thinking, the restructuring of ideas and concepts, and the activation of knowledge and divergent thinking [3,4,5,6]. These qualities are indispensable in facilitating innovative and unconventional approaches to problem-solving [7], commonly referred to as thinking “outside the box”. Indeed, the capacity to generate new and useful ideas plays a crucial role in innovation [8] and various fields such as the arts, academia, education, politics, and the economy. As such, the fostering and nurturing of creativity has been the subject of interest by organizations, companies, and educational policy makers [9]. This is under the largely agreed assumption that creativity can be nurtured and enhanced [10,11,12]. The capacity to cultivate an individual’s creativity—known as creativity enhancement—and the means to do so are gaining increased scientific and applied attention [10,11,12,13,14].

In their comprehensive review of creativity enhancement methods, Haase et al. [11] distinguish between creativity training and creativity manipulation. Creativity training is defined as a systematic process, or program, through which individuals acquire specific skills while being aware of the learning process and the desired objectives, which may differ from one training program to another. Moreover, training is typically intended to yield long-lasting effects. Creativity manipulation, in comparison, is simpler and shorter in duration, participants are often unaware of its specific impact on their creativity, and the intended effect is typically not meant to be long-lasting (see also Ref. [12]). For example, Müller et al. [15] evaluated participants’ associative ability after guiding participants in a 20-min meditation session and found an increase in their creative performance.

To date, numerous creativity enhancement programs and methods have been proposed and studied, each offering unique strategies and techniques [11,12]. However, most of these programs involve some form of human intervention, either via guidance and feedback or the assessment of participants’ performance during and after the training. This reliance on human involvement leaves a significant research gap: can AI effectively replace human guidance in creativity training? Recent advancements in AI, particularly through large language models (LLMs), offer new opportunities to explore this.

These advancements in generative artificial intelligence, particularly through LLMs such as GPT-4 [16], have opened up new opportunities for the automation of tasks that are largely performed by humans [17]. LLMs’ ability to generate fluent and contextually relevant natural language makes them valuable in various domains. These include personalized educational tools that adapt to individual teaching and learning [18,19,20], or tools that generate content for marketing purposes [21]. In creative domains, LLMs have demonstrated significant potential in tasks such as drafting research-paper blogs [22], composing poetry [23], creating humorous memes [24], creating content, story generation [25,26,27], composing music [28,29], and generating code for programmers [30].

The integration of LLMs within various workflows raises compelling questions about their potential for enhancing human creativity [31]. While LLMs have shown the ability to assist in ideation, their true promise may lie in their ability to augment and enhance creativity rather than simply substitute human input [32,33,34]. This potential becomes particularly significant in the context of creativity training programs, where AI could potentially minimize or even eliminate the need for human intervention [35]. By automating creativity training, LLMs could address the common barriers associated with traditional programs, such as time, financial costs, and geographical limitations. The result would be a more accessible and democratized approach to creativity training, enabling a wider range of individuals to develop their creativity without the constraints imposed by traditional methods.

1.1. Existing Creativity Training Programs

Creativity training programs traditionally rely on human intervention to teach, mentor, or guide creative exercises, and they are often conducted in academic or professional settings [11,12]. These programs typically run for 3–8 sessions, with each session ranging from 0.5–4 h, and are spread over a period of 1–3 months [14]. Critically, such programs try to provide participants with practical approaches and techniques to foster creativity, usually targeting divergent thinking as its facilitator [10]. Divergent thinking (DT) is the ability to generate multiple solutions to a given problem and is considered a critical aspect of creativity [36,37].

A common approach involves teaching creative ideation techniques, where participants practice generating multiple, diverse ideas to solve a problem, often within group settings that allow for collaborative creative challenges [7,38,39,40,41,42]. Another approach is through metacognitive strategies, which involve teaching participants how to regulate their own thinking during creative tasks. For example, students are taught how to monitor and adjust their thinking to generate creative ideas, using techniques such as perspective-shifting and problem reframing [43]. Other programs emphasize vicarious learning and verbal persuasion to build creative self-efficacy (the belief one has the ability to produce creative outcomes; [44,45]). For example, participants practice creative techniques and engage in real-world creative processes, such as brainstorming, to develop confidence in their ability to think creatively [46]. Brainstorming sessions, a widely used method, involve participants generating ideas either individually or in groups, with the aim of producing constraint-free, novel, and diverse ideas [47,48,49]. Some programs adopt a more contextual approach, using role-playing within story modules to encourage perspective-taking, divergent thinking, and problem-solving [50]. These techniques aim to stimulate participants’ imagination, which plays a central role in creative problem-solving and ideation [51].

While human-led programs have been found effective in fostering creativity [52], they often require skilled instructors and structured group environments. In contrast, computerized creativity enhancement programs have emerged as scalable alternatives, potentially offering more flexibility in fostering divergent thinking and problem-solving. Several studies have explored these approaches, some of which focused on verbal tasks like generating original uses for everyday objects or completing words and creating slogans, activities designed to stimulate DT through computer-based exercises [53,54]. Similarly, mind mapping techniques were used to encourage creative strategies like association, decomposition, and combination, further supporting DT [55]. Other efforts targeted specific aspects of creative cognition. For example, one online program used the Remote Associates Test (RAT) and Functional RAT [56] to help participants make connections between seemingly unrelated ideas, a key mechanism that drives creativity [3,57]. Another study used a computerized game aimed at improving conceptual combination skills by having players form associations between unrelated visual concepts, with creativity assessed through fluency, flexibility, and novelty [58]. Kim et al. [59] used computer programming to train creative problem-solving, with mixed results depending on student ability and training duration; the authors found that gifted students showed significant improvement in overall problem-solving ability, while typical students improved mainly in originality of their creative problem-solving.

Despite the effectiveness of existing creativity training programs, their reliance on human feedback and structured group environments may limit their scalability. They are often confined to academic settings, making them less accessible to the general public. This underscores the gap that exists for fully automated training programs that operate without human intervention in both the training and assessment. AI-driven creativity training programs offer a promising alternative in this regard, enabling automation and personalized adaptability without human guidance, paving the way for more accessible creativity enhancement methods.

1.2. AI and Creativity Enhancement

LLMs represent a promising step forward in enhancing human creativity. These models, exemplified by GPT-4 [16] and Llama 3 [60], are defined by their massive scale, often containing billions of parameters. They are pre-trained on an extensive corpora of text, enabling them to generate human-like text across a wide array of tasks [61,62,63]. This immense scale grants these models significant capacities for pattern recognition, semantic understanding, and the ability to produce coherent, contextually relevant responses [60]. Recent research emphasizes their role as tools that can support human creativity in various ways, from idea generation to problem-solving [31].

LLMs have emerged as valuable tools for studying human cognition and creativity [64,65], problem-solving [66,67], and language acquisition [68]. Studies have demonstrated how LLMs can enhance individual creativity by offering new ideas to writers, thereby improving both the novelty and quality of their output [69,70]. However, generative AI may also reduce diversity in creative outputs due to the homogenization of AI-generated content. Such AI-based homogenization tends to make narratives or ideas more similar—i.e., consistent or less diverse—when using AI assistance [69,71,72,73,74].

Furthermore, research highlights the different ways AI can impact creativity, such as AI-human co-creation, where AI tools function as collaborative partners in the creative process [75,76]. These findings suggest that while AI can enhance creative outcomes, understanding its limitations and potential impacts on creativity, both individual and collective, is essential for future research and application. The balance between fostering individual creative output and maintaining diversity remains a key challenge in integrating LLMs into creative work [69,72,73,74,75,76,77].

1.3. Creativity Support Tools

An adjacent field involves Creativity Support Tools (CSTs), which are designed to assist individuals throughout the creative process, often within a given context or specific need. CSTs aim to enhance creative thinking, problem-solving, and innovation by promoting ideation, collaboration, and the development of creative solutions [78]. These tools range from simple applications with a single functionality to comprehensive systems that integrate multiple features. They support creative professionals in fields such as design, music, and art by helping them generate, refine, and critique ideas [79].

Research into CSTs has demonstrated the potential of computers to augment human creativity, sparking significant exploration within the Human–Computer Interaction domain. For instance, IdeaExpander facilitates group brainstorming by automatically selecting pictorial prompts based on the dynamics of the conversation [80], while skWiki provides a collaborative web-based framework for digital multimedia projects, allowing users to co-create and refine visual content [81]. Another notable system, CritiqueKit, delivers specific feedback on creative projects, aiding users to refine their work [82]. Furthermore, the Associative Creativity Sparker [83] aids in breaking mental fixation by ‘sparking’ the creative ideation process. Additionally, AI-assisted visualization tools have been implemented to assist in problem identification and construction phases during creative problem solving [84,85].

Several tools also leverage LLMs to support creative processes in specific domains. For example, Story Centaur facilitates story writing by providing writers with AI-generated prompts and feedback [86]. In the design field, Jamplate offers structured guidance and interactive prompts within a digital whiteboard environment to help novice designers develop critical and reflective thinking [87]. Other works explore how LLMs can assist by collaborating in creative writing and ideation processes [88,89].

However, while these CSTs are effective in supporting creativity within specific domains and collaborative environments, they are often limited in their ability to foster general creativity. Most CSTs focus on providing external stimuli or feedback rather than actively developing the user’s creative skills in a broader sense. Moreover, these tools tend to operate within constrained contexts, such as design or planning, and rely on collaboration with AI rather than promoting independent creative growth. We seek to address these limitations by exploring a creativity training program that develops the creative potential of its users and is not necessarily confined to specific contexts or disciplines.

1.4. Story Writing as a Creativity Training Task

Associative thinking plays a central role in creative cognition, with creative individuals often possessing richer and more flexible semantic memory structure [6,90]. This richer memory structure facilitates their associative thinking, enhancing their ability to combine existing concepts into novel and useful ideas more effectively [3,57,90,91]. To explore this notion, Mednick [57] introduced the RAT, which requires participants to find a word linking three seemingly unrelated words. This task, later revised [92,93], stimulates creative thinking by prompting connections between distant ideas and filtering them [94,95] and has also been used in creativity training [56].

Building on the associative principles of the RAT, our study employs a Story Writing Task (SWT) that requires participants to integrate three distinct words into a coherent narrative [96,97]. This approach emphasizes both divergent and convergent thinking, requiring participants to generate novel connections while maintaining narrative coherence. Similar versions of the SWT have been implemented in previous research [97,98]. This shift encourages participants to engage not only in associative thinking but also in creative problem-solving by constructing a narrative that ties these seemingly unrelated elements together in a meaningful and coherent way. By constructing narratives, they engage in DT while still exercising convergent thinking through the integration of the three words into a coherent story [99]. The absence of a clear connection between the words challenges participants to think flexibly and creatively, mirroring real-world problem-solving through constrained retrieval and evaluation processes [100]. Additionally, the task’s playful, open-ended nature can add an element of motivation and engagement, providing a structured yet flexible approach to creativity training that stimulates both idea generation and problem-solving skills.

1.5. The Current Study

The aim of this study is to evaluate the effectiveness of an AI-driven, adaptive, and personalized creativity training program without human intervention. We developed Creativeable, a novel tool designed to foster creativity independently, providing both guidance and assessment without human involvement. Creativeable leverages the capabilities of AI to create an individualized training experience by offering task-specific feedback and automatically adjusting difficulty levels based on participant performance and self-reported fatigue. A between-subject design was implemented to test the effectiveness of Creativeable along its two dimensions: with/out feedback and with/out difficulty adaptation. The effectiveness of these different conditions were examined by comparing performance before and after on a standard creativity task (AUT) and a control task (SF). We first report analyses we conducted to validate the effectiveness of Creativeable and then the analyses conducted to examine the effect of our tool on creativity and general performance. Finally, we discuss our results and the feasibility of our tool, highlighting future directions needed for the advancement of similar tools.

The study design is grounded in theoretical models of creativity and learning. Creativity is strongly linked to expanding semantic and conceptual space, the ability to form novel connections between distantly related concepts [2,3,6,57,101]. By requiring participants to incorporate semantically distant words into a single narrative, the SWT aims to push them to bridge disparate ideas, fostering the remote associations necessary for creative thinking. Expanding this semantic search space is a core mechanism of creativity [3,102,103], and our training system was designed to enhance this process through adaptive difficulty and structured feedback.

The inclusion of difficulty level adjustment is based on Vygotsky’s zone of proximal development theory [104] and the Goldilocks principle of learning engagement [105], which propose that optimal learning occurs when challenges are neither too easy nor too difficult. By dynamically adjusting the complexity of the task, Creativeable aims to keep participants engaged at an optimal challenge level, potentially preventing both stagnation and cognitive overload.

Additionally, research on feedback and creativity suggests that task-relevant, specific feedback can significantly enhance creative performance [106,107]. Deliberate practice theory [108] also emphasizes that reflection and iterative refinement through feedback are essential for skill improvement. Creativeable incorporates AI-generated general performance feedback alongside specific, actionable suggestions, allowing participants to refine their creative strategies over multiple rounds.

The primary research questions of this study are: (1) Can an AI-driven tool like Creativeable significantly enhance creativity in the absence of human intervention? (2) How do participants experience and respond to this kind of training? (3) What insights can be drawn to inform the development of future AI-driven creativity training programs? We predict that participants who train with Creativeable will demonstrate a significant improvement in creative performance post-training compared to the pre-training baseline. In addition, we predict that Creativeable will only affect creative thinking and not impact similar, non-creative control task in post- compared to pre-training. Finally, we predict that participant engagement and satisfaction will be positively associated with Creativeable’s adaptive difficulty and AI-generated feedback mechanisms.

2. Materials and Methods

2.1. Participants

A total of 385 participants were included in the study, recruited via the crowdsourcing platform Prolific. They were compensated £9 per hour and provided informed consent. Participants were native English speakers aged 18–45 years with at least 12 years of formal education (see Table 1). Recruitment targeted ~100 participants per experimental condition to ensure sufficient power for statistical analysis, based on previous studies (i.e., [83]). Recruitment was conducted in four separate batches, each corresponding to one of the study’s experimental conditions, to ensure smooth operation of the study platform. While batch recruitment presents potential concerns for temporal differences, demographic analyses confirmed no significant differences across conditions for key variables, minimizing the likelihood of systematic bias due to batch collection. Chi-squared tests for independence did not reveal any differences across the groups for age, sex, or education. Participants were excluded if they did not complete the study, provided overly brief responses, or completed the tasks unusually quickly. The study protocol was reviewed and approved by the Institutional Review Board of the Technion—Israel Institute of Technology.

2.2. Creativeable

2.2.1. Creativeable Website

A proprietary website was developed using the oTree platform to host and conduct the tool and all aspects of the study [109]. This website includes Creativeable and pre- and post- cognitive assessments. The Creativeable tool includes the SWT with the AI trainer based on OpenAI’s GPT-4 API (see below and Figure 1). Creativeable includes five rounds of the SWT with/out the difficulty adaptation and feedback. The pre- and post-cognitive assessments include creativity assessments (Section 2.3) and a baseline fluency assessment (Section 2.4). Such a design allows comparing the effect of Creativeable—across its different manipulation conditions—on the creativity assessment compared to the baseline assessment of creativity and compared to the fluency assessment task (which serves as a control task).

2.2.2. Story Writing Task (SWT)

Participants completed a creative SWT over five rounds (Figure 1). In each round, they were presented with a set of three words, each representing a distinct semantic concept (e.g., “Coffee”, “Democracy”, “Galaxy”). Their task was to compose a short story of approximately three to four sentences (60–80 words) that incorporated all three words or their derivatives (e.g., plural forms). The instructions emphasized the importance of creativity and originality while maintaining coherence in their stories. Participants were informed about the total number of story writing rounds both before and during the training. After submitting each story, participants were asked to provide subjective ratings of creativity of their stories, how challenging they thought a specific round was, and how fatigued they felt in that round (see Text S4).

2.2.3. SWT Stimuli

The word triplets used in the SWT were manually selected prior to the study. To achieve this, we prompted GPT to generate 32 triplets of words with increasing semantic (conceptual) distance within each triplet, ensuring that each word was progressively less semantically related to the others within the same triplet (see Text S1). To validate these semantic distances, we used spaCy [110] word embeddings

(en_core_web_md)

, calculating

1 - cosine similarity

scores between words to confirm increasing semantic distance within each triplet.

To further categorize the difficulty of these triplets, a pool of 17 crowd-sourced raters was recruited via Prolific (8 males, 9 females; ages: M = 31.0 years, SD = 6.35; formal education: M = 15.88 years, SD = 2.13). The raters, all fluent English speakers, were asked to write short stories incorporating these word triplets and to rate the difficulty of the task on a scale from 1 to 10 (1—not difficult at all, 10—very difficult). Based on these ratings, we calculated the average difficulty of each triplet and categorized them into four levels: easy (E), medium–easy (ME), medium–hard (MH), and hard (H). The 8 triplets with the lowest difficulty ratings according to the raters were classified as “easy,” the next 8 as “medium–easy,” and so on. For example, easy-level triplets consisted mostly of semantically related words (“detective”, “mystery”, “clue”), whereas hard-level triplets included more semantically distant words (“penguin”, “pyramid”, “microprocessor”).

All participants, regardless of condition, began the training program at the medium–easy (ME) difficulty level. The full list of word triplets, along with the rating prompts, semantic distance calculations, and difficulty scores, is provided in the Supplementary Information (Table S1).

2.2.4. Experimental Creativeable Conditions

We implemented four experimental conditions to examine the effects of feedback and difficulty adjustment on creativity training. These conditions were designed to test whether adaptive challenge levels and structured feedback enhance creative performance by encouraging participants to expand their semantic search space and refine their creative strategies (see Text S2 for AI prompt that realizes these four experimental conditions).

Feedback with Varying Difficulty Level (F/VL)

Participants received general performance feedback and specific creativity-enhancing suggestions after each round from a GPT-4-based “creativity trainer” via its API. The trainer also dynamically adjusted task difficulty at each round based on participants’ performance and self-reported fatigue by either increasing, maintaining, or decreasing the task’s difficulty level.

Feedback with Constant Difficulty Level (F/CL)

Participants received the same feedback and suggestions as in F/VL, but task difficulty remained fixed across the five rounds. While it is known that feedback can aid creative refinement [111], the lack of increasing challenge may limit further semantic exploration. Without progressive difficulty, we postulated that participants would show moderate creativity gains but potentially plateau over time.

No Feedback, Varying Difficulty Level (NF/VL)

Participants did not receive feedback but experienced adaptive difficulty adjustments based on performance across the five rounds. While increasing challenge can promote creative breakthroughs, the absence of guidance may slow improvement [107]. Creativity gains were expected but at a slower rate than conditions with feedback.

No Feedback, Constant Difficulty Level (NF/CL)

In this control condition, participants received neither feedback nor difficulty adjustments, as the difficulty level of the tasks remained constant. Without structured guidance or progressive challenge, improvements would rely solely on self-directed learning. Although prior studies suggest that repetition alone may foster significant creative growth [108], we expected this condition to show the least amount of improvement.

2.2.5. Evaluating the Creativity of the Stories

When processing the data collected from participant responses across all four conditions, we used GPT-4 to rate the creativity of the stories, as it has been established as a reliable evaluator in various textual tasks [112,113,114,115,116,117]. For each story, GPT-4 was prompted as a creative writing expert and was asked to assess creativity based on the three prompt words given to the participant. It rated the story’s creativity on a scale from 1 to 5, considering factors like how imaginative, surprising, and original the characters and plot were, as well as how effectively and creatively the three words were integrated into the narrative. Each story was evaluated five times to address potential variability in GPT’s scores, and the final creativity score was the average of these five ratings (see Text S3 for AI prompt that scores the creativity of the stories).

2.2.6. Adaptive Difficulty Level Policy

The difficulty of the SWT was determined by the difficulty category assigned to the word triplet for each round. In the F/VL and NF/VL conditions, the AI trainer adjusted the task difficulty based on participants’ performance and reported fatigue from the previous round (see Text S2 for AI prompt used to adjust the task’s difficulty level). The trainer was prompted to decide whether to increase, decrease, or maintain the difficulty level for the subsequent round. If a participant demonstrated reasonable performance and a high level of creativity, the trainer would increase the difficulty level. Conversely, if the participant’s performance indicated lower creativity or if they reported high fatigue, the trainer would decrease task difficulty. For example, all participants began the first round with a word triplet from the medium–easy level. In the following round, the difficulty level could be adjusted by the trainer to easy, maintained at medium–easy, or increased to medium–hard, depending on the participant’s prior performance. The adjustments were made incrementally, meaning participants could only move to the adjacent lower or higher difficulty level, i.e., it was not possible to skip directly from the easy level to the medium–hard level in a single round.

2.2.7. Personalized Feedback and Suggestions for Improvement

In the F/VL and F/CL conditions, after each SWT round, participants received two types of feedback from the trainer. The first type was general feedback on the story. To ensure the feedback was broad and adaptable, the prompt provided to the trainer was intentionally not too specific. The AI trainer was instructed to “evaluate a short story [written by the participant] that incorporates three specific words [the participant] was asked to seamlessly and creatively integrate into the story (…) and provide concise feedback (limited to 2 sentences) on the story”. The feedback typically addressed the participant’s performance on the task, the plot’s interest level, the creative use of the words in the given triplet, and other aspects of writing style and structure.

The second type of feedback was a focused suggestion on how to enhance creativity in the next round of story writing. The trainer was instructed to “suggest one primary way for [the participant] to improve [their] creativity or brainstorming process for such a task, not necessarily for that specific story. Try to avoid repeating the same suggestions”. To provide context, each time GPT was invoked, it was also presented with the participant’s stories from earlier rounds, along with the feedback and suggestions already provided (see Text S2 for AI prompt used to providing feedback by the AI trainer).

After receiving the two types of feedback, the general comments and the focused suggestion for improvement, participants were asked to rate how useful they found the improvement suggestions on a scale from 1 to 4: 1—“Not helpful—I didn’t find the suggestion relevant to my writing style or creativity needs”; 2—“Slightly helpful—The suggestion was somewhat relevant, but I’m not sure how it will impact my creativity”; 3—“Moderately helpful—The suggestion seems relevant, and I’m curious to see if it improves my creativity”; 4—“Very helpful—The suggestion was insightful and seems like it could really enhance my creative writing”.

2.3. Creativity Assessment

For the main creativity assessment tool, we used the Alternative Uses Task (AUT), a widely used psychological assessment designed to measure DT, which is a key component of creativity [36]. In this task, participants are presented with a common object (e.g., a brick) and are asked to generate as many different and unusual uses for that object as they can think of within a specific time frame [118]. Two random items from a predefined list were consecutively presented to the participant, before and after Creativeable. For each of these objects, participants had 2 min to provide all the unsual and creative uses they could think of. Performance in the task was evaluated using the online automated scoring platform OCSAI, which uses prompted GPT instance to score the originality of the uses given the item between 1 and 5 (1—not creative, 5—very creative) [119]. We averaged the scores for each item, then averaged the scores of the two items presented before Creativeable to calculate the “pre-training” AUT score, and finally averaged the two post-training items scores for the “post-training” AUT score. To assess task fluency, we counted the number of creative uses provided by each participant and followed the same averaging procedure [120]. The items used in the task were ‘toothbrush’, ‘shoelace’, ‘pencil’, ‘tire’, ‘bucket’, ‘chair’, ‘newspaper’, ‘pillow’, ‘paperclip’, and ‘plastic bottle’ [121].

2.4. Fluency Assessment

The Semantic Fluency Task (SFT) is commonly used to assess executive functions, particularly in areas related to semantic memory [122,123,124,125]. During the task, participants are given a specific time limit, usually two minutes as in this study, and are asked to come up with as many different members of a given category as they can within that time frame. The task measures a person’s ability to rapidly produce words within a certain category and is used to evaluate fluency and cognitive flexibility [126]. In this study, the category was animals, and participants were given 2 min to perform the task. We measured fluency by counting the number of responses [127].

2.5. Procedure

Participants began the study by providing informed consent. They then completed initial assessment tasks: two rounds of the AUT and one round of the animal category SFT. Following this, they underwent five rounds of creativity training via Creativeable with the SWT, based on their assigned study condition. Afterward, participants repeated another two rounds of AUT and one round of the SFT to measure changes in their creativity and concluded by reflecting on their experience and providing demographic details. The whole study took on average 50 min.

3. Results

3.1. Validation of Experimental Design

First, we examine whether Creativeable effectively elicited the intended variations in participant experiences across conditions. We conducted analyses to verify that task difficulty, the perceived task difficulty by the participants, and their fatigue levels aligned with the experimental manipulations and varied across rounds and conditions. All subsequent inter-round comparison validation analyses were adjusted for multiple comparisons using the Holm-Bonferroni procedure [128], unless stated otherwise.

3.1.1. Perceived Task Difficulty Through Training Rounds

We conducted a mixed-design ANOVA to explore changes in self-perceived task difficulty across the five training rounds and among the four conditions. The analysis revealed a significant main effect of condition, F(3, 381) = 3.23, p = 0.022, η² = 0.02, indicating that participants’ perceived task difficulty differed significantly between conditions. A significant main effect of training round was also found, F(4, 1524) = 7.78, p < 0.001, η² = 0.02, indicating that participants’ ratings fluctuated over the course of the five rounds, independent of condition. Additionally, a significant main effect of interaction between condition and training round was found, F(12, 1524) = 1.87, p = 0.033, η² = 0.014, indicating that the pattern of change in self-perceived difficulty over time varied significantly across the four conditions.

Tukey HSD corrected post hoc comparisons revealed a significant difference in self-perceived task difficulty between several condition pairs. Participants in the F/VL condition (M = 2.75) reported significantly higher task difficulty compared to those in the NF/CL condition (M = 2.51), SE = 0.071, p = 0.004, Cohen’s d = 0.21. Similarly, the F/CL condition (M = 2.82) had significantly higher task difficulty ratings than the NF/CL condition, SE = 0.071, p < 0.001, Cohen’s d = 0.28. The NF/VL condition (M = 2.83) also reported significantly higher task difficulty scores compared to the NF/CL condition, SE = 0.072, p < 0.001, Cohen’s d = 0.30. However, no significant differences were found between F/VL and NF/VL, F/VL and F/CL, or NF/VL and F/CL conditions. These findings indicate that without feedback, varying difficulty levels increase perceived task difficulty. However, when difficulty levels remain constant, the presence of feedback similarly increases perceived difficulty. When either feedback or varying difficulty is present, adding the complementary factor (feedback or varying difficulty) does not result in a further increase in perceived task difficulty.

In both VL conditions, perceived task difficulty increased over the training rounds, peaking around rounds 3 and 4, with notable variability in responses. A repeated-measures ANOVA revealed a significant overall increase in perceived difficulty in both VL conditions, F/VL: F(4, 384) = 5.21, p < 0.001, η² = 0.021; NF/VL: F(4, 368) = 4.93, p < 0.001, η² = 0.026. In the F/VL condition, the average perceived difficulty during rounds 4 and 5 was significantly higher than in round 1 (t(96) = 4.29, p < 0.001, Cohen’s d = 0.32; t(96) = 2.86, p = 0.046, Cohen’s d = 0.43, respectively). Similarly, in the NF/VL condition, rounds 2 through 4 were significantly higher than round 1, t(92) = 2.84, p = 0.044, Cohen’s d = 0.34; t(92) = 3.82, p = 0.002, Cohen’s d = 0.47; t(92) = 3.73, p = 0.003, Cohen’s d = 0.43, respectively. In the F/CL condition, there was a significant change in perceived task difficulty across rounds, F(4, 388) = 2.94 p = 0.02, η² = 0.01. Participants in this condition reported relatively stable difficulty levels, with the only significant change occurring between rounds 1 and 2, t(97) = 3.53, p = 0.01, Cohen’s d = 0.33. In the NF/CL condition, participants reported stable difficulty levels with minimal variation across rounds. A repeated-measures ANOVA confirmed this stability, F(4, 384) = 0.29, p = 0.88, η² = 0.001, indicating no significant shifts in perceived task difficulty, consistent with the study design.

3.1.2. Self-Reported Fatigue Through Training Rounds

We conducted a mixed design ANOVA to examine the effect of experimental condition and training round on self-reported fatigue levels. A significant main effect of training round was observed, F(4, 1524) = 104.7, p < 0.001, η² = 0.21, indicating that participants’ fatigue levels significantly increased over the course of the five rounds, regardless of condition. However, there was no significant main effect of condition, F(3, 381) = 0.45, p = 0.71, η² = 0.004, suggesting that fatigue levels did not differ significantly between the four conditions. Additionally, no significant interaction effect between condition and training round was found, F(12, 1524) = 1.37, p = 0.017, η² = 0.01, indicating that the pattern of fatigue increase over time was consistent across all conditions. These findings indicate that self-reported fatigue increased as participants progressed through the rounds, while fatigue levels remained comparable across the different conditions.

3.1.3. Task Difficulty Levels Through Training Rounds

Across all four study conditions, participants started the creativity training with a word triplet of medium–easy (ME) difficulty, which remained constant in the two CL conditions. In the F/VL and NF/VL conditions, however, task difficulty was adjusted based on participants’ performance and reported fatigue, according to the AI trainer (Figure 2). In both conditions, participants progressively moved to higher difficulty levels (medium–hard and hard), with many reaching these by round 3. By round 4, most participants had stabilized at the higher levels, with only a few returning to easier tasks. This pattern aligns with participants’ reported task difficulty, where perceived difficulty increased up to rounds 3 and 4 followed by a slight decrease. This is possibly due to participants adapting to the tasks, or adjustments made by the AI trainer to ensure an appropriate difficulty.

Statistical analysis using Chi-square tests for independence revealed no significant differences in difficulty level progression between the F/VL and NF/VL conditions across the five training rounds. Specifically, Chi-square tests across rounds showed p-values ranging from 0.146 to 1.0 (all p’s > 0.05), indicating that the distribution of difficulty levels was comparable between the two conditions. For example, in round 2, χ²(2) = 2.76, p = 0.252; in round 3, χ²(3) = 5.39, p = 0.146; and in round 5, χ²(3) = 5.37, p = 0.147. These results suggest that the adaptive difficulty adjustments were similar in both conditions, with minimal impact from the feedback provided in the F/VL condition. Although there was some variation, particularly a slight return to easier levels in the F/VL condition, feedback’s overall influence on difficulty transitions appears limited.

3.2. Creativity and Performance Results

Next, we examined the impact of Creativeable on participants’ creative performance. Changes in creativity were assessed using a range of measures, including the AUT, fluency task, GPT-assigned creativity scores, and self-reported creativity ratings. The analyses examine whether the training led to significant improvement in creativity, both across conditions and over time, and whether factors such as feedback and adaptive task difficulty contributed to these changes.

3.2.1. Creativity Improvements Across Conditions

Participants’ creativity scores, as measured by the AUT, showed moderate but statistically significant increases across all four experimental conditions (Figure 3). We conducted a mixed-design ANOVA to examine the effect of the training on creativity scores, with time (pre-training vs. post-training) as a within-subjects factor and condition as a between-subjects factor. This analysis revealed a significant main effect of time, F(1, 381) = 108.76, p < 0.001, η² = 0.22, indicating that participants’ creativity scores increased significantly from pre-training (M = 2.6, SD = 0.36) to post-training (M = 2.81, SD = 0.4), regardless of condition. This analysis also revealed a marginally significant main effect of condition, F(3, 381) = 2.5, p = 0.06, η² = 0.019, suggesting a potential trend toward differences in creativity scores between the four conditions. Finally, this analysis revealed a significant condition x time interaction effect, F(3, 381) = 2.74, p = 0.043, η² = 0.021, indicating that the change in creativity scores from pre- to post-training differed significantly across conditions. To further explore this interaction effect, we conducted Tukey’s HSD post hoc comparisons separately for pre-training and post-training creativity scores, and within conditions.

A one-way ANOVA examined the effect of condition on pre-training creativity scores. This analysis revealed no significant differences in AUT scores between conditions before the training, F(3, 381) = 1.6, p = 0.18, η² = 0.012, indicating that participants across conditions began with similar baseline levels of creativity. A one-way ANOVA examined the effect of condition on post-training creativity scores. This analysis revealed significant differences in AUT scores between conditions after the training, F(3, 381) = 3.34, p = 0.002, η² = 0.025, indicating that the different training conditions had varying levels of effectiveness in improving creativity. We conducted a Tukey HSD corrected post hoc analysis to explore these differences further, finding a significant mean difference between the NF/VL and F/VL conditions (ΔM = 0.175, SE = 0.06, p = 0.014, Cohen’s d = 0.41). Other paired comparisons between conditions for post-training scores were not statistically significant.

As noted above, significant increases in AUT scores were observed across all conditions from pre-training to post-training. In the F/VL condition, the mean AUT score increased from M = 2.60 (SD = 0.34) before training to M = 2.71 (SD = 0.44), t(96) = 2.75, p < 0.001, Cohen’s d = 0.28. For the F/CL condition, the mean AUT scores increased from M = 2.56 (SD = 0.40) to M = 2.81 (SD = 0.38), t(97) = 6.38, p < 0.001, Cohen’s d = 0.65; in the NF/VL condition, the mean AUT scores increased from M = 2.67 (SD = 0.33) to M = 2.89 (SD = 0.41), t(92) = 5.94, p < 0.001, Cohen’s d = 0.58. Finally, in the NF/CL condition, the mean AUT scores increased from M = 2.63 (SD = 0.37) to M = 2.84 (SD = 0.35), t(96) = 5.97, p < 0.001, Cohen’s d = 0.59.

To further explore these differences, we calculated the percentage rate of improvement in AUT scores from pre- to post-training. The F/VL condition showed an average improvement of M = 4.82%, SD = 16.2%, 95%CI = [1.57, 8.09] while the F/CL condition registered an improvement of M = 11.6%, SD = 17.42%, 95%CI = [8.11, 15.09], NF/VL with improvement of M = 8.61%, SD = 14.14%, 95%CI = [5.7, 11.52] and NF/CL with M = 9.42%, SD = 15.0%, 95%CI = [6.41, 12.44]. A one-way ANOVA confirmed that improvement rates differed significantly between conditions, F(3, 381) = 3.12, p = 0.025, η² = 0.024). Tukey HSD corrected post hoc comparisons revealed that the F/CL condition was significantly more effective than the F/VL condition, with a mean difference of 6.77%, SE = 2.25, p = 0.015, Cohen’s d = 0.4. No significant differences in improvement rates were found between the other conditions.

While creativity training led to improved creativity across all four conditions, it did not enhance participants’ ability to generate more ideas in the AUT or in the SFT. We conducted a mixed-design ANOVA to assess changes in the average number of responses (uses) generated in the AUT, with time (pre-training vs. post-training) as a within-subjects factor and condition as a between-subjects factor (Figure 4). The analysis revealed no significant main effect of time, F(1, 381) = 0.04, p = 0.90, η² = 0.0, nor interaction between time and condition, F(3, 381) = 1.53, p = 0.2, η² = 0.01.

We conducted a similar mixed-design ANOVA for the SFT, with time and condition as factors (Figure 5). This analysis revealed no significant main effect of time, F(1, 381) = 0.40, p = 0.52, η² = 0.001, nor significant interaction between time and condition, F(3, 381) = 0.29, p = 0.83, η² = 0.002. However, there was a significant main effect of condition, F(3, 381) = 5.0, p = 0.01, η² = 0.04, indicating group-level differences across the conditions. Thus, while the training encouraged participants to generate more original and creative ideas, it did not lead to an increase in the overall number of ideas produced, indicating that fluency was not affected by the training.

3.2.2. Creativity Scores Across Training Rounds

Next, we examined how creativity scores changed overall across the training rounds and how such change differentiated across the conditions. To do so, we examined two training round specific scores, the GPT-evaluated creativity score and participants’ subjective creativity score for every story across each of the rounds.

The analysis of creativity scores assigned by GPT revealed patterns of change across the five training rounds, with significant differences in specific conditions. The average creativity scores across all conditions fluctuated between approximately 2.5 and 3.6, on a scale from 1 to 5. Participants generally experienced a slight increase in their creativity scores from the first to the second round, likely due to increased familiarity with the task. In the NF conditions (NF/VL and NF/CL), creativity scores did not exhibit significant changes across rounds. However, in the F/CL condition, statistically significant changes were observed across the five rounds.

To examine the effects of condition and training round on GPT-evaluated creativity scores, we conducted a condition x training round mixed-design ANOVA. This revealed a significant main effect of condition, F(3, 381) = 29.48, p < 0.001, η² = 0.155, indicating that creativity scores differed across the four conditions; a significant main effect of training round, F(4, 1524) = 10.21, p < 0.001, η² = 0.026, indicating that creativity scores changed significantly across the five rounds, regardless of condition; and a significant interaction between condition and training round, F(12, 1524) = 2.01, p = 0.02, η² = 0.016, indicating that changes in creativity scores across rounds varied depending on the assigned condition.

Further analysis using repeated-measures ANOVA within each condition revealed a significant change in creativity scores in the F/VL condition, F(4, 384) = 3.32, p = 0.01, η² = 0.018 (Figure 4). Specifically, the mean creativity score in the F/VL group increased from M = 2.7 (SD = 0.6) in the first round to M = 2.98 (SD = 0.79) in the third round, t(96) = 3.12, p = 0.02, Cohen’s d = 0.38. Furthermore, this analysis revealed a significant change in creativity scores in the F/CL condition, F(4, 388) = 12.14, p < 0.001, η² = 0.058. Specifically, the mean creativity score in the F/CL group increased from M = 3.03 (SD = 0.66) in the first round M = 3.49 (SD = 0.69) in the third round, t(97) = 5.52, p < 0.001, Cohen’s d = 0.43, up to M = 3.44 (SD = 0.75) in the fifth round, t(97) = 4.97, p < 0.001, Cohen’s d = 0.57. Comparisons between intermediate rounds (e.g., Rounds 3 and 4 or 4 and 5) were not statistically significant. In contrast, in the NF conditions, GPT-evaluated creativity scores did not change significantly across rounds (all p‘s > 0.05), indicating that the training round had minimal impact on scores in these groups.

Next, we conducted a similar analysis on participants’ subjective creativity evaluation for each of their stories across the training rounds. The pattern of self-reported creativity scores somewhat mirrored the GPT-assigned creativity scores, with most scores ranging between 2.5 and 3.3. Across all conditions, there was a noticeable increase in self-reported creativity from the first to the second round, followed by a decline across all conditions between the second and third rounds and between the third and fourth rounds. Like the GPT evaluated scores, there was a slight-to-moderate increase in self-reported creativity between the fourth and fifth rounds. A mixed-design ANOVA revealed a significant effect of training round, F(4, 1524) = 6.34, p < 0.001, η² = 0.16 and significant effect of condition, F(3, 381) = 3.38, p = 0.018, η² = 0.03, but no significant interaction between the two, F(12, 1524) = 0.16, p = 0.99, η² = 0.001.

3.3. Exploratory Insights on Feedback

Finally, we conducted exploratory and qualitative analysis of the feedback provided to participants during the study. Specifically, we examined the content, themes, and perceived utility of feedback offered by the AI trainer, including general narrative critiques and personalized creativity suggestions. We explored participants’ responses to the feedback, their strategies for incorporating it into their writing, and their reflections on its impact on creativity and engagement. These insights provide a more nuanced understanding of the role of feedback in fostering creativity through Creativeable and highlight the potential challenges and benefits of using AI-driven feedback systems in creative tasks.

3.3.1. Transfer Feedback

Through GPT-4’s API, Creativeable provided concise feedback on participants’ stories, consisting of a positive remark to acknowledge effort and constructive criticism to guide improvement. Feedback focused on story structure, integration of prompt words, character development, and thematic coherence, with recurring themes emphasizing vivid scenes, well-rounded characters, and coherent narratives (examples in Table 2). While participants were often commended for originality and creative use of prompt words, common critiques noted issues like stories resembling summaries rather than dynamic narratives. Feedback also addressed writing conventions, including sentence structure, punctuation, and scene transitions, positioning the AI trainer more as a writing coach than a creativity facilitator. Adjusting the GPT instance’s prompt to prioritize creativity-focused feedback may better align with the program’s objectives.

Analysis of participants’ suggestions revealed that they were generally divided into six key themes: DT techniques, writing techniques, perspective shifts, sensory and emotional immersion, creative constraints and challenges, and rest and rejuvenation (Figure 6). Examples of suggestions for each theme are provided in Table S2.

3.3.2. Participant Experiences with Improvement Suggestions in the Feedback Conditions

During the reflection phase, participants in the F/VL and F/CL conditions shared mixed perspectives on the feedback they received and its applicability to their stories (Figure 7). Many participants found the feedback helpful and attempted to implement it, noting improvements in areas such as using metaphors, enhancing cohesion, and refining punctuation. For example, one participant remarked, “I enjoyed the feedback and tried to add what I learned into the following stories if I could think of ways to.” However, some participants faced challenges, citing the feedback’s complexity, perceived irrelevance, or impracticality within the constraints of the task. One participant expressed confusion over undefined terms, while another noted, “I tried to incorporate the feedback, but then I ended up going over my word limit.” A few participants were skeptical about the value of AI-generated feedback altogether, viewing it as misaligned with their creative goals.

Despite these challenges, several participants reported that the feedback enhanced their creativity and motivation, as one participant reflected, “The feedback was helpful and a key motivator that enhanced my creativity.” These findings highlight the importance of providing clear, actionable, and context-appropriate feedback, particularly in creative tasks with specific constraints, as poorly tailored feedback may hinder rather than support creative outcomes.

3.3.3. Changes in Motivation and Engagement During the Training

To better understand the design considerations for developing a creativity training program using semantic and textual stimulation, participants were asked reflection questions about their experiences during the final part of the study. These questions explored their perceptions of the repeated short story writing task, the strategies they employed, and their thoughts on whether and how their creativity improved. While responses provided valuable insights, only findings related to motivation and engagement are detailed here. Other aspects were beyond the scope of this paper and were, therefore, not included in the final analysis. We qualitatively explored whether the different conditions influenced participants’ motivation and engagement during the approximately 50-min study and the factors driving these changes.

Across all conditions, many participants reported consistent levels of motivation and engagement throughout the study. For instance, one NF/VL participant shared, “My level of motivation and engagement remained constant throughout the study. If anything, it increased as the study progressed.” Similarly, an NF/CL participant noted, “My motivation was the same since the words for each story were very different and allowed for varied stories to be written”.

Participants who enjoyed the task reported sustained motivation across conditions. One NF/VL participant remarked, “I really enjoyed the challenge of three random words and trying to make them make sense and fit in a story,” while an NF/CL participant wrote “I enjoyed writing stories and word games, so the challenge kept me engaged.” In the F/VL condition, feedback was a significant motivator for some participants. For example, one participant shared, “The feedback suggestions made me want to do better each time,” while another noted, “After each feedback, my motivation grew—I wanted to try harder and possibly beat the feedback system.” However, feedback occasionally had the opposite effect, with some participants feeling discouraged or frustrated, which negatively impacted their engagement.

Finally, fatigue and the repetitive nature of the tasks were common factors contributing to decreased motivation across conditions. One F/VL participant stated, “My motivation was high initially but dropped as I got tired,” while an NF/VL participant remarked, “By the 4th and 5th stories, I was too fatigued to think creatively”.

4. Discussion

In this study, we developed an automated, personalized, and adaptable creativity enhancement program (Creativeable) to evaluate the effectiveness of AI-driven feedback and adaptive task difficulty in enhancing creativity without human intervention. Creativeable leverages a LLM to deliver personalized creativity training through a story-writing task, where participants integrate three words of varying semantic distance into a coherent narrative [96]. We examined the impact of AI-generated feedback, adaptive difficulty, and participant fatigue on creative performance, providing insights into how AI can support scalable, fully automated creativity training. Specifically, we measured participants’ performance on a creativity task (AUT) and a control task (SFT) before and after engaging with Creativeable.

4.1. Comparing with Existing Training Programs

In contrast to traditional creativity training programs that rely on human-led instruction settings [11,12], our study introduces an automated, AI-driven approach that addresses key limitations of existing methods, such as scalability, time, and personalization [10,14]. While previous programs have focused on collaborative settings and exercises guided by a human instructor [7,38,39,40,41,42,43,46,47,48,49,50,51], Creativeable offers a novel approach that eliminates the need for human intervention in both guidance and assessment [79]. This could allow for a more accessible and adaptable training experience, tailored to individual participant performance.

Similarly to computerized creativity programs, such as those suggested by Fink et al. [54] and Huang [56], our study leverages technology to enhance creativity. Importantly, we go even further by incorporating AI-generated feedback and difficulty adjustments, enabling a fully autonomous training process. Unlike prior computerized efforts, which still require human feedback or assessment [35], Creativeable provides real-time, personalized input without the need for human involvement.

By offering a scalable and adaptive approach, our study extends the potential for AI in creativity enhancement beyond the limitations of resource-intensive, human-led programs [11,12], providing a more democratized and accessible platform for creativity training. This shift provides a step forward from previous research [32,33,34] in addressing the constraints of traditional methods while opening new pathways for creativity research and practice.

4.2. Effectiveness of Creativity Enhancement

The results demonstrated that Creativeable led to significant improvements in creativity scores across all four conditions, as measured by the AUT. While the improvements were evident across conditions, the degree of effectiveness varied, with the F/CL condition showing the most notable gains, followed by the NF/CL and NF/VL conditions, while the F/VL condition exhibited comparatively smaller improvements.

This variation suggests that feedback, when combined with consistent lexical stimulation (F/CL), may be particularly effective in fostering creativity, while the inclusion of variable difficulty (F/VL) may have introduced complexities that reduced its overall impact. Thus, the combination of feedback and varying difficulty levels may not have synergistic effects on creativity improvement, and the inherent challenge and intrinsic motivation of the task itself are sufficient to drive improvements. The reduced effectiveness of the F/VL condition could imply that combining both interventions may have introduced cognitive or motivational complexities that diminished their overall impact.

The finding that creativity scores improved significantly in the F/CL condition aligns with the idea that structured, targeted feedback can enhance creative processes, particularly when paired with stable external stimuli [129]. This condition appears to have supported participants in refining their creative strategies over time, as reflected in the significant increases observed across the five training rounds.

The variability in effectiveness across conditions underscores the importance of understanding the interactions between feedback, task difficulty, and lexical stimulation in creativity training. While feedback can be a powerful motivator and guide, as evidenced in the F/CL condition, its utility depends on its alignment with task design and participant needs. The relatively smaller gains in the F/VL condition suggest that combining feedback with adaptive difficulty may inadvertently hinder creativity by introducing cognitive or motivational challenges.

One possible reason for this hindering effect is increased cognitive load, which has been shown to impact creativity [130]. Previous work has shown that cognitive load impacts AUT (e.g., Refs [131,132]). For example, Rodet [131] conducted a dual-task study with increasing cognitive load in parallel to the AUT. This study found that as cognitive load increased, the quality and quantity of AUT responses decreased, indicating the challenge of dual-tasks. Redifer et al. [132] examined the relationship between performance feedback, creative self-efficacy, and cognitive load during creativity tasks. The authors show how positive feedback coupled with higher creative self-efficacy were associated with lower cognitive load during the creativity tasks (such as the AUT). Negative feedback was associated with higher cognitive load that had a negative impact on the AUT. Given that we find increased fatigue effects across the training rounds in the variable level conditions, such fatigue can interact with the feedback condition and thus possibly interfere with the separate positive training effect of the feedback and adaptive difficulty. Thus, future research is needed to further elucidate such possible dual-task effects and their impact on possible creativity training.

While the training program enhanced creativity scores, it did not lead to a significant increase in fluency, as measured by the number of ideas generated in the AUT or fluency task. This distinction reinforces the notion that the SWT primarily targeted DT and idea originality, which are both critical for creative idea restructuring [36,37], rather than fluency, which has been shown to be a confound of idea originality [133,134]. This result highlights the success of our tool, as while generating a similar number of AUT responses post-training, these responses were more original. Furthermore, the lack of change in the fluency of the SFT control task lends more support that our tool was able to facilitate originality in participants’ responses. The absence of fluency gains across conditions highlights the importance of aligning training tasks with specific cognitive outcomes. If future programs aim to enhance both fluency and DT, they may need to incorporate a broader range of tasks tailored to these dimensions.

4.3. AI Generated Feedback in Creativity Enhancement

Exploring the role of AI-generated feedback in enhancing creativity reveals a nuanced picture. Feedback was generally rated as useful by participants, but individual experiences varied. Some participants in the F/VL condition reported that the feedback helped guide their creative process and motivated them to improve, while others found it too generic or not directly applicable. Interestingly, the type of suggestions found most beneficial by participants were the sensory and emotional immersion and writing techniques, while DT techniques were deemed less useful. This shows that participants valued a more CST-like behavior from the AI trainer, i.e., when it gave them very specific and contextualized creative writing tips, and that immediate implementation of DT techniques was either cognitively demanding or not very clear. This variability suggests that AI trainer feedback, while valuable, requires more refinement to match the contextual awareness and task-specific relevance typically provided by human feedback. Findings from recent studies [69,135], noted that while feedback can enhance creativity, it may also constrain originality by encouraging participants to fit their responses into perceived expectations. Critically, these studies highlight the dual nature of feedback: while it can be a motivator and provide valuable insights, it might also limit DT if participants focus too much on “satisfying” the AI trainer rather than exploring creative possibilities.

Importantly, our AI trainer gave participants both general and specific feedback advice. While the general feedback focused on the general aspects of the story, and the specific feedback focused on suggestions on how to enhance creativity in the next story. As such, the AI trainer feedback acts both as a writing coach and a creativity facilitator. We did not differentiate between both aspects of feedback, so we cannot determine which aspect of the feedback was adopted by participants. Thus, a way to disentangle between a “writing enhancement effect” and a “creativity enhancement effect” is not possible and requires future research to do so. Nevertheless, richer utilizing of language is linked to heightened creativity [3,136], and thus such a dissociation may not be that critical.

The comparison between AI and human feedback raises important considerations for the future of AI in creativity training [32,33,34]. While AI can provide scalable and accessible feedback, particularly in large-scale or remote training programs, there may be a need to integrate human oversight in the design and monitoring of such programs [75]. This hybrid approach could help mitigate the limitations observed in this study, offering more personalized and contextually relevant guidance to participants. Possible ways to move forward in such a direction are based on few-shot prompting, reinforcement learning from human feedback, or LLM fine tuning. Several of these methods are already examined with regard to improving objective originality assessment [137], and thus can be integrated in AI creativity training programs in the future.

4.4. Implications for Future Creativity Training Programs

The findings from this study offer several insights into the design of future AI-driven creativity training programs. A key takeaway is the importance of adaptive feedback that is both task-specific and personalized. While AI-generated feedback has the potential to provide consistent and scalable guidance in writing [138,139,140], it must be carefully tailored to the specific needs of the participant and the context of the task to be truly effective. Furthermore, it should be practical, such that it can be implemented immediately within a training session, as well as relevant for activities outside of the training itself, like the recreational suggestions suggested by the trainer in this study. This can be achieved by allowing participants to ask the trainer clarifying questions or to chat with it to better understand its suggestions and the reasoning behind them.

A balanced approach to creativity training is essential, incorporating tasks that target different aspects of creativity, such as fluency and divergent thinking. A well-rounded program might include tasks that encourage quick brainstorming alongside more reflective, elaborative tasks like story writing. This approach could help participants develop a broader range of creative skills and ensure that improvements in one area of creativity do not come at the expense of others.

Another important consideration is the role of intrinsic motivation in sustaining engagement. Creativity training programs should be designed to be intrinsically rewarding, offering tasks that are not only cognitively engaging but also enjoyable. This could help mitigate the effects of mental fatigue and task repetition [141], which were significant challenges in this study. To enhance intrinsic motivation, programs could introduce competitive elements, encouraging participants to strive for more creative or original solutions to tasks. Additionally, incorporating collaborative elements, similar to those in existing human-led creativity training programs, could be beneficial—where an AI, rather than a human mentor, mediates and leads the program.

The potential of AI creativity training programs to democratize access to creativity training by providing scalable, accessible, and consistent guidance is significant. Yet, some caution is needed when developing AI-based interventions for psychology or educational settings. This is especially due to largely still unknown AI-specific biases that are incorporated within LLMs [142] as well as overly homogenized responses generated by such models [72]. As such AI-based tools are increasingly being developed, it is critical to ensure that they truly enhance human creativity instead of hindering it. The limitations observed in this study highlight the need for continued research and development to ensure that these programs can effectively replace or complement traditional, human-guided creativity training methods.

4.5. Limitations and Future Research

While this study provides insights into the potential of AI-generated feedback and adaptive difficulty levels in creativity training, it is important to acknowledge its limitations. One primary limitation is the reliance on a single creative task—the SWT [96]—to train creativity. While this method encouraged structured creative expression, it may have constrained spontaneous idea generation by emphasizing coherence and narrative structure over divergent thinking. Future studies should explore a broader range of tasks to assess different dimensions of creativity, such as fluency, flexibility, and originality, and expand investigations into other creative domains like visual arts or music. Such future studies are needed to replicate and generalize the feasibility of creativity AI training programs such as Creativeable.

Data collection in separate batches raises potential concerns about temporal differences in participant recruitment, particularly regarding demographics that might influence the results. To address this, key demographic variables (age, gender, and education level) were analyzed across conditions. The results confirmed that gender and education levels were consistent across conditions, with no significant differences observed (χ² test and ANOVA, respectively). Although a marginally significant difference in age was observed (p = 0.05), the associated effect size was negligible (η² = 0.02), suggesting this is unlikely to meaningfully impact the study’s findings.

Another limitation lies in the controlled, artificial environment in which the tasks were conducted. Real-world creative scenarios involve more complex constraints, influences, and motivations [103], which the study design may not fully capture [143]. Thus, the ecological validity of the findings could be limited, as participants’ behaviors in these structured tasks may not directly translate to real-life creativity challenges. Future research should explore whether AI-generated feedback and adaptive difficulty settings are similarly effective in more authentic, real-life creative contexts.

Additionally, the study’s focus on short-term creativity improvement leaves open the question of long-term effects. While significant improvements in creativity were observed after the training sessions, it is unclear whether these gains are sustained over time. Longitudinal studies would be essential to assess whether the impact of AI-driven creativity training persists and whether participants continue to apply the learned techniques in their everyday creative activities. Thus, longitudinal studies with Creativeable or similar AI creativity training programs are needed to examine the possible long-lasting effects of such training.

A potential limitation of using GPT to determine task difficulty adjustments is the lack of transparency in its decision-making process compared to a predefined, rule-based algorithm. While GPT’s approach allows for context-sensitive adjustments, this could introduce variability that may not align with the expectations of a consistent, predefined policy. Future research should address these two approaches to determine the advantages and trade-offs between explicit algorithmic control and GPT’s dynamic, context-driven adaptability. For example, a predefined algorithm could use scores derived from performance metrics to adjust difficulty levels incrementally, providing greater transparency and reproducibility in task adjustments.

A related limitation is our use of the GPT-4 model for adaptive difficulty, feedback, and originality scoring. Reliance on a single AI model for all of these stages may lead to circularity as well as homogenization of evaluation [72]. To address such potential issues of variability in GPT’s scores, each participant generated story in each of Creativeable training rounds was evaluated five times, and the final creativity score was the average of these five ratings. Furthermore, our analysis of GPT-evaluated vs. subjective-evaluated originality of stories (Figure 4) found a strong correspondence across both types of evaluation, in accordance with previous research [119,137]. Nevertheless, future studies need to replicate our approach with multiple LLMs involved in each of the stages to enhance LLM variability and minimize its potential homogenization effects.

The study also highlights the challenge of ensuring that GPT-based feedback remains meaningful and effective. While such feedback was valuable in many cases, it may lack the nuanced understanding of human creativity that a human evaluator might provide, particularly in assessing more subtle aspects of creativity, such as emotional depth or imaginative expression. Additionally, participants may experience “feedback fatigue,” where too much feedback or repeated suggestions lead to diminishing returns in motivation and creativity. Future research should explore ways to optimize the frequency and specificity of feedback to avoid overloading participants while maintaining its effectiveness.

Finally, the improvements in creativity observed in this study, though statistically significant, were modest and varied across participants. This variability underscores the importance of considering individual differences in response to AI-driven creativity training and highlights the need for more sophisticated AI algorithms that can offer even greater personalization and adaptivity to individual needs.

5. Conclusions

This study explored the potential of an AI-powered creativity training program, Creativeable, to enhance creativity without human intervention. Through personalized feedback and adaptive difficulty, our training tool demonstrated the capability to moderately improve participants’ creativity across diverse conditions. However, the varying responses to AI-generated feedback suggest that while AI has the potential to democratize creativity training, there remains room for refining its approach to better meet individual needs. The findings highlight the delicate balance between feedback and creative freedom, suggesting that AI-driven guidance must remain flexible and contextually relevant to avoid suppressing originality. Future research should build on these insights, exploring more dynamic, multi-faceted approaches to creativity training that target a broader range of creative skills and offer lasting impact.

The promise of AI to scale and personalize creativity training is immense, but achieving its full potential requires continued refinement. With more adaptive, nuanced feedback that accounts for cognitive load and expanded task designs, AI can truly revolutionize how we nurture creativity across diverse fields and individuals, offering a glimpse of a future where creativity is both more accessible and more deeply cultivated. Creativeable is one step forward in this frontier, highlighting the promise, opportunity, and complexity that is still required to address on the path towards democratizing creativity for all via AI training programs.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ai6100247/s1, Text S1: The prompt that was used to generate SWT word triplets; Text S2: Prompt used for feedback and task difficulty level adjustment; Text S3: Prompt used to rate the creativuity scores of the stories; Text S4: The reflection questions participants were asked regarding their stories after writing them at each round; Table S1: Word triplets used in the SWT with semantic distance and difficulty ratings from human raters; Table S2: Examples of creativity improvement suggestions generated by Creativeable in the feedback conditions.

Author Contributions

Conceptualization, A.K.-N. and Y.N.K.; methodology, A.K.-N. and Y.N.K.; software, A.K.-N.; formal analysis, A.K.-N.; investigation, A.K.-N.; data curation, A.K.-N.; writing—original draft preparation, A.K.-N.; writing—review and editing, Y.N.K.; supervision, Y.N.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Technion—Israel Institute of Technology (Approval number 2024-013, approved on 28 February 2024).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data and code of this study can be found at https://osf.io/5kxv8/ (accessed on 15 September 2025).

Acknowledgments

We thank Talia Wise and Amit Gerstein for their help in revising this paper. During the preparation of this study, the authors used GPT4 for the purposes of developing the tool described in the study, analyzing responses, and providing real-time feedback. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AUT	Alternative Uses Task
CST	Creativity Support Tool
DT	Divergent thinking
F/CL	Feedback/Constant Level condition
F/VL	Feedback/Varying Level condition
LLM NF/CL	Large Language Model No Feedback/Constant Level condition
NF/VL	No Feedback/Varying Level condition
RAT	Remote Associates Task
SFT	Semantic Fluency Task
SWT	Story Writing Task

References

Runco, M.A.; Jaeger, G.J. The standard definition of creativity. Creat. Res. J. 2012, 24, 92–96. [Google Scholar] [CrossRef]
Green, A.E.; Beaty, R.E.; Kenett, Y.N.; Kaufman, J.C. The process definition of creativity. Creat. Res. J. 2024, 36, 544–572. [Google Scholar] [CrossRef]
Beaty, R.E.; Kenett, Y.N. Associative thinking at the core of creativity. Trends Cogn. Sci. 2023, 27, 671–683. [Google Scholar] [CrossRef] [PubMed]
Chiu, F.-C. Improving your creative potential without awareness: Overinclusive thinking training. Think. Ski. Creat. 2015, 15, 1–12. [Google Scholar] [CrossRef]
Dacey, J.S.; Lennon, K.H. Understanding Creativity: The Interplay of Biological, Psychological, and Social Factors; Jossey-Bass: Hoboken, NJ, USA, 1998. [Google Scholar]
Benedek, M.; Beaty, R.E.; Schacter, D.L.; Kenett, Y.N. The role of memory in creative ideation. Nat. Rev. Psychol. 2023, 2, 246–257. [Google Scholar] [CrossRef]
Basadur, M.; Runco, M.A.; Vegaxy, L.A. Understanding how creative thinking skills, attitudes and behaviors work together: A causal process model. J. Creat. Behav. 2000, 34, 77–100. [Google Scholar] [CrossRef]
Cropley, D.H. The role of creativity as a driver of innovation. In Proceedings of the 2006 IEEE International Conference on Management of Innovation and Technology, Singapore, 21–23 June 2006. [Google Scholar]
Henriksen, D.; Henderson, M.; Creely, E.; Ceretkova, S.; Černochová, M.; Sendova, E.; Sointu, E.T.; Tienken, C.H. Creativity and technology in education: An international perspective. Technol. Knowl. Learn. 2018, 23, 409–424. [Google Scholar] [CrossRef]
Scott, G.; Leritz, L.E.; Mumford, M.D. The effectiveness of creativity training: A quantitative review. Creat. Res. J. 2004, 16, 361–388. [Google Scholar] [CrossRef]
Haase, J.; Hanel, P.H.P.; Gronau, N. Creativity enhancement methods for adults: A meta-analysis. Psychol. Aesthet. Creat. Arts 2025, 19, 708–736. [Google Scholar] [CrossRef]
Ding, K.; Kenett, Y.N. Creativity enhancement: A primer Trans. In The Oxford Handbook of Cognitive Enhancement and Brain Plasticity; Barbey, A.K., Ed.; Oxford University Press: Oxford, UK, 2024. [Google Scholar]
Ma, H.-H. A synthetic analysis of the effectiveness of single components and packages in creativity training programs. Creat. Res. J. 2006, 18, 435–446. [Google Scholar] [CrossRef]
Valgeirsdottir, D.; Onarheim, B. Studying creativity training programs: A methodological analysis. Creat. Innov. Manag. 2017, 26, 430–439. [Google Scholar] [CrossRef]
Müller, B.C.N.; Gerasimova, A.; Ritter, S.M. Concentrative meditation influences creativity by increasing cognitive flexibility. Psychol. Aesthet. Creat. Arts 2016, 10, 278. [Google Scholar] [CrossRef]
Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S. Gpt-4 technical report. arXiv 2023. [Google Scholar] [CrossRef]
Gai, Y.; Zhou, L.; Qin, K.; Song, D.; Gervais, A. Blockchain large language models. arXiv 2023. [Google Scholar] [CrossRef] [PubMed]
Goslen, A.; Kim, Y.J.; Rowe, J.; Lester, J. Llm-based student plan generation for adaptive scaffolding in game-based learning environments. Int. J. Artif. Intell. Educ. 2025, 35, 533–558. [Google Scholar] [CrossRef]
Hou, X.; Wu, Z.; Wang, X.; Ericson, B.J. Codetailor: Llm-powered personalized parsons puzzles for engaging support while learning programming. In Proceedings of the Eleventh ACM Conference on Learning at Scale, Atlanta, GA, USA, 18–20 July 2024. [Google Scholar]
Xiao, C.; Xu, S.X.; Zhang, K.; Wang, Y.; Xia, L. Evaluating reading comprehension exercises generated by llms: A showcase of chatgpt in education applications. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), Toronto, ON, Canada, 13 July 2023. [Google Scholar]
Pandya, K.; Holia, M. Automating customer service using langchain: Building custom open-source gpt chatbot for organizations. arXiv 2023. [Google Scholar] [CrossRef]
Radensky, M.; Weld, D.S.; Chee Chang, J.; Siangliulue, P.; Bragg, J. Let’s get to the point: Llm-supported planning, drafting, and revising of research-paper blog posts. arXiv 2024. [Google Scholar] [CrossRef]
Yu, C.; Zang, L.; Wang, J.; Zhuang, C.; Gu, J. Charpoet: A chinese classical poetry generation system based on token-free llm. arXiv 2024. [Google Scholar] [CrossRef]
Wu, Z.; Weber, T.; Müller, F. One does not simply meme alone: Evaluating co-creativity between llms and humans in the generation of humor. In Proceedings of the 30th International Conference on Intelligent User Interfaces, Cagliari, Italy, 24–27 March 2025. [Google Scholar]
Gómez-Rodríguez, C.; Williams, P. A confederacy of models: A comprehensive evaluation of llms on creative writing. arXiv 2023. [Google Scholar] [CrossRef]
Qin, H.X.; Jin, S.; Gao, Z.; Fan, M.; Hui, P. Charactermeet: Supporting creative writers’ entire story character construction processes through conversation with llm-powered chatbot avatars. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11–16 May 2024. [Google Scholar]
Wang, T.; Chen, J.; Jia, Q.; Wang, S.; Fang, R.; Wang, H.; Gao, Z.; Xie, C.; Xu, C.; Dai, J. Weaver: Foundation models for creative writing. arXiv 2024. [Google Scholar] [CrossRef]
Ding, S.; Liu, Z.; Dong, X.; Zhang, P.; Qian, R.; He, C.; Lin, D.; Wang, J. Songcomposer: A large language model for lyric and melody composition in song generation. arXiv 2024. [Google Scholar] [CrossRef]
Yuan, R.; Lin, H.; Wang, Y.; Tian, Z.; Wu, S.; Shen, T.; Zhang, G.; Wu, Y.; Liu, C.; Zhou, Z. Chatmusician: Understanding and generating music intrinsically with llm. arXiv 2024. [Google Scholar] [CrossRef]
Poldrack, R.A.; Lu, T.; Beguš, G. Ai-assisted coding: Experiments with gpt-4. arXiv 2023. [Google Scholar] [CrossRef]
Ismayilzada, M.; Paul, D.; Bosselut, A.; Plas, L.v.d. Creativity in ai: Progresses and challenges. arXiv 2024. [Google Scholar] [CrossRef]
de Chantal, P.-L.; Beaty, R.E.; Laverghetta, A.; Pronchick, J.; Patterson, J.D.; Organisciak, P.; Potega vel Zabik, K.; Barbot, B.; Karwowski, M. Artificial intelligence enhances human creativity through real-time evaluative feedback. PsyarXiv 2025. [Google Scholar] [CrossRef]
de Chantal, P.-L.; Houde-Labrecque, C.; Leblanc, M.-C.; Organisciak, P. Investigating lasting effects of real-time feedback on originality and evaluation accuracy. Creat. Res. J. 2024, 1–18. [Google Scholar] [CrossRef]
de Chantal, P.-L.; Organisciak, P. Automated feedback and creativity: On the role of metacognitive monitoring in divergent thinking. Psychol. Aesthet. Creat. Arts 2023. [Google Scholar] [CrossRef]
Chung, N.C. Human in the loop for machine creativity. arXiv 2021. [Google Scholar] [CrossRef]
Acar, S.; Runco, M.A. Divergent thinking: New methods, recent research, and extended theory. Psychol. Aesthet. Creat. Arts 2019, 13, 153–158. [Google Scholar] [CrossRef]
Runco, M.A.; Acar, S. Divergent thinking as an indicator of creative potential. Creat. Res. J. 2012, 24, 66–75. [Google Scholar] [CrossRef]
Karkockiene, D. Creativity: Can It Be Trained? A Scientific Educology of Creativity. Online Submission. 2005, pp. 51–58. Available online: https://www.researchgate.net/publication/280304250_Creativity_Can_it_be_Trained_A_Scientific_Educology_of_Creativity (accessed on 15 September 2025).
Perry, A.; Karpova, E. Efficacy of teaching creative thinking skills: A comparison of multiple creativity assessments. Think. Ski. Creat. 2017, 24, 118–126. [Google Scholar] [CrossRef]
Samašonok, K.; Leškienė-Hussey, B. Creativity development: Theoretical and practical aspects. J. Creat. Bus. Innov. 2015, 1, 19–34. [Google Scholar]
Vally, Z.; Salloum, L.; AlQedra, D.; El Shazly, S.; Albloshi, M.; Alsheraifi, S.; Alkaabi, A. Examining the effects of creativity training on creative production, creative self-efficacy, and neuro-executive functioning. Think. Ski. Creat. 2019, 31, 70–78. [Google Scholar] [CrossRef]
West, R.E.; Tateishi, I.; Wright, G.A.; Fonoimoana, M. Innovation 101: Promoting undergraduate innovation through a two-day boot camp. Creat. Res. J. 2012, 24, 243–251. [Google Scholar] [CrossRef]
Hargrove, R.A.; Nietfeld, J.L. The impact of metacognitive instruction on creative problem solving. J. Exp. Educ. 2015, 83, 291–318. [Google Scholar] [CrossRef]
Haase, J.; Hoff, E.V.; Hanel, P.H.; Innes-Ker, Å. A meta-analysis of the relation between creative self-efficacy and different creativity measurements. Creat. Res. J. 2018, 30, 1–16. [Google Scholar] [CrossRef]
Tierney, P.; Farmer, S.M. Creative self-efficacy: Its potential antecedents and relationship to creative performance. Acad. Manag. 2002, 45, 1137–1148. [Google Scholar] [CrossRef]
Mathisen, G.E.; Bronnick, K.S. Creative self-efficacy: An intervention study. Int. J. Educ. Res. 2009, 48, 21–29. [Google Scholar] [CrossRef]
Baruah, J.; Paulus, P.B. Effects of training on idea generation in groups. Small Group Res. 2008, 39, 523–541. [Google Scholar] [CrossRef]
Bonnardel, N.; Didier, J. Enhancing creativity in the educational design context: An exploration of the effects of design project-oriented methods on students’ evocation processes and creative output. J. Cogn. Educ. Psychol. 2016, 15, 80–101. [Google Scholar] [CrossRef]
Ulger, K. The creative training in the visual arts education. Think. Ski. Creat. 2016, 19, 73–87. [Google Scholar] [CrossRef]
Dyson, S.B.; Chang, Y.-L.; Chen, H.-C.; Hsiung, H.-Y.; Tseng, C.-C.; Chang, J.-H. The effect of tabletop role-playing games on the creative potential and emotional creativity of taiwanese college students. Think. Ski. Creat. 2016, 19, 88–96. [Google Scholar] [CrossRef]
Karwowski, M.; Soszynski, M. How to develop creative imagination?: Assumptions, aims and effectiveness of role play training in creativity. Think. Ski. Creat. 2008, 3, 163–171. [Google Scholar] [CrossRef]
Mansfield, R.S.; Busse, T.V.; Krepelka, E.J. The effectiveness of creativity training. Rev. Educ. Res. 1978, 48, 517–536. [Google Scholar] [CrossRef]
Benedek, M.; Fink, A.; Neubauer, A.C. Enhancement of ideational fluency by means of computer-based training. Creat. Res. J. 2006, 18, 317–328. [Google Scholar] [CrossRef]
Fink, A.; Benedek, M.; Koschutnig, K.; Pirker, E.; Berger, E.; Meister, S.; Neubauer, A.C.; Papousek, I.; Weiss, E.M. Training of verbal creativity modulates brain activity in regions associated with language- and memory-related demands. Hum. Brain Mapp. 2015, 36, 4104–4115. [Google Scholar] [CrossRef]
Sun, M.; Wang, M.; Wegerif, R. Using computer-based cognitive mapping to improve students’ divergent thinking for creativity development. Br. J. Educ. Technol. 2019, 50, 2217–2233. [Google Scholar] [CrossRef]
Huang, T.-C. Do different learning styles make a difference when it comes to creativity? An empirical study. Comput. Hum. Behav. 2019, 100, 252–257. [Google Scholar] [CrossRef]
Mednick, S.A. The associative basis of the creative process. Psychol. Rev. 1962, 69, 220–232. [Google Scholar] [CrossRef]
Phượng, V.B. An educational computerized game to train creativity: First development and evidence of its creativity correlates. In New Issues in Educational Sciences: Inter-Disciplinary and Cross-Disciplinary Approaches; Zun.vn: Ha Noi, Vietnam, 2018; Available online: https://www.zun.vn/tai-lieu/an-educational-computerized-game-to-train-creativity-first-development-and-evidence-of-its-creativity-correlates-59604/ (accessed on 15 September 2025).
Kim, S.; Chung, K.; Yu, H. Enhancing digital fluency through a training program for creative problem solving using computer programming. J. Creat. Behav. 2013, 47, 171–199. [Google Scholar] [CrossRef]
Grattafiori, A.; Dubey, A.; Jauhri, A.; Pandey, A.; Kadian, A.; Al-Dahle, A.; Letman, A.; Mathur, A.; Schelten, A.; Vaughan, A.; et al. The llama 3 herd of models. arXiv 2024. [Google Scholar] [CrossRef]
Dasgupta, I.; Lampinen, A.K.; Chan, S.C.Y.; Creswell, A.; Kumaran, D.; McClelland, J.L.; Hill, F. Language models show human-like content effects on reasoning. arXiv 2022. [Google Scholar] [CrossRef]
Orrù, G.; Piarulli, A.; Conversano, C.; Gemignani, A. Human-like problem-solving abilities in large language models using chatgpt [Original Research]. Front. Artif. Intell. 2023, 6, 1199350. [Google Scholar] [CrossRef] [PubMed]
Shahriar, S.; Lund, B.D.; Mannuru, N.R.; Arshad, M.A.; Hayawi, K.; Bevara, R.V.K.; Mannuru, A.; Batool, L. Putting GPT-4o to the sword: A comprehensive evaluation of language, vision, speech, and multimodal proficiency. Appl. Sci. 2024, 14, 7782. [Google Scholar] [CrossRef]
Franceschelli, G.; Musolesi, M. On the creativity of large language models. AI Soc. 2025, 40, 3785–3795. [Google Scholar] [CrossRef]
Stevenson, C.E.; Smal, I.; Baas, M.; Grasman, R.; Maas, H.L.J.v.d. Putting gpt-3’s creativity to the (alternative uses) test. in International Conference on Innovative Computing and Cloud Computing. arXiv 2022. [Google Scholar] [CrossRef]
Wang, X.; Hu, Z.; Lu, P.; Zhu, Y.; Zhang, J.; Subramaniam, S.; Loomba, A.R.; Zhang, S.; Sun, Y.; Wang, W. Scibench: Evaluating college-level scientific problem-solving abilities of large language models. arXiv 2023. [Google Scholar] [CrossRef]
Yao, S.; Yu, D.; Zhao, J.; Shafran, I.; Griffiths, T.L.; Cao, Y.; Narasimhan, K. Tree of thoughts: Deliberate problem solving with large language models. arXiv 2023. [Google Scholar] [CrossRef]
Evanson, L.; Lakretz, Y.; King, J.-R. Language acquisition: Do children and language models follow similar learning stages? In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Toronto, ON, Canada, 9–14 July 2023. [Google Scholar]
Doshi, A.R.; Hauser, O.P. Generative ai enhances individual creativity but reduces the collective diversity of novel content. Sci. Adv. 2024, 10, eadn5290. [Google Scholar] [CrossRef]
Chandrasekera, T.; Hosseini, Z.; Perera, U. Can artificial intelligence support creativity in early design processes? Int. J. Archit. Comput. 2025, 23, 122–136. [Google Scholar] [CrossRef]
Dell’Acqua, F.; McFowland, E.; Mollick, E.R.; Lifshitz-Assaf, H.; Kellogg, K.C.; Rajendran, S.; Krayer, L.; Candelon, F.; Lakhani, K.R. Navigating the jagged technological frontier: Field experimental evidence of the effects of ai on knowledge worker productivity and quality. SSRN Electron. J. 2023. [Google Scholar] [CrossRef]
Wenger, E.; Kenett, Y.N. We’re different, we’re the same: Creative homogeneity across llms. arXiv 2025. [Google Scholar] [CrossRef]
Moon, K.; Green, A.; Kushlev, K. Homogenizing effect of large language model (llm) on creative diversity: An empirical comparison of human and chatgpt writing. PsyarXiv 2024. [Google Scholar] [CrossRef]
Moon, K.; Kushlev, K.; Bank, A.; Green, A. Impersonal statements: Llm-era college admissions essays exhibit deep homogenization despite lexical diversity. PsyarXiv 2025. [Google Scholar] [CrossRef]
Rafner, J.; Beaty, R.E.; Kaufman, J.C.; Lubart, T.; Sherson, J. Creativity in the age of generative ai. Nat. Hum. Behav. 2023, 7, 1836–1838. [Google Scholar] [CrossRef]
Vinchon, F.; Lubart, T.; Bartolotta, S.; Gironnay, V.; Botella, M.; Bourgeois-Bougrine, S.; Burkhardt, J.-M.; Bonnardel, N.; Corazza, G.E.; Glăveanu, V.; et al. Artificial intelligence & creativity: A manifesto for collaboration. J. Creat. Behav. 2023, 57, 472–484. [Google Scholar] [CrossRef]
Cropley, D. Is artificial intelligence more creative than humans?: Chatgpt and the divergent association task. Learn. Lett. 2023, 2, 13. [Google Scholar] [CrossRef]
Candy, L. Evaluating creativity Trans. In Creativity and Rationale: Enhancing Human Experience by Design; Carroll, J.M., Ed.; Springer: London, UK, 2013; pp. 57–84. [Google Scholar] [CrossRef]
Frich, J.; MacDonald Vermeulen, L.; Remy, C.; Mose Biskjaer, M.; Dalsgaard, P. Mapping the landscape of creativity support tools in hci. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019. [Google Scholar]
Wang, H.-C.; Cosley, D.; Fussell, S.R. Idea expander: Supporting group brainstorming with conversationally triggered visual thinking stimuli. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, Savannah, GA, USA, 6–10 February 2010; pp. 103–106. [Google Scholar]
Zhao, Z.; Badam, S.K.; Chandrasegaran, S.; Park, D.G.; Elmqvist, N.L.E.; Kisselburgh, L.; Ramani, K. Skwiki: A multimedia sketching system for collaborative creativity. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Toronto, ON, Canada, 26 April–1 May 2014; pp. 1235–1244. [Google Scholar]
Ngoon, T.J.; Fraser, C.A.; Weingarten, A.S.; Dontcheva, M.; Klemmer, S. Interactive guidance techniques for improving creative feedback. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018. [Google Scholar]
Wise, T.A.; Kenett, Y.N. Sparking creativity: Encouraging creative idea generation through automatically generated word recommendations. Behav. Res. Methods 2024, 56, 7939–7962. [Google Scholar] [CrossRef]
Rafner, J.; Biskjær, M.M.; Zana, B.; Langsford, S.; Bergenholtz, C.; Rahimi, S.; Carugati, A.; Noy, L.; Sherson, J. Digital games for creativity assessment: Strengths, weaknesses and opportunities. Creat. Res. J. 2022, 34, 28–54. [Google Scholar] [CrossRef]
Rafner, J.; Wang, Q.J.; Gadjacz, M.; Badts, T.; Baker, B.; Bergenholtz, C.; Biskjaer, M.M.; Bui, T.; Carugati, A.; de Cibeins, M.; et al. Towards game-based assessment of creative thinking. Creat. Res. J. 2023, 35, 763–782. [Google Scholar] [CrossRef]
Swanson, B.; Mathewson, K.W.; Pietrzak, B.; Chen, S.; Dinalescu, M. Story centaur: Large language model few shot learning as a creative writing tool. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics, Online, 19–23 April 2021. [Google Scholar]
Xu, X.; Yin, J.; Gu, C.; Mar, J.; Zhang, S.; E, J.L.; Dow, S.P. Jamplate: Exploring llm-enhanced templates for idea reflection. In Proceedings of the 29th International Conference on Intelligent User Interfaces, Greenville, SC, USA, 18–21 March 2024. [Google Scholar]
Chakrabarty, T.; Padmakumar, V.; Brahman, F.; Muresan, S. Creativity support in the age of large language models: An empirical study involving professional writers. In Proceedings of the 16th Conference on Creativity & Cognition, Chicago, IL, USA, 23–26 June 2024. [Google Scholar]
Hwang, A.H.-C.; Won, A.S. Ideabot: Investigating social facilitation in human-machine team creativity. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Online, 8–13 May 2021. [Google Scholar]
Kenett, Y.N. The role of knowledge in creative thinking. Creat. Res. J. 2025, 37, 242–249. [Google Scholar] [CrossRef]
Hills, T.T.; Kenett, Y.N. An entropy modulation theory of creative exploration. Psychol. Rev. 2025, 132, 239–251. [Google Scholar] [CrossRef]
Bowden, E.M.; Jung-Beeman, M. Aha! Insight experience correlates with solution activation in the right hemisphere. Psychon. Bull. Rev. 2003, 10, 730–737. [Google Scholar] [CrossRef]
Worthen, B.R.; Clark, P.M. Toward an improved measure of remote associational ability. J. Educ. Meas. 1971, 8, 113–123. [Google Scholar] [CrossRef]
Kajić, I.; Gosmann, J.; Stewart, T.C.; Wennekers, T.; Eliasmith, C. A spiking neuron model of word associations for the remote associates test [Original Research]. Front. Psychol. 2017, 8, 99. [Google Scholar] [CrossRef] [PubMed]
Smith, K.A.; Huber, D.E.; Vul, E. Multiply-constrained semantic search in the remote associates test. Cognition 2013, 128, 64–75. [Google Scholar] [CrossRef] [PubMed]
Howard-Jones, P.A.; Blakemore, S.-J.; Samuel, E.A.; Summers, I.R.; Claxton, G. Semantic divergence and creative story generation: An fmri investigation. Cogn. Brain Res. 2005, 25, 240–250. [Google Scholar] [CrossRef]
Luchini, S.; Moosa, M.; Patterson, J.D.; Johnson, D.; Baas, M.; Barbot, B.; Bashmakova, I.; Benedek, M.; Chen, Q.; Corazza, G.; et al. Automated assessment of creativity in multilingual narratives. Psychol. Aesthet. Creat. Arts 2025. [Google Scholar] [CrossRef]
Prabhakaran, R.; Green, A.E.; Gray, J.R. Thin slices of creativity: Using single-word utterances to assess creative cognition. Behav. Res. Methods 2014, 46, 641–659. [Google Scholar] [CrossRef]
Wenzel, W.G.; Gerrig, R.J. Convergent and divergent thinking in the context of narrative mysteries. Discourse Process. 2015, 52, 489–516. [Google Scholar] [CrossRef]
Benedek, M.; Jurisch, J.; Koschutnig, K.; Fink, A.; Beaty, R.E. Elements of creative thought: Investigating the cognitive and neural correlates of association and bi-association processes. NeuroImage 2020, 210, 116586. [Google Scholar] [CrossRef] [PubMed]
Volle, E. Associative and controlled cognition in divergent thinking: Theoretical, experimental, neuroimaging evidence, and new directions. In The Cambridge Handbook of the Neuroscience of Creativity; Jung, R.E., Vartanian, O., Eds.; Cambridge University Press: Cambridge, UK, 2018; pp. 333–362. [Google Scholar]
Becker, M.; Cabeza, R. The neural basis of the insight memory advantage. Trends Cogn. Sci. 2025, 29, 255–268. [Google Scholar] [CrossRef] [PubMed]
Yang, W.; Green, A.E.; Chen, Q.; Kenett, Y.N.; Sun, J.; Wei, D.; Qiu, J. Creative problem solving in knowledge-rich contexts. Trends Cogn. Sci. 2022, 26, 849–859. [Google Scholar] [CrossRef] [PubMed]
Vygotskiĭ, L.S. Mind in Society: The Development of Higher Psychological Processes; Harvard University Press: Cambridge, MA, USA, 1978. [Google Scholar]
Kidd, C.; Piantadosi, S.T.; Aslin, R.N. The goldilocks effect: Human infants allocate attention to visual sequences that are neither too simple nor too complex. PLoS ONE 2012, 7, e36399. [Google Scholar] [CrossRef]
Fong, C.J.; Patall, E.A.; Vasquez, A.C.; Stautberg, S. A meta-analysis of negative feedback on intrinsic motivation. Educ. Psychol. Rev. 2019, 31, 121–162. [Google Scholar] [CrossRef]
Hattie, J.A.C.; Timperley, H.S. The power of feedback. Rev. Educ. Res. 2007, 77, 112–181. [Google Scholar] [CrossRef]
Ericsson, K.A.; Krampe, R.T.; Tesch-Römer, C. The role of deliberate practice in the acquisition of expert performance. Psychol. Rev. 1993, 100, 363. [Google Scholar] [CrossRef]
Chen, D.L.; Schonger, M.; Wickens, C. Otree—An open-source platform for laboratory, online, and field experiments. J. Behav. Exp. Financ. 2016, 9, 88–97. [Google Scholar] [CrossRef]
Vasiliev, Y. Natural Language Processing with Python and Spacy: A Practical Introduction; No Starch Press: San Francisco, CA, USA, 2020. [Google Scholar]
Shen, S.; Wang, S.; Qi, Y.; Wang, Y.; Yan, X. Teacher suggestion feedback facilitates creativity of students in steam education. Front. Psychol. 2021, 12, 723171. [Google Scholar] [CrossRef]
Gilardi, F.; Alizadeh, M.; Kubli, M. Chatgpt outperforms crowd workers for text-annotation tasks. Proc. Natl. Acad. Sci. USA 2023, 120, e2305016120. [Google Scholar] [CrossRef]
Gray, M.A.; Šavelka, J.; Oliver, W.M.; Ashley, K.D. Can gpt alleviate the burden of annotation? In Proceedings of the International Conference on Legal Knowledge and Information Systems, Maastricht, The Netherlands, 18–20 December 2023. [Google Scholar]
Hackl, V.; Müller, A.E.; Granitzer, M.; Sailer, M. Is gpt-4 a reliable rater? Evaluating consistency in gpt-4’s text ratings [Original Research]. Front. Educ. 2023, 8, 1272229. [Google Scholar] [CrossRef]
Kim, S.; Jo, M. Is gpt-4 alone sufficient for automated essay scoring?: A comparative judgment approach based on rater cognition. In Proceedings of the Eleventh ACM Conference on Learning at Scale, Atlanta, GA, USA, 18–20 July 2024. [Google Scholar]
Lundgren, M. Large language models in student assessment: Comparing chatgpt and human graders. arXiv 2024. [Google Scholar] [CrossRef]
Yadav, S.; Choppa, T.; Schlechtweg, D. Towards automating text annotation: A case study on semantic proximity annotation using gpt-4. arXiv 2024. [Google Scholar] [CrossRef]
Guilford, J.P. The Nature of Human Intelligence; McGraw-Hill: Columbus, OH, USA, 1967. [Google Scholar]
Organisciak, P.; Acar, S.; Dumas, D.; Berthiaume, K. Beyond semantic distance: Automated scoring of divergent thinking greatly improves with large language models. Think. Ski. Creat. 2023, 49, 101356. [Google Scholar] [CrossRef]
Alhashim, G.; Alhashim, A.G.; Marshall, M.; Marshall, M.; Hartog, T.; Jonczyk, D.R.; Jo’nczyk, R.; Hell, P.J.v.; Siddique, P.Z.; Siddique, Z. Wip: Assessing creativity of alternative uses task responses: A detailed procedure. ASEE Annu. Conf. Expo. Conf. Proc. 2020, 2020, 1656. [Google Scholar]
Forthmann, B.; Gerwig, A.; Holling, H.; Çelik, P.; Storme, M.; Lubart, T. The be-creative effect in divergent thinking: The interplay of instruction and object frequency. Intelligence 2016, 57, 25–32. [Google Scholar] [CrossRef]
Henry, J.; Crawford, J. A meta-analytic review of verbal fluency deficits in schizophrenia relative to other neurocognitive deficits. Cogn. Neuropsychiatry 2005, 10, 1–33. [Google Scholar] [CrossRef]
Henry, J.D.; Crawford, J.R. Verbal fluency deficits in parkinson’s disease: A meta-analysis. J. Int. Neuropsychol. Soc. 2004, 10, 608–622. [Google Scholar] [CrossRef]
Laws, K.R.; Duncan, A.; Gale, T.M. ‘Normal’ semantic–phonemic fluency discrepancy in alzheimer’s disease? A meta-analytic study. Cortex 2010, 46, 595–601. [Google Scholar] [CrossRef]
Ardila, A.; Ostrosky-Solís, F.; Bernal, B. Cognitive testing toward the future: The example of semantic verbal fluency (animals). Int. J. Psychol. 2006, 41, 324–332. [Google Scholar] [CrossRef]
Rofes, A.; de Aguiar, V.; Jonkers, R.; Oh, S.J.; DeDe, G.; Sung, J.E. What drives task performance during animal fluency in people with alzheimer’s disease? Front. Psychol. 2020, 11, 1485. [Google Scholar] [CrossRef]
Shao, Z.; Janse, E.; Visser, K.; Meyer, A. What do verbal fluency tasks measure? Predictors of verbal fluency performance in older adults [Original Research]. Front. Psychol. 2014, 5, 772. [Google Scholar] [CrossRef]
Abdi, H. Holm’s sequential bonferroni procedure. Encycl. Res. Des. 2010, 1, 1–8. [Google Scholar]
Holinger, M.; Kaufman, J.C. The relationship between creativity and feedback. In The Cambridge Handbook of Instructional Feedback; Cambridge University Press: Cambridge, UK, 2018; pp. 575–587. [Google Scholar] [CrossRef]
Hernandez Sibo, I.P.; Gomez Celis, D.A.; Liou, S. Exploring the landscape of cognitive load in creative thinking: A systematic literature review. Educ. Psychol. Rev. 2024, 36, 24. [Google Scholar] [CrossRef]
Rodet, C.S. Does cognitive load affect creativity? An experiment using a divergent thinking task. Econ. Lett. 2022, 220, 110849. [Google Scholar] [CrossRef]
Redifer, J.L.; Bae, C.L.; Zhao, Q. Self-efficacy and performance feedback: Impacts on cognitive load during creative thinking. Learn. Instr. 2021, 71, 101395. [Google Scholar] [CrossRef]
Beaty, R.E.; Silvia, P.J. Why do ideas get more creative over time? An executive interpretation of the serial order effect in divergent thinking tasks. Psychol. Aesthet. Creat. Arts 2012, 6, 309–319. [Google Scholar] [CrossRef]
Silvia, P.J. Intelligence and creativity are pretty similar after all. Educ. Psychol. Rev. 2015, 27, 599–606. [Google Scholar] [CrossRef]
Anderson, B.R.; Shah, J.H.; Kreminski, M. Evaluating creativity support tools via homogenization analysis. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11–16 May 2024. [Google Scholar]
Merseal, H.M.; Luchini, S.; Kenett, Y.N.; Knudsen, K.; Bilder, R.M.; Beaty, R.E. Free association ability distinguishes highly creative artists from scientists: Findings from the big-c project. Psychol. Aesthet. Creat. Arts 2025, 19, 495–504. [Google Scholar] [CrossRef]
Saretzki, J.; Andrae, R.; Forthmann, B.; Benedek, M. Investigation of response aggregation methods in divergent thinking assessments. J. Creat. Behav. 2025, 59, e1527. [Google Scholar] [CrossRef]
Chen, W.-Y. Intelligent tutor: Leveraging chatgpt and microsoft copilot studio to deliver a generative ai student support and feedback system within teams. arXiv 2024. [Google Scholar] [CrossRef]
Jacobsen, L.J.; Weber, K.E. The promises and pitfalls of large language models as feedback providers: A study of prompt engineering and the quality of ai-driven feedback. AI 2025, 6, 35. [Google Scholar] [CrossRef]
Steiss, J.; Tate, T.; Graham, S.; Cruz, J.; Hebert, M.; Wang, J.; Moon, Y.; Tseng, W.; Warschauer, M.; Olson, C.B. Comparing the quality of human and chatgpt feedback of students’ writing. Learn. Instr. 2024, 91, 101894. [Google Scholar] [CrossRef]
Herlambang, M.B.; Taatgen, N.A.; Cnossen, F. The role of motivation as a factor in mental fatigue. Hum. Factors 2019, 61, 1171–1185. [Google Scholar] [CrossRef]
Stella, M.; Hills, T.T.; Kenett, Y.N. Using cognitive psychology to understand gpt-like models needs to extend beyond human biases. Proc. Natl. Acad. Sci. USA 2023, 120, e2312911120. [Google Scholar] [CrossRef]
Zeng, L.; Proctor, R.W.; Salvendy, G. Can traditional divergent thinking tests be trusted in measuring and predicting real-world creativity? Creat. Res. J. 2011, 23, 24–37. [Google Scholar] [CrossRef]

Figure 1. Creativeable GUI. Left: Screenshot of the page where participants are asked to write a creative story with the provided word triplet. Right: Screenshot of the summary and feedback page, shown to participants after they submit their creative short story and answer a few reflection questions about their creative process.

Figure 2. Validation of experimental design. Sankey diagrams showing how the AI trainer dynamically adjusted task difficulty levels for participants across five rounds based on their performance and reported fatigue. Thicker flows represent a greater number of participants transitioning between. Difficulty levels (as defined in Section 2.2.3): E—easy; ME—medium–easy; MH—medium–hard; H—hard. X–aix repr.

Figure 3. Creativity improvements across conditions. (left; AUT originality) Average AUT originality scores (Y-axis) before and after the training for each of the condition groups (X-axis), with 95% confidence intervals.

Figure 4. Fluency improvements in the AUT across conditions. Average number of AUT responses (Y-axis) before and after the training for each of the condition groups (X-axis), with 95% confidence intervals.

Figure 5. Fluency improvements in the SFT across conditions. Average number of fluency responses (Y-axis) before and after the training for each of the condition groups (X-axis), with 95% confidence intervals.

Figure 6. Distribution of improvement suggestion themes.

Figure 7. Average usefulness scores of suggestion themes with 95% confidence intervals. Responses were generated during the reflection phase in the F/VL and F/CL conditions.

Table 1. Demographic information of the four experimental conditions.

	F/VL	F/CL	NF/VL	NF/CL
N	97	98	93	97
M/F	41/55	57/41	42/51	39/57
Age	31.7 (8.2)	28.8 (8.43)	29.12 (8.0)	29.0 (8.06)
Education	14.21 (2.76)	13.9 (1.85)	13.87 (2.0)	14.27 (2.29)

Note—N—number of participants. M/F—number of male and female participants. Age—average age in years. Education—average number of education years (standard deviation in parentheses).

Table 2. Examples of general feedback given by the AI trainer to participants in the feedback conditions (F/VL and F/CL).

Examples of General Feedback Provided by the AI Trainer
The story is a thoughtful commentary on societal norms surrounding poetry, effectively using the prompt words to highlight feelings of disconnection. It could be enriched by a vivid scene or character to ground these abstract concepts.
The story presents a cynical, introspective tone that effectively incorporates the given words but could benefit from a clearer narrative or emotional arc.
The story cleverly integrates the three required words, painting an image of escapism and the juxtaposition of mundane life with grandiose dreams. However, the connection between the man’s transformation and the context (bowling and astronomy) could be more cohesive and purposeful.
The story features an unexpected twist on the concept of romance and incorporates the prompt words with a touch of humor. Be cautious with sensitive topics to ensure they’re handled appropriately and with the consideration of potential readers.
The story paints a vivid image of an evocative and historically rich setting, successfully weaving the prompt words into a tapestry of jazz culture. The tale provides a fanciful origin story for jazz that invites readers to visualize the unconventional logo and the atmosphere within the club.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kreisberg-Nitzav, A.; Kenett, Y.N. Creativeable: Leveraging AI for Personalized Creativity Enhancement. AI 2025, 6, 247. https://doi.org/10.3390/ai6100247

AMA Style

Kreisberg-Nitzav A, Kenett YN. Creativeable: Leveraging AI for Personalized Creativity Enhancement. AI. 2025; 6(10):247. https://doi.org/10.3390/ai6100247

Chicago/Turabian Style

Kreisberg-Nitzav, Ariel, and Yoed N. Kenett. 2025. "Creativeable: Leveraging AI for Personalized Creativity Enhancement" AI 6, no. 10: 247. https://doi.org/10.3390/ai6100247

APA Style

Kreisberg-Nitzav, A., & Kenett, Y. N. (2025). Creativeable: Leveraging AI for Personalized Creativity Enhancement. AI, 6(10), 247. https://doi.org/10.3390/ai6100247

Article Menu

Creativeable: Leveraging AI for Personalized Creativity Enhancement

Abstract

1. Introduction

1.1. Existing Creativity Training Programs

1.2. AI and Creativity Enhancement

1.3. Creativity Support Tools

1.4. Story Writing as a Creativity Training Task

1.5. The Current Study

2. Materials and Methods

2.1. Participants

2.2. Creativeable

2.2.1. Creativeable Website

2.2.2. Story Writing Task (SWT)

2.2.3. SWT Stimuli

2.2.4. Experimental Creativeable Conditions

Feedback with Varying Difficulty Level (F/VL)

Feedback with Constant Difficulty Level (F/CL)

No Feedback, Varying Difficulty Level (NF/VL)

No Feedback, Constant Difficulty Level (NF/CL)

2.2.5. Evaluating the Creativity of the Stories

2.2.6. Adaptive Difficulty Level Policy

2.2.7. Personalized Feedback and Suggestions for Improvement

2.3. Creativity Assessment

2.4. Fluency Assessment

2.5. Procedure

3. Results

3.1. Validation of Experimental Design

3.1.1. Perceived Task Difficulty Through Training Rounds

3.1.2. Self-Reported Fatigue Through Training Rounds

3.1.3. Task Difficulty Levels Through Training Rounds

3.2. Creativity and Performance Results

3.2.1. Creativity Improvements Across Conditions

3.2.2. Creativity Scores Across Training Rounds

3.3. Exploratory Insights on Feedback

3.3.1. Transfer Feedback

3.3.2. Participant Experiences with Improvement Suggestions in the Feedback Conditions

3.3.3. Changes in Motivation and Engagement During the Training

4. Discussion

4.1. Comparing with Existing Training Programs

4.2. Effectiveness of Creativity Enhancement

4.3. AI Generated Feedback in Creativity Enhancement

4.4. Implications for Future Creativity Training Programs

4.5. Limitations and Future Research

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI