Analyzing the Impact of a Structured LLM Workshop in Different Education Levels

Kozov, Vasil; Ivanova, Boyana; Shoylekova, Kamelia; Andreeva, Magdalena

doi:10.3390/app14146280

Open AccessArticle

Analyzing the Impact of a Structured LLM Workshop in Different Education Levels

Department of Computer science, University of Ruse, 7017 Ruse, Bulgaria

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(14), 6280; https://doi.org/10.3390/app14146280

Submission received: 10 June 2024 / Revised: 16 July 2024 / Accepted: 17 July 2024 / Published: 18 July 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

An observation on the current state of teaching large language models (LLMs) in education is made. The problem of lacking a structural approach is defined. A methodology is created in order to serve as the basis of a workshop in teaching students with different types of backgrounds the correct use of LLMs and their capabilities. A workshop plan is created; instructions and materials are presented. A practical experiment has been conducted by dividing students into teams and guiding them to create a small project. Different LLMs are used for the purposes of creating a fictional story, images relating to the story, and very simple HTML, JS, and CSS code. Participants are given requirements that consider the limitations of LLMs, and different approaches to creatively solving arising issues due to the requirements are observed. The students’ projects are hosted on the web, so that they can see the results of their work. They are given the opportunity to use them as motivation for their future development. A survey is created and distributed to all the participating students. The results are analyzed from different angles and conclusions are made on the effectiveness of the workshop in completing its goal of solving the defined problem.

Keywords:

large language models; AI; education; training; GPT; stable diffusion

1. Introduction

With the fast-paced development of large language models (LLMs), their inclusion in the education system has become a necessity, as they give a competitive advantage to students. Teaching students how to enhance their work by using such tools is a double-edged sword, as it can lead to dulling their minds and developing an over-reliance on external thinking. With this problem in mind, a structured approach has been used to create a workshop that takes the use of LLMs into account when evaluating tasks.

It must be mentioned that in the near future, it will probably be necessary to create most task requirements with the students’ use of LLMs in mind. This will gradually increase the difficulty of tasks tremendously, further developing reliance on LLMs, but at the same time giving students challenges of a different kind: assembling solutions and stitching together complex projects. The results of such tasks will raise the overall quality of products that can be created with minimal effort, while at the same time sharpening future generations’ minds in a different way: recognizing which tools are the most appropriate for various tasks and how to utilize them successfully and efficiently.

In order to solve the problem of lacking a structured approach to teaching students in schools and higher education the correct way to include LLMs in their own workflows, as well as show them their limitations, a workshop methodology has been created and then applied in an experiment. To obtain more varied results, several groups were targeted: 120 students in high schools with different profiles and 30 university students in Software Engineering. During their project presentations, all participants were asked questions about their experience and the results were satisfactory across all the different groups that were participating in the workshop. To ensure the impartiality of opinions, an anonymous survey was conducted, and its results were analyzed, helping with identifying flaws in the workshop and the methodology, as well as its impact on the learning process.

Applied artificial intelligence (AI) and large language models (LLMs) are currently transforming many industries and aspects of life, particularly in human–robot environments and their interactions [1], in solving complex industrial and automation issues [2], as well as helping complete sustainable agriculture and development goals [3]. Classifications and reviews of different types of LLMs, AI, and their architectural peculiarities have been made [4], noting the complexity of analyzing the impact of a currently developing field due to the enormity of the research and data overall. As such, generalized overviews are difficult to create and verify, but are extremely useful to bring a researcher up to date with most of the recent changes on the topic. It is impossible to keep track of the progress of all branches of AI in all fields, as monitoring all branches of science requires a tremendous amount of time. Perhaps an AI summarization of its own applications and progress is the future of research reviews going forward, but it is not yet trustworthy enough to provide accurate and extensive information that can be peer-reviewed and verified [5]. Currently, there are difficulties with analyzing research content appropriately, and the success rate is of varying degrees [6].

The problem defined in the current paper requires sufficient understanding of the different approaches to applying LLMs in practice. Interesting research has been conducted to analyze the behavioral patterns of GPT 4 using personality tests [7]. The generic pretrained transformer has developed so much it can be difficult to distinguish from a human being, and the variety of personalities it can exhibit on demand makes it an interesting case study. Due to its power, research has been conducted on how to help autistic individuals with developing communication skills [8]. Agency is given to individuals themselves on whether to use it as a conversational agent, as a search tool, or as a type of “buffer” between them and other people. It must be noted that just like with conversational robots, the lack of human-on-human communication may lead to negative impact on their communication skills for such individuals. It is an extraordinarily complex field of study because of the variety of types of autism, and in the future, conversational LLMs will be able to help positively because of their capabilities and flexibility.

Using custom GPTs for assessing results in particular fields is becoming increasingly popular as the degree of automation helps alleviate the burden of repetitive manual work from reviewers. The field of topography benefits from this due to the recently expanded GPT repertoire [9]. In engineering education, the impact of using AI chatbots has been analyzed and the influence of such tools is extreme [10]. The general use of GPT among learners (high school and university level) has been noted and polled [11], leading to interesting results, as some fairly young people decided against the use of such tools and stated opinions that can perhaps form the view of the next generation’s will and goals.

Models used to help with software development processes are the most advanced, as the field is particularly eager to implement them at all stages of their methodologies. A particularly important step that can be enhanced by using AI is the writing of software requirements [12]. Many relatively recent problems can find solutions in personalized LLMs, and research topics on specific problems, such as trajectory and movement algorithms using cameras, can be expanded on [13], as well as putting mathematical simulation models into practice and testing and integrating them in a more robust way [14].

Trending topics of Scopus and WoS—ChatGPT and AI—are intensively discussed from different points of view: pedagogical, technological, educational, psychological, ethical, and legal, with all the benefits for stakeholders in education. Their detriments are also present, along with all the risks and negative effects on educational practice [15]. There have been studies that showcase different effects of using AI in education in the long-term [16] and for very specialized cases [17]. Analyses have been made for the uses of chat LLMs in higher education and their effectiveness [18], while some evaluate the impact when including those tools in mathematics in engineering education [19].

Efforts have been made to create multidisciplinary methodologies that help with explaining sustainability [20]. Those approaches are applicable to the current problem; multidisciplinary knowledge is required for increasing results and applying a structured approach to informing and teaching everyone about the practical applications of AI. People with technical backgrounds and relevant qualifications, who understand how to effectively use LLMs are too few. Raising awareness and education level helps with achieving UN goals 10.1 and 10.3: reducing inequality and increasing opportunity. Instilling a useful skill package in participants in the workshop is of particular importance to achieving actual positive results and not just ticking a box in a document with requirements that “something positive has happened”. Practical skills, industrious minds, and financial knowledge are what solve issues and take people out of their disadvantaged position.

Economic inequality is recognized as a barrier to economic growth and access to quality education. Employers seek and reward individuals who possess complex problem-solving skills. As technology advances, such skills are likely to continue to be valued and may even increase in value, especially when combined with communication skills. It is important that schools at all levels—K-12, community college, career and technical education, and college and university—recognize the importance of general, complex problem-solving skills for students as part of a strategy to prepare students for the world of work, thereby reducing the insidious effects of wealth concentration on the future opportunities of all [21].

European sustainable development goals incentivize activities that reduce inequality of opportunity, and companies that comply with SDGs have a positive effect on overall sustainable development and social welfare [22]. In the country where the workshop was conducted (Bulgaria), there is enormous potential for such sustainable development [23], and steps must be taken to raise the qualifications of everyone in education in a sustainable way. Using AI and LLM tools is the way forward for people to complete projects and raise their qualifications and abilities quickly, so they can have a buffer of time for long-term studying and permanent lifestyle change.

Khan Academy was one of the first large companies that worked in the field of education that included an LLM assistant (GPT) [24]. Their version is heavily moderated and does not solve the currently defined problem, as students are using a neutered version of an LLM that is tailored to helping them. Students must see the limitations and constraints they will face when using the most LLM popular tools like ChatGPT, as well as recognize the value in tools that use models with deeper prompt and model setting complexity levels like Stable Diffusion.

The lack of a structured approach to teaching the use of GPTs and LLMs to many people in many different fields is a difficult problem to solve [25], and as such, a creative approach is necessary. It is inevitable for motivation to decrease as automation of any and all current jobs become commonplace; thus, teaching everyone the creative uses of LLMs is necessary for their improvement.

Currently, the formal education system does not introduce large language models. There are several initiatives for students at the university level to delve into the topic. These courses and workshops aim to showcase the various technologies, rather than teach the actual skill to use LLMs and explain their limitations and explain their connections to other fields of AI: machine learning, natural language processing, computer vision, and neural networks. An approach that has stringent requirements for practical project results is a necessity to let people experience the different aspects of LLMs.

The structured approaches that currently are used for teaching in formal education are lectures, practical exercises, and discussions. The workshop applies all these methods. It starts with a short theory introduction in the form of a lecture, then discussions are held about the topics of the participants’ work and the requirements. The practical exercise is prevalent as it takes the most time and effort, and a short discussion is held with each team after they present their work, in the form of an interview. Critical issues and questions are, of course, discussed during the practical work.

2. Materials and Methods

The research methodology (Figure 1) used to create the idea for the workshop and conduct the experiment is similar to that of the operational action research. It consists of several steps: defining the problem, brainstorming for ideas on how to solve it, creating an outline for the workshop, creating Supplementary Materials, testing the workshop personally in a closed environment, improving the workshop plan and materials iteratively, ensuring equipment is available and robust enough to handle the tasks, conducting the workshop with participants in the experiment, helping participants finish their work, performing short interviews and questionnaires with the participants, and collating and analyzing data to develop conclusions.

In order to solve the defined problem—lack of a structured approach to teaching working with LLMs to students—a methodology has been defined. It consists of several different steps, ensuring each different technology has been used to successfully complete the assigned project. The graphic representation of the methodology can be seen in Figure 2.

Step 1 (done only with students in high schools and omitted for university students) —consists of separating students based on their study goals. A list is made, consisting of the people who have chosen a technical specialty to study, such as anything in IT or computer engineering. They are then grouped with 1 or 2 other students who have chosen a non-technical specialty. Depending on the number of people, groups of two or three are made, each representing a team to complete a small project.

Their task is defined: create a website that has at least several pages, each having a part of a story, in any genre, that is at least 3000 words long. The pages must have the characters in the story (minimum 3 characters), in at least three distinct locations, put as background images for each page. All the text and images must be generated by AI.

2.1. Start of the Iterative Process

Step 2: Use prompts in GPT to create the text of the story. The difficulty in this step comes from the requirement—“story must be at least 3000 words”. GPT cannot write consistent messages above 1000 words. Students must grasp the intricacies of how to solve that issue. Several interesting approaches have been used that will be described in the next part.

Step 3: Use prompts in GPT to generate prompts for visual character definitions. In turn, use those definitions in an AI tool to generate images for each character. Students are given the opportunity to use locally deployed Stable Diffusion on computers at the University of Ruse remotely. After a satisfactory image has been generated for each of the characters that the story will be about, the process is repeated for the locations to supplement the story. The difficulty in this step is in fine-tuning the prompts and settings for the image generation that will lead to satisfactory results. The learning process of a low-level image-generation LLM is realized here.

Step 4: Use prompts in GPT to create the code for web pages that will contain the story and the images. Fine-tune the prompts in a way that will present the story better.

2.2. End of Iterative Process

Step 5: Follow the instructions that will complete the procedure for hosting a website freely on github.io. Host the webpages.

Step 6: Create and rehearse presentations. A template is provided that can be followed. Rehearsal helps with learning how to bond in teams, as well as preparing to do a better final job.

Step 7: Teams present the result of their efforts. A necessary step—many students currently lack the required skills to present their work and themselves, and this step helps them gain valuable experience. Questions are asked after the presentations, in an interview fashion, mainly for feedback purposes and to help identify difficult parts of the workshop.

Step 8: Evaluation based on the achieved results. A feedback survey is distributed in order to gain data and improve the workshop further.

3. Results

3.1. Practical Outcome and Preliminary Data Analysis of the Workshop Results

The experiment was conducted by four university researchers, working in the field of computer science, three of which were taking the role of action researchers for the experiment, closely overlooking and guiding the research process.

The workshop experience for each step during the experiment is detailed here:

Participants are provided with the entire plan for the workshop, as well as instructions for each step. They are given an explanation for their task: they have to create a story that they will present in front of their peers in the form of a website, containing three web pages, each page a different act in their story, having at least three different characters and locations.

Step 1: When given the right to choose teammates, participants did not form groups based on their interests, qualifications, or skills but on their previous acquaintance and friendship. There were very few cases where participants had a huge disparity in skill level in general IT knowledge. For the specialized participants (high school students studying mathematics and informatics), there were gaps in knowledge depending on their previous courses, but they were grouped based on predefined criteria (explained in the Section 2).

Step 2: Choosing a genre was difficult for some participants, but most chose the default one—fantasy. More than half of the students (about 60%) chose to place their story in “standard” environments, and the rest chose atypical and creative locations. The same rough estimate can be made about the characters; about 40% of the teams took an interesting approach to their creations and made them non-standard (e.g., a mushroom as a main character, a sponge, anti-hero characters, characters with criminal or sociopathic behavior, etc.). The limit of 3000 words led to creative solutions; some participants chose to ask GPT to summarize the first part of the story and gave that as a prompt for it to continue with it. One person chose to make many short stories with the same tone and characters, leading to interesting results. Another chose to format their story as a fictional advertisement that had a story embedded in each of its products. The results of this step prove that many people thrive and exert creativity when given “game-breaking” requirements. Another thing to note is that the names of the characters that were proposed by GPT were almost exclusively beginning with E, L, and N. There were many cases where the same name (or a very similar one) was present in their stories, which prompted many of them to name their characters themselves. Faced with these difficulties, participants had to iteratively solve their issues and move on.

Step 3: Most participants used the suggested tool and methods for generating images: Stable Diffusion. Others used a variety of free tools on the internet: Dall-e 3, Bard/Gemini, Dream (Wombo), Midjourney, Leonardo.ai, etc. Prompt generation proved somewhat challenging for many of the participants; they first tried to describe the feelings and spiritual qualities of the characters, and after several results they started being more descriptive of their visual qualities. They generally understood the value of descriptive prompts and how visual cues are more helpful for the generation process. Most participants had to do multiple iterations of their prompts to reach satisfactory results. Several prompt modifications included changing physical descriptions of characters, including terms for realism and style, adding location descriptions to get a better-looking character, character clothes descriptions, and others.

The displayed results prove that LLMs can create sufficiently beautiful images based on user specifications that can resonate with their aesthetic tastes.

Several teams used different tools to generate the same images and some of their results can be seen in Figure 3. Stable Diffusion online and Leonardo.ai were among the popular choices of the various available online options (which all have limited tokens and usage).

Step 4: Most participants at the high school level (those in grade 11) had difficulty managing code, as they had never coded anything even in HTML and CSS. Their difficulties came in the form of the novelty of coding and file management: where to put their code, how to visualize the result and the images, etc. Those difficulties made the students think and try to solve those issues. It must be noted that some teams excelled and solved these challenges without any prior knowledge on the topic of using scription (or any) programming languages, thus developing problem-solving skills, and perhaps a deeper understanding of how technology works. Other participants (those in grade 12, and of course university students), had no issues with using the generated code to successfully establish communication with GPT and create their own static web pages. Some of them learned new applications of CSS to make their websites appear more beautiful and personalized.

Step 5: Publishing their projects in GitHub gave all participants the opportunity to work on their projects on their own time, not only during the workshop duration. Most were interested in working on the projects in their own time and were a part of the increased result success rate (compared to their usual levels of contributions, results, and achievements). One of the challenges in this step was creating GitHub accounts for those without any IT background, so for those groups that needed more time, they were given the task of creating and setting up their personal accounts in their own time.

Step 6: Creating the presentations was an interesting process, as students were impressed by their own creations. In order to assure consistency among the project presentations, a template was supplied to everyone, but the rules of the workshop allowed them full freedom of expression if they wanted to use other tools or ways of presenting—as long as they fulfilled the minimum requirements. The obligatory parts of the presentations were described: tools used for text and image generation, the prompts that were used, some of the generated images (for both locations and characters), and interesting moments during their work (these would also help the researchers identify patterns as well as resonate with listeners who had similar issues or positive moments).

Step 7: Presenting the results led to interesting discussions based on participant experience and interesting moments during the workshop. All participants (it is extremely rare for the entirety of a participating group to be unanimous on any topic) were satisfied with the workshop and had at least a bit of fun during it. The task provided them with the opportunity to see a different aspect of using AI.

Step 8: During the evaluation step, participants were graded based on the scales and points for project completion and presentation. Results were more than satisfactory. Participants were motivated to use such tools for creative purposes in their own time, as helping them complete a framework for a successfully completed project gave them a good starting point. The survey was conducted during this phase, and most participants answered the posed questions.

The general feedback received during this step by different groups was as follows:

Those with an Art background (specialized subjects focused on art) shared that they would use LLM tools for many of their personal projects, as it gives them a new direction to go in. Their stories contained the most detailed descriptions and the best images that closely resembled what they wanted the AI to create.

Those with a managing background (specialized in entrepreneurship and IT) shared that they would use AI in a small number of their projects. Their projects also had detailed descriptions, but the resulting images lacked vibrant colors, and the stories lacked good culmination. Most of their projects had a dark theme and muted colors.

Those with a mathematics background (specialized in mathematics and IT) had detailed websites, a good variety of images, and used interesting approaches to showcasing their characters in different ways on their webpages, using medium and advanced CSS styling. They were aware of the use of AI to help with code generation but were surprised to learn of its limitations when given tasks that push it to the maximum. They were most excited about hosting their website for free without advertisements and learning about github.io.

The youngest participants shared that they made their first steps in HTML and CSS and liked it, while the older participants liked using cutting-edge programming tools and powerful AI as a part of the workshop.

Several important points can be taken away from the various opinions of the teams:

-: AI needs constant reminders about who the characters are, as it often hallucinates new companions.
-: The 3000-word minimum required constant new prompts that contained explanations and previous data.
-: The image results were imperfect; for most people, at least several tries were necessary to achieve results that had all fingers attached and had no missing limbs or broken arm shapes.
-: Generated images were different than what the expectations were—revealing the necessity for improving prompt skills.
-: The most important—it is not enough to just copy the information from an AI chat bot, because without sufficient understanding, it cannot be used effectively. There needs to be a strong link between information, technology, and application.

Taking into account the different viewpoints of participants with slightly different backgrounds in a similar age group, we can conclude that limit-pushing requirements are what is necessary to push students to develop their critical thinking and broaden their understanding in a particular field. To obtain better results, it is not enough to just formulate their problem and wait for external help, but prerequisite thinking is required. Of course, the general information from ChatGPT and similar tools must always be verified or there is a chance for students to end up with wrong data, which is why the defined tasks when working with LLMs should not be reliant on facts, thus going in the direction of creative solutions.

In order to analyze the variety of prompts used by students and their diverse results, the YOLOv5 image recognition model was used to identify the types of objects in the images. Considering the accuracy of the model, it recognized the following objects in the images: out of 784 total images, 704 contained a character/person. The recognized object type distribution can be seen in Table 1.

Objects that were found less than five times (a total of 48) were omitted for brevity. A total of 784 images were analyzed and 1544 objects were classified in those generated images. It should be noted that more optimized models can be used for image and facial recognition for very precise tasks, such as driver drowsiness recognition [26]. The currently chosen one was used mostly for illustrative purposes, as this is not the goal of current the study. Behavioral patterns in people and object classification when given tasks with image generation can be a topic of future research.

It can be noted that the recognition level is high enough based on sampled manual validation, as the images are vastly different, and some have defects that can make them difficult to categorize.

A part of the output of Stable Diffusion during the workshops can be seen in Figure 4. A wide variety of styles and prompts are used to achieve visual results to a varying degree of quality.

3.2. Surveys and Interviews

Using the Google Forms platform, a survey was created and conducted with the participants anonymously to ensure their unbiased opinions were presented. It contains 33 questions that have been grouped into categories, as shown in Table 2.

The reliability of the questionnaire was assessed using Cronbach’s alpha, a measure of internal consistency. The survey was deemed moderate and acceptable with the value of 0.616 for Cronbach’s alpha. The value is not the highest due to the broadness of the asked questions. Keeping in mind the wide range of criterion types that were chosen to be measured, it was deemed that the statistical data were acceptable and appropriate enough for validating the initial steps to the solution to the problem [27], particularly in the field of science education [28].

The questions in the survey have been divided into five categories. Story text—questions related to the creation of the story part of the project—Q1, Q2, Q6, Q9, Q10, Q12, Q15 and Q16. Images—questions related to the generation of the images for the characters and the locations—Q3, Q4, Q5, Q7, Q8, Q11, Q13. Code—questions related to the generation of the code (HTML, CSS, JavaScript, and the complimentary libraries)—Q14, Q17, Q18, Q19. Results—questions related to the encountered difficulties on the path to successful results, as well as whether the participants’ expectations were met—Q24, Q25, Q26, Q27, Q30, Q31, Q32. General—questions related to the general experience, feedback, future thoughts on the workshop, the methodology, and the process itself—Q20.1, Q20.2, Q21, Q22, Q23, Q28, Q29.

The short interviews conducted after each participant presented their work consisted of questions pertaining to the topic of difficulties during the workshop. The overwhelming majority of participants’ answers coincided with the researchers’ expectations; the requirements imposed by the workshop rules were going beyond the limits of the LLMs. Most participants had to come up with their own way to solve those issues.

The validity and reliability of the questionnaire items was assured (as much as possible) by:

(1): Pruning the questions with answers, which were not compliant with the Likert Scale format.
(2): Random errors were eliminated as much as possible, due to the nature of collecting the responses—the survey was online, so no errors due to wrong reporting were possible.
(3): Systematic errors were prevented due to choosing only the most important questions throughout the questionnaire. All questions that had the possibility of including such errors were pruned.
(4): The possibility of misunderstanding the questions was mitigated as much as possible by introducing equalized scaling answers—“1 = bad/less, 5 = good/more”.

By using the guidelines above, the influence of errors in the survey has been mitigated as much as possible.

3.2.1. Sample Size and Description

The sample consists of students in grade 11 (from an English Language School), grade 12 (again from an English Language School), grade 12 (from a High School of Mathematics), and university students in their third (3rd) year (in the Software Engineering bachelor’s degree). The total number of survey participants was 130. The detailed distribution of the participants can be seen in the following Table 3. Groups 11 ELS and 12 ELS specialize in a two-year informatics and information systems profile.

The information on the precise number of students, their gender, and count has been acquired from the teachers and professors who conducted the workshop experiment.

The choice of different participating institutions has been made with variety and robustness of results in mind. In order to test whether the workshop was successful with people that have a background in logical thought and STEM, the High School of Mathematics was chosen; to verify the performance of specialized individuals, the third year Software Engineering students were selected (they were also working alone and had to apply their knowledge in a shorter timeframe). Two different age-groups were selected from a school that is specialized in teaching English so that a comparison could be made between the two groups with a similar background, as well as to compare them as non-specialists to the other participants.

3.2.2. Data Analysis Methods

Questions Q24, Q26, Q28, Q29, Q31, and Q32 were analyzed to verify both the overall validity of the gathered data and the group distribution dependent on participant type. All of them are a part of the previously defined Results category, with the exception of Q29 and Q28, which are members of the General category. The answer type of the questions is best suited for our analysis.

The normalization of each chosen question was checked using the SPSS version 25 software [29]. A zero hypothesis H0 was formed; it states that the data from the sample is normalized. Using the Kolmogorov–Smirnov test, the data was analyzed and H0 was proven wrong—the asymptotic sigma values are below 0.05. The exact analysis result values for the category “RESULTS” can be seen in Table 4.

The Kolmogorov–Smirnov test for normality shows that p-value = 0.000 is less than 0.05. Therefore, we must reject the null hypothesis; in other words, the data are not normally distributed. Therefore, we must use nonparametric tests.

The descriptive statistics for five variables indicate that the average is bigger than 3: Q24 reveals a mean of 3.33 (SD = 0.989) which shows a tendency toward answers showing less deviation in ChatGPT results from participants’ expectations; Q26 reveals a mean of 3.73 (SD = 0.900) which shows a tendency toward answers that show increased prompt effectiveness; Q28 reveals a mean of 3.46 (SD = 1.019) which shows a tendency toward positive answers for the future use of LLM tools for code and content generation by participants; Q29 reveals a mean of 3.48 (SD = 0.955) which shows a tendency toward answers that show increased interest in the subject related to the workshop; Q31 reveals a mean of 3.88 (SD = 0.896) which indicates a tendency toward answers that imply successful general task completion while working in tandem with LLMs. The descriptive statistics for Q32 reveal a mean of 2.81 (SD = 0.929), indicating the average is lesser than 3, which shows a tendency toward negative answers. It can be concluded that most participants did not find it difficult to work with LLM tools.

3.2.3. Group Comparison Analysis

The four distinct groups that were analyzed were 11 ELS, 12 ELS, 12 Math, and 3 SE, as described in Table 3.

The Q24 result analysis (seen in Table 5 and Figure 5) shows that groups 12 ELS and 3 SE met fewer difficulties in tone or style change while creating their story. An interesting conclusion can be made that due to their level of language skills, their way of expression can be defined as better than the other participants.

Results for Q26 (seen in Table 6 and Figure 6) show that the visual descriptive skills of group 12 Math were a bit lower overall, but in general, the distribution of results is similar for every group. The gap is too small to be of statistical significance. It can be concluded that all groups have similar skill levels.

Question Q28 (seen in Table 7 and Figure 7) evaluates the desire of the students to continue using LLMs in the future. The answer distribution shows that the more technical and tech-oriented participants (groups 12 Math and 3 SE) were most influenced by the usage of such tools and are more likely to continue using them in the future.

Question Q29 (seen in Table 8 and Figure 8) evaluates the increase in interest in information systems and technologies in general. The answer distribution displays the overall small increase in interest of the various groups. Some participants were thoroughly impressed and motivated, while others were not.

Q31 (seen in Table 9 and Figure 9) evaluates the ability of participants to successfully communicate and work together with the LLM tools. In essence, their communication skills are instrumental in the resulting conversations and image prompts. Older and more advanced students (3 SE), as well as math-focused students (12 Math), were more successful in their symbiotic work on the project using LLMs.

Q32 (seen in Table 10 and Figure 10) evaluates overall technical result competencies; it can be noted that 3 SE, as more experienced users and future software engineers, found almost no difficulties achieving the desired result, followed by 12 Math. The rest of the groups show average difficulty levels.

A temporary conclusion can be made based on the answer distribution for the chosen questions, the age groups, and previous qualification and/or specialization matters for the statistical answer distribution and participant attitude. The background of the participants influences their expectations and behavior towards the LLM tools they were provided and had to work with.

Students with additional specialization in art were the most motivated to complete the task compared to their peers. They formed the prompts carefully using an iterative approach until they were happy with the results. They were behaving in the most positive way out of all the participating groups.

Students with additional specialization in mathematics were concise and less descriptive with their prompts, needing more iterations to reach a satisfactory result. They needed more individual discussions in order to overcome the frustrations created by their brevity.

Students with additional specialization in entrepreneurship also had to modify their prompts, but were unhappy in the beginning, as most of them had already decided beforehand on their prompts and were not as malleable in their wording as their peers.

The software engineering students recognized the value of the workshop and were happy with the overall experience, understanding how to use LLMs as tools. They behaved in the most calm and professional manner, compared to the younger participants.

Based on the gathered behavioral data, it can be concluded that the workshop was positively received. Younger participants experienced more frustration while trying to get the LLMs to finish their tasks correctly.

Several screenshots from the completed participant projects can be viewed in Figure 11. A platform to save results from future workshops is considered, as the current host for the participants’ projects is their personal GitHub account, and that makes long-term hosting an issue.

4. Discussion

Several limitations of this study are that it could use more participants, as more data will help further prove the suggested points. The workshop itself has prerequisites from various fields; it requires reading, writing, art-visual presentation, speaking (during presenting), prompt engineering, programming, and LLM limit understanding, and creativity. This makes it challenging, but also beneficial to all participants. The survey will be further refined for future workshops, in order to expand the criteria for analysis. The data gathered in this study are a great foundation to help with future research.

This workshop is the first of, hopefully, many to follow on similar topics. Specialized LLMs are important for each field, and disseminating the knowledge on how to train and use them is imperative for the future success of specialists in the fields of medicine, healthcare, engineering (all kinds), business, transportation, etc. The process of creating art can also benefit from these tools; it can help artists refine their processes and achieve better results. The findings of this workshop show that the format used is successful; it is simple enough so that even high school participants can complete it while also being engaging enough for university students.

All authors of this paper believe that it is an absolute necessity for LLMs to be included in all education levels. The way current curriculums work almost everywhere is too slow to include AI efficiently. The lack of proper structured guides is problematic, and we believe this is a step in the right direction. It is not enough to simply showcase some tools that make use of AI and call it “education”. Students (both at high school and university level) need practical experience and limit testing in order to build a knowledge base that will make them relevant in the coming times. In software engineering, the subject of AI has been taught for decades, but it has never been so easy to achieve useful results using such technologies. All fields of science, industry, business, agriculture, healthcare, and engineering in general are currently using and evolving LLMs. It is imperative that educators (in general, not only in schools and universities) receive the necessary tools and ideas to make workshops that will raise qualifications, so as to raise everyone’s level of understanding.

The workshop that has been created can be used as a base for extended courses on the topic and can be taught at any level, because of the low complexity of the tasks. The “creative” approach was chosen because it is field agnostic, and every participant can later build on the achieved results in their own way; they can transfer the gained experience to creating a great CV or finally teach themselves to draw using AI help or create their dream game or make and publish a small web project in any field that they have interest in.

It is the authors’ belief that this workshop and this paper along with it will be useful to anyone in the field of education, be it if they are starting to dabble in LLM tools now or if they have been using GPT and the like for creating and grading their tests from the time it came out. Everyone can use, modify, and apply the approaches used, the methodologies, and the results, to, hopefully, lead to a beneficial experience for all participants. It will also be useful as a starting point for people who want concrete project ideas and need a small nudge, to begin with something achievable.

All good changes start from people who have a vision and the persistence to follow it through. We need to educate more people who are like that, and we need it done not for today, but for “yesterday”.

5. Conclusions

It is inevitable that LLMs will become even more used in all aspects of life. Currently, there are severe hardware limitations that are slowing the more widespread use of LLMs for personal use. Within several years of the unleashing of AI and LLM technology, many new jobs will be created in place of other jobs that will be obsolete. People with knowledge of how to use personal LLMs will be at an advantage. Preparing everyone is imperative to adequate transitioning to a new way of thinking about work. All areas of learning, education, industry, and business will require sufficient knowledge on the matter to increase motivation and lead to satisfactory achievements.

This research paper and the related workshop are one of many that will try to analyze, discuss, and implement a way to navigate the successful integration of AI in the lives of everyone. The limitations of the tools, the difficulties of expressing thoughts using prompts, the memory issues when keeping continuous conversation with AI, as well as the standard hallucination and fact-checking problems are only a few of the points that have been elaborated during the course of working on this experiment.

The requirements of how people complete their tasks should take into account the use of LLM tools, at least to a reasonable degree. AI is incredible in doing mundane tasks well, as long as the person using it understands what they are doing. Teaching everyone how to overcome this limitation is a duty that we as researchers working in the field of computer science have, and we must complete this duty however difficult it is. Completing project-based work with the help of AI in the creative process has been a successful first step to solving the problem that was defined in the beginning. The lack of a structured approach to teaching the correct way to use LLMs is being solved by such initiatives. This approach has merits and should be propagated, refined, and analyzed further—in more fields, with more people with different backgrounds.

Many people start using AI expecting either an omniscient all-knowing tutor or an incompetent machine. Promoting works such as the one in this workshop helps them recognize the truth: AI (LLMs) is a tool that can be used appropriately only when a person has the prerequisite knowledge, a concrete idea, and a plan. While the AI models can help with the process of applying that knowledge to realizing ideas, they are limited by their own design: memory issues, hallucinations, confidence in the wrong answers, artifacts in imagery, and lack of understanding.

It is imperative for LLMs to be viewed as tools with an understanding of their limitations and characteristics and not used as an easy way to outsource thinking. The long-term impact will be detrimental to all areas of human society if people are not educated that their own logical thoughts are a requirement for the successful use of LLMs. Research on the topic is currently ongoing and a multitude of experiments in all fields must be performed to verify the effectiveness of different approaches. This experimental workshop aims at recognizing the difficulties of implementing a structured approach to using LLMs in the field of education and creativity.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/app14146280/s1.

Author Contributions

V.K. created the workshop—the idea and the realization, contributed in the workshop methodology creation, conducted the workshop experiment, contributed in the literature review, supervised and supported other professors and teachers with workshop participation, issues, and problems, ensured availability of LLMs for the purposes of the workshop, promoted the experiment to participants, created instructions, manuals, and examples, participated in the survey creation, supervised the experiment and statistical data collection and analysis, formatted and collated all other author data, conclusions, and discussions into the paper, collated participant results and reached conclusions based on the data gathered, coordinated and organized other participants. B.I. conducted the workshop, participated in the survey creation, contributed in the literature review, supervised the workshop integration with SDG and UN goals, helped with gathering statistical information, participated in the writing and description of the practical results of applying the methodology, collated participant results and reached conclusions based on the data gathered. K.S. conducted the workshop, participated in survey creation, participated in the literature review and methodology refinement, helped with gathering statistical information, participated in the writing and description of the practical results of applying the methodology, collated participant results, and reached conclusions based on the data gathered. M.A. did the statistical analysis of survey results and data collation, experimenting with different statistical methods for data analysis, helped with formatting and question choice and answer distribution analysis, contributed to creating question categories, collated participant results, and reached conclusions based on the data gathered. All authors have read and agreed to the published version of the manuscript.

Funding

This study is financed by the European Union-NextGenerationEU, through the National Recovery and Resilience Plan of the Republic of Bulgaria, project No. BG-RRP-2.013-0001-C01.

Data Availability Statement

All the necessary materials for recreating the workshop, including instructions, are available at https://drive.google.com/drive/folders/15AH6bhjo2rqDtYS7Lvc92U2tfpTEahKU?usp=sharing (accessed on 16 July 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhu, Y.; Wang, T.T.; Wang, C.; Quan, W.; Tang, M.W. Complexity-Driven Trust Dynamics in Human–Robot Interactions: Insights from AI-Enhanced Collaborative Engagements. Appl. Sci. 2023, 13, 12989. [Google Scholar] [CrossRef]
Yang, C.; Kim, J.; Kang, D.; Eom, D.S. Vision AI System Development for Improved Productivity in Challenging Industrial Environments: A Sustainable and Efficient Approach. Appl. Sci. 2024, 14, 2750. [Google Scholar] [CrossRef]
Kevin, M.; Baeza-Yates, R. Responsible AI in Farming: A Multi-Criteria Framework for Sustainable Technology Design. Appl. Sci. 2024, 14, 437. [Google Scholar] [CrossRef]
Raiaan, M.A.K.; Mukta, M.S.H.; Fatema, K.; Fahad, N.M.; Sakib, S.; Mim, M.M.J.; Ahmad, J.; Ali, M.E.; Azam, S. A review on large Language Models: Architectures, applications, taxonomies, open issues and challenges. IEEE Access 2024, 12, 26839–26874. [Google Scholar] [CrossRef]
Sufi, F. Generative Pre-Trained Transformer (GPT) in Research: A Systematic Review on Data Augmentation. Information 2024, 15, 99. [Google Scholar] [CrossRef]
Silva, D.; Anjalee; Janaka; Wijekoon, L.; Liyanarachchi, R.; Panchendrarajan, R.; Rajapaksha, W. AI insights: A case study on utilizing ChatGPT intelligence for research paper analysis. arXiv 2024, arXiv:2403.03293. [Google Scholar]
Stöckli, L.; Joho, L.; Lehner, F.; Hanne, T. The Personification of ChatGPT (GPT-4)—Understanding Its Personality and Adaptability. Information 2024, 6, 300. [Google Scholar] [CrossRef]
Choi, D.; Lee, S.; Kim, S.I.; Lee, K.; Yoo, H.J.; Lee, S.; Hong, H. Unlock Life with a Chat (GPT): Integrating Conversational AI with Large Language Models into Everyday Lives of Autistic Individuals. In Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11–16 May 2024; pp. 1–17. Available online: https://dl.acm.org/doi/full/10.1145/3613904.3641989 (accessed on 20 June 2024).
Almasre, M. Development and Evaluation of a Custom GPT for the Assessment of Students’ Designs in a Typography Course. Educ. Sci. 2024, 14, 148. [Google Scholar] [CrossRef]
Bravo, F.A.; Cruz-Bohorquez, J.M. Engineering Education in the Age of AI: Analysis of the Impact of Chatbots on Learning in Engineering. Educ. Sci. 2024, 14, 484. [Google Scholar] [CrossRef]
Valova, I.; Mladenova, T.; Kanev, G. Students’ Perception of ChatGPT Usage in Education. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 466–473. [Google Scholar] [CrossRef]
Marques, N.; Silva, R.R.; Bernardino, J. Using ChatGPT in Software Requirements Engineering: A Comprehensive Review. Future Internet 2024, 16, 180. [Google Scholar] [CrossRef]
Anchev, A.; Georgieva, T.; Ivanov, A. Experimental Comparison of Standard and Trajectories Movement Algorithm using Intelligent Camera. In Proceedings of the 2023 4th International Conference on Communications, Information, Electronic and Energy Systems (CIEES), Plovdiv, Bulgaria, 23–25 November 2023; IEEE: Piscataway, NJ, USA; pp. 1–6. Available online: https://ieeexplore.ieee.org/abstract/document/10378783/ (accessed on 20 June 2024).
Ivanov, A.; Ivanova, M.; Anchev, A. Mathematical Models Incorporated in a Digital Workflow for Designing an Anthropomorphous Robot. In Computer Science On-line Conference; Springer International Publishing: Cham, Switzerland, 2019; pp. 306–314. Available online: https://link.springer.com/chapter/10.1007/978-3-030-19813-8_31 (accessed on 20 June 2024).
Ivanova, M.; Grosseck, G.; Holotescu, C. Unveiling Insights: A Bibliometric Analysis of Artificial Intelligence in Teaching. Informatics 2024, 11, 10. [Google Scholar] [CrossRef]
Guan, C.; Mou, J.; Jiang, Z. Artificial intelligence innovation in education: A twenty-year data-driven historical analysis. Int. J. Innov. Stud. 2020, 4, 134–147. [Google Scholar] [CrossRef]
Hou, C.; Hua, L.; Lin, Y.; Zhang, J.; Liu, G.; Xiao, Y. Application and exploration of artificial intelligence and edge computing in long-distance education on mobile network. Mob. Netw. Appl. 2021, 26, 2164–2175. [Google Scholar] [CrossRef]
Markos, A.; Prentzas, J.; Sidiropoulou, M. Pre-Service Teachers’ Assessment of ChatGPT’s Utility in Higher Education: SWOT and Content Analysis. Electronics 2024, 13, 1985. [Google Scholar] [CrossRef]
Sánchez-Ruiz, L.M.; Moll-López, S.; Nuñez-Pérez, A.; Moraño-Fernández, J.A.; Vega-Fleitas, E. ChatGPT challenges blended learning methodologies in engineering education: A case study in mathematics. Appl. Sci. 2023, 13, 6039. [Google Scholar] [CrossRef]
Calvo, I.; Carrascal, E.; González, J.M.; Armentia, A.; Gil-García, J.M.; Barambones, O.; Basogain, X.; Tazo-Herran, I.; Apiñaniz, E. A Methodology to Introduce Sustainable Development Goals in Engineering Degrees by Means of Multidisciplinary Projects. Educ. Sci. 2024, 14, 583. [Google Scholar] [CrossRef]
Kyllonen, P.C. Inequality, education, workforce preparedness, and complex problem solving. J. Intell. 2018, 6, 33. [Google Scholar] [CrossRef] [PubMed]
Perevoznic, F.M.; Dragomir, V.D. Achieving the 2030 Agenda: Mapping the Landscape of Corporate Sustainability Goals and Policies in the European Union. Sustainability 2024, 16, 2971. [Google Scholar] [CrossRef]
Ionescu, G.H.; Jianu, E.; Patrichi, I.C.; Ghiocel, F.; Țenea, L.; Iancu, D. Assessment of sustainable development goals (SDG) implementation in Bulgaria and future developments. Sustainability 2021, 13, 12000. [Google Scholar] [CrossRef]
Saif, N.; Khan, S.U.; Shaheen, I.; ALotaibi, F.A.; Alnfiai, M.M.; Arif, M. Chat-GPT; validating Technology Acceptance Model (TAM) in education sector via ubiquitous learning mechanism. Comput. Hum. Behav. 2024, 154, 108097. [Google Scholar] [CrossRef]
Albadarin, Y.; Saqr, M.; Pope, N.; Tukiainen, M. A systematic literature review of empirical research on ChatGPT in education. Discov. Educ. 2024, 3, 60. [Google Scholar] [CrossRef]
Díaz-Santos, S.; Cigala-Álvarez, Ó.; Gonzalez-Sosa, E.; Caballero-Gil, P.; Caballero-Gil, C. Driver identification and detection of drowsiness while driving. Appl. Sci. 2024, 14, 2603. [Google Scholar] [CrossRef]
Raharjanti, N.W.; Wiguna, T.; Purwadianto, A.; Soemantri, D.; Indriatmi, W.; Poerwandari, E.K.; Mahajudin, M.S.; Nugrahadi, N.R.; Roekman, A.E.; Saroso, O.J.D.A.; et al. Translation, validity and reliability of decision style scale in forensic psychiatric setting in Indonesia. Heliyon 2022, 8, e09810. [Google Scholar] [CrossRef] [PubMed]
Taber, K.S. The use of Cronbach’s alpha when developing and reporting research instruments in science education. Res. Sci. Educ. 2018, 48, 1273–1296. [Google Scholar] [CrossRef]
Field, A. Discovering Statistics Using IBM SPSS Statistics; Sage: London, UK, 2013. [Google Scholar]

Figure 1. Research methodology (left) and the research design (right).

Figure 2. Workshop methodology.

Figure 3. Using different tools to generate the same picture.

Figure 4. A partial view of some of the generated images during the workshop.

Figure 5. Q24 result analysis result distribution.

Figure 6. Q26 result analysis result distribution.

Figure 7. Q28 result analysis result distribution.

Figure 8. Q29 result analysis result distribution.

Figure 9. Q31 result analysis result distribution.

Figure 10. Q32 result analysis result distribution.

Figure 11. Screenshots from several of the completed projects.

Table 1. Image results classified by the YOLOv5 image recognition model.

Type	Count	Type	Count
person	704	truck	19
chair	139	handbag	15
car	70	teddy bear	14
bird	63	dog	14
bottle	62	umbrella	12
tie	51	suitcase	11
kite	43	cell phone	10
potted plant	31	horse	9
vase	29	cake	9
book	29	bench	9
motorcycle	28	tv	8
backpack	27	bowl	7
dining table	24	sports ball	6
sheep	21	traffic light	6
cup	20	scissors	5
boat	20	fire hydrant	5
clock	19	cat	5

Table 2. All 33 questions from the different categories, along with their answer types.

Category	Q#	Question Text	Answer Type
Story text	Q1	How did ChatGPT help you in crafting the introduction of your short story?	Text
	Q2	Were you able to effectively communicate the setting and atmosphere of the story’s beginning?	No/Yes
	Q6	Did you encounter any challenges in maintaining coherence and flow between the introduction and the first scene?	No/Yes
	Q9	How many challenges did you face in maintaining consistency in the story’s tone and style during the second scene?	Scale 1–5
	Q10	Did ChatGPT aid you in resolving the conflict or climax of the story in the third scene?	No/Yes
	Q12	Did you face challenges in ensuring a satisfying and coherent conclusion to the story’s narrative?	No/Yes
	Q15	Did you face any challenges in maintaining visual coherence and consistency across different parts of the story?	No/Yes
	Q16	How did ChatGPT assist you in crafting smooth transitions between different scenes of the short story?	Text
Images	Q3	Did ChatGPT assist in generating descriptive visuals for the initial scene and characters?	No/Yes
	Q4	How did you use ChatGPT to develop the interactions between the characters in the first scene?	Text
	Q5	Were you able to create a visually descriptive scene for the first part of your story using ChatGPT?	No/Yes
	Q7	How much did ChatGPT contribute to the development of the second scene and the introduction of new characters?	Scale 1–5
	Q8	Were you able to use ChatGPT to visually describe the second scene and characters effectively?	No/Yes
	Q11	Were you successful in creating visually descriptive elements for the resolution using ChatGPT?	No/Yes
	Q13	How much did LLM applications assist in generating visual descriptions for characters and scenes throughout the story?	Scale 1–5
Code	Q14	Were you able to seamlessly incorporate the generated visual elements into the HTML and CSS design of the website?	No/Yes
	Q17	Were you able to effectively visualize and represent the transitions between scenes on the website using ChatGPT?	No/Yes
	Q18	Did you encounter challenges in creating seamless transitions, and how did ChatGPT contribute to overcoming them?	Text
	Q19	How satisfied are you with the overall assistance provided by ChatGPT in creating your short story and website?	Scale 1–5
Results	Q24	Were there many instances where ChatGPT’s responses did not align with the tone or style you wanted for your short story?	Scale 1–5
	Q25	How did you handle situations where ChatGPT generated content that deviated from your intended plot or character development?	Text
	Q26	How close was the image generated to what you wanted to create? How effective were your prompts?	Scale 1–5
	Q27	What was the genre of your story? (Or genres, if there were multiple.)	Text
	Q30	Did you use the default character names, or did you create your own?	Own/Default
	Q31	How helpful were the LLM tools to you in general, when you had to complete this task? Did you manage to work together successfully?	Scale 1–5
	Q32	How difficult did you find it overall to get the correct responses while using LLM tools to complete this task?	Scale 1–5
General	Q20.1	Were there specific areas where you felt ChatGPT’s assistance was particularly helpful?	Text
	Q20.2	Were there specific areas where you felt ChatGPT’s assistance was particularly lacking?	Text
	Q21	In retrospect, would you consider using ChatGPT for a similar creative writing and web design task in the future?	No/Yes/Maybe
	Q22	Do you need any improvements or additional features you would like to see in ChatGPT to enhance its support for creative tasks?	No/Yes
	Q23	Were there any specific difficulties or limitations you faced in using ChatGPT for this task that you’d like addressed? (such as moderated content or ineffective prompts)	Text
	Q28	After working together with LLMs to generate content and code, how likely are you to use them in the future in your education?	Scale 1–5
	Q29	How much did working on this project increase your interest in Information systems and technologies?	Scale 1–5

Table 3. Survey participant property distribution.

Education Type	Age (in Years)	Group Name	Number Participants	Study Level	Gender Percentage
Education Type	Age (in Years)	Group Name	Number Participants	Study Level	M	F
High School of Mathematics	18–19	12 Math	21	High school grade 12	45%	55%
English Language School	17–18	11 ELS	60	High school grade 11	42%	58%
English Language School	18–19	12 ELS	34	High school grade 12	46%	54%
Ruse University—Software engineering, 3rd year	21–22	3 SE	15	University 3rd year	92%	8%

Table 4. One-sample Kolmogorov–Smirnov test for the questions that were chosen for analysis: descriptive, statistic, and analytic data.

Descriptive Statistics
	N	Mean	Std. Deviation		Minimum		Maximum
Q24	128	3.33	0.989		1		5
Q26	128	3.73	0.900		1		5
Q28	128	3.46	1.019		1		5
Q29	128	3.48	0.955		1		5
Q31	128	3.88	0.896		1		5
Q32	128	2.81	0.929		1		5
			Q32.	Q31.	Q29.	Q28.	Q26.	Q24.
N			128	128	128	128	128	128
Normal Parameters ^a,b		Mean	2.81	3.88	3.48	3.46	3.73	3.33
Normal Parameters ^a,b		Std. Deviation	0.929	0.896	0.955	1.019	0.900	0.989
Most Extreme Differences		Absolute	0.236	0.212	0.254	0.221	0.249	0.216
		Positive	0.225	0.179	0.254	0.221	0.189	0.216
		Negative	−0.236	−0.212	−0.200	−0.193	−0.249	−0.198
Test Statistic			0.236	0.212	0.254	0.221	0.249	0.216
Asymp. Sig. (2-tailed)			0.000 ^c	0.000 ^c	0.000 ^c	0.000 ^c	0.000 ^c	0.000 ^c

^a Test distribution is normal. ^b Calculated from data. ^c Lilliefors significance correction.

Table 5. Q24 result analysis result percentage distribution.

Q24. Were There Many Instances Where ChatGPT’s Responses Did Not Align with the Tone or Style You Wanted for Your Short Story? (Many 1–None 5)
		1—Many		2—Rather a Lot		3—Some		4—Rather None		5—None		Blank
Group	n	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	%
11 ELS	60	3	5.00%	5	8.33%	31	51.67%	15	25.00%	6	10.00%	0.00%
12 ELS	34	0	0.00%	4	11.76%	9	26.47%	13	38.24%	7	20.59%	2.94%
12 Math	21	2	9.52%	5	23.81%	9	42.86%	4	19.05%	1	4.76%	0.00%
3 SE	15	0	0.00%	3	20.00%	4	26.67%	5	33.33%	2	13.33%	6.67%

Table 6. Q26 result analysis result percentage distribution.

Q26. How Close Was the Image Generated to What You Wanted to Create? How Effective Were Your Prompts? (Not at All 1—Perfectly Alike 5)
		1—Not at All		2—Rather Not		3—Almost		4—Alike		5—Perfectly Alike		Blank
Group	n	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	%
11 ELS	60	1	1.67%	4	6.67%	15	25.00%	29	48.33%	11	18.33%	0.00%
12 ELS	34	0	0.00%	2	5.88%	11	32.35%	15	44.12%	5	14.71%	2.94%
12 Math	21	0	0.00%	2	9.52%	8	38.10%	4	19.05%	6	28.57%	0.00%
3 SE	15	0	0.00%	0	0.00%	3	20.00%	8	53.33%	3	20.00%	6.67%

Table 7. Q28 result analysis result percentage distribution.

Q28. After Working Together with LLMs to Generate Content and Code, How Likely Are You to Use Them in the Future in Your Education? (No 1–Yes 5)
		1—Not		2—Rather Not		3—Maybe		4—Closer to Yes		5—Yes		Blank
Group	n	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	%
11 ELS	60	1	1.67%	7	11.67%	33	55.00%	14	23.33%	4	6.67%	0.00%
12 ELS	34	2	5.88%	4	11.76%	11	32.35%	12	35.29%	4	11.76%	2.94%
12 Math	21	2	9.52%	1	4.76%	5	23.81%	6	28.57%	7	33.33%	0.00%
3 SE	15	0	0.00%	0	0.00%	3	20.00%	3	20.00%	8	53.33%	6.67%

Table 8. Q29 result analysis result percentage distribution.

Q29. How Much Did Working on This Project Increase Your Interest in Information Systems and Technologies? (Decrease 1–Increase 5)
		1—Decreased		2—Rather Decreased		3—Unchanged		4—Rather Increased		5—Increased		Blank
Group	n	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	%
11 ELS	60	0	0.00%	6	10.00%	31	51.67%	15	25.00%	8	13.33%	0.00%
12 ELS	34	1	2.94%	2	5.88%	14	41.18%	13	38.24%	3	8.82%	2.94%
12 Math	21	2	9.52%	3	14.29%	8	38.10%	1	4.76%	7	33.33%	0.00%
3 SE	15	0	0.00%	0	0.00%	5	33.33%	5	33.33%	4	26.67%	6.67%

Table 9. Q31 result analysis result percentage distribution.

Q31. How Helpful Were the LLM Tools to You in General, When You Had to Complete this Task? Did You Manage to Work Together Successfully? (Not at All 1–Very Helpful 5)
		1—Not at All		2—Rather Not Helpful		3—Rather Helpful		4—Helpful		5—Very Helpful		Blank
Group	n	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	%
11 ELS	60	0	0.00%	0	0.00%	25	41.67%	23	38.33%	12	20.00%	0.00%
12 ELS	34	0	0.00%	3	8.82%	6	17.65%	16	47.06%	8	23.53%	2.94%
12 Math	21	2	9.52%	0	0.00%	4	19.05%	7	33.33%	8	38.10%	0.00%
3 SE	15	0	0.00%	0	0.00%	4	26.67%	3	20.00%	7	46.67%	6.67%

Table 10. Q32 result analysis result percentage distribution.

Q32. How Difficult Did You Find It Overall to Get the Correct Responses While Using LLM Tools to Complete This Task? (Not 1–Very 5)
		1—Not Difficult		2—Closer to Not Difficult		3—Neutral		4—Closer to Very Difficult		5—Very Difficult		Blank
Group	n	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	Friq	Friq, %	%
11 ELS	60	3	5.00%	13	21.67%	35	58.33%	7	11.67%	2	3.33%	0.00%
12 ELS	34	2	5.88%	13	38.24%	10	29.41%	6	17.65%	2	5.88%	2.94%
12 Math	21	2	9.52%	5	23.81%	8	38.10%	5	23.81%	1	4.76%	0.00%
3 SE	15	3	20.00%	3	20.00%	6	33.33%	2	16.67%	0	0.00%	6.67%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kozov, V.; Ivanova, B.; Shoylekova, K.; Andreeva, M. Analyzing the Impact of a Structured LLM Workshop in Different Education Levels. Appl. Sci. 2024, 14, 6280. https://doi.org/10.3390/app14146280

AMA Style

Kozov V, Ivanova B, Shoylekova K, Andreeva M. Analyzing the Impact of a Structured LLM Workshop in Different Education Levels. Applied Sciences. 2024; 14(14):6280. https://doi.org/10.3390/app14146280

Chicago/Turabian Style

Kozov, Vasil, Boyana Ivanova, Kamelia Shoylekova, and Magdalena Andreeva. 2024. "Analyzing the Impact of a Structured LLM Workshop in Different Education Levels" Applied Sciences 14, no. 14: 6280. https://doi.org/10.3390/app14146280

APA Style

Kozov, V., Ivanova, B., Shoylekova, K., & Andreeva, M. (2024). Analyzing the Impact of a Structured LLM Workshop in Different Education Levels. Applied Sciences, 14(14), 6280. https://doi.org/10.3390/app14146280

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Analyzing the Impact of a Structured LLM Workshop in Different Education Levels

Abstract

1. Introduction

2. Materials and Methods

2.1. Start of the Iterative Process

2.2. End of Iterative Process

3. Results

3.1. Practical Outcome and Preliminary Data Analysis of the Workshop Results

3.2. Surveys and Interviews

3.2.1. Sample Size and Description

3.2.2. Data Analysis Methods

3.2.3. Group Comparison Analysis

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI