Next Article in Journal
XT-Hypergraph-Based Decomposition and Implementation of Concurrent Control Systems Modeled by Petri Nets
Previous Article in Journal
Open HTML5 Widgets for Smart Learning: Enriching Educational 360° Virtual Tours and a Comparative Evaluation vs. H5P
Previous Article in Special Issue
Control Applications with FPGA: Case of Approaching FPGAs for Students in an Intelligent Control Class
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Higher Mathematics Education and AI Prompt Patterns: Examples from Selected University Classes

1
Department of Mathematics, West University of Timisoara, 300223 Timişoara, Romania
2
Faculty of Computer Science, Bialystok University of Technology, 15-351 Białystok, Poland
3
Department of Computer Science, West University of Timisoara, 300223 Timişoara, Romania
4
School of Electrical and Computer Engineering, University of Crete, 73100 Chania, Greece
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(1), 339; https://doi.org/10.3390/app16010339
Submission received: 21 November 2025 / Revised: 18 December 2025 / Accepted: 23 December 2025 / Published: 29 December 2025
(This article belongs to the Special Issue Artificial Intelligence for Learning and Education)

Abstract

The rapid integration of large language models into higher education creates opportunities for mathematics instruction, but also raises the need for structured interaction strategies that support reflective learning rather than passive answer consumption. This study, conducted within the Erasmus+ MAESTRO-AI project, examines how selected AI prompt patterns can be implemented in concrete university mathematics activities and how students evaluate these AI-supported experiences. Two experimental modules were compared: complex numbers for first-semester Applied Mathematics students in Poland ( n = 100 ) and conditional probability for second-year Computer Science students in Romania ( n = 213 ). After completing AI-assisted learning activities with ChatGPT and/or Gemini, students completed a common evaluation questionnaire assessing engagement, perceived usefulness, and reflections on AI as a tutor. Group comparisons and experience-based analyses were performed using the Mann–Whitney test. Results indicate that students who reported regular prior use of AI tools evaluated AI-supported learning significantly more positively than those with occasional or no prior experience. They gave higher ratings across most questionnaire items as well as for the overall score. The findings suggest that prompt-pattern-based designs can support engaging AI-assisted mathematics activities. They also indicate that such designs can provide a structured learning experience, while introductory guidance may be important to ensure comparable benefits for less experienced students.

1. Introduction and Literature Review

The rapid development of artificial intelligence (AI) in recent years has profoundly influenced how mathematics is taught and learned in higher education. Large language models (LLMs) such as ChatGPT (https://chat.openai.com, accessed 17 December 2025) and Gemini (https://gemini.google.com, accessed 17 December 2025) have become increasingly popular among both students and educators as tools for problem solving, conceptual explanation, and the generation of didactic materials [1,2,3]. Recent studies among engineering students in Poland show that over one-third of first-year students regularly use AI tools in their learning process, with ChatGPT being the most frequently chosen platform [1,4].
However, this growing integration of AI in education raises important challenges related to students’ critical evaluation of AI-generated content. Research indicates that students sometimes rate chatbot-generated solutions higher than those provided by teachers, even when the AI responses contain factual or conceptual errors [2,3,4]. Such overreliance on AI may reduce analytical and verification skills that are fundamental in mathematics and other exact sciences. This highlights the need for structured didactic strategies that enhance human–AI interaction and promote reflection, rather than passive acceptance of generated content. Recent studies indicate that structured AI-prompt patterns can effectively support both conceptual understanding and procedural fluency by guiding students through stages of explanation, reasoning, and reflection [5,6]. In higher mathematics, such structured prompts can function as cognitive scaffolding, helping students connect problem-solving tasks with theoretical reasoning—an important component of mathematical thinking in engineering and computer science programs. In this context, our aim is not to develop a comprehensive pedagogical framework. Instead, we focus on exploring how carefully designed prompt patterns can be implemented in specific mathematics learning activities. We also investigate how students experience and perceive these structured AI-assisted tasks, with particular attention to the ways such designs may support engagement, understanding, and reflective thinking.
It should be emphasized that the research presented in this paper is carried out within the framework of the international MAESTRO-AI project (Math and Programming Education with Smart Teaching and Robust Outreach using AI at Universities). Funded under the Erasmus+ Programme (Key Action 2–Higher Education Cooperation Partnerships), the project runs from September 2024 to January 2027 and is coordinated by Bialystok University of Technology (Poland). The consortium also includes the Technical University of Crete (Greece), the West University of Timișoara (Romania), and the Instituto Politécnico de Viseu (Portugal). Its main goal is to design and implement innovative methods for teaching mathematics and programming through the use of AI technologies.
Within this initiative, experimental classes have been conducted in partner institutions. At the Bialystok University of Technology, first-semester Applied Mathematics students worked with AI-assisted modules on complex numbers, while at the West University of Timișoara, second-year Computer Science students engaged with a module on conditional probability. Both groups interacted with ChatGPT and Gemini—the latter chosen for its unrestricted prompt input and broader contextual understanding. Students from both universities completed the same evaluation questionnaire, allowing for a comparative analysis of engagement, perceived usefulness, and reflections on AI-assisted learning.
While the broader MAESTRO-AI project aims to explore how AI-supported methods may foster more reflective forms of learning, the scope of the present study is narrower. Here, we focus specifically on the implementation of selected prompt patterns in two mathematics modules and on analyzing how students with different levels of AI experience evaluated these AI-assisted activities.
In this paper, an AI prompt is understood as any natural-language instruction, question, or task provided to an AI model (cf. widely used definitions in prompt engineering). A prompt pattern is a structured, repeatable framework for designing sequences of prompts that guide human–AI interaction toward specific pedagogical goals. This distinction is consistent with recent work in educational prompt design, where patterns function as didactic scaffolds rather than isolated queries [7,8,9]. Throughout Section 2, the term “prompt” refers to single inputs, whereas “prompt pattern” describes the larger instructional structure managed by the teacher or student.
Preliminary results indicated that unstructured use of chatbots did not yield the desired educational outcomes. Structured prompt patterns can elicit cognitive behaviors—such as explaining reasoning, diagnosing errors, or justifying solution steps—that are unlikely to arise during free prompting. Free-form interactions often lead students to request ready-made solutions, whereas selected patterns require active participation and stepwise reasoning. Thus, even though our focus is on student perceptions, the pedagogical rationale for using prompt patterns lies in their ability to support reflective and metacognitive processes. According to the recent literature, prompt engineering methods can be divided into general prompting patterns (e.g., Zero-Shot [10], Few-Shot [11], Chain-of-Thought [12], Constraint-Based [12], Negative Prompting [13], Instruction-Based [14]) and educational prompt patterns designed for learning environments (e.g., Socratic Reversal, Show-Then-Do, Adaptive Quiz, Cognitive Verifier) [7,8,9].
In the context of higher mathematics, such patterns open new opportunities to support both conceptual understanding and procedural fluency. They help integrate problem-solving tasks with theoretical reasoning—an essential aspect of teaching mathematics to engineering and computer science students.
The present study analyzes selected AI prompt patterns and their impact on student learning outcomes in two experimental implementations. To examine how students’ prior experience with AI tools influenced their evaluation of AI-assisted classes, the Mann–Whitney test for two independent groups was applied. The results revealed significant differences in engagement, satisfaction, and perception of the AI tutor, varying according to the mathematical topic and students’ academic background. These findings provide insight into effective strategies for integrating AI models into higher mathematics education and highlight the importance of tailoring AI-assisted methods to students’ experience levels.
We stress out that the results presented in this paper focus on how students perceived the AI-supported activities, not on measuring their skills or academic progress. Because of this, the findings should be read as an indication of how students experienced the lessons rather than how much they learned. To avoid ambiguity, we added a short clarification of the teaching setup so that the context of these perceptions is clear. In future work, we plan to include performance-based measures to examine learning gains more directly.
This study examines the implementation of selected prompt patterns in two AI-supported university mathematics modules and students’ evaluations of these structured activities. We address the following research questions:
RQ1:
How do students evaluate AI-supported mathematics activities designed using structured prompt patterns (engagement, perceived usefulness, clarity of interaction, and reflection)?
RQ2:
RQ2: How is students’ evaluation associated with prior experience using AI tools (regular vs. occasional/none)?
RQ3:
How do students’ perceptions differ between the two implementations (complex numbers vs. conditional probability; Applied Mathematics vs. Computer Science cohorts) when the same questionnaire is applied?
The remainder of the paper is structured as follows. Section 2 introduces the concept of AI prompt patterns and provides an overview of general, educational, and teacher-oriented patterns relevant to mathematics instruction. It also describes the operationalization of the AI-assisted teaching experience. Section 3 and Section 4 present two experimental implementations: a module on complex numbers for Applied Mathematics students and a module on conditional probability for Computer Science students. Section 5 reports the results of the student evaluation questionnaire and the statistical analysis. Finally, Section 6 and Section 7 discuss the main findings, limitations of the study, and directions for future research.

2. General Prompt Patterns

To support the development of AI-assisted lesson models, the internal project document “Prompt Design Patterns for AI in Education” was prepared within the MAESTRO-AI initiative. The document can be found at https://www.music.tuc.gr/en/maestro-ai-reportonpromptpatternsforaiineducation (accessed on 17 December 2025). It describes various strategies for formulating prompts—questions, tasks, or instructions—that enable more effective use of artificial intelligence systems in educational settings. The document includes both general interaction patterns and education-oriented prompting frameworks. Applying these strategies not only improves the accuracy and coherence of AI responses but also fosters active, reflective, and critical approaches to learning.
As noted in the recent literature on prompt engineering [7,8], general prompt patterns are foundational strategies applicable across domains. While not designed exclusively for education, they demonstrate how to direct AI systems to produce more structured, accurate, or creative responses. The most widely recognized general patterns include the following:
  • Zero-Shot Prompting—AI responds without prior examples, testing its general knowledge and reasoning skills [10].
  • Few-Shot Prompting—AI is provided with a few examples, allowing it to adapt the tone, structure, and reasoning of its response [11].
  • Chain-of-Thought Prompting—AI reveals intermediate reasoning steps rather than only the final result, which enhances transparency in problem-solving [12].
  • Persona-Based Prompting—AI assumes a specific role (e.g., teacher, student, historical figure, or coach) to shape its responses accordingly [7].
  • Constraint-Based Prompting—The response is restricted by format or structure (e.g., bullet list, specific word limit) [12].
  • Negative Prompting—Specifies what AI should avoid, such as jargon or overly technical language [13].
  • Data-Driven Prompting—AI analyzes numerical data, tables, or code fragments to generate responses based on evidence [15].
  • Game-Play Prompting—The interaction takes the form of a game, quiz, or challenge to increase engagement.
  • Template Prompting—AI responses follow predefined templates for consistency and evaluation.
  • Cognitive Verifier—AI not only generates an answer but also verifies the correctness of the reasoning.
  • Adaptive Quiz Prompting—Questions dynamically adjust to the learner’s level of proficiency [16].
In mathematics education, these general prompt patterns can be interpreted as instructional scaffolds aligned with specific learning outcomes. Zero-shot prompts serve a diagnostic function by eliciting students’ initial reasoning and revealing misconceptions. Few-shot prompts support the development of procedural fluency and solution modeling by providing worked examples that students can adapt. Chain-of-thought or stepwise prompting facilitates structured reasoning and explicit justification of solution steps. Persona-based prompts promote mathematical communication by encouraging explanations in accessible and audience-appropriate language. Constraint-based prompts foster precision and structural clarity, which are essential components of mathematical writing. Negative prompting helps limit unproductive or overly complex output, thereby supporting accessibility and managing cognitive load. Data-driven prompting enhances students’ ability to interpret tables, diagrams, and numerical evidence, a core competency in probability and statistics. Finally, template-based and verifier-style prompts promote self-checking and error detection through explicit consistency checks, supporting the development of critical evaluation skills when working with AI.

2.1. Educational Prompt Patterns

Educational patterns can be understood as adaptations of general teaching methods, intentionally tailored to specific instructional goals. They constitute repeatable frameworks for working with students that support the development of knowledge, skills, and reflection on one’s own learning. For convenience of the reader, the patterns are grouped below into four main categories: knowledge building, active practice, creative thinking, and reflection and metacognition. Each category consists of specific patterns that include characteristic actions and corresponding educational goals [7,8].
  • Knowledge Building
    The first category includes patterns aimed at developing and organizing knowledge.{Socratic Reversal involves the student taking on the role of the teacher and explaining the material to the AI. This reversal of roles promotes deeper understanding through verbalization and justification.
    Feynman Prompt encourages the student to explain content in the simplest possible way, as if teaching a complete beginner. This approach helps verify the student’s level of understanding and identify gaps in knowledge.
    In the Gap Finder pattern, the AI analyzes the student’s responses and points out areas that require further work. This serves a diagnostic function and supports individualized learning.
    The final pattern in this category, Concept Contrast, consists of comparing concepts with similar meanings or functions, which helps develop analytical skills and supports more structured knowledge acquisition.
  • Active Practice
    The second category includes patterns designed to support practical application of knowledge.
    In Show-Then-Do, the AI initially presents a correct example of a solution, after which the student completes a similar task independently. This facilitates the transfer of knowledge into practice.
    Error Injection involves the AI deliberately introducing an error that the student must identify and correct, fostering self-correction skills and critical thinking.
    Test Me refers to the AI generating quizzes that assess student progress, reinforcing knowledge and supporting ongoing monitoring of learning.
    The Rewrite Challenge pattern asks the student to transform a correct solution into an equivalent alternative form. This develops cognitive flexibility and deepens understanding of the content.
  • Creative Thinking
    The third category focuses on patterns that stimulate creativity.
    In Many Ways, the student looks for multiple possible approaches to solving a problem, encouraging divergent thinking.
    Analogy Builder involves creating analogies to explain complex or abstract concepts, thus supporting knowledge transfer and deeper comprehension.
    Debate Simulation presents two opposing viewpoints in the form of a simulated debate, allowing students to explore different perspectives and develop argumentation skills.
  • Reflection and Metacognition
    The final category encompasses patterns that support self-reflection and conscious engagement with the learning process.
    Confidence Check asks the student to assess their confidence in an answer, after which the AI provides commentary. This helps develop cognitive awareness and self-regulation.
    In AI-as-Coach, the AI acts as a mentor, helping the student with planning their learning and developing strategies to improve their study habits.
    Reverse Role Play involves the student acting as the teacher while the AI takes the role of the student. This promotes reflection and enhances the student’s ability to explain material clearly.
    Finally, Learning Diary encourages students to keep a reflective journal with AI support, fostering systematic self-reflection and the development of metacognitive competencies.
Several prompt patterns are intentionally close in spirit because they operationalize similar pedagogical moves (e.g., explanation, diagnosis, or reflection) through slightly different interaction designs. In this paper, we treat each pattern as defined by its primary function in the learning dialogue. For example, Socratic Reversal emphasizes role switching to elicit reasoning and justification, whereas the Feynman Prompt emphasizes simplification for a novice audience as a check of conceptual clarity. Similarly, Gap Finder is used for diagnosing missing elements in the learner’s response, while Error Injection targets misconceptions through deliberate incorrect examples that must be identified and corrected. Finally, Confidence Check focuses on metacognitive calibration, whereas Learning Diary supports longer-term reflection and regulation over time.

2.2. Teacher-Oriented Patterns

The use of artificial intelligence in education can significantly reduce teacher workload by supporting lesson planning, the adaptation of instructional materials, and the assessment process. In practice, recurring patterns of its use can be distinguished, referred to as didactic patterns. Below are selected examples from two key areas: instructional design and assessment with feedback. In the context of instructional design, the following patterns are particularly useful:
  • Curriculum Generator, in which the AI supports the teacher in planning a course or lesson cycle by taking into account learning objectives, curriculum standards, and available time, and then proposing the sequence of topics and appropriate instructional materials.
  • Misconception Map, which enables the identification of common misconceptions and typical errors made by students regarding a given topic. This allows the teacher to anticipate potential difficulties and prepare strategies for addressing them, thereby increasing instructional effectiveness.
  • Scaffold Builder, which generates a sequence of tasks or examples with increasing levels of difficulty. Students can develop new skills smoothly, avoiding overly abrupt cognitive jumps that might reduce motivation.
  • Differentiation Designer, which adapts the same content to different student ability levels. The AI modifies, for example, text length, vocabulary range, or the number of hints, thus supporting instructional differentiation.
Assessment and feedback constitute an equally important area, where artificial intelligence supports teachers in providing personalized support to students. Examples of such patterns include the following:
  • Rubric-Based Feedback, in which the AI evaluates the student’s work based on predefined criteria, generating descriptive feedback that highlights strengths and areas requiring improvement. The teacher retains control over the process by approving or modifying the AI’s suggestions.
  • Adaptive Test Generation, which makes it possible to create tests that automatically adjust difficulty to the student’s abilities. This enables more precise assessment of knowledge and supports personalized educational development.
  • AI as Peer Reviewer, in which the system acts as a “peer”. Instead of formal grading, it focuses on offering support, suggesting improvements, and encouraging revision, making the evaluation process less stressful and more constructive.
To illustrate how these teacher-oriented patterns can be applied in practice, we provide brief examples drawn from typical university mathematics topics (including the modules used in this study):
  • Curriculum Generator (example: complex numbers, 90 min session). Prompt example: “Propose a lesson plan that introduces the imaginary unit i, operations on a + b i , and the trigonometric form. Include two diagnostic questions, three worked examples, five student exercises with increasing difficulty, and a five-minute exit quiz. Ensure common misconceptions are addressed”.
  • Misconception Map (example: conditional probability). Prompt example: “List frequent student errors when applying Bayes’ rule and the law of total probability (e.g., confusing P ( A B ) with P ( A B ) , forgetting normalization, mixing priors and likelihoods). For each misconception, propose a short counterexample and a corrective prompt”.
  • Scaffold Builder (example: Bayes’ rule progression). Prompt example: “Create a sequence of six tasks: start with reading probabilities from a contingency table, then computing P ( A B ) , then total probability, then Bayes’ rule in a medical-test setting, ending with a student-designed scenario. Add one hint per task that can be revealed progressively”.
  • Adaptive Test Generation (example: complex numbers operations). Prompt example: “Generate a short quiz where difficulty adapts based on student answers: start with addition/multiplication, then division and conjugates, then argument/modulus, then one task involving geometric interpretation. After each response, ask a one-sentence justification to discourage guessing”.
  • AI as Peer Reviewer (example: solution/explanation quality). Prompt example: “Act as a peer reviewer of my solution: check for missing assumptions, unclear steps, and notation errors; suggest a cleaner mathematical explanation without providing a completely new solution”.
In all cases, the instructor reviews AI-generated materials and ensures alignment with the intended learning outcomes.

2.3. Recommendations and Best Practices

AI can act not only as a source of information but also as a learning partner that enhances understanding and reflection. Proper use of prompt patterns helps structure the human–AI dialogue and transform it into a pedagogically meaningful process. Best practices include the following:
  • Combining multiple prompt types (e.g., quiz + reflection + coaching).
  • Adapting prompts to the learner’s experience and cognitive level.
  • Maintaining flexibility—AI should encourage creativity rather than rigid responses.
  • Teaching students to critically evaluate AI-generated answers.
As emphasized by Sahoo et al. [8] and White et al. [7], the systematic design of prompt structures allows educators to move from ad hoc chatbot use toward intentional, evidence-based AI integration in education.

2.4. Operationalization of the Teaching Experience

To increase the transparency of how the teaching activities were implemented, we provide a concise overview of the operational structure applied in both modules. Each lesson followed a consistent sequence of actions designed to translate the selected prompt patterns into classroom practice. The sessions began with short diagnostic prompts activating prior knowledge, followed by guided practice in which students interacted with the AI through Show-Then-Do, Prompt-Me-First, and Step-by-Step Debugger patterns. After this phase, students worked independently on analogous tasks while receiving immediate AI feedback. The modules concluded with short AI-generated quizzes and a brief reflective task in which students explained their reasoning or analyzed errors. Throughout the activities, the instructor’s role was to supervise the flow of interaction, ensure the correctness of AI-generated explanations, and intervene in cases of conceptual difficulty. This structure made the teaching experience replicable and clarified how the prompt patterns shaped students’ engagement and task progression.

3. Description of Lesson Scenarios Within the Module “Fundamentals of Operations on Complex Numbers”

As part of the experiment, three interrelated teaching scenarios were developed to form a coherent module introducing the first-semester students of Applied Mathematics at Bialystok University of Technology (Poland) to working with complex numbers supported by AI systems (ChatGPT and Google Gemini). Each scenario addresses a distinct cognitive objective and employs a different set of prompt patterns consistent with the principles outlined in [7,8]. The following sections present each of the three scenarios that make up the proposed module.

3.1. Part 1—Introduction to Complex Numbers

The first scenario is conceptual and introductory in nature. Its goal is to help students understand the necessity of extending the set of real numbers to include imaginary numbers, and to comprehend the definition of the imaginary unit i, for which i 2 = 1 .
The class takes the form of a short conceptual dialogue and a set of practical exercises with immediate AI feedback. The structure of the lesson includes the following:
  • Conceptual Warm-up: Analysis of the equation x 2 + 1 = 0 and introduction of the idea of the imaginary number through historical analogies (“just as negative numbers were once discovered”).
  • Computational Exercises: Addition, subtraction, multiplication, and division of complex numbers in algebraic form z = a + b i , implemented using the Show-Then-Do, Prompt Me First, and Step-by-Step Debugger patterns.
  • Quiz and Reflection: Short tasks based on the Test Me prompt, paraphrasing of answers (Explain-Back), and analysis of gaps (What Did I Miss).
Students gradually progress from confronting the “impossible equation” to formulating the basic rules of operations on complex numbers independently.

3.2. Part 2—Trigonometric Form of a Complex Number

The second scenario focuses on the geometric interpretation of complex numbers and on transitions between the algebraic and trigonometric forms.
The lesson begins with a narrative introduction about Gauss and the discovery of the complex plane (Storytelling Prompt), followed by a sequence of activities using the Predict–Observe–Explain method. Students predict the location of a number, observe the graphical result, and explain the relationship between algebraic and geometric representations.
In the practical part, participants:
  • Calculate the modulus and argument of a complex number;
  • Express complex numbers in the form z = r ( cos φ + i sin φ ) ;
  • Perform multiplication and division of complex numbers in trigonometric form, applying the rules of angle addition and subtraction.
The patterns used include Show-Then-Do, Prompt Me First, and Step-by-Step Debugger (error correction in angle calculation), as well as Test Me and What Did I Miss during the summarizing phase. The lesson concludes with self-assessment based on a scoring rubric and short reflection on the meaning and role of the argument of a complex number.

3.3. Part 3—Sets of Complex Numbers

The third scenario emphasizes geometric and visual applications. Students explore different types of sets in the complex plane, defined by conditions on the modulus and argument of the number z.
AI guides the activity in the Flipped Interaction mode—asking questions, proposing region descriptions, and inviting students to identify appropriate sets and justify their choices.
The class consists of three stages:
  • Flipped Diagnosis: AI assesses students’ prior understanding through diagnostic questions such as “What does the expression | z | = 2 mean?”.
  • Modeling and Practice: Recognition and description of geometric sets such as circles, annuli, half-planes, and sectors; work with incorrect examples (Error Injection) and exploratory questions (What-If, e.g., “How does the set change if we add the condition | z | < 4 ?”).
  • Reflection and Design: Creation and explanation of an original example of a complex-number set, supported by the AI-as-Coach and Reflection Prompt patterns.
Students learn to interpret geometric conditions, justify their reasoning, and correct misconceptions.

3.4. Summaryof the Module

The three scenarios together form a coherent didactic path—from introducing the concept of complex numbers, through understanding their geometric representation, to analyzing sets in the complex plane. Each stage employs different prompt patterns designed to support active learning, reflection, and the development of independent mathematical reasoning with the assistance of artificial intelligence.

3.5. Analysis of the Evaluation Survey Results for the Module on “Fundamentals of Operations on Complex Numbers”

After completing the module activities, students evaluated the proposed instructional solutions by completing a detailed survey. The survey consisted of five parts (A–E), whose content is provided in the Appendix A. The following sections summarize the main findings.
Part A asked students about their previous experience with AI tools such as ChatGPT (https://chat.openai.com, accessed 17 December 2025), Copilot, (https://copilot.microsoft.com, accessed 17 December 2025) or Gemini (https://gemini.google.com, accessed 17 December 2025). Participants demonstrated a wide range of familiarity:
  • Regular users: 56 students;
  • Occasional users: 40 students;
  • Never used AI before: 4 students.
Most respondents had at least some previous exposure to AI tools, and only a small minority interacted with them for the first time during this activity. Experience level strongly influenced perceptions of the module:
  • Regular users provided the highest overall evaluation (mean overall score: 7.5/10).
  • Occasional users were more cautious (mean: 6.6/10).
  • Students without prior AI experience rated the module positively (mean: 7.0/10), although the group is very small.
Students predominantly used Gemini during the exercise (59%), followed by ChatGPT (26%) or a combination of tools. Digital competence and familiarity with AI appear to shape students’ views on the didactic potential of the tools.
Parts B and C contained statements rated on 5-point Likert scales. In Part B, 1 indicated “strongly disagree” and 5 “strongly agree”. In Part C, 1 indicated “not useful” and 5 “very useful”.

3.5.1. Part B: Impact of AI on Engagement and Learning (B1–B8)

The highest-rated statements were the following:
  • B5: AI encouraged me to think independently and explain my solutions. Mean: 3.93; 70 % of students selected 4 or 5.
  • B1: Learning with the use of AI was engaging for me. Mean: 3.92; 69 % of students selected 4 or 5; only 5 % gave a low rating (1 or 2).
  • B2: AI helped me better understand the concept of complex numbers. Mean: 3.84; 68 % of students selected 4 or 5.
Moderately positive results were observed for the following:
  • B3: Working with AI helped me notice my own calculation mistakes faster. Mean: 3.72; 63 % selected 4 or 5.
  • B8: The classes developed my ability to reflect on my mathematical thinking. Mean: 3.70; 61 % selected 4 or 5.
  • B6 Interaction with AI was clear and logical. Mean: 3.58; 53 % selected 4 or 5.
The lowest-rated statements were the following:
  • B4: Work patterns supported my learning. Mean: 3.53; 52 % selected 4 or 5; 15 % gave a low rating (1 or 2).
  • B7: Working with AI was a better experience compared to traditional exercises without AI. Mean: 3.02; 37 % selected 4 or 5; 38 % gave a low rating.
These results suggest that while AI was generally perceived as helpful and engaging, students were divided on whether it offered a better learning experience than traditional exercises.

3.5.2. Part C: Evaluation of Didactic Modules (C1–C5)

All five components of the module received overall positive evaluations. The highest-rated elements were the following:
  • C1: Introduction to complex numbers: Mean 4.24; 86 % of students selected 4 or 5, only 7 % gave 1 or 2.
  • C4: Quizzes and interactive AI-based tests: Mean 3.99; 68 % selected 4 or 5, 6 % selected 1 or 2.
  • C2: Trigonometric form: Mean 3.95; 72 % selected 4 or 5, 5 % selected 1 or 2.
  • C5: Reflection and discussion of errors: Mean 3.91; 70 % selected 4 or 5, 10 % selected 1 or 2.
The lowest-rated, though still positively evaluated, module was the following:
  • C3: Sets of complex numbers: Mean 3.72; 60 % selected 4 or 5, 11 % selected 1 or 2.
Overall, students appreciated structured introductions, quizzes, and reflective tasks.

3.5.3. Most Frequent Themes in Open-Ended Responses (Part D)

Part D consisted of four open-ended questions about students’ experiences with AI. Thematic analysis revealed several recurring patterns.

Positive Themes

  • Learning from errors and feedback: Students valued AI’s support in identifying mistakes, correcting them, and providing explanatory feedback.
  • Quizzes and interactive tests: Many respondents described quizzes and “check-yourself” tasks as the most beneficial form of work.
  • Dialogue and interaction with AI: Students appreciated being able to ask follow-up questions, explore different explanations, and engage in conversation.
  • Clarity and structure: AI-generated notes and explanations were often described as clear, logical, and easy to follow.
  • Examples and analogies: Concrete examples and conceptual analogies were reported as especially supportive in building intuition.
Despite the advantages, several challenges were reported:
  • Errors, “hallucinations”, and limited trust: Some students encountered incorrect or misleading answers and expressed reduced trust in AI-generated explanations.
  • Communication difficulties: Certain participants struggled with prompt formulation or interpreting AI’s responses.
  • Preference for traditional classes: A noticeable group emphasized that AI cannot replace a human instructor, citing better interaction, motivation and clarity.
These insights suggest that AI is most effective when integrated into interactive, reflective learning activities rather than used solely as an answer generator.

3.5.4. Overall Evaluation and Future Perspectives (Part E)

Students evaluated their overall impression of AI-supported learning on a 1–10 scale:
  • Mean overall score: 7.12/10.
  • Range: 2–10.
  • Most common ratings: 8 (31 students), 7 (23 students), 6 (15 students).
Regular AI users rated the experience significantly higher (mean 7.5) than occasional users (mean 6.6).
Students were also asked whether they would like similar AI-supported activities in future mathematics courses:
  • Yes: 59 students.
  • Undecided: 27 students.
  • No: 14 students.
Despite trust issues and errors, a clear majority expressed interest in participating in similar AI-based classes.

4. Description of Lesson Scenario Within the Module “Conditional Probabilities”

The second experimental module was implemented at the West University of Timişoara (Romania) and was designed for second-year Computer Science students. The module focused on the use of artificial intelligence tools (ChatGPT and Gemini) to support the learning of important concepts in conditional probability (definition of posterior probabilities, Bayes’ rule, law of total probability, and independence of events). All experimental sessions were structured as follows: an introduction to the topic was presented at the lecture before the AI-assisted laboratory, guided AI-assisted exercises using structured prompt patterns and individual practice using AI for problem-solving. This design allowed us to implement the “teaching experience” by providing a clear, replicable sequence of tasks while monitoring student engagement.
Similarly to the module on complex numbers, this sequence of activities aimed to develop both conceptual understanding and computational fluency. The scenario used a specific set of prompt patterns to foster reflection, self-correction, and collaboration with AI. The lesson scenario contains three parts which are discussed below.

4.1. Part A—Flipped Diagnosis

Part A serves as a warm-up section designed to activate prior knowledge and establish foundational understanding before moving into more complex applications. It uses a quick, interactive question-and-answer format to ensure that students grasp the basics of conditional probability. The structure of part A includes the following:
  • Conceptual Warm-up: A guided discussion in which students are asked to state the definition and meaning of a conditional probability of an event by revisiting familiar ideas. They restate the definition of P ( A B ) in their own words and contrast joint vs. marginal probabilities.
  • Computational Exercises: Short, targeted tasks in which students select an appropriate representation (tree, table, or Venn diagram) for a two-step process and compute a basic conditional probability. These are implemented through quick Show-Then-Do interactions and Prompt-Me-First nudges.
  • Quiz and Reflection: A brief closing exchange where students explain the domain condition for conditional probability and restate the concept in their own language (Explain-Back). The AI highlights any misconceptions (“What Did I Miss?”), preparing students for the more involved applications of total probability and Bayes’ rule in later sections.
Rationale: This warm-up activates prerequisite knowledge (set operations, joint vs. marginal probabilities, notation for conditioning) and reduces cognitive overload in later multi-step tasks.

4.2. Part B—Modeling and Practice (Show-Then-Do and Error Injection)

This part guides students from basic conditioning to the applications of the law of total probability and Bayes’ rule. The emphasis is on choosing appropriate diagrams and carrying out structured computations with immediate feedback.
  • Modeling Tasks: Students analyze short scenarios (population groups, email filtering, diagnostic tests) and select a suitable representation (tree, table, or Venn diagram).
  • Guided computation: The AI uses Show-Then-Do prompts to demonstrate a structure and then asks students to replicate it. Small corrections, reminders about normalization, and clarifications about joint vs. conditional probabilities support accurate execution.
  • Error Awareness: Students examine typical mistakes (e.g., confusing P ( A B ) with P ( A B ) ) and briefly explain the correct reasoning. This reinforces conceptual clarity.
Rationale: Worked templates and stepwise prompts help limit extraneous load in Bayes/total-probability computations by externalizing intermediate steps and stabilizing the representation.

4.3. Part C—Application and Reflection

The students are asked in this part to apply conditional probability and Bayes’ rule to practical contexts. The AI prompts the students to interpret and extend what they learned.
  • Applied Computation: Students work through scenarios such as quality control, computing both overall event probabilities via total probability and posteriors via Bayes’ rule.
  • Interpretation: Students explain their results in plain language, identify where Bayes’ rule is used in the solution, and justify their diagram choice.
  • Reflection and Transfer: Learners design a simple scenario of their own (priors, likelihoods, posterior query) and sketch an appropriate representation, reinforcing the ability to transfer the method to new problems.
Rationale: Students move from guided to more open tasks to support transfer, focusing attention on meaning rather than only computation.

4.4. Summary of the Module

The scenario presents a coherent instructional sequence, beginning with the introduction of conditional probability and progressing through the law of total probability and Bayes’ rule, ultimately guiding students toward their effective application.
Tailored prompt patterns are used to encourage active learning, promote reflection, and support the development of independent mathematical reasoning with the aid of artificial intelligence.

4.5. Analysis of the Evaluation Survey Results for the Module on “Conditional Probabilities”

All participants had at least some prior exposure to AI tools, most of them being regular users:
  • Regular users: 155 students;
  • Occasional users: 58 students;
  • Never used AI before: 0 students.
Students’ experience levels were reflected in their evaluations of the module:
  • Regular users provided the highest overall evaluation (mean overall score: 8.44/10).
  • Occasional users were more cautious (mean: 7.69/10).
During the exercise, most students worked with ChatGPT (83.57%), followed by Gemini (0.94%) or a combination of tools (15.50%). Digital competence and AI familiarity seem to significantly influence how students perceive the didactic value of these tools.

4.5.1. Part B: Impact of AI on Engagement and Learning (B1–B8)

The following statements received the highest mean scores:
  • Learning with the use of AI was engaging for me. Mean: 4.15, high ratings 4–5: 75.59%, low ratings 1–2: 5.63%.
  • Work patterns (e.g., “Show first, then do”, “Explain in your own words”) supported my learning. Mean: 4.11, high ratings 4–5: 78.40%, low ratings 1–2: 4.69%.
  • Interaction with AI was clear and logical. Mean: 4.11, high ratings 4–5: 75.59%, low ratings 1–2: 8.45%.
Moderately positive results were obtained for the following:
  • AI helped me better understand the concept of conditional probabilities. Mean: 4.00, high ratings 4–5: 71.36%, low ratings 1–2: 7.51%.
  • Working with AI helped me notice my own calculation mistakes faster. Mean: 3.95, high ratings 4–5: 69.48%, low ratings 1–2: 11.74%.
  • The classes developed my ability to reflect on my mathematical thinking. Mean: 3.70, high ratings 4–5: 62.91%, low ratings 1–2: 12.21%.
The lowest-rated statements were the following:
  • AI encouraged me to think independently and explain solutions. Mean: 3.67, high ratings 4–5: 60.56%, low ratings 1–2: 16.43%.
  • Working with AI was a better experience compared to traditional exercises without AI. Mean: 3.56, high ratings 4–5: 50.23%, low ratings 1–2: 16.90%.
Students rated AI-based learning as engaging, clear, and supported by effective work patterns. They also reported moderate benefits for understanding concepts, identifying mistakes, and reflecting on their mathematical thinking. However, they were less convinced that AI fostered independent reasoning or offered a better experience than traditional exercises.

4.5.2. Part C: Evaluation of Didactic Modules (C1, C4, C5)

We note that in the questionnaire given to the Romanian students, there are only three items in part C: C1, C4, and C5. This is because the lesson contained only one scenario (Part 1). The students evaluated the three components of this module in an overall positive manner. The three items were rated as follows:
  • C1—Conditional probabilities: mean 4.15 (4–5: 80.75 % , 1–2: 1.41 % ).
  • C4—Quizzes and interactive tests: mean 4.14 (4–5: 76.53 % , 1–2: 5.16 % ).
  • C5—Reflection and error analysis: mean 4.14 (4–5: 75.59 % , 1–2: 6.10 % ).
The large proportion of scores in the 4–5 range and the very small share of low ratings indicate that students valued the module’s focus on conditional probabilities, its use of quizzes and interactive tests, and the inclusion of reflection and error analysis activities. These results suggest that the instructional design elements incorporated in this lesson were well received and effectively supported student engagement.

4.5.3. Most Frequent Themes in Open-Ended Responses (Part D)

The thematic analysis of the responses to the four open-ended questions in part D revealed several recurring ideas across participants’ reflections.
Positive Themes
  • Immediate feedback and error correction: Many students highlighted the usefulness of AI in correcting mistakes, checking intermediate steps, and offering clarifying feedback. Error correction was frequently described as the most valuable activity for consolidating understanding.
  • Dialogue and iterative questioning: Students appreciated being able to ask follow-up questions, request clarifications, or explore alternative explanations. The conversational aspect helped them unpack difficult concepts and verify their reasoning.
  • Quizzes and structured tasks: Interactive tasks such as quizzes and “check-yourself” items were viewed as effective for practice and self-evaluation. Several respondents emphasized that these activities helped them monitor progress and reinforce the material.
  • Fast and accessible explanations: Participants noted the advantage of receiving immediate responses and step-by-step explanations. The speed and availability of AI were repeatedly mentioned as beneficial for understanding probabilities and related concepts.
Reported Challenges
  • Message limits and interaction constraints: A commonly mentioned frustration was reaching the platform’s message limit (ChatGPT), which interrupted the flow of problem-solving or explanation.
  • Communication and formatting difficulties: Several students struggled with articulating prompts clearly or using the correct notation (especially LaTeX or formula formatting). These issues sometimes made the interaction slower or less effective.
  • Preference for traditional guidance: A portion of the students expressed that, although helpful, AI cannot replace a human instructor. They emphasized that traditional teachers provide motivation, clearer explanations, and more personalized feedback.
  • Occasional inconsistencies: Some students reported that AI explanations were sometimes incomplete, too general, or not well adapted to the specific problem. These inconsistencies required additional verification or repeated prompts, reducing their confidence in certain answers.

4.5.4. Overall Evaluation and Future Perspectives (Part E)

Students rated their overall experience with AI-supported learning on a 1–10 scale:
  • Mean overall score: 8.23.
  • Range: 1–10.
  • Most common ratings: 8 (58 students), 9 (52 students), 10 (47 students).
Regular AI users provided noticeably higher ratings (mean 8.44) compared to occasional users (mean 7.69). Students were also asked whether they would like similar AI-supported activities in future mathematics courses:
  • Yes: 175 students.
  • No: 14 students.
  • Undecided: 24 students.
The analysis above shows that, although some concerns were raised regarding trust and accuracy, the majority has a strong willingness to engage in comparable AI-supported classes moving forward.

5. Comparative Discussion Between Modules

In interpreting the results, it is important to note that the questionnaire captures students’ perceptions of AI-supported learning rather than their actual learning outcomes, mathematical performance, or proficiency in using AI tools. The reported results, therefore, reflect subjective learning experiences. To clarify the instructional context in which these perceptions were formed, we outlined how the teaching activities were operationalized through a structured sequence of diagnostic prompts, guided AI-supported practice, and reflective tasks. Future stages of the project will complement these self-reported measures with direct assessments, including pre- and post-testing and analysis of student solutions, to evaluate actual learning gains.

5.1. Categorization of Responses

We propose the following categorization of responses in the questionnaire to the following question:
“How do you assess the role of AI as a tutor compared to a traditional instructor?” The analysis organizes the feedback into thematic and sentiment-based categories: positive, neutral/ambivalent, and negative, with additional notes on short or unclear responses. Analyzing responses to this exact question is important because it reveals students’ perceptions of AI-based tutoring and their readiness to engage with such tools. The categorization of answers into positive, neutral/ambivalent, and negative sentiment allows for a clear assessment of acceptance, concerns, and expectations. It also identifies specific strengths and limitations attributed to AI in comparison with human instructors, providing valuable insight into how AI can complement traditional teaching and where its current boundaries lie.
We provide a concise comparison between responses collected in Polish group of students and responses in Romanian group of students regarding the role of AI as a tutor. Across groups, respondents share similar sentiments: AI is perceived as a highly useful tool for accessibility, repetition, and individual pacing, but it is not viewed as a replacement for a traditional instructor. A combined profile for both groups highlights strong alignment in strengths and limitations attributed to AI-based tutoring; see Table 1.
Respondents appreciate AI’s accessibility, patience, clear explanations, and individual approach. Some describe AI tutoring as highly effective or even better than traditional teaching.
Examples:
  • “It was much easier to understand the topic thanks to unlimited time and the ability to ask even the simplest questions”.
  • “Very good—I could get extensive feedback and felt that the tutor focused only on me”.
  • “AI as a tutor is like a personal teacher who answers every question”.
  • “In my opinion, AI as a tutor is a very good solution. It’s like a private tutor who is infinitely patient”.
  • “10/10”.
Respondents recognize both strengths and weaknesses. They consider AI a useful or interesting supplement to traditional learning but not a replacement.
Examples:
  • “AI can serve as a supplement to traditional instructors”.
  • “Quite good”.
  • “I wouldn’t compare AI directly to a teacher—it’s a unique and interesting way to explore topics”.
  • “Positive—it won’t replace the teacher, but it’s a good addition”.
  • “It’s fine and patient, but human instructors bring more energy and passion”.
Respondents emphasize AI’s limitations—lack of human connection, intuition, and emotional engagement, as well as potential errors and overly rigid explanations. Many express a clear preference for traditional instruction.
Examples:
  • “A traditional instructor is better”.
  • “I prefer traditional classes. Human contact is important to me”.
  • “AI doesn’t work well long-term”.
  • “Nothing can replace a human”.
  • “Traditional teachers are definitely better”.
Despite linguistic and cultural differences, the Polish and Romanian groups show strikingly similar perceptions. AI is widely appreciated as an individual learning aid but is consistently viewed as unable to replace the depth, intuition, and human connection provided by a traditional instructor. The most preferred model is a combined, complementary approach integrating AI with human teaching.

5.2. Group Differences in Perceptions of AI-Based Learning

Table 2 presents the mean values for Romanian (RO Group) and Polish (PL Group) students on the eight items (B1–B8) of Part B (see Appendix A), together with the p-values of the Mann–Whitney test.
The results of Table 2 indicate several statistically significant differences between the two groups.
Statistically significant effects (p < 0.05) were observed for the following items:
  • B1 (Learning with AI was engaging for me).
    Romanian students rated the engaging nature of learning with AI significantly higher than Polish students ( Z = 2.39 , p = 0.017 ). Rank sums (RO: 35,123.5; PL: 14,017.5) further confirm the higher evaluations of the Romanian group.
  • B3 (Thanks to working with AI, I noticed my own errors in calculations more quickly).
    In this case as well, Romanian students scored higher ( Z = 2.11 , p = 0.035 ), indicating that they perceived AI more often as a tool that facilitated the detection of their own errors.
  • B4 (Working patterns (e.g., “Show first, then do”, “Explain in your own words”) made learning easier for me).
    This item showed one of the strongest differences: Romanian students clearly rated the impact of AI work structures on learning effectiveness higher ( Z = 4.63 , p < 0.001 ).
  • B6 (Interaction with the AI was understandable and logical).
    Here, also, Romanian students gave significantly higher scores ( Z = 4.41 , p < 0.001 ), suggesting that their perception of the clarity and coherence of AI interaction was more positive.
  • B7 (Working with AI was a better experience than traditional exercises without AI).
    Romanian students more frequently considered AI-based experiences superior to traditional exercises ( Z = 3.59 , p < 0.001 ).
In most categories related to practical experiences with AI interactions—such as the engaging nature of sessions, ease of learning through work patterns, clarity of interaction, and preference for AI over traditional exercises—Romanian students rated AI higher than Polish students. In contrast, in areas related to understanding substantive content (e.g., conditional probability) and supporting mathematical reflection, no differences between the groups were observed.
Table 3 compares participants’ responses to questions C4 and C5 from part C of the questionnaire, along with their overall impressions of working with AI as captured in part E (see Appendix A).
The analysis of the questionnaire responses, as summarized in Table 3, reveals nuanced differences in the way the two groups experienced working with AI. Although both groups reported similar experiences with respect to specific activities, such as interactive quizzes (C4) and reflection and discussion of errors (C5), the overall impression of working with AI (Final reflection, part E) differed significantly.
Specifically, the RO Group rated interactive quizzes (C4) at 4.136 compared to 3.990 in the PL Group, and reflection and discussion of errors (C5) at 4.141 versus 3.910 in the PL group. However, these differences were not statistically significant (Mann–Whitney test p = 0.147 for C4 and p = 0.054 for C5), suggesting that participants in both groups generally found these activities equally engaging and useful.
In contrast, the overall assessment of working with AI (Final reflection, part E) showed a pronounced difference: the RO Group rated it 8.235, while the PL Group rated it 7.120, with a highly significant p-value of 0.001. This indicates that, despite similar experiences with individual tasks, the RO Group had a markedly more positive overall impression of using AI.
These results suggest that general attitudes toward AI may not always align perfectly with task-level evaluations. While both groups engaged similarly with quizzes and error-reflection exercises, broader factors—such as prior experience, expectations, or group-specific dynamics—may have influenced their overall satisfaction.
The findings underscore the importance of examining both detailed and holistic measures when evaluating AI-assisted learning interventions. Task-level feedback captures the usability and perceived usefulness of specific activities, while overall ratings provide insight into general acceptance and learner satisfaction, which may differ even when individual task experiences are similar.

5.3. Influence of Experience on the Obtained Results–Statistical Analysis

The quantitative analysis aimed to examine the relationship between the students’ level of experience with artificial intelligence (AI) tools and their evaluation of the instructional activities involving interaction with a language model. The variable declared experience was defined based on the participants’ responses to the question included in Part A of the questionnaire (see Appendix A): “Have you previously used AI tools (e.g., ChatGPT, Copilot, Gemini)?”.
Respondents could choose one of three options:
  • Yes, regularly.
  • Occasionally.
  • Never.
All participants (N = 313) were then divided into two groups:
  • EX1—students with high experience (response: Yes, regularly);
  • EX2—students with low or moderate experience (responses: Occasionally or Never).
The EX1 group consisted of 211 students ( 67.41 % ), whereas the EX2 group included 102 students ( 32.59 % ). To evaluate different aspects of the AI-enhanced classes, statements B1–B8 and C4–C5 from Parts B and C of the questionnaire (rated on a 1–5 scale) were analyzed, along with the final overall rating from Part E (1–10 scale). The mean values for both groups are presented in Table 4. The significance of differences between the groups was tested using the Mann–Whitney test.
As shown in Table 4, there are clear differences in the evaluation of AI-enhanced instructional activities between students with high prior experience using AI tools (EX1) and those with low experience (EX2). Students in EX1 consistently rated most aspects of the activities higher, indicating a more positive perception of AI-supported learning.
The largest and most statistically significant differences ( p < 0.001 , Mann–Whitney test) were observed for items related to engagement (B1), comprehension support (B2), faster detection of calculation errors (B3), facilitation of learning through working patterns (B4), clarity and logic of AI interaction (B6), and overall preference for AI-enhanced exercises over traditional methods (B7).
No significant differences were found for items related to independent thinking and solution explanation (B5, Z = 0.098 , p = 0.922 ) or reflection on mathematical thinking (B8, Z = 1.544 , p = 0.122 ). This suggests that prior experience with AI did not strongly influence students’ perception of these aspects. Differences regarding interactive quizzes and error reflection (C4–C5) were smaller but still significant ( p < 0.002 ). The overall final rating (part E) also favored the experienced group (EX1: 8.19 vs. EX2: 7.24, Z = 4.50 , p < 0.001 ), confirming that prior experience influenced general impressions of working with AI.
These results indicate that prior experience with AI tools significantly affects students’ perceptions of engagement, comprehension support, and overall satisfaction with AI-enhanced learning activities. In contrast, competencies related to independent thinking and reflection appear to be shaped more by the instructional design than by prior experience. In practical terms, these findings suggest that providing guidance or introductory training for less experienced students may help ensure equitable benefits from AI-enhanced instruction.

5.4. Contextual Factors and Implementation Differences

While the questionnaire results indicate several statistically significant differences between the RO and PL groups (Table 2), these differences should be interpreted in light of contextual and implementation factors. Importantly, the present study did not directly measure variables such as digital literacy, prior educational exposure to AI-supported learning, or instructor teaching style. Therefore, the points below are discussed as plausible contributors that may help contextualize the findings rather than as causal explanations.
First, the two cohorts differed in academic profile and stage of study (first-semester Applied Mathematics vs. second-year Computer Science), which may influence both comfort with digital tools and expectations regarding learning support. Second, the modules covered different mathematical topics (complex numbers vs. conditional probability), which differ in representational demands (algebraic–geometric transformations vs. diagram choice and interpretation) and may shape how students perceive the clarity and usefulness of AI interaction. Third, the practical implementation differed in the dominant platform used (Gemini was used more frequently in the PL cohort, whereas ChatGPT dominated in the RO cohort), and students reported platform-related constraints (e.g., message limits), which can affect the smoothness of interaction and overall satisfaction.
Finally, local classroom norms and teaching practices (e.g., the degree of emphasis on guided practice vs. independent exploration) may also influence how students interpret AI as a tutor and how they evaluate structured prompt patterns.

6. Conclusions

The results presented in the paper confirm the following:
  • AI is generally viewed as engaging and supportive, particularly in fostering independent thinking and clarifying complex concepts.
  • Students highly value quizzes, structured explanations, and opportunities to correct mistakes.
  • AI is not considered a replacement for traditional teaching, but, rather, a useful complementary tool.
  • Prior experience with AI significantly shapes evaluations.
  • Students are open to further integration of AI in mathematics education, provided its limitations are acknowledged and it is used in ways that stimulate reflection and active learning.
This study has several limitations that should be acknowledged. First, the evaluation relied on self-reported perceptions rather than objective measures of learning. Second, the two modules address different mathematical topics and were implemented in different institutional contexts, which may influence students’ responses. Third, the use of AI systems cannot yet guarantee full accuracy of explanations, and instructor oversight was required to mitigate occasional errors.

7. Future Research Directions

Future stages of the MAESTRO-AI project will address the limitations of perception-based evaluation by incorporating direct performance measures, including pre- and post-tests, as well as a systematic analysis of students’ written solutions, to assess learning gains and the accuracy and progression of mathematical reasoning. Subsequent modules will also examine how different prompt patterns and pattern-focused instructional designs influence learning processes. In addition, we will investigate the role of onboarding support for less experienced AI users to identify which prompt designs and implementation choices most effectively promote equitable and measurable educational outcomes.

Author Contributions

Conceptualization, O.B., M.F.-C., E.G., E.K., D.M., R.M., N.P., A.L.T., and C.Z.; methodology, O.B., M.F.-C., E.G., E.K., D.M., R.M., N.P., A.L.T., and C.Z.; software, O.B., M.F.-C., E.G., E.K., D.M., R.M., N.P., A.L.T., and C.Z.; validation, O.B., M.F.-C., E.G., E.K., D.M., R.M., N.P., A.L.T., and C.Z.; formal analysis, O.B., M.F.-C., E.G., E.K., D.M., R.M., N.P., A.L.T., and C.Z.; investigation, O.B., M.F.-C., E.G., E.K., D.M., R.M., N.P., A.L.T., and C.Z.; resources, O.B., M.F.-C., E.G., E.K., D.M., R.M., N.P., A.L.T., and C.Z.; data curation, O.B., M.F.-C., E.G., E.K., D.M., R.M., N.P., A.L.T., and C.Z.; writing—original draft preparation, O.B., M.F.-C., E.G., E.K., D.M., R.M., N.P., A.L.T., and C.Z.; writing—review and editing, O.B., M.F.-C., E.G., E.K., D.M., R.M., N.P., A.L.T., and C.Z.; supervision, D.M.; project administration, D.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was for (Oana Brandibur, Ewa Girejko, Raluca Mureșan, Nikos Pappas, Adriana Loredana Tănasie and Claudia Zaharia) co-funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or Fundacja Rozwoju Systemu Edukacji. Neither the European Union nor the granting authority can be held responsible for them. The work was supported by Bialystok University of Technology for Dorota Mozyrska by the grant WZ/WI-IIT/2/2023, for Marzena Filipowicz-Chomko by the grant WZ/WI-IIT/2/2025 and funded by the resources for research by the Ministry of Education and Science.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A. Evaluation Survey After Completing the Module

Part A. General Information
Have you previously used AI tools (e.g., ChatGPT, Copilot, Gemini)?
□ Yes, regularly     □ Occasionally     □ Never
Part B. Evaluation of working with AI during the module
Indicate the extent to which you agree with the following statements (1–strongly disagree, 5–strongly agree).
No.Statement12345
B1Learning with the use of AI was engaging for me.
B2AI helped me better understand the concept of a complex number/conditional probabilities.
B3Working with AI helped me notice my own calculation mistakes faster.
B4Work patterns (e.g., “Show first, then do”, “Explain in your own words”) supported my learning.
B5AI encouraged me to think independently and explain solutions.
B6Interaction with AI was clear and logical.
B7Working with AI was a better experience compared to traditional exercises without AI.
B8The classes developed my ability to reflect on my mathematical thinking.
Part C. Evaluation of module components
Rate the individual parts of the classes on a scale of 1–5 (1–not useful, 5–very useful).
No.Module Component12345
C1Part 1–Introduction to complex numbers/Conditional probabilities
C2Part 2–Trigonometric form/-
C3Part 3–Sets of complex numbers/-
C4Quizzes and interactive AI-based tests
C5Reflection and discussion of errors
Part D. Open questions
  • What helped you the most in understanding the topic of complex numbers while working with AI?
  • What was the most difficult aspect of working with AI for you?
  • How do you evaluate the role of AI as a “tutor” compared with a traditional instructor?
  • Which patterns or forms of work (e.g., quizzes, correcting mistakes, dialogue) were the most valuable for you?
  • Would you like similar AI-enhanced classes to appear in other mathematics courses?
    □ Yes     □ No     □ I have no opinion
Please justify briefly:
Part E. Final reflection
Finally, rate your overall experience of working with AI on a scale of 1–10 (1–very negative, 10–very positive).
Your rating: ____________/10
Table A1. Categories of educational patterns with action descriptions and educational goals based on [7,8].
Table A1. Categories of educational patterns with action descriptions and educational goals based on [7,8].
CategoryPatternAction DescriptionEducational Goal/Effect
Knowledge
Building
Socratic ReversalThe student assumes the role of the teacher and explains the material to the AI.Deepening understanding through verbalization and justification.
Feynman PromptThe student explains content in the simplest possible manner, as for a beginner.Verifying the level of understanding and identifying knowledge gaps.
Gap FinderThe AI analyzes the student’s responses and identifies areas requiring improvement.Diagnosing difficulties and individualizing support.
Concept ContrastComparing concepts with similar meanings or functions.Developing analytical skills and structuring knowledge.
Active
Practice
Show-Then-DoThe AI presents a correct example, and the student completes an analogous task.Transferring knowledge into independent practice.
Error InjectionThe AI intentionally introduces an error, and the student must identify and correct it.Developing self-correction skills and critical thinking.
Test MeThe AI generates quizzes to assess student progress.Reinforcing knowledge and ongoing progress monitoring.
Rewrite ChallengeThe student transforms a correct solution into an equivalent form.Enhancing cognitive flexibility and deepening understanding.
Creative
Thinking
Many WaysSearching for various methods of solving a single problem.Stimulating creativity and divergent thinking.
Analogy BuilderCreating analogies to explain difficult concepts.Knowledge transfer and deepened understanding of abstract content.
Debate SimulationThe AI presents two opposing viewpoints in debate form.Developing argumentation and evaluating multiple perspectives.
Reflection and
Metacognition
Confidence CheckThe student evaluates their confidence in an answer, and the AI provides commentary.Developing cognitive awareness and self-regulation.
AI-as-CoachThe AI acts as a mentor supporting learning planning.Strengthening autonomy and learning strategies.
Reverse Role PlayThe student plays the role of the teacher, and the AI plays the role of the student.Deepening reflection and improving explanation skills.
Learning DiaryThe student keeps a reflection journal with AI support.Systematic self-reflection and developing metacognitive competencies.

References

  1. Stańdo, J.; Fechner, Ż.; Dąbrowicz-Tlałka, A.; Kujawska, K.; Musielak, M.M. Exploring AI Chatbots for Learning Mathematics: Students’ Perspectives on Accuracy and Educational Value. In Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium, Blue Sky, and WideAIED, Proceedings of the 26th International Conference, AIED 2025, Palermo, Italy, 22–26 July 2025; Springer: Cham, Switzerland, 2025; pp. 282–288. [Google Scholar] [CrossRef]
  2. Mohamed, M.Z.B.; Hidayat, R.; Suhaizi, N.N.B.; Sabri, N.B.M.; Mahmud, M.K.H.B.; Baharuddin, S.N.B. Artificial Intelligence in Mathematics Education: A Systematic Literature Review. Int. Electron. J. Math. Educ. 2022, 17, em0694. [Google Scholar] [CrossRef] [PubMed]
  3. Yi, L.; Liu, D.; Jiang, T.; Xian, Y. The Effectiveness of AI on K-12 Students’ Mathematics Learning: A Systematic Review and Meta-Analysis. Int. J. Sci. Math. Educ. 2024, 23, 1105–1126. [Google Scholar] [CrossRef]
  4. Łupińska Dubicka, A.; Mozyrska, D. ChatGPT w nauczaniu programowania: Do’swiadczenia studentów i wyzwania dydaktyczne. In Wybrane Zagadnienia Informatyki Technicznej; Online; Oficyna Wydawnicza Politechniki Białostockiej: Bialystok, Poland, 2025; ISBN 978-83-68673-08-1. [Google Scholar] [CrossRef]
  5. Kasneci, E.; Seßler, K.; Küchemann, S.; Bannert, M.; Dementieva, D.; Fischer, F.; Gasser, U.; Groh, G.; Günnemann, S.; Hüllermeier, E.; et al. ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education. Learn. Individ. Differ. 2023, 103, 102274. [Google Scholar] [CrossRef]
  6. Lee, D.; Palmer, E. Prompt engineering in higher education: A systematic review to help inform curricula. Int. J. Educ. Technol. High. Educ. 2025, 22, 7. [Google Scholar] [CrossRef]
  7. White, J.; Fu, Q.; Hays, S.; Sandborn, M.; Olea, C.; Gilbert, H.; Elnashar, A.; Spencer-Smith, J.; Schmidt, D.C. A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. arXiv 2023, arXiv:2302.11382. [Google Scholar] [CrossRef]
  8. Sahoo, P.; Singh, A.K.; Saha, S.; Jain, V.; Mondal, S.S.; Chadha, A. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications. arXiv 2024, arXiv:2402.07927. [Google Scholar] [CrossRef]
  9. Naskręcki, B.; Ono, K. Mathematical discovery in the age of artificial intelligence. Nat. Phys. 2025, 21, 1504–1506. [Google Scholar] [CrossRef]
  10. Kojima, T.; Gu, S.S.; Reid, M.; Matsuo, Y.; Iwasawa, Y. Large Language Models are Zero-Shot Reasoners. arXiv 2022, arXiv:2205.11916. [Google Scholar]
  11. Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models are Few-Shot Learners. arXiv 2020, arXiv:2005.14165. [Google Scholar] [CrossRef]
  12. Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Ichter, B.; Xia, F.; Chi, E.; Le, Q.; Zhou, D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv 2022, arXiv:2201.11903. [Google Scholar]
  13. Ban, Y.; Wang, R.; Zhou, T.; Cheng, M.; Gong, B.; Hsieh, C. Understanding the Impact of Negative Prompts: When and How Do They Take Effect? In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024. [Google Scholar]
  14. Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.L.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training Language Models to Follow Instructions with Human Feedback. arXiv 2022, arXiv:2203.02155. [Google Scholar] [CrossRef]
  15. Salesforce. What Is Prompt Grounding?—A Generative AI Tutorial, 2024. Available online: https://www.salesforce.com/blog/what-is-grounding/ (accessed on 17 December 2025).
  16. Kabir, M.R.; Lin, F.O. An LLM-Powered Adaptive Practicing System. In Proceedings of the LLM@AIED, Tokyo, Japan, 3–7 July 2023. [Google Scholar]
Table 1. Sentiment distribution (PL vs. RO).
Table 1. Sentiment distribution (PL vs. RO).
CategoryPL GroupRO Group
Positive∼35–40%∼35–40%
Neutral∼25–30%∼30–35%
Negative∼30–35%∼35–40%
Unclear∼10%∼5%
Table 2. Mean values for RO Group and PL Group on items B1–B8, and Mann–Whitney test p-values.
Table 2. Mean values for RO Group and PL Group on items B1–B8, and Mann–Whitney test p-values.
B1B2B3B4B5B6B7B8
RO Group4.1503.9953.9534.1133.6714.1133.5633.704
PL Group3.9203.8403.7203.5303.9303.5803.0203.700
Total4.0773.9463.8793.9273.7543.9423.3903.703
M–W test (p)0.0170.2650.0350.0010.0670.0010.0010.958
Table 3. Mean values for RO Group and PL Group on items C4–C5, overall score, and Mann–Whitney test p-values.
Table 3. Mean values for RO Group and PL Group on items C4–C5, overall score, and Mann–Whitney test p-values.
C4C5Final Reflection
RO Group4.1364.1418.235
PL Group3.9903.9107.120
Total4.0894.0677.879
M–W test (p)0.1470.0540.001
Table 4. Mean values for items B1–B8, C4–C5, and the overall rating in groups EX1 and EX2, with Mann–Whitney test p-values.
Table 4. Mean values for items B1–B8, C4–C5, and the overall rating in groups EX1 and EX2, with Mann–Whitney test p-values.
ItemEX1EX2p-Value
B14.3133.5880.001
B24.1613.5000.001
B34.0473.5290.001
B44.0663.6370.001
B53.7543.7550.922
B64.1233.5690.001
B73.5922.9710.001
B83.7733.5590.122
C44.1943.8730.006
C54.1803.8330.001
Overall rate8.1907.2350.001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Brandibur, O.; Filipowicz-Chomko, M.; Girejko, E.; Kaslik, E.; Mozyrska, D.; Mureșan, R.; Pappas, N.; Tănasie, A.L.; Zaharia, C. Higher Mathematics Education and AI Prompt Patterns: Examples from Selected University Classes. Appl. Sci. 2026, 16, 339. https://doi.org/10.3390/app16010339

AMA Style

Brandibur O, Filipowicz-Chomko M, Girejko E, Kaslik E, Mozyrska D, Mureșan R, Pappas N, Tănasie AL, Zaharia C. Higher Mathematics Education and AI Prompt Patterns: Examples from Selected University Classes. Applied Sciences. 2026; 16(1):339. https://doi.org/10.3390/app16010339

Chicago/Turabian Style

Brandibur, Oana, Marzena Filipowicz-Chomko, Ewa Girejko, Eva Kaslik, Dorota Mozyrska, Raluca Mureșan, Nikos Pappas, Adriana Loredana Tănasie, and Claudia Zaharia. 2026. "Higher Mathematics Education and AI Prompt Patterns: Examples from Selected University Classes" Applied Sciences 16, no. 1: 339. https://doi.org/10.3390/app16010339

APA Style

Brandibur, O., Filipowicz-Chomko, M., Girejko, E., Kaslik, E., Mozyrska, D., Mureșan, R., Pappas, N., Tănasie, A. L., & Zaharia, C. (2026). Higher Mathematics Education and AI Prompt Patterns: Examples from Selected University Classes. Applied Sciences, 16(1), 339. https://doi.org/10.3390/app16010339

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop