1. Introduction
The fundamental objective of educational institutions is to facilitate the transmission of knowledge and information through structured learning processes, thereby enabling learners to translate this knowledge into practical decisions, skills, and actions within real-world contexts. Despite its central role in education and cognitive development, the process of learning remains insufficiently understood. It is well established, however, that effective learning necessitates sustained concentration, adequate time, focused cognitive engagement, and psychological well-being.
The courses in each educational institution are typically constrained by fixed daily schedules and a set number of class sessions. Students who begin with lower ability levels rarely have enough time to reach the passing standard. Meanwhile, students who start with higher ability levels are often able to demonstrate sufficient proficiency to earn high grades.
Prevailing learning theories imply markedly different dependencies of the learning rate on a student’s prior knowledge—that is, what the learner does or does not already know upon encountering new material. One foundational theory is the tabula rasa (i.e., the Aristotelian concept) [
1], which conceptualizes the mind as a blank slate that acquires knowledge first through sensory experience and subsequently through reasoning. An alternative framework, constructivism [
2], posits that new knowledge is actively constructed through the integration of existing cognitive structures. According to this theory, a learner’s prior knowledge positively influences the efficiency and rate of acquiring new information. A third approach, tutoring theory [
3], involves individualized instruction by an expert, often exemplified by the Socratic method, and is adapted to the learner’s specific needs and current level of understanding.
In response to this complexity, various mathematical models have been developed to represent the learning process. Many models employ Ordinary Differential Equations (ODEs) or Initial Value Problems (IVPs). Hicklin [
4] developed a theoretical model that provides a quantitative explanation of why individuals exposed to the same total amount of material will ultimately achieve the same level of mastery. A mathematical model showing the relationship among prior learning, motivation, and time on task was presented by Johnston and Aldridge [
5]. Anderson [
6] developed a mathematical model of learning and tested it with empirical data on science learning, while Pritchard et al. [
7] presented mathematical models of students’ knowledge that express various theories of learning. Aldridge’s mathematical model for mastery learning [
8] includes measures of motivation and specific learner ability with respect to the content being learned, taking into account the parallel phenomenon of forgetting, which was also considered in [
9].
In the model development, other areas of mathematics are also employed. Some discrete or stochastic mathematical models of human learning are discussed in Chapter 4 of [
10]. Atkinson and Shiffrin presented a detailed study using stochastic models [
11], while Käser et al. applied a dynamic Bayesian network and non-linear control mechanisms to model math skill acquisition [
12]. Foss et al. [
13] presented six different learning process models based on a form of reaction kinetics, studying equilibrium conditions and stability. Dassios et al. [
14] presented a mathematical model using fractional calculus, and Chornyi et al. [
15] developed a cybernetic model based on partial differential equations with fractional degrees. Lu et al. [
16] presented a mathematical model that unifies nested timescales of learning, showing how short-term dynamics of motivation give rise to well-established patterns of skill acquisition.
Gafriani et al. [
17] investigated the effect of the mastery learning model on mathematics achievement among elementary school students. Differences in mathematics performance between secondary students taught with conventional versus mastery learning strategies were examined in [
18]. In another study, 157 Swiss secondary school students participated in an intervention that assessed the immediate and long-term effects of teaching modeling and its influence on self-efficacy [
19]. An online Q study conducted in the UK [
20] demonstrated its utility for identifying and explaining teachers’ values. Vlǎdescu [
21] further analyzed the relationship between mastery learning models and mathematics achievement in secondary education. Coupland et al. [
22] developed an assessment model for teaching first-year mathematics, which they implemented at a metropolitan university.
Many researchers have also studied the loss of memory. Georgiou et al. [
23] proposed a model in which weaker memories are supplanted by stronger ones. Additionally, Murre et al. [
24] developed an alternative framework that decomposes memory into a set of distinct processes. An extensive review of memory models was published by Finotelli and Eustache [
25].
In this paper, we develop and solve a theoretical model using an IVP that represents the learning theory of constructivism, according to which individuals construct knowledge through experience and prior knowledge. The model takes into account the amount of knowledge the student already possesses, the amount of knowledge that must be acquired, the student’s intelligence, the amount of knowledge forgotten during the process, and the amount of knowledge available at the beginning of the study. The obtained IVP is solved analytically, and a parametric study of the solution is conducted, highlighting the fact that the higher the student’s intelligence and the lower the loss factor are, the faster knowledge is achieved. As in previous models, full knowledge cannot be reached in finite time [
5,
8,
14], and therefore, in order to achieve a perfect score, the student will need to read and learn more than the necessary amount of knowledge. By correlating data from [
8,
26], we demonstrate that the model is adequate for describing these data, reflecting that the proposed model can be used to characterize mastery learning with great accuracy. This conclusion is verified using the Wilcoxon signed-rank test [
27]. Therefore, the model can provide the function expressing the amount of knowledge that the individual knows at any time, as well as the time at which half of the total knowledge will be achieved. This prediction will help teachers guide students regarding the knowledge they need to acquire to excel in the exams within the available time, raising appropriately the amount of knowledge that the student must learn.
The structure of this paper is as follows: In
Section 2, the mathematical model is developed, while in
Section 3, a parametric study of the obtained model is conducted. In
Section 4, the parameters of the proposed model are calculated to fit existing data, and a comparison with previous models is presented. Finally, in
Section 5, the discussion is provided, and in
Section 6, the conclusions are presented.
2. The Mathematical Model
A student has to take an exam and
is the total amount of knowledge that has to learn. Let
be the function of knowledge gained at time
so
In our model, the rate at which the human’s brain learns the total amount of knowledge
M is proportional to the amount of knowledge learned and yet to be learned, providing the ODE
where
is a constant representing the student’s intelligence, the so-called intelligence factor, reflecting the processing efficiency of the student. This is the well-known logistic differential equation.
For reasons of completeness and realism, a loss factor,
is also introduced which is a constant expressing the loss of knowledge during studying, and therefore the ODE assumes the form
Moreover, the initial condition (IC)
is applied, where
is a constant providing the percentage of the material
M that the student knows at the beginning of the study. Therefore, the problem at hand is
or equivalently if
is the percentage of
M that the student knows at time
the problem is translated to
The equations in (
4) or (
5) are non-linear first order ODEs (specifically, they are Bernoulli’s ODEs).
Functions
are solutions of the ODEs of (
4), (
5), respectively, but they satisfy the corresponding IC only if
If
the solution of the ODE of (
4) is
and applying the IC
while
If
the solution of the ODE of (
4) is
while taking into account the IC, the solution of (
4) is
4. Fitting Data
Aldridge [
8] used the data from three students who took four tests in a specific amount of knowledge within 12 h (
Table 1).
Using these data we adjusted function
given in (
11) with specific values of the parameters, for each one of the students A, B, C. For reasons of comparison we set
In order to calculate the other parameters
that correspond to student A, the solution of the system
must be derived. Using the method of the least squares we derive that
(
Figure 6). Therefore, the function
for the student A is obtained, the half-learning time was before time 0 and
Similarly, students’ B data are fitted with
, and
(
Figure 7), the half-learning time is approximately
h and
If
, and
(
Figure 8), students’ C data are fitted and the half-learning time can never been reached since
This proves the significant learning problem of student C.
Moreover, the derived model for each student can be used in order to predict the grades of these students in various times (
Table 2).
Finally, our model can predict when a student started the effort to obtain the necessary knowledge, solving the equation Students B, C started their effort at the beginning (time equals to zero), while for student A it stands meaning that his effort started before time zero. Therefore, if and taking the limit as we derive that indicating that student A has established a stable knowledge base prior to time zero with respect to the material that must be learned in order to excel in the exams.
Along this line we compare our model with the data presented in [
26] (
Table 3). Using the least square method we calculate the parameters
M, and
Group 1 was derived with
group 2 with
, and group 3 with
From these data we derive that for group 1, the absolute relative errors, i.e., the quantity
for Anderson’s model is
for Preece’s model is
while for our model is
(
Figure 9). These results indicate that the interval in our model has the best approximation. Moreover, for group 2 the absolute relative errors are
respectively (
Figure 10), while for group 3 they are
respectively (
Figure 11), deriving the same conclusions as studying group 1’s results.
Calculating the percentages expressing how far the distance from the middle to the edge of each interval is (
Table 4), i.e., the quantity
we conclude that our model has the least mean value of these deviations.
5. Discussion
In this paper, a model for mastery learning is presented. No student can excel in the exam when
so the cases with significant mathematical and practical interest are when
The amount of knowledge is constant when
increasing at an increasing rate from the beginning of the study up to time
and increasing at a decreasing rate from
to infinity (
Figure 3) when
and finally decreasing at an increasing rate over the whole domain when
(
Figure 4).
The graph of is similar to a sigmoid. Thus, to acquire the necessary knowledge as soon as possible, the half-learning time has to be as close as possible to the knowledge axis (), which means that and if tends to time tends to infinity. The same conclusions apply when dealing with the inflection point
In
Section 4, we examine the validity of the model. First, using data from [
8], we calculated the four unknown parameters of our model
by solving a non-linear system with four equations and four unknowns for each student. This can be done using any numerical scheme of the researcher’s choice; in our case, we used the method of least squares. To demonstrate the model’s efficiency, we plotted the graphs of the function
derived from the proposed model for each student, which demonstrated high accuracy. These data were also used to predict student scores at various times. Subsequently, we compared our model with those of Anderson and Preece using real data from [
26]. First, we calculated the unknown parameters with the method of least squares, and our results are presented in
Table 3, which illustrates the model’s performance. Our model achieved the lowest standard deviation for each group in terms of absolute relative error. Furthermore, using the obtained parameters, we calculated for each model and group the mean deviation percentages, which represent the distance from the midpoint to the edge of each interval (
Table 4).
In order to further validate the model, a non-parametric rank test for statistical hypothesis testing was applied—the well-known Wilcoxon signed-rank test [
27]—using the data from
Table 3. The T-value of the Wilcoxon signed-rank test is 28 for Anderson’s model, 30 for Preece’s model, and 34 for the proposed model. All models are efficient since the T-value is greater than 14, but our model better describes mastery learning, as it has the highest value.
Since the total amount of knowledge, is constant, it is useful to study the behavior of function with respect to the intelligence factor A and the loss factor This can be done, plotting the graph of with constant values of , and several values of Therefore, we plot function with and
Where in each case we have drawn the point that correspond to half-learning time (named respectively).
From these figures it is clear that the smaller the fraction the faster the student learns half of the total amount of knowledge M and consequently the faster they will acquire the necessary knowledge to excel in the exams.
6. Conclusions
The process of learning is of great interest to scientists. The functioning of the human brain is widely studied, but much remains to be understood. The rate at which the human brain learns has been approximated by various mathematical models. In this work, we develop a mathematical model based on an ordinary differential equation that takes into account the material that a student has to learn, the individual’s intelligence quotient, and the indisputable fact that during learning, a student forgets some of the material that has already been studied.
Our model takes into account the fact that when someone studies to gain new knowledge, they face difficulties with the new material, so the rate of knowledge acquisition is lower than in previous models, and when these issues are overcome, the rate of knowledge acquisition increases rapidly. This result reflects all three groups of students [
15]:
- I.
Those who quickly reach the maximum level of assimilation of information;
- II.
Those who quickly reach the intermediate level and then slowly improve it;
- III.
Those who slowly master the teaching material.
Students of group I are described in our model with high values of the parameters and low values of the parameter students of group II with intermediate values of the parameters and students of group III with low values of the parameters and high values of the parameter Comparing our model with previous ones and taking into account actual data, the validity of the model is confirmed. Finally, our study demonstrates that the proposed model effectively captures the principles of mastery learning and can be directly utilized by educators as a practical tool to guide instructional planning, thereby enhancing students’ preparedness for examinations scheduled on specific dates.
The proposed model can be used by any teacher to help their students. This can be achieved as follows: Give the student a specific subset of the total material to study and give them a test after 1, 2, 3, or 4 h of study. The scores from these tests create a
non-linear system (similar to system (
17)), whose solution gives the parameters
and therefore the function
Consequently, the teacher can predict the student’s performance at any time, thus effectively guiding them to success. In order to excel in the exams, the student has to read more than the total amount of knowledge required for the exams. The percentage of this variation can be calculated through the obtained function, taking into account the available time for reading.
In conclusion, this study may serve as a stepping stone for future research, either by analyzing the model’s ODE with non-constant parameters or by incorporating additional parameters such as a distraction factor or an environmental factor. Moreover, the author plans to present an extension of this model using fractional calculus in a forthcoming publication, in order to further improve the obtained results.