Implementation of Fuzzy Functions Aimed at Fairer Grading of Students’ Tests

: The main goal of this research is to achieve fairer assessment of students’ knowledge and skills in tests in view of the fact that learners with similar abilities are often assigned di ﬀ erent test grades if their achievements are borderline cases between two grades. As an example, a student who has scored 29 points out of 60 in a test will fail, while another one with 30 or 31 points will pass, which can be viewed as unjust. This problem is addressed by using fuzzy logic and di ﬀ erent fuzzy functions to determine whether students with borderline test scores are assigned the higher or the lower grade of the two. Furthermore, although changing the grades of individual students, the authors endeavor to preserve the overall score of all test takers statistically unchanged.


Introduction
The education process at the Faculty of Mathematics and Informatics (FMI) at the University of Plovdiv is divided into three trimesters of 10 weeks each per academic year. The students from most majors study English only for two trimesters during their first year at university with a total of 100 seminars. In the course of the English language education students at FMI undergo a number of written examinations. First of all, they sit for a placement test, which determines the language level of each individual learner and, based on their results, students are redistributed from academic groups into language ones. The placement test scores are essential because they ensure the homogeneity of the language groups of learners with similar abilities. Then, students at FMI have classes once a week during which they use web-based materials and work on various projects. For homework, they do online tests in accordance with the exact teaching material covered during the seminars. These are self-study tests, created by the teacher, which are founded on criteria ranging from a lower to a higher level of difficulty [1]. Students take the self-study tests in their own place and time within the week before their next seminars in English at FMI. The grading of each test contributes towards the final grade that students are assigned at the end of the English course; for this reason, it is crucial that the students' tests are evaluated fairly and impartially. Besides the placement test and the self-study tests, students also sit for Midterm and Final exams during their course in English.
Fair and impartial assessment of the students' knowledge and skills is essential not only when high-stakes evaluations are concerned, for instance tests that determine whether a learner is eligible to enroll in or graduate from a school, but also for low-stakes ones. Assessments can help students to gain insight into their own achievements as well as to build confidence in their ability to learn and remain positive and motivated to continue studying. Since the global work culture is changing, students will have to become life-long learners to remain competitive on the job market. Assessment plays a key role in developing the learners' confidence in their ability to learn and skills to do it throughout their lives.
The main reason to create and administer test examinations to students is to measure their understanding of specific content or the effective application of critical thinking skills. As a consequence, tests evaluate student learning and the level of their academic achievements at a specific stage of the course of studies. Online tests provide a number of benefits; select-response tests such as multiple-choice or true/false ones immediately provide students with their final scores, while open questions allow students to improve their writing by drafting, editing, and rewriting their texts, if necessary. Moreover, the data that tests provide, gathered over time, can help instructors to identify trends and improve their own teaching. In conclusion, assessment allows teachers to collect, record, score, and interpret information about learning. Some of the advantages of self-study tests are that they support learners of different ages and different abilities; they help them to memorize the content for a longer time and increase student achievement [2,3]. Testing not only registers the student's level of knowledge but also changes it, thus improving the long-term retention of information, which is known as the testing effect [4]. Self-study tests provide feedback to both the student and the teacher and in this way, they allow for adjustments to the teaching process in order to achieve better final results.
One of the most important indicators of the quality of education is the result from it presented as a grade based on some assessment of a learner's knowledge and skills. In addition to fulfilling a specific function, the assessment grades must be objective and unbiased; they should indicate the achievement of certain significant results and be understandable for the learners, who have to be informed in advance about the evaluation method and its criteria. Moreover, the grades themselves must not be turned into a goal; they must inform about the student's progress, and the methods for calculating the grade have to be objective and statistically valid.
Nevertheless, students with similar abilities may be assigned different grades if their achievements are borderline cases, for example, a student who has scored 29 points out of 60 will probably fail, while another one with 30 or 31 points will pass the test, which can be viewed as unfair. In an attempt to find a reasonable and impartial solution to this problem, we have used fuzzy logic to review the results of a specific test taken by 78 first-year students of Informatics at FMI. The test assessed the students' reading and writing skills in English based on the learning materials and it comprised 61 items, 60 of which were closed-ended questions and one open-ended. The closed-ended items were multiple choice questions with four response options, each one awarding one point for a correct answer, while the open question required students to write a long-text answer to a question and it granted a maximum of 20 points. All the open-ended questions were evaluated by the same teacher following explicit predetermined criteria.
Fuzzy logic has a number of applications in analyzing studies from various areas of education as can be seen from Reference [5]. Research has been made previously endeavoring to incorporate fuzzy logic and education [6] or to use fuzzy logic as a tool for assessing students' knowledge and skills collectively [7]. However, instead of assessing student groups' knowledge and skills by means of fuzzy logic, we have used fuzzy functions to reconsider students' individual assessments in cases where there may be doubts of unfair grading.

Fuzzy Sets
Uncertainty in some test-related conclusions may appear due to vagueness or imprecision rather than to randomness, and it is frequently encountered in the evaluation of student results. The idea of fuzzy sets, which was introduced by Lofti Zadeh [8], can solve this problem to some extent. A typical example may be presented by the following two sets: set A = {if a man is taller than or equal to 1.85 m, we consider him a tall person}; set B = {if a man is less than 1.85 m tall, we consider him not to be tall}. We can define the characteristic functions of sets A and B as listed below: What can we say about a person who is 1.71 m tall? We obtain that µ tall (1.71) = 0 and µ not tall (1.71) = 1. Therefore, we can definitely conclude that he or she is not tall.
On the other hand, what can we say about two people who are 1.84 m and 1.86 m tall respectively? We get that µ tall (1.84) = 0, µ not tall (1.86) = 1, µ tall (1.86) = 1 and µ not tall (1.86) = 0. Therefore, the first person should be regarded as not tall and the second one as tall. However, that would be a mistake since both of them are approximately the same height. In cases like these, it is reasonable to apply fuzzy sets. Fuzzy sets are considered with respect to a nonempty base set X of elements of interest (in our case X will be the set of all people). The essential idea is that each element x∈X is assigned a membership grade µ tall to set A, taking values in [0,1], with µ tall (x) = 0 corresponding to non-membership, 0 < µ tall (x) < 1 to partial membership, and µ not tall (x) = 1 to full membership. One possible definition that provides an example of fuzzy sets of the tall and not tall individuals can be: We can plot the function µ tall with blue color and µ not tall with red color (Figure 1): Therefore, the first person should be regarded as not tall and the second one as tall. However, that would be a mistake since both of them are approximately the same height. In cases like these, it is reasonable to apply fuzzy sets. Fuzzy sets are considered with respect to a nonempty base set X of elements of interest (in our case X will be the set of all people). The essential idea is that each element x∈X is assigned a membership grade to set A, taking values in [0,1], with ( ) = 0 corresponding to non-membership, 0 < ( ) < 1 to partial membership, and ( ) = 1 to full membership. One possible definition that provides an example of fuzzy sets of the tall and not tall individuals can be: We can plot the function with blue color and with red color (Figure 1): There are different functions that can be used to assign a membership grade. The previous example illustrates the so-called trapezium membership function. Owing to the use of the bell-shaped function (Figure 2), when the height of a person gets closer and closer to the limit of 1.9 m, the degree of his/her membership grade to the tall people approaches 1 more rapidly, and this is evident from (1.87) = 0.7 ( Figure 1) and (1.87) = 0.79 ( Figure 2, blue curve). If a person's height is close to the boundary value of 1.9 for tall persons, his/her membership grade, calculated with the help of the bell-shaped function, is closer to one, than if it is calculated with a trapezium membership function. A sample for such a membership function is displayed in Figure 2. Now a person who is 1.87 m tall can be viewed as partly tall and partly not tall. In fact, we get that he/she belongs to the set of tall people with a degree µ tall (1.87) = 0.7, and he/she belongs to the set of not tall people µ not tall (1.87) = 0.3. If we consider a 1.82 m tall person, then he/she belongs to the set of tall individuals with a degree µ tall (1.82) = 0.2, and to the set of not tall people µ not tall (1.82) = 0.8. If we consider a person that is 1.95 m tall, then he/she belongs to the set of tall men with a degree µ tall (1.95) = 1, and to the set of not tall people µ not tall (1.95) = 0, so he/she is definitely a tall individual.
There are different functions that can be used to assign a membership grade. The previous example illustrates the so-called trapezium membership function. Owing to the use of the bell-shaped function (Figure 2), when the height of a person gets closer and closer to the limit of 1.9 m, the degree of his/her membership grade to the tall people approaches 1 more rapidly, and this is evident from µ tall (1.87) = 0.7 ( Figure 1) and ν tall (1.87) = 0.79 ( Figure 2, blue curve). If a person's height is close to the boundary value of 1.9 for tall persons, his/her membership grade, calculated with the help of the bell-shaped function, is closer to one, than if it is calculated with a trapezium membership function.
A sample for such a membership function is displayed in Figure 2.

Fuzzification of Score Metrics
As previously stated, the score metric represented the number of correct answers to 60 questions with a closed answer, each one awarding one point for a correct answer and zero points for an incorrect one, and one question with an open answer, which was graded by the teacher on a scale from zero to 20.
The criteria for constructing the e-tests that we used are based on Bloom's taxonomy. The closedended questions were of similar difficulty and length, they were scored automatically and aim at the lower levels of cognitive thinking by testing the students' knowledge and comprehension of the learning material. For this reason, the same weight of one point for a correct answer was assigned to each of them. The open question targeted the higher levels of cognitive thinking; it required students to compose a text, in which they provided an opinion and justified it, and this question was evaluated manually by the teacher. Besides the difference in the level of difficulty in the two types of questions, there was a chance that closed-ended ones may not reflect a student's knowledge and skills objectively for a variety of reasons, for example guessing, technical errors (automatic change of the selection of an answer while scrolling up or down the webpage), etc. On that account we assumed that the test score metrics relative to the open question were the critical ones, wherefore that item was assigned a higher weight.
The criteria for determining the weight of the open question were based on a scoring guide that we developed, which outlined what information was important in each response and how much credit was given for each part of it. These criteria reviewed the content of the responses-compliance with the topic and logical sequence of the text, adherence to the set format of the text (for example, an opinion essay, letter, etc.) and number of words, adherence to the grammatical norms and rules, correct and accurate use of the vocabulary, wealth of means of expression, and correct spelling.
When items of a test measured the same kind of ability or knowledge, they yielded a high internal consistent reliability. If a test was made up of different kinds of items assessing different types of abilities and knowledge, Cronbach's alpha coefficient tended to be low as a result of the heterogeneity of the items in terms of format and content. That was why first we decided what weight to place on the open question to ensure that the test was reliable. A wrong decision can sometimes be unfair. Cronbach's alpha coefficient can easily be calculated by the formula: The number of questions was 61, the sum of the variances was 50.91, and the test variance was 179.45. Applying these numbers in the formula for Cronbach's alpha coefficient, we got = 0.72, which could be interpreted that the reliability was just acceptable. The greatest variance appeared in Now a person who is 1.87 m tall can be viewed as partly tall and partly not tall. In fact, we get that he/she belongs to the set of tall people with a degree µ tall (1.87) = 0.79 and to the set of not tall people µ not tall (1.87) = 0.21. If we consider a person that is 1.82 m tall, then he/she belongs to the set of tall people with a degree µ tall (1.82) = 0.09 and to the set of not tall people µ not tall (1.82) = 0.9. If we consider a 1.95 m tall person, then he/she belongs to the set of tall people with a degree µ tall (1.95) = 1 and to the set of not tall individuals µ not tall (1.95) = 0 so he/she is definitely a tall person.

Fuzzification of Score Metrics
As previously stated, the score metric represented the number of correct answers to 60 questions with a closed answer, each one awarding one point for a correct answer and zero points for an incorrect one, and one question with an open answer, which was graded by the teacher on a scale from zero to 20.
The criteria for constructing the e-tests that we used are based on Bloom's taxonomy. The closed-ended questions were of similar difficulty and length, they were scored automatically and aim at the lower levels of cognitive thinking by testing the students' knowledge and comprehension of the learning material. For this reason, the same weight of one point for a correct answer was assigned to each of them. The open question targeted the higher levels of cognitive thinking; it required students to compose a text, in which they provided an opinion and justified it, and this question was evaluated manually by the teacher. Besides the difference in the level of difficulty in the two types of questions, there was a chance that closed-ended ones may not reflect a student's knowledge and skills objectively for a variety of reasons, for example guessing, technical errors (automatic change of the selection of an answer while scrolling up or down the webpage), etc. On that account we assumed that the test score metrics relative to the open question were the critical ones, wherefore that item was assigned a higher weight.
The criteria for determining the weight of the open question were based on a scoring guide that we developed, which outlined what information was important in each response and how much credit was given for each part of it. These criteria reviewed the content of the responses-compliance with the topic and logical sequence of the text, adherence to the set format of the text (for example, an opinion essay, letter, etc.) and number of words, adherence to the grammatical norms and rules, correct and accurate use of the vocabulary, wealth of means of expression, and correct spelling.
When items of a test measured the same kind of ability or knowledge, they yielded a high internal consistent reliability. If a test was made up of different kinds of items assessing different types of abilities and knowledge, Cronbach's alpha coefficient tended to be low as a result of the heterogeneity of the items in terms of format and content. That was why first we decided what weight to place on the open question to ensure that the test was reliable. A wrong decision can sometimes be unfair. Cronbach's alpha coefficient can easily be calculated by the formula: The number of questions was 61, the sum of the variances was 50.91, and the test variance was 179.45. Applying these numbers in the formula for Cronbach's alpha coefficient, we got α = 0.72, Educ. Sci. 2019, 9, 214 5 of 13 which could be interpreted that the reliability was just acceptable. The greatest variance appeared in the open question, therefore we searched for a coefficient γ to multiply the results of the open question, so that α became bigger. We chose γ = 0.2, then we got: the number of questions was 61, the sum of the variances was 12.92, and the test variance was 114.99. Applying these numbers in the formula for Cronbach's alpha coefficient, we obtained α = 0.902, which could be interpreted that the reliability was excellent.
Now the maximum number of points that a student could get was 64 (60 + 20 × 0.2). We defined five categories of grades depending on the number of correct answers obtained by each student: Fail (from zero to 31 points), Satisfactory (from 32 to 39), Good (from 40 to 47), Very good (from 48 to 55), and Excellent (from 56 to 64).
As pointed out by the authors of References [6,[9][10][11], it is difficult and unfair to assign a Good grade to a student with 47 points and to give a Very Good grade to another one with 48 points. Therefore, the use of fuzzy sets is justified. We will need to assume that someone whose test score belongs to the set (37-43) can be assigned either a grade Satisfactory or a grade Good. Thus comes the necessity to use fuzzy sets. Our goal was to apply fuzzy sets and fuzzy logic in the estimation of students' tests in order to assign them a fairer mark.
We defined five functions that represent the fuzzy membership trapezium functions to the sets of grades; their plots are given in Figure 3.   . Figure 3. Plots of , . , , , and . Figure 3. Plots of µ Fail , µ Satis. , µ Good , µ VeryGood , and µ Excellent .
In the case when the points that a student had obtained in a test did not belong definitely to a given set, we needed to select another criterion, dependent on the student's score, in order to take a decision what grade to assign him/her. Consequently, following the ideas from Reference [1], the second assessment that was applied to estimate a student's knowledge was their result of the open question. Furthermore, we divided the students' grades into five groups again: Fail (from zero to eight points), Satisfactory (from nine to 11), Good (from 12 to 14), Very good (from 15 to 17), and Excellent (from 17 to 20), and we also defined their membership functions ( Figure 4): x < 6.9 cos(1.42x−9.85)+1 2 6.9 ≤ x < 9.1 cos(1.42x−6.71)+1 2 6.9 ≤ x < 9.   The fuzzy associative matrix (Table 1) provided a convenient way to directly combine the input relations in order to obtain the fuzzified output results [9].   The fuzzy associative matrix (Table 1) provided a convenient way to directly combine the input relations in order to obtain the fuzzified output results [9]. The input values for the results of the open question are across the top of the matrix and the input values for the total score of the test are down left in the matrix. We used the conventional Bulgarian grading scale.  The rules for performing set operations of union (AND), intersection (OR), and complement (NOT), which were of most interest to us, are stated below [9].
For union, we looked at the degree of membership for each set and picked the lower one of the two, that is: µ A∪B = min(µ A , µ B ) ( Figure 5).  The rules for performing set operations of union (AND), intersection (OR), and complement (NOT), which were of most interest to us, are stated below [9].
For union, we looked at the degree of membership for each set and picked the lower one of the two, that is: ∪ = min( , ) ( Figure 5). For intersection, we looked at the degree of membership for each set and picked the higher one of the two, that is ∩ = max( , ) (Figure 6).  For intersection, we looked at the degree of membership for each set and picked the higher one of the two, that is µ A∩B = max(µ A , µ B ) ( Figure 6).  The rules for performing set operations of union (AND), intersection (OR), and complement (NOT), which were of most interest to us, are stated below [9].
For union, we looked at the degree of membership for each set and picked the lower one of the two, that is: ∪ = min( , ) ( Figure 5). For intersection, we looked at the degree of membership for each set and picked the higher one of the two, that is ∩ = max( , ) (Figure 6). with a degree 0.21; to the set ∩ with a degree 0, and to the set ∩ with a degree 0 of the associative matrix. Therefore, we could assign him/her a For complement, we subtracted the degree of membership from 1, that is µ NOT A = 1 − µ B . Let us consider a student with an overall score of 49 points and a score of the open question of 19 points. He/she belonged to the set of Very Good grades with a degree µ VeryGood (49) = 0.79 and to the set of Good grades with a degree µ Excellent (49) = 0.21. Normally, he/she would get a grade Very good (5). However, the intersection of the two grades-the overall points and the points of the open question, states: He/she belongs to the set µ VeryGood ∩ ν Excellent with a degree 0.79; to the set µ Good ∩ ν Excellent with a degree 0.21; to the set µ VeryGood ∩ ν VeryGood with a degree 0, and to the set µ Good ∩ ν VeryGood with a degree 0 of the associative matrix. Therefore, we could assign him/her a grade Excellent (6).
In accordance with Reference [9], we needed to calculate the grade for each student, whose test score did not belong definitely to a given set. To do so, we referred to the Table 2 where the function F registered the minimums of µ and ν, i.e., F = min(µ, ν). The maximum membership grade, which was acquired from the table, represented the corrected grade from the associative matrix. Table 2. The various combinations of the minimums of µ and ν, calculated by use of F.
Let us now take an example with a student whose overall score of the test is 57 points and his/her score of the open question is 18 points (p = 57 and q = 18). As illustrated in Table 3, the student's grade after the process of fuzzification was Excellent 6, which coincided with the classical grading (56-64 points correspond to an Excellent grade). Table 3. If we review another student who obtained 42 points overall (which corresponds to Good 4 using the conventional grading system) and 17 points of the open question, it is evident from Table 4 that after the fuzzification that student should have been assigned a higher grade-Very Good 5. To reevaluate the test scores that did not belong definitely to a given set, we only needed to input the test results: first as the overall number of points obtained, and then as the number of points of the open question for all the 78 students that have taken the test. Next, the software (MapleSoft 2016.0) automatically selected the ones that needed to be fuzzified and calculated the fuzzified grades.
With the help of the F-test we checked whether the standard deviation S 2 X after using the fuzzy technique and the standard deviation S 2 Y before using it were equal. We got the following result: Two sample F-test (X, Y, 1, confidence = 0.95, equal variances = true); F-ratio test on two samples.
Null Hypothesis: A sample drawn from populations with a ratio of variances is equal to 1.

Alt. Hypothesis:
A sample drawn from a population with a ratio of variances is not equal to 1.

Result: [Accepted].
This statistical test did not provide enough evidence to conclude that the null hypothesis was false. Consequently, we could accept that the standard deviations were equal.
We also checked the distributions of the grades before and after using the fuzzy technique for equal mean values.
Null Hypothesis: A sample drawn from populations with a difference of means is equal to 0.

Alt. Hypothesis:
A sample drawn from a population with a difference of means is not equal to 0.

Result: [Accepted].
This statistical test did not provide enough evidence to conclude that the null hypothesis was false. Consequently, we could accept that the mean values were equal.
With the help of the paired t-test we checked if the distributions of the changed grades by means of the fuzzy technique were one and the same distribution.
Null Hypothesis: A sample drawn from populations with difference of means is equal to 0.
Alt. Hypothesis: A sample drawn from a population with difference of means is not equal to 0.
Result: [Accepted]. This statistical test did not provide enough evidence to conclude that the null hypothesis was false. Consequently, we could accept that the two distributions Y and X were equal.

Summary
To summarize, we could say that the means of the grades of the group as a whole before and after the fuzzification did not change statistically: the mean of the grades before the process of fuzzification was 4.09, and after the fuzzification 4.04.
The number of the fuzzified grades from the test was 42 out of 78. Nearly half of the grades were fuzzified-namely 53.84%, and the number of corrected grades was 31. The percentage of the corrected grades after the fuzzification from the overall grades was 39.74%, and the percentage of the corrected grades after the fuzzification from fuzzified grades was 73.81%, which indicated that the grading system needed some adjustments and that fuzzy logic was applicable to a large number of students in pursuit of more reasonable grading.
To implement the process of fuzzification, by virtue of the sequence of procedures that we wrote in MapleSoft 2016.0, we only needed to enter: The software automatically calculated the fuzzified grades and checked the three hypotheses mentioned above. We could change the boundary values of the membership functions to ensure that the two distributions were equal. If we reduced the intervals, in which the grades were not fuzzified, and increased the intervals, in which we used fuzzification, we obtained results whose distribution was substantially different from that of the results before the process of fuzzification. That is why we would like to emphasize that the teacher's experience in using fuzzification to correct the final grades is of great importance. We illustrate below that by changing the set of points which will be fuzzified. We defined five functions that represent the fuzzy membership bell-like functions to the sets of grades (Figure 7).   We used the same membership functions , as in the first case. We obtained that the set of scores Y that would be fuzzified consisted of 46 items. Performing the same procedure in MapleSoft 2016.0, we got: Two sample F-test (X, Y, 1, confidence = 0.95, equal variances = true); F-ratio test on two samples.
Null Hypothesis: A sample drawn from populations with a ratio of variances is equal to 1. We used the same membership functions υ, as in the first case.
We obtained that the set of scores Y that would be fuzzified consisted of 46 items. Performing the same procedure in MapleSoft 2016.0, we got: Two sample F-test (X, Y, 1, confidence = 0.95, equal variances = true); F-ratio test on two samples.
Null Hypothesis: A sample drawn from populations with a ratio of variances is equal to 1.

Alt. Hypothesis:
A sample drawn from a population with a ratio of variances is not equal to 1.

Result: [Accepted]
. This statistical test did not provide enough evidence to conclude that the null hypothesis was false. Consequently, we accepted that the standard deviations were equal.
Null Hypothesis: A sample drawn from populations with a difference of means is equal to 0.

Alt. Hypothesis:
A sample drawn from a population with a difference of means is not equal to 0.
Null Hypothesis: A sample drawn from populations with difference of means is equal to 0.
Alt. Hypothesis: A sample drawn from a population with difference of means is not equal to 0.

Result: [Rejected].
In short, the number of the fuzzified grades from the test was 46 out of 78. Over half of the grades were fuzzified-namely 63.88%, and the number of corrected grades was 35. The percentage of the corrected grades after the fuzzification from the overall grades was 44.87%, and the percentage of the corrected grades after the fuzzification from fuzzified grades was 76.08%. The means and the standard deviation of the group before and after the fuzzification did not change statistically: the mean of the grades before the process of fuzzification was 4.39, and after the fuzzification 4.04. However, the paired test showed that there was a significant difference in the distributions of the fuzzified marks. Therefore, in this particular case, it cannot be interpreted as obtaining a fairer grading of students' tests.

Conclusions
Although research has been made previously to incorporate fuzzy logic and education [6] or to use fuzzy logic as a tool for assessing students' knowledge and skills collectively [7], the focus of our work was the individual student. Our main concern was to verify that the grades each learner obtains are as fair as possible because test scores are important for their final grading and hence for their motivation to study [12].
One of our main contributions is that we have taken into account the reliability of the test when using fuzzy logic and have confirmed that the distributions of the changed grades by means of the fuzzy technique are the same as the distributions before that process.
In our work we wish to add more than two points in the borderline interval (for instance, to change the interval for assigning grade Good 4 from 51-60 to 50-61 points) in order to make assessments fairer and compensate for possible technical errors made by students. Thus, our purpose is to make the interval of borderline scores as large as possible while at the same time guarantee that the test is reliable.
For this reason, calculations were performed automatically to check at which values the test was reliable. In our test this interval included five points: for example, according to the usual grading, the grade Good 4 was assigned to students who obtained from 51 to 60 points on the test, while by using fuzzy logic this interval added five points and changed it to 47.5-62.5.
Just as importantly, the test results of all the students as a whole did not differ statistically from those obtained by using the traditional scale.
We can conclude that, when assessing students' achievements in tests, the use of the fuzzy technique needs to be handled carefully. We can only implement fuzzification if the mean value, the variance, and the distribution function do not differ statistically when the grades are considered as a set. In this case, we neither increase nor decrease the overall score of the examined group. Taking everything into account, we can deduce that we have found a way of getting a fairer evaluation of students, without changing the overall score of the group and their distribution function.
We are planning to incorporate fuzzy logic in the Distributed e-Learning Center (DeLC) at Plovdiv University during the next academic year. DeLC has been used at the Faculty of Mathematics and Informatics for teaching and testing Bachelor and Master-degree students for years. The tests, created and delivered online, evaluate the students' achievements in a variety of subjects among which English, Software Engineering, Databases, Intelligent Systems, Mobile Applications, elective courses such as Application of Block Chains, Contemporary JavaScript frames for Building Backend Applications, etc. Thus, our goal is to use fuzzy logic to review borderline test scores of a wide range of ability tests on a regular basis.

Author Contributions:
The authors worked on this piece in close collaboration and have contributed equally to its development.
Funding: This research received no external funding.