Moodle Quizzes as a Continuous Assessment in Higher Education: An Exploratory Approach in Physical Chemistry

: The use of Moodle quizzes as a continuous assessment and an integral part of the educational methodology in higher education has been analyzed in a case study of physical chemistry subject. Two types of quiz designed with different item types and different settings, called basic quiz (BQ) and thematic block quiz (TBQ), were elaborated making use of a question bank with more than 450 items. BQ has true/false items, while TBQ has randomly mixed items (multiple choice, numerical and matching). The effect of the type of quiz on the student scores is analyzed according to statistical and psychometric data such as the degree of participation, the facility index and the discrimination index of each item, and the average score, calculated according to the classical test theory. This allows us to discern which type of quiz has an enough quality to use it as an assessment tool. Moreover, the effect of this educational activity, developed during the last six academic years from 2014 to 2020, just before of the pandemic situation, is evaluated considering the scores of the students in the Ordinary Calls of exams and comparing them with previous courses taught with a traditional education based on master classes. The statistic results indicate that TBQs are more discriminative than BQs and could be used as an assessment tool, while BQs could be only useful as formative activity. Moodle quizzes turn out to be a reliable strategy for learning of contents in scientiﬁc matter, with a high participation in the knowledge tests, with good marks in the average score and a greater number of pass degrees in the Ordinary Calls.


Introduction
The adaptation of the subjects of any graduate degree to the new European Space for Higher Education, where new capacities and abilities are evaluated in students, implies a significant change in the traditional teaching methodology that has been developed, mainly as master classes. This change in the pedagogical model of teaching-learning is in turn conditioned and reinforced by the new model of current digital society and by the new information and communication techniques available in any space-time framework [1,2]. The availability of information on everyday electronic devices such as smartphones or tablets, in addition to the connectivity of user groups to the internet, allows establishing other more active work dynamics, giving to the student a greater participation in the teaching-learning process of a subject [3].
The studies carried out to evaluate the global impact of the use of technology on student performance are not conclusive, yielding different results [4,5] given that these teaching researches may depend on other factors not identified in the analysis itself, such as the educational method or strategy [6] that is carried out through the electronic medium, or the student's commitment to the learning methodology [7]. However, these multimedia and interactive technologies can be of great help in offering quality comprehensive education [8] based on current computer tools that facilitate cognitive learning processes and reinforce the capacities of abstract reasoning and study of a specific subject, in addition to complete the traditional forms of learning [9]. Different teaching methodologies integrate the technological devices in the educational environment such as the blended-learning [10][11][12] and gamification [13][14][15]. Blendedlearning, b-learning, combines face-to-face lessons in the classroom, required for any subject in university study plans, and virtual training teaching activities through learning platforms, while the gamification techniques try to create similar experiences to those experienced when playing games in order to motivate and engage users. Both methodologies profit from the presence of the teacher as a transmitter of knowledge and guide of educational activities and from the communication technology that facilitates independent and collaborative learning. In particular, b-learning has been applied in subjects from different areas of knowledge such as education sciences [12], natural sciences [16], economics [9], engineering [7], etc. as a proposal for European convergence, given that it allows the student's noncontact work hours to be completed with virtual activities, as established in the new university teaching guides for the convergence of the European Space for Higher Education.
Moodle platform [17] is a virtual learning environment that offers very attractive functionalities from the pedagogical point of view by promoting the philosophy of constructivist social education [18,19], and where the subjects can be accommodated with easy handling, at the editing and user level by teachers and students, respectively. In this virtual environment, teaching resources of different characteristics can be included, such as links to web pages, chats, forums, messages, and other specific documents like notes, tutorials and question relationships elaborated by the teacher. Moreover, it offers the possibility of carrying out online activities through quizzes, which could allow the continuous assessment of students' learning. A great variety of quizzes can be designed with different item types and settings, but not all quizzes can differentiate the skills and competences of student, and thus, they could not be used as assessment tools. The quality of these quizzes can be analyzed by statistical and psychometric data reported by Moodle platform [20,21].
Concerning evaluation methods by using online quizzes, there are studies in diverse disciplines such as engineering, biology, medicine and the social sciences [22][23][24]. Although, there are some objections to the implementation of such systems related to the confidentiality of the identity of the student, the use of the information and its possible impact on the educational process [25,26]; these offer some advantages such as the efficient management of results in a huge students' group, the speed by which the evaluation can be performed, and the save of paper [27]. However, the design of quizzes must be adequately elaborated in order to be used as an assessment tool. Two important points must be considered in the design, such as the writing of different questions using different item type and the own quiz settings. The statistical and psychometric data derived from a particular quiz can a great help us know the quality of the quiz. There are some studies regarding the analysis of information generated from test-type quiz evaluations in other scientific subjects [20,21,[28][29][30], yielding how such results could be useful for professors and students. No statistical and psychometric studies on physical chemistry quizzes have been found in the bibliography.
In this work, two types of Moodle quizzes are designed in physical chemistry subject. The main objective is to establish which type of quiz can be used as an assessment tool on the basis of statistical and psychometric data. Here, it highlights how Moodle statistics can be used to measure the effectiveness and reliability of a quiz. In addition, the effect of these online activities on the final scores of the students are compared with those obtained in a traditional education.

Materials and Methods
The research is designed in three stages: first, the student population was surveyed by a brief poll to inquire about their entry into the university; second, the students answered the quizzes during the teaching semester; and finally, the statistical and psychometric parameters of the quizzes was analyzed on the basis of the classical test theory [31][32][33] (See Supplementary Materials). A brief survey is carried out at the end of the teaching period to know the opinion of the students about this experience. The scores obtained in the two Ordinary Calls of exams are compared with those obtained in previous courses where the teaching methodology corresponds to a traditional education based exclusively on master classes.
This research is performed in the general physical chemistry subject during six years, from the 2014-2015 to the 2019-2020 academic years, just before of the pandemic situation. This matter is included in the Basic Module of the Degree in Chemistry at the University of Málaga. It consists of six theoretical credits, and it is taught during the first semester of the first year of the degree.
This subject was chosen because it is a difficult matter for novel students in the Degree of Chemistry. It includes themes like thermodynamic, electrochemistry and kinetics that are the starting point of other physical chemistry subjects in higher courses, in which a significant dropout of students has been detected. Thus, it seems convenient to apply a new educational methodology, or new activities using technological devices, in the first course in order to consolidate and strengthen the basic concepts of this matter.

Sample
The average number of students in general physical chemistry was about 80 students during the last academic years, with a parity proportion of men and women in the last four years. All students can freely participate in the quizzes as a unique experimental group. No specific sampling method and no control group is established with the aim that all students were evaluated in a homogenous way so that there are no discrepancies in the final evaluation.
It was not possible to perform a similar study in other courses or scientific areas, even in other degrees, because there were no other teachers implied in the project using a similar educational strategy with Moodle quizzes. Although this sample is not representative of the higher education context, the similar results obtained in this experience along different years with different students population point out that it would not be expected to see significant changes in another similar scientific scene, giving probably a similar trend.
At the beginning of the course, a brief survey is carried out to explore the admission at the university, such as academic background on chemistry knowledges and the enrollment in the degree. Considering an average of the last six academic years, practically all the students, 86-88%, are 18 years old, and the rest, 11-12%, are in the range of 21 to 25 years old, which could probably be due to repeaters in the secondary or bachelor cycle, or students who come from other degrees. Most of the students, 86-95%, have studied a chemistry subject during high school, but it should be noted that about 4-5% of students have not studied any chemistry subject in any official degree before to their admission to the university, although they indicate that they have basic knowledge of chemistry. Only a small proportion, 1-2%, have no knowledge of chemistry. Moreover, a high proportion, around 70-85%, has enrolled in the Chemistry degree because it is their vocation, being the first option in university pre-registration. Only 12-25% of students recognize that it is not their vocation and it has not been the first option in the university pre-registration. In addition, this degree was not the first choice of about 1-2% of students, but it was the only option for their admission to the university.

Development of the Experience: Didactic Strategy
Within the Moodle platform, a question bank has been created and divided into five thematic blocks that involve all topics of the teaching program (Table 1). Each block has more than 50 questions or items, even over 100 items in the cases of the Matter and Thermodynamics blocks. The question bank has over 450 items belonging to four types of Moodle questions: true/false, multiple choice (with multi-responses and single response), matching and numerical. All these items were elaborated according to the scientific competencies required for passing this subject. The set of items is classified, in turn, into two categories, one with the questions that collect the basic knowledge of the subject, while the other contains more elaborate questions, in order to check the skills and abilities of the students in practical reasoning about physical chemistry. In this way, two types of quiz are developed. First, a "basic" quiz (BQ) is proposed for each of the eleven topics. It consists of ten true/false type items, with a time limit of one hour. The BQ contains the same questions for all students and is active for a period of one week after finishing the topic in class. Second, another type of "thematic block" quiz (TBQ) is proposed corresponding to each of the five thematic blocks, which are made up of several topics in the teaching program, except those dedicated to chemistry kinetics (see Table 1). It has ten items of different type (multiple choice, numerical, matching) chosen at random from a category of question bank, so it is practically an individual and different test for each student. The multiple choice items have a particular characteristic, the correct/incorrect answers score positively/negatively, with a proportional value to the number of item options. These quizzes are held in a scheduled day.
In both types of quizzes, each item has the same statistical weight of 10% in the final mark. All quizzes are performed outside the classroom and have a delayed feedback; that is, the correct answers can be only checked once the test is over for all students. All these activities are carried out continuously throughout the semester according to the physical chemistry program.
All students were informed about the characteristic of Moodle quizzes and how the platform works before doing the activities. In this way, any bias factor due to students' attitudes towards technology along the time would be diminished.

Participation in the Quizzes
There is a high participation during the last six academic courses (Figure 1), higher than 50% in any BQ or TBQ quizzes, with the exception of the last BQ carried out in the 2017-2018 academic year with a participation of 40%.
2017-2018 academic year with a participation of 40%.
A detailed analysis by academic year allows us to know the dynamics and evolution of the participation. The participation falls down in the last quizzes, being always slightly lower than the first ones. This decrease is more striking in the 2016-2017 and 2017-2018 academic years, which go from approximately 85% and 75% in the first BQ to 60% and 40% in the last quiz, respectively, while it goes from 85% and 70% to 60% and 55% in the TBQs, respectively. The general trend is a progressive decrease in participation throughout the semester in any academic year. This is due to several factors, such as possible changes in the enrolment of the subject given that some students are waiting for a possible change to another degree at the beginning of the course, and this process does not materialize until after a month, but in the meantime, they have been taking the quizzes. Moreover, mid-semester A detailed analysis by academic year allows us to know the dynamics and evolution of the participation. The participation falls down in the last quizzes, being always slightly lower than the first ones. This decrease is more striking in the 2016-2017 and 2017-2018 academic years, which go from approximately 85% and 75% in the first BQ to 60% and 40% in the last quiz, respectively, while it goes from 85% and 70% to 60% and 55% in the TBQs, respectively.
The general trend is a progressive decrease in participation throughout the semester in any academic year. This is due to several factors, such as possible changes in the enrolment of the subject given that some students are waiting for a possible change to another degree at the beginning of the course, and this process does not materialize until after a month, but in the meantime, they have been taking the quizzes. Moreover, mid-semester partial exams of other matters are held, so students are immersed in the study of other subjects and end up not doing the quizzes, either because the time has passed to do it, or because they have not studied. Additionally, at the end of the semester, a large number of students have decided to drop out of the degree in chemistry and are not involved in the training activities. The dropout rate in this first-degree course is approximately 20-25%. In the initial survey of the class, 25% of the students consider that the degree in chemistry is not their vocation and it was not their first option in the university pre-registration.

Statistical and Psychometric Data of the Quizzes and Each Item
The results provided directly by the Moodle platform (https://docs.moodle.org/ dev/Quiz_statistics_calculations (accessed on 22 July 2021)) [20,21] have been analyzed and calculated according to the classical test theory [32,33]. Supplementary Materials summarize the definition of psychometric parameters. Tables 2 and 3 collect, for each quiz, statistical data such as the average score, the standard deviation (SD), the range of correct answers (maximum and minimum percentage), also called the facility index (FI), and the asymmetry in the distribution of the scores, also called bias, together with the internal consistency coefficient (ICC), or Cronbach's alpha, which gives an idea of the quality of the tests and allows to recognize if the whole exam is homogeneous. The average score of any BQ in any academic year is high, between remarkable and outstanding (6.51 for BQ-10 and 9.83 for BQ-7 in the 2015-2016 and 2018-2019 academic years, respectively), with a high percentage of correct answers in each quiz that ranges from 70% to 100%, except in the 2015-2016 academic year where a minimum success rate of 15%, 47% and 37%, was obtained in the BQ-6, BQ-7 and BQ-10, respectively, which correspond to the two most difficult topics to assimilate: thermodynamics and electrochemistry. This large range in the correct answers yields an asymmetric distribution with a negative bias greater than −1 in all academic years. That indicates the lack of discrimination among those students who do better than the average ratio, and it is due to the fact that most of the items are classified as basic knowledge, and also to the type of question (true/false) Educ. Sci. 2021, 11, 500 7 of 12 which shows a random response of 50%. The standard deviation is practically around 20%, except in some cases with a slightly higher value, between 22 and 28%, in those quizzes corresponding to the topics of thermodynamics and electrochemistry. The ICC in most quizzes at any academic year is higher than 65%, the minimum value proposed as indicator of an overall homogeneity of the quiz [34]. However, in some cases, values lower than 65% have been obtained, for example in the BQ- Moreover, the dispersion of the IF and the discrimination index (DI) for each item of any quiz have been analyzed in order to know the item effectiveness to discern between students with different cognitive ability ( Figure S1, left). Most of the questions have an adequate discrimination, with a DI above 30%. A more detailed analysis of the discriminative efficiency (DE) for any item of the different BQ ( Figure S2) shows that the effectiveness of the items depends on the academic year and, therefore, on the student population. For example, in the academic year 2015-2016, all items of the BQ1 do not reach 30% of DE, while in the 2018-2019 academic year, they are all above 30%. The same behavior has been found in other items corresponding to other quizzes. Therefore, the same item may or may not be discriminatory for a population of students depending on the level of knowledge they have, and therefore, questions that have a low DI should not be discarded. It is concluded that quizzes made only with true/false items serve as continuous training activities in the teaching-learning process of a matter, not being feasible as assessment activities because they are not discriminatory for students.
Different results are obtained for the TBQs ( Table 3). The average score drops significantly with respect to the BQ quizzes, from 5. 16  Moreover, the FI index drops significantly and ranges from 28% in the TBQ-5 of the 2014-2015 academic year to 81% in the TBQ-1 of the 2016-2017 academic year, but in no case does it reach 100% in any of the items in any quiz. The dispersion in the average scores oscillates around 20%, being slightly high in the last three CBTs of certain academic courses. The asymmetry of the distribution in the scores, the so-called bias, is still negative, but now with a value lower than −1, reaching a slightly positive value and with an almost symmetric distribution, with a bias close to zero in the last three quizzes of the 2018-2019 academic year with a value between 0.07 and 0.02.
As a general trend, the bias in the first two quizzes, TBQ-1 and TBQ-2, is somewhat higher than in the rest of the TBQ-3, TBQ-4 and TBQ-5 ones that have a bias close to zero. This indicates that the topics corresponding to the first two blocks are better assimilated than the rest of the topics corresponding to the blocks of thermodynamics, electrochemistry and kinetics. This effect is probably due to the fact that the first topics are already studied in the bachelor grade, while the topics of the last blocks are totally new, which means an effort in learning process.
Therefore, these TBQs are more discriminative between students than BQs, that is, here the cognitive abilities of each student are tested. The FI-DI scatter diagrams are shown in Figure S1 right. Although a low DI value is obtained due to the random characteristic of the quiz, the detailed analysis of the FI-DI diagrams for all items in any TBQ ( Figure S3) reveals that most of items are discriminative with a DI value above 30% and with a wide range in the FI. However, the ICC is always lower than the reference value of 65% [34]. This is because the items of the quiz have been randomly selected by the Moodle platform and show different questions for each student.

Students' Opinion on the Educational Activities
A survey was performed among the students to find out their opinion on these two types of virtual quizzes. It is a short survey with only eight statements, four for each type of quizzes, in which students have to answer YES or NO to the proposed statement. Although this questionnaire was not contrasted by the scientific community and its experimental validity was not demonstrated, it only tries to show the students' opinion on Moodle quiz activities in few ideas. Table 4 shows the average values obtained in the BQ and TBQ, respectively, during the last six academic years. These average values practically coincide with those obtained for each academic course; therefore, the students' opinion about how they apperceive the virtual quizzes, practically, does not change with the years. Regarding BQ, most of the students, about 90%, tell that the difficulty of the items is adequate for a basic level and the time of one hour is more than enough. These BQs help a lower percentage 60% to study the matter continuously, while the rest, 30%, think that they do not study continuously, even though they have to do the online quizzes. However, the BQs serve as a self-assessment of the knowledge acquired in class for almost 70%. Only 5-8% of students do not know or do not answer (NK/NA) to each of the proposed statements. The results obtained for the TBQs are a little bit different.
Regarding the TBQ, practically all students, 92%, say that the level of difficulty in these quizzes is greater than that of the BQ. This was predictable because one of the goals of these TBQ was to measure the level of abstract reasoning and assimilation of theoretical concepts. Only 50% of the students say that time limit of one hour is sufficient. On the other hand, these quizzes scheduled in fixed days allow more than 60% of students to study the days before doing the test. This fact consolidates the knowledge acquired in the classroom. Only 3% of the students answer NK/NA to the proposed statements.
In short, both types of quiz fulfill the goal of favoring a continuous study of the matter along the semester, avoiding the general trend to study only the days before the written exam. Moreover, all these educational activities into an electronic environment promote an online-accessible medium for the student that serves as a self-assessment of the level of knowledge acquired. Figure 2 shows the final scores in the matter of general physical chemistry corresponding to the two Ordinary Calls of exams (February and September), between 2011 to 2014 courses following a traditional teaching methodology, and between 2014 to 2020 courses following the new methodology with quizzes.

Final Scores in General Physical Chemistry
In the first Ordinary Call (February), a slight decrease in the percentage of students not presented and also in that of failed following the new methodology is observed. However, it is interesting to note the increase in the number of passes, not only with the minimum score but also with remarkable, outstanding and even honors (H) in the last five academic years, highlighting the academic year 2016-2017 where the percentage of outstanding is higher than in the rest of the courses.
In the second Ordinary Call (September), the percentage of students not presented remains almost constant in all the courses developed with the methodology based on quizzes and is practically the same as that of the first courses taught with the traditional methodology. There is not much variation in the percentage of failures and approved following one methodology or another, except in the last academic year 2019-2020, in which the number of not presented decreases while the number of approved increases. In this September call, there is a low percentage of outstanding students and the absence of scores above the remarkable following the new methodology with quizzes.
These online education activities help not only students with high cognitive capacity to attain good scores but also those with a medium level to pass the exam in the first call of February. The percentage of students that remain for the second call in September are really those who have not assimilated the physical chemistry knowledge and those who find it difficult to make scientific reasoning or deductions.
September call, there is a low percentage of outstanding students and the absence of scores above the remarkable following the new methodology with quizzes.
These online education activities help not only students with high cognitive capacity to attain good scores but also those with a medium level to pass the exam in the first call of February. The percentage of students that remain for the second call in September are really those who have not assimilated the physical chemistry knowledge and those who find it difficult to make scientific reasoning or deductions.

Conclusions
The use of Moodle quizzes, as online activities, favors the implementation of a different educational methodology in the subject of general physical chemistry of the Degree in Chemistry. However, not all quizzes can be used as assessment tools given that the item type and the quiz settings play an important role. In this work, two different types of quiz are proposed, basic (BQ) and thematic block (TBQ). BQ has only true/false items and these are the same for all students, while TBQ has multiple choice, matching and numerical items that are randomly selected from a category of the question bank.
Statistical and psychometric data provided by Moodle platform were analyzed. Most of items show a discrimination (DI) and facility (FI) index according to the proposed Moodle reference range. The FI-DI dispersion graphs, together with the average scores and the bias values, show the different quality of the two types of quiz. The BQs can be used as formative teaching activities because they have not enough evaluative quality to distinguish the different capacities and abilities of students, yielding in any quiz higher average scores and FI values, and also a large negative bias. TBQs are more discriminative, showing lower values of both FI and average scores, with a bias near to zero value, and thus, these can discern competencies and skills among students, so they could be used as assessment tools. As a conclusion, the true/false item must not be used as evaluative item. In the future, it is necessary to study how each type of item (multiple choice, numerical and matching) contributes individually to the final scores of the quiz in a similar performance. This would allow us to select the best type of item to use it in a particular assessment quiz.
Although the methodology applied is weak, the performance developed along the years indicates that TBQs work quite well to evaluate the physical chemistry knowledge and the different capacities of students, independent of the student population, and it could even be extrapolated to other similar scientific scenes.
The analysis of statistic data in TBQs, like FI index, allow the teacher to know the topics that were not well understood by the students and gives a feedback on learning process of the students. Moreover, these quizzes work as a self-assessment for the students, providing them a better preparation for the written exam. The performance of these quizzes has not supposed an excessive workload for the students, allowing improved scores obtained in the first Ordinary Call of February, unlike those obtained following a traditional methodology in previous courses.
In few words, the Moodle statistics point out that a particular quiz has assessment quality if the bias is symmetric, around zero value, and the range of FI and DI is about 40-70%. This feature can be obtained using a huge question bank with the aim to perform a random quiz. Mixed items (multiple choice, numerical and matching) also contribute to achieving a quiz in which different skills and abilities of physical chemistry have been inquired. In addition, the similar results obtained in this experience along the years, with different students' population, would also prove the validity and reliability of the designed quizzes. This study shows how the analysis of statistical and psychometric parameters allows checking if the design of an educative activity based on Moodle quizzes can be used as an assessment tool to evaluate the skill and competences of a particular subject along the semester.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/educsci11090500/s1, Definition of psychometric parameters, Figure S1: FI-DI diagrams corresponding to the two quiz types, Figure S2: DE diagrams corresponding to the ten items (true/false) of each BQ, Figure S3: FI-DI diagrams of the different items in each TBQ.
Funding: This research and the APC were funded by University of Málaga, through Educational Innovation Projects, PIE15-027 and PIE19-051.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: All data reported in this work are based on the present study with an own elaboration.