Next Article in Journal
University Students’ Perceptions of Sustainability and Ecological Footprint in the Use of Digital Leisure
Previous Article in Journal
In-Service Teacher Professional Development: Challenges and Opportunities for Innovating the Trichronous Modality of Delivery in Vietnam’s EFL Education
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

External Constraints on the Development of Quality Assessment of Students’ Learning in Higher Education

by
Juan C. Manrique-Arribas
1,
Víctor M. López-Pastor
1,* and
Andrés Palacios-Picos
2,*
1
Departamento de Didáctica de la Expresión Musical, Plástica y Corporal, Facultad de Educación de Segovia, Universidad de Valladolid, 40005 Segovia, Spain
2
Departamento de Psicología, Facultad de Educación de Segovia, Universidad de Valladolid, 40005 Segovia, Spain
*
Authors to whom correspondence should be addressed.
Educ. Sci. 2025, 15(1), 20; https://doi.org/10.3390/educsci15010020
Submission received: 20 November 2024 / Revised: 18 December 2024 / Accepted: 25 December 2024 / Published: 28 December 2024

Abstract

:
Learning-oriented assessment models, which tend to generate better learning processes, better academic performance and greater student involvement in teaching–learning processes, are increasingly being used in European universities; they are also more suitable for competence-based learning. However, there are a number of constraints that hinder their implementation, either due to internal (lack of teacher training and attitude/motivation to make changes in teaching and assessment methodology) or external (teaching load of lecturers and group size (number of students per group)) factors. Taking the latter into account, the aim of this study was to determine to what extent the teaching load and the number of students per group condition the use or not of quality assessment systems in the university world. A questionnaire of assessment systems and instruments with a high level of reliability and validity was administered to a large and representative sample of the teaching staff from numerous Spanish universities. Multivariate analyses were carried out to try to achieve the research objective. The results, based on a post facto model, show that quality assessment systems are advancing. Furthermore, the conclusion was reached that the teaching load and the number of students to be assessed are factors that correlate significantly with a greater use of assessment systems involving exams with closed questions. They correlate negatively with the use of continuous and formative assessment (FA) systems, although there is much variability in the latter case and the results do not seem clearly conclusive, suggesting more research on this topic and its sustainability.

1. Introduction

In recent years, we have witnessed a growing interest in everything to do with higher education (HE). Undoubtedly, the great challenge is to increase the level of student learning with teacher training that allows a positive relationship between educational research and knowledge and teaching practice (García-Rodríguez et al., 2023). Specifically, in order to carry out this commitment, it is necessary to reconsider the new teaching framework and reformulate methodologies, based more on producing learning than on teaching techniques. Improving the teaching and learning process in higher education (HE) is a multifaceted challenge that involves a variety of aspects, ranging from pedagogical innovation, with the application of a diversity of teaching styles, to the application of new technologies. This requires a shift from a teacher-centred orientation to empowering students to be the true protagonists of their learning, providing them with timely and meaningful feedback, as well as creating learning environments that foster their autonomy and critical thinking (Mendioroz-Lacambra et al., 2022). In this context, assessment methods also need to be updated and adapted with new approaches. Lecturers should move towards higher assessment for learning (AfL) systems that are more relevant to students and provide them with the opportunity to involve themselves in the process so that, through reflection, they become more aware of their progress and difficulties. However, in HE there is still a confluence of assessment techniques and strategies ranging from traditional written tests with developmental, short or test-type questions to more alternative ones such as formative assessment (FA) (López-Pastor, 2009), competence-based assessment (Menzala-Peralta et al., 2023) or project-based assessment (Lara-Navarra et al., 2024), which can provide greater knowledge and motivation in the student learning process.
If we refer to the assessment of student learning, the term most used in the international literature is “assessment”. It is also possible to find documents that use the term “evaluation” as a synonym, although this term is more commonly used to refer to the evaluation of programmes, schools, universities, etc.
In line with these challenges, a number of new terms have emerged in university teaching over the last decades, such as “Learning Oriented Assessment” (assessment system focused on the students’ learning); “Integrated Assessment” (assessment is integrated within the teaching–learning process and be part of it); “Assessment for Learning” (educational assessment must be clearly directed to enhance students’ learning instead of just directed to check and grade their performance); or “Formative Assessment” (assess to improve, in three ways, the following: 1—student learning; 2—teacher’s teaching competence; 3—course development) (Brown, 2015; Carless, 2015; López-Pastor, 2009). In the literature, a distinction is often made between “assessment of learning” and “assessment for learning” (AfL), with the latter having a formative function, while the former assumes a summative function. There seems to be some consensus that both types of assessment are often used simultaneously in higher education institutions. A question that is often raised when both formative and summative assessment practices are used in continuous assessment is the extent to which student learning can be facilitated through feedback. FA emphasises providing feedback and feedforward during the learning process to enhance understanding and self-regulation, generating positive effects on learning and academic performance (Molina-Soria et al., 2020). Sortwell et al. (2024) reviewed meta-analyses on the impact of FA on students and its sustainability and found that its positive effects on learning range from small to large (depending on the type of FA), although negative effects are never identified, and it generates numerous benefits.
However, these new models of AfL do not seem to be very widespread in HE. San Martín et al. (2016) analysed 552 teaching guides from different Spanish universities and found that the final exam continues to be a key instrument in assessment and that the continuous assessment system is not the most widely used. By branches of knowledge, the social science disciplines are the ones that use the different types of exams as assessment systems most. Science subjects are the ones that use the final exam the least and work more with practical laboratory tests. The humanities branch is more inclined to use oral tests, practical tests and class participation. Lastly, the engineering branches preferably use midterm exams and electronic resources.
However, we are witnessing a slow transition from assessment models that are exclusively oriented towards the final grade to more novel approaches that are learning-oriented. In fact, different styles of teaching and assessment currently coexist in universities. For example, Palacios and López-Pastor (2013) found three main typologies of university teaching staff in teacher education faculties: “innovative teaching staff” (using formative and continuous assessment systems); “traditional teaching staff” (still using traditional final examination models); and “eclectic teaching staff” (introducing changes in assessment systems but not in a clear and comprehensive way). The percentages of teaching staff are similar in the three groups, and the variable that seems to be most influential for being part of the “innovative teaching staff” group is participation in continuous training activities on university teaching, especially seminars and teaching innovation projects. Crespí and García-Ramos (2021) noted the majority use of a mixed model by teaching staff in the area of social sciences and law, with competence-based assessment coexisting with traditional assessment. This same result is found in the degrees in the area of sciences (Quevedo-Blasco et al., 2015). In this regard, Gozalo-Delgado et al. (2022) considered that the incorporation of Spanish universities into the EHEA has led to a decrease in the use of final exams as the main assessment system, despite the fact that those who use this mode of assessment consider the exam to be a “supposedly” objective, valid and reliable instrument for verifying what has been learnt. This confirms the erroneous belief that assessment is identified with marking as a means of accrediting results and selecting students (Vain, 2016).
However, in order to move towards new AfL models in HE, it is necessary to analyse what conditions may exist. Some of these may be internal (education and conceptions of the lecturer, attitude towards the use of learning-oriented assessment systems, experience, etc.) and others external (teaching hours, number of students per group, professional situation, etc.). In this context, there is a limited amount of research that has attempted to determine the factors involved in this transition from apparently obsolete assessment models to AfL ones. Along these lines, Palacios and López-Pastor (2013) and Ibarra-Saiz and Rodríguez-Gómez (2014) concluded that the most decisive element in the use of learning-oriented assessment systems is the continuous training of university teaching staff and their participation in teaching innovation projects. Margalef (2014) pointed out that the main problem for the implementation of quality assessment is the beliefs and conceptions of the teaching staff about what they understand about teaching–learning processes and more effective assessment. Pozuelos-Estrada et al. (2021) considered that most university faculties do not feel empowered to encourage student participation in this process. It is precisely this inability and lack of teaching experience that leads to the greatest resistance to the development of formative assessment (Hidalgo-Apunte, 2021).
In any case, some lecturers consider that there are a number of factors influencing why they prefer to use more traditional models of evaluation in HE: (a) lack of time, as they perceive they have heavy workloads due to class preparation, research tasks and dedication to other academic and management responsibilities (Otero-Saborido et al., 2023); (b) poor training, which makes them feel insecure, as they are unfamiliar with formative and shared assessment techniques and methodologies (Margalef, 2014); (c) the traditional academic culture more oriented towards expository teaching and summative final exam assessment (Legarda-López, 2021); (d) the lack of technological resources to be able to implement FA (Fuentes-Agustí, 2019); (e) resistance to change on the part of teaching staff, due to comfort with the use of traditional methods, the perception that change will require too much effort or the lack of incentives to motivate lecturers to implement innovative assessment models (Mayorga-Fernández et al., 2023); and (f) the type of subject, where FA is more commonly applied in practical subjects than in theoretical ones (Quevedo-Blasco et al., 2015).
But undoubtedly, one of the factors pointed out as most relevant to the reluctance to use formative and continuous assessment systems is the lecturers’ belief in the high level of dedication in terms of time and resources involved (Carless, 2015). Quevedo-Blasco et al. (2015) and Quevedo-Blasco and Buela-Casal (2017) point out that one of the concerns of the teaching staff is the greater workload that these new ways of teaching in HE entail, especially with regard to assessment. They say they are not sufficiently prepared to design and construct assessment instruments for learning, with differences depending on experience, with those with fewer years of practice having the fewest resources for carrying it out, although they are the ones who attach greater importance to the assessment task. On the other hand, Pantoja-Vallejo et al. (2020) points out that the challenge of improving the assessment system and carrying it out in a formative and continuous way does not represent an excessive workload for lecturers. The feeling conveyed by the teaching staff is that when these FA systems are used, the workload is greater than it was with traditional assessment systems (limited to a single final exam), which is true in absolute terms; although, studies show that the use of formative and continuous assessment systems does not generate an excessive workload, but that it is viable during the working hours stipulated in the contract (Pantoja-Vallejo et al., 2020; Vera-Cazorla, 2021).
These studies seem to indicate that the use of formative and continuous assessment systems can be sustainable over time, that is, they do not imply an overload of work for students and lecturers. It is precisely this sustainability that is one of the seven procedural principles of the “formative and shared assessment” model (López-Pastor, 2009). However, other concepts related to the sustainability of assessment can be found. The works of Boud (2000) and Boud and Soler (2015) introduce the concept of “Sustainable Assessment”, referring to assessment that encompasses the skills needed to carry out the activities that necessarily accompany lifelong learning in formal and informal settings. It has a very strong connection with the French and Spanish concept of “formative assessment”, which has a dual function: (1) to assess student learning and (2) to help students develop the competence of learning to learn and thus develop their capacity for self-assessment and self-regulation in learning (López-Pastor, 2009; Sanmartín, 2007). According to Boud (2000), assessment events should respond both to the specific and immediate objectives of a course, and to lay the groundwork for students to carry out their own assessment activities in the future. In a later work, Boud and Soler (2015) explain that the concept of “sustainable assessment” is identified as assessment that meets the needs of the present in terms of formative and summative assessment requirements but also prepares students to meet their own learning needs in the future. For their part, Rodríguez-Gómez and Ibarra-Sáiz (2015) analyse the relationship between assessment and learning in order to move towards “Sustainable Learning in Higher Education”. Following this analysis, they make a proposal that they call “assessment as learning and empowerment”, which is based on three key challenges: (1) the involvement of students in the assessment of their own learning; (2) feedforward, which focuses on providing feedback on assessment results that can be used proactively; and (3) the production of high-quality assessment tasks.
The number of teaching hours for lecturers can also have an impact on the implementation of quality formative and continuous assessment. Thus, Palacios et al. (2013) show significant differences in relation to the number of teaching hours per week and the assessment model used. For example, lecturers who use written final examinations as the main assessment system tend to have a higher weekly teaching load than other members of the teaching staff. In this respect, Asare and Afriyie (2023) analysed the significant challenges for lecturers in the implementation of formative assessment in the classroom, finding that they include workload, number of students and teaching load. Similar results can also be found in Rahman et al. (2021).
As we have just pointed out, interest in assessment in higher education has clearly increased in recent years. New forms of assessment, more focused on the student and his or her capacity to improve learning, have strongly emerged and are struggling to find their place in such complex learning processes as those that characterise university education. But these new strategies have not become as widespread as some would like, and they coexist in the classroom with others which, without entering into value judgements, could well be called classic due to the length of time they have been present in the university. Despite this interest in these new proposals, little research has analysed the barriers or impediments to such a generalisation. Regardless of the complexity of the subject, there is currently a tendency to consider two types of factors that would facilitate or inhibit their use; specifically, there is talk of internal factors related to lifelong learning or attitudes towards assessment and external factors such as the teaching load of the lecturer or the number of students to be assessed.
In this paper, we analyse the correlation between these two external factors (the number of students and the teaching load) with the use of one or the other assessment strategy, which, as we have pointed out, is currently a major gap in the field of assessment in higher education. Our research questions can be summarised as follows:
RQ1—Which are the most and least frequently used assessment and marking strategies in university classrooms?
RQ2—What is the place of new assessment strategies focused on learning, such as formative assessment, among the strategies currently found in university classrooms?
RQ3—Is the number of teaching hours taught by a university lecturer a factor associated with the use of assessment strategies and tools?
RQ4—Is the number of students to be assessed by a university lecturer a factor associated with the use of assessment strategies and instruments?

2. Materials and Methods

2.1. The Participants

The data for this study were obtained from a convenience sample of 469 lecturers from 21 universities throughout Spain participating in a research and innovation project at national level. These universities accounted for 24.4% of the total possible (50 public and 36 private universities). Of the participating centres, the percentage of surveys and data collected was 15% of this total. These lecturers, who were predominantly men (63%), had an average age of 44 years. By areas of knowledge, social and legal sciences predominated (63%), followed by science (17%), health sciences (8%) and arts and humanities (12%).

2.2. Materials and Design

The research data were obtained through an ex post facto research design. The data were collected by means of a questionnaire that was completed anonymously by the lecturers, who were duly informed of the objectives of the research and the purely statistical treatment of their answers. The questionnaires were answered at the end of the second semester and collected by the research team and collaborating lecturers. One part of the questionnaire consisted of identification data and questions relating to the weekly workload and the total number of students to be assessed per year; the second consisted of a scale on the use of marking and assessment procedures. Specifically, it was an adaptation of the proposal by Palacios et al. (2013) consisting of 13 Likert-type questions with 5 alternatives (0 = no use; 4 = very frequent use) and which, according to the authors, presents adequate reliability (α = 0.86) and validity indices measured by means of a confirmatory factor analysis (chi-squared 57.88; p = 0.064). The authors’ final version presented a 4-factor structure: (1) “Assessment based on closed question exams”; (2) “Continuous and formative assessment”; (3) “Assessment with assignments and essays”; (4) “Assessment with open question exams with or without notes”.
Prior to the calculations with the sample data, an exploratory factor analysis carried out by the research team was performed on the 13 questions of the original questionnaire proposed by Palacios et al. (2013). The final values of the factor loadings and their Omega coefficients are presented in Table 1. Both the aforementioned loadings and the values of the reliability of the scales measured by McDonald’s Omega coefficient are within the acceptable range for this type of measure. However, one of the original questions was removed from the final version of the questionnaire, due to inadequate values.

2.3. Data Analysis

The SPSS 29 statistical package was used for data analysis. Pearson correlations were calculated between the scores of the four subscales of the assessment-grading systems with the variables number of students and weekly teaching hours. As a complement to these calculations, correspondence analysis was carried out in those cases where the correlations were statistically significant. The latter analyses are exploratory techniques aimed at reducing a large amount of information into a small number of dimensions, generally two, which are represented by diagrams. In our case, the value of this technique is that it allows us to visualise in a simple way on an axis of coordinates (or factors) the closeness of the different assessment systems, both in terms of the number of students to be assessed and the lecturer’s teaching load. A confidence level of 5% was used to make the various statistical inferences.

3. Results

The variable number of teaching hours per week presented a distribution whose most relevant data are summarised in Table 2.
Data collection for the total number of students to be evaluated was carried out using a Likert scale with four values (1 = 80 or less; 2 = 81 to 120; 3 = 121 to 160; and 4 = 161 or more students). In addition, for the calculations based on correspondence analysis, a categorisation of the original values of the variable weekly teaching load was carried out, which, crossed with the categories of the variable number of students, resulted in the contingency table summarised in Table 3.
Continuing with this first descriptive approach, we analysed the greater or lesser use of the different assessment and marking strategies. As can be seen in Table 4, the use of reports and written assignments was the most common technique in the universities in the sample, with an average of 3.10, which is equivalent to very frequent use. On the other hand, oral exams barely reached one point as an average value, equivalent to little or no use.
The mean values of the four subscales complement the previous data, so that the use of written assignments is high, while continuous and formative assessment has low values and the use of examinations with open questions is very low (Table 5).
We now turn to the analysis of the relationships between the different assessment-grading systems and the teaching load and the number of students to be assessed. As shown in Table 6, assessment with closed question exams has positive and significant correlations with the number of students (r = 0.19; sig. = 0.000) and with the number of teaching hours per week (r = 0.16; sig. = 0.001); given that the correlation in both cases is positive, it can be concluded that the use of these assessment systems based on closed question exams increases as the number of students to be assessed increases and as the lecturer’s weekly teaching load increases.
Continuous and formative assessment correlates statistically significantly with both the number of students (r = −0.10; sig. = 0.043) and the total teaching hours per week (r = −0.10; sig. = 0.047); in both cases, the correlation is negative; therefore, the use of this assessment strategy is present to a greater extent among lecturers with a smaller number of students and a smaller teaching load.
No significant correlations were found between assessment with written assignments or assessment with open-ended questions with the number of hours or the number of students to be assessed (Table 6).
As a complement to the previous calculations, a “correspondence analysis” was carried out between the two assessment strategies with significant correlations (exams with closed questions and formative assessment) and the number of students and the lecturer’s teaching load. In the case of the relationship between the number of students and the use of exams with closed questions, we found four groups of possible topographical affinities (Figure 1). Two of them show a close relationship between lecturers with a low number of hours per week and little or no use of this assessment system (G11 and G12); the other two groups, made up of lecturers with a higher number of students to assess (G13 and G14), are associated with the frequent use of this assessment with closed questions. These results ratify the correlation obtained between the number of students and the use of assessments with closed question exams, which is more common among lecturers with a higher number of students to assess.
The significant correlation between the assessment with closed questions and the number of teaching hours per week can be represented as in the previous case by means of the row and column graph of the correspondence analysis (Figure 2). We can consider three groups in this graph. Scant or low use of closed-ended assessment (G21) is associated with few teaching hours (five or less hours per week). In a second grouping (G22), the highest use of closed-ended assessment would correspond to lecturers with an average number of teaching hours per week (6 or 7 h per week). In the third grouping (G23), we found a close correlation between those who use this technique frequently and teaching loads of more than eight to nine hours per week. These results clarify the low but significant correlation between closed question tests and the teaching load of the teaching staff mentioned above, especially at the expense of the closeness between a low teaching load and little or no use of this type of assessment.
As we have noted, continuous and formative assessment correlates inversely with the number of students to be assessed; the representation of this possible topographical closeness is summarised in Figure 3. Three affinity groups can be established in this two-dimensional space: the first of them (G31) topographically represents the closeness of a lot of use of continuous and formative assessment and the smallest number of students to be assessed in the sample; the second (G32) presents topographic closeness to lecturers with average student loads with little use of this assessment strategy; the third (G33) represents the closeness of lecturers with a high number of students and little or no use of this continuous and formative assessment, ratifying the inverse correlation between both variables. It is relevant that the very frequent use of continuous assessment is not related to any number of students.
Figure 4 shows the spatial distribution of the teaching load of lecturers and the use of continuous assessment using the row and column diagram. We recall that the relationship is significant and inverse. Again, we find a common space between a frequent use of continuous assessment and a small number of teaching hours per week (G41). Lecturers with some use (G42) would be close to those with medium teaching loads (6 or 7 h per week); lecturers with higher teaching loads (8 or more hours) share space with little or no use of continuous assessment (G43). As in the previous analysis, the relationship between the use of continuous assessment and a small teaching load is well represented, and the relationship between other teaching loads and the use of this strategy is blurred, although the sign of the concomitance is clear: the higher the number of teaching hours, the lower the use of continuous and formative strategies.
Looking further into the correlations between the different assessment strategies and the teaching load and the number of students, we now analyse these possible examples of concomitance using each strategy separately for each of the four factors (Table 7).
The assessment strategies that form part of the subscale we have called “assessment with closed question exams” have significant correlations with both the number of students and the teaching load; in all cases, the number of teaching hours per week and the number of students to be assessed correlate positively with multiple-choice tests, short-question exams and closed question exams (Table 7).
With regard to the four strategies that form part of “continuous and formative assessment”, we found significant correlations between these assessment systems and the number of students in practical work with feedback (r = −0.12; sig. = 0.011), in classroom participation (r = −0.10; sig. = 0.036) and in the use of portfolios (r = −0.10; sig. = 0.042). But the correlation between formative assessment and teaching load is significant only in the assessment strategy based on practical work with feedback (r = −0.10; sig. = 0.034). Field notebooks do not correlate significantly with either the number of students or the teaching load. We repeat that, in all cases, the correlations are inverse, so that the greater use of this assessment strategy is related to a smaller number of students and fewer teaching hours per week.
Exams with open questions correlate significantly and directly with both the teaching load (r = 0.14; sig. = 0.003) and the number of students to be assessed (r = 0.13; sig. = 0.006). This correlation is also significant but negative in the case of oral exams and the weekly teaching load (r = −0.13; sig. = 0.006), so that it is more present as the weekly teaching load decreases (Table 7).
The two assessment strategies related to the submission of written assignments and essays do not correlate significantly with either the number of teaching hours or the number of students to be assessed.

4. Discussion

With regard to which assessment and marking strategies are the most frequent and which are the least used in university classrooms, the results show that the most widespread technique is the preparation of reports and assignments, with high marks also awarded to essays based on texts and multiple-choice tests. The least used are oral examinations and written examinations with material. In a complementary way, the second research question sought to find out what place the new assessment strategies focused on learning (such as formative assessment) occupy among the strategies currently found in university classrooms. If we take into account the four subscales obtained, we see that the specific subscale of “continuous and formative assessment” is in third place, with average use, while the subscale that makes indirect reference to these alternative systems (“assessment with written assignments”) is by far the most widespread and used, with very high values, while the subscale that refers to the more traditional model (“assessment with written exams”) is the second most used, also with average values.
As we can see, there seems to be an evolution in universities towards the increased use of new assessment systems, although the traditional model of final exams is still quite common, in line with Granero-Gallegos et al. (2021) and San Martín et al. (2016). This progressive evolution towards more learning-focused assessment models can also be found in the studies of Crespí and García-Ramos (2021) and Gozalo-Delgado et al. (2022). A very important challenge for our universities is to continue to evolve from a culture of examination to a culture of assessment, the latter being understood as formative assessment, aimed at improving the students’ learning processes and encouraging their participation in their assessment (Asiú-Corrales et al., 2021; Dochy et al., 2002; Turra-Marín et al., 2022).
With regard to research questions 3 and 4, the results show a significant correlation between the number of students and the lecturer’s teaching load and the use of assessment-marking systems based on short-question exams. On the contrary, continuous and formative assessment systems correlate negatively with both the teaching load and the number of students; therefore, the use of this formative assessment strategy is present to a greater extent among lecturers with a smaller number of students and a smaller teaching load. The results of the correspondence analysis reinforce this result, as some external variables such as teaching load and number of students may explain the higher use of traditional assessment systems as both teaching load and student numbers increase and the lower use of formative and continuous assessment.
These results coincide with those found by Palacios et al. (2013), who conclude that lecturers with a more traditional profile (marking and exam-oriented) have a significantly higher weekly load of hours than the innovative and eclectic profiles. This greater “time and demographic pressure” would lead them to make more frequent use of traditional assessment systems, which require less time and dedication. In this regard, Gibbs and Simpson (2005) state that, when written feedback is given to students, there is a direct relationship between increased teaching workload and increased student numbers. This makes some lecturers wonder whether it is worth so much effort on their part. These results seem to partly justify a widespread belief among university lecturers that continuous and formative assessment is incompatible with a high weekly teaching load (Fraile-Aranda et al., 2018; Margalef, 2014).
However, it cannot be overlooked that, despite the presence of a significant correlation, both the use of assessment systems based on final exams and the more innovative ones based on continuous and formative assessment correlate differently with these external factors, both because of their low level and their diffuse graphic representation (at least in some cases), so our conclusions should be qualified.
Our results strengthen the hypothesis that the external variables of teaching load and number of students explain, in part, the use of one or the other system to a greater or lesser extent. However, for the reasons we have just mentioned, it can be assumed that other variables, such as initial or continuous training or attitude towards assessment, may be influencing this same process of using one system or the other, as shown in other studies (Ibarra-Saiz & Rodríguez-Gómez, 2014; Palacios & López-Pastor, 2013). These studies indicate that the most decisive element in the use of learning-oriented assessment systems is the in-service training of university teaching staff and their participation in teaching innovation projects.

5. Conclusions

The improvement of university teaching entails a series of changes aimed at improving students’ learning processes and, consequently, the assessment systems applied. However, there is still a long tradition among teaching staff of using final or summative assessment systems based on closed question exams.
The changes taking place in HE indicate that continuous and formative assessment improves students’ learning and academic performance, usually generating greater motivation as they receive constant feedback that helps them to progress in the acquisition of their competences. However, our results show that this assessment model has not yet been implemented with adequate generalisation among university teaching staff. These same results also indicate that the teaching load and the number of students to be assessed are factors that seem to condition some assessment-grading systems; specifically, the greater the teaching load and the greater the number of students, the greater the use of exams with closed questions and the lesser the use of continuous and formative assessment systems, although there is much variability in the latter case, and the results do not seem clearly conclusive.
However, given that in both cases the correlation values are significant but not high, we must be cautious in their interpretation. As we have indicated on several occasions, as assessment is a complex process, other variables may be influencing the use of one strategy or another. We point out, among others, what we have called internal conditioning factors, such as in-service teacher education or their beliefs and attitudes towards assessment.
This aspect can be considered as a limitation of our study. Another limitation may be the influence of the degree to which teaching is provided on the use of one or another assessment system. As a prospective research question, it would be interesting to investigate in greater detail the influence of internal conditioning factors on the use or non-use of FA systems, as well as to explore how other tasks that a university lecturer must undertake (related to research, management and/or university extension) may be influencing the use of the different assessment systems. The use of semi-structured interviews with a small sample could help with better understanding why they enact their assessments.
Some studies seem to indicate that a solution would be better continuous training in FA and AfL.

Author Contributions

Conceptualisation, all; methodology, A.P.-P.; software, A.P.-P.; validation, all; formal analysis, all; investigation, all; resources, all; data curation, all; writing—original draft preparation, all; writing—review and editing, all; visualisation, all. All authors have read and agreed to the published version of the manuscript.

Funding

This paper has been made possible thanks to participation in the project RTI2018-093292-B100, within the State Programme for Research, Development and Innovation Oriented to the Challenges of Society, in the framework of the State Plan for Scientific and Technical Research and Innovation, 2018–2022, of the Ministry of Education, Science and Innovation.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board (Ethics Committee of Research of Comunidad Autónoma de Aragón; protocol code C.P.-CIPI.21/377, and date of approval: 2021).

Informed Consent Statement

Informed consent was obtained from all subjects involved in this study.

Data Availability Statement

The data have been presented in the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Asare, E., & Afriyie, E. (2023). Barriers to basic school teachers’ implementation of formative assessment in the Cape Coast Metropolis of Ghana. Open Education Studies, 5(1), 20220193. [Google Scholar] [CrossRef]
  2. Asiú-Corrales, L. E., Asiú-Corrales, A. M., & Barboza-Díaz, Ó. A. (2021). Evaluación formativa en la práctica pedagógica: Una revisión bibliográfica. Conrado, 17(78), 134–139. [Google Scholar]
  3. Boud, D. (2000). Sustainable assessment: Rethinking assessment for the learning society. Studies in Continuing Education, 22(2), 151–167. [Google Scholar] [CrossRef]
  4. Boud, D., & Soler, R. (2015). Sustainable assessment revisited. Assessment & Evaluation in Higher Education, 41(3), 400–413. [Google Scholar] [CrossRef]
  5. Brown, S. (2015). Learning, teaching and assessment in higher education: Global perspectives. Palgrave. [Google Scholar] [CrossRef]
  6. Carless, D. (2015). Exploring learning-oriented assessment processes. Higher Education, 69(6), 963–976. [Google Scholar] [CrossRef]
  7. Crespí, P., & García-Ramos, J. M. (2021). Competencias genéricas en la universidad. Evaluación de un programa formativo. Educación XX1, 24(1), 297–327. [Google Scholar] [CrossRef]
  8. Dochy, F., Segers, M., & Dierick, S. (2002). Nuevas vías de aprendizaje y enseñanza y sus consecuencias: Una era de evaluación. Revista de Docencia Universitaria REDU, 2(2), 13–30. [Google Scholar] [CrossRef]
  9. Fraile-Aranda, A., Aparicio-Herguedas, J. L., Asún-Dieste, S., & Romero-Martín, R. (2018). La evaluación formativa de las competencias genéricas en la formación del profesorado de educación física. Estudios Pedagógicos (Valdivia), 44(2), 39–53. [Google Scholar] [CrossRef]
  10. Fuentes-Agustí, M. (2019). En la universidad, ¿cómo mejorar la evaluación formativa mediante el uso de las tecnologías de la información y la comunicación? Revista Infancia, Educación y Aprendizaje, 5(2), 521–529. [Google Scholar] [CrossRef]
  11. García-Rodríguez, M. P., Coronel, J. M., Gómez-Hurtado, I., & González-Falcón, I. (2023). 20 años sobre el impacto de la investigación educativa en la práctica. Algunas recomendaciones y propuestas de mejora. REICE. Revista Iberoamericana sobre Calidad, Eficacia y Cambio en Educación, 22(1), 121–140. [Google Scholar] [CrossRef]
  12. Gibbs, G., & Simpson, C. (2005). Conditions under which assessment supports students’ learning. Learning and Teaching in Higher Education, 1, 3–31. [Google Scholar]
  13. Gozalo-Delgado, M., León-del Barco, B., & Romero-Moncayo, M. (2022). Buenas prácticas del estudiante universitario que predicen su rendimiento académico. Educación XX1, 25(1), 171–195. [Google Scholar] [CrossRef]
  14. Granero-Gallegos, A., Hortigüela-Alcalá, D., Hernando-Garijo, A., & Carrasco-Poyatos, M. (2021). Estilo docente y competencia en Educación Superior: Mediación del clima motivacional. Educación XX1, 24(2), 43–64. [Google Scholar] [CrossRef]
  15. Hidalgo-Apunte, M. E. (2021). Reflexiones acerca de la evaluación formativa en el contexto universitario. Revista Internacional de Pedagogía e Innovación Educativa, 1(1), 189–210. [Google Scholar] [CrossRef]
  16. Ibarra-Saiz, M. S., & Rodríguez-Gómez, G. (2014). Modalidades participativas de evaluación: Un análisis de la percepción del profesorado y de los estudiantes universitarios. Revista de Investigación Educativa, 32(2), 339–362. [Google Scholar] [CrossRef]
  17. Lara-Navarra, P., Sánchez-Navarro, J., Fitó-Bertran, Á., López-Ruiz, J., & Girona, C. (2024). Explorando la singularidad en la educación superior: Innovar para adaptarse a un futuro incierto. RIED-Revista Iberoamericana de Educación a Distancia, 27(1), 115–137. [Google Scholar] [CrossRef]
  18. Legarda-López, N. C. (2021). Didácticas funcionales vs. enseñanza tradicional con clase expositiva en el ámbito universitario. Revista UNIMAR, 39(2), 268–285. [Google Scholar] [CrossRef]
  19. López-Pastor, V. M. (2009). Evaluación formativa y compartida en educación superior: Propuestas, técnicas, instrumentos y experiencias (Vol. 21). Narcea Ediciones. [Google Scholar] [CrossRef]
  20. Margalef, L. (2014). Evaluación formativa de los aprendizajes en el contexto universitario: Resistencias y paradojas del profesorado. Educación XX1, 17(2), 35–55. [Google Scholar] [CrossRef]
  21. Mayorga-Fernández, M. J., Sepúlveda-Ruiz, M. P., & García-Vila, E. (2023). La evaluación formativa: Una actividad clave para tutorizar, acompañar y personalizar el proceso de aprendizaje. ENSAYOS. Revista de la Facultad de Educación de Albacete, 38(1), 80–97. [Google Scholar] [CrossRef]
  22. Mendioroz-Lacambra, A., Napal-Fraile, M., & Peñalva-Vélez, A. (2022). La competencia investigativa del profesorado en formación: Percepciones y desempeño. Revista Electrónica de Investigación Educativa (REDIE), 24(28), 1–14. [Google Scholar] [CrossRef]
  23. Menzala-Peralta, C. C., Ortega-Menzala, E., Menzala Peralta, R. M., & Solís Trujillo, B. P. (2023). Evaluación basada en competencias en educación superior. Horizontes Revista de Investigación en Ciencias de la Educación, 7(28), 836–851. [Google Scholar] [CrossRef]
  24. Molina-Soria, M., Pascual, C., & López-Pastor, V. M. (2020). El rendimiento académico y la evaluación formativa y compartida en formación del profesorado. ALTERIDAD. Revista de Educación, 15(2), 204–215. [Google Scholar] [CrossRef]
  25. Otero-Saborido, F. M., Rodríguez-Bies, E., Gallardo-López, J. A., & López-Noguero, F. (2023). Percepción del alumnado de Educación Física sobre la carga de trabajo y Evaluación Formativa en Flipped Learning. Retos: Nuevas Tendencias en Educación Física, Deporte y Recreación, 50, 298–305. [Google Scholar] [CrossRef]
  26. Palacios, A., & López-Pastor, V. M. (2013). Haz lo que yo digo, pero no lo que yo hago: Sistemas de evaluación del alumnado en la formación inicial del profesorado. Revista de Educación, 361, 279–305. [Google Scholar] [CrossRef]
  27. Palacios, A., López-Pastor, V. M., & Barba, J. J. (2013). Análisis de las diferentes tipologías de profesorado universitario en función de la evaluación aplicada a los futuros docentes. Estudios sobre Educación, 24, 173–195. [Google Scholar] [CrossRef]
  28. Pantoja-Vallejo, A., Molero, D., Molina-Jaén, M. D., & Colmenero-Ruiz, M. J. (2020). Valoración de la práctica orientadora y tutorial en la universidad: Validación de una escala para el alumnado. Educación XX1, 23(2), 119–143. [Google Scholar] [CrossRef]
  29. Pozuelos-Estrada, F. J., García-Prieto, F. J., & Conde-Vélez, S. (2021). Evaluar prácticas innovadoras en la enseñanza universitaria. Validación de instrumento. Educación XX1, 24(1), 69–91. [Google Scholar] [CrossRef]
  30. Quevedo-Blasco, R., Ariza, T., & Buela-Casal, G. (2015). Evaluación de la satisfacción del profesorado de Ciencias con la adaptación al Espacio Europeo de Educación Superior. Educación XX1, 18(1), 45–70. [Google Scholar] [CrossRef]
  31. Quevedo-Blasco, R., & Buela-Casal, G. (2017). Influence of the implementation of the European Higher Education Area on Engineering and Architecture university teachers. DYNA-Ingeniería e Industria, 92(3), 333–338. [Google Scholar] [CrossRef]
  32. Rahman, K. A., Hasan, M. K., & Namaziandost, E. (2021). Implementing a formative assessment model at the secondary schools: Attitudes and challenges. Lang Test Asia, 11, 18. [Google Scholar] [CrossRef]
  33. Rodríguez-Gómez, G., & Ibarra-Sáiz, M. S. (2015). Assessment as Learning and Empowerment: Towards Sustainable Learning in Higher Education. In M. Peris-Ortiz, & J. Merigó Lindahl (Eds.), Sustainable learning in higher education. Innovation, technology, and knowledge management (pp. 1–20). Springer. [Google Scholar] [CrossRef]
  34. San Martín, S., Jerónimo, N., & Sánchez-Beato, E. (2016). La evaluación del alumnado universitario en el Espacio Europeo de Educación Superior. Aula Abierta, 44(1), 7–14. [Google Scholar] [CrossRef]
  35. Sanmartín, N. (2007). 10 Ideas clave: Evaluar para aprender. Graó. [Google Scholar]
  36. Sortwell, A., Trimble, K., Ferraz, R., Geelan, D. R., Hine, G., Ramirez-Campillo, R., Carter-Thuiller, B., Gkintoni, E., & Xuan, Q. A. (2024). Systematic review of meta-analyses on the impact of formative assessment on K-12 students’ learning: Toward sustainable quality education. Sustainability, 16, 7826. [Google Scholar] [CrossRef]
  37. Turra-Marín, Y., Villagra-Bravo, C., Mellado-Hernández, M., & Aravena-Kenigs, O. (2022). Diseño y validación de una escala de percepción de los estudiantes sobre la cultura de evaluación como aprendizaje. RELIEVE—Revista Electrónica de Investigación y Evaluación Educativa, 28(2), 1–25. [Google Scholar] [CrossRef]
  38. Vain, P. D. (2016). Perspectiva socio-histórica de las prácticas de evaluación de los aprendizajes en la universidad. Trayectorias Universitarias, 2(2), 20–27. [Google Scholar]
  39. Vera-Cazorla, M. J. (2021). La elección de medios de evaluación por el alumnado universitario. LFE. Revista de Lenguas para Fines Específicos, 27(2), 9–24. Available online: http://hdl.handle.net/10553/112951 (accessed on 12 January 2024). [CrossRef]
Figure 1. Relationship between the number of students and the use of exams with closed questions.
Figure 1. Relationship between the number of students and the use of exams with closed questions.
Education 15 00020 g001
Figure 2. Relationship between number of teaching hours and the use of exams with closed questions.
Figure 2. Relationship between number of teaching hours and the use of exams with closed questions.
Education 15 00020 g002
Figure 3. Relationship between the formative assessment and number of students.
Figure 3. Relationship between the formative assessment and number of students.
Education 15 00020 g003
Figure 4. Relationship between the formative assessment and teaching hours.
Figure 4. Relationship between the formative assessment and teaching hours.
Education 15 00020 g004
Table 1. Exploratory factor analysis and reliability indices of the factors.
Table 1. Exploratory factor analysis and reliability indices of the factors.
FactorQuestionsFactorial LoadingOmega Coefficient
Assessment with closed question tests P66. Examination with closed questions0.880.85
P63. Multiple-choice test0.81
P65. Short-question exam0.74
Continuous and formative assessment P69. Practical work with feedback0.880.87
P62. Classroom participation0.85
P610. Portfolios0.76
P611. Field notebooks0.69
Assessment with assignments and essaysP612. Written reports or papers0.870.88
P613. Essays based on written texts or audio-visual material0.87
Assessment with open-ended examinations with or without notesP67. Written exams with documents available0.750.71
P68. Oral exams0.66
P64. Written examination with open questions0.60
Table 2. Descriptive values of the teaching load of the teaching staff.
Table 2. Descriptive values of the teaching load of the teaching staff.
MeanMedianStandard Deviation
Number of teaching hours per week 7.387.002.675
Table 3. Number of students vs. teaching hours.
Table 3. Number of students vs. teaching hours.
Number of StudentsTotal
80 or FewerFrom 81 to 120From 121 to 160161 or More
Hours per week 5 or less hours/week881170106
6 to 7 h/week9545814135
8 to 9 h/week5444340132
10 or more hours/week08246496
Total102117132118469
Table 4. Use of assessment-grading procedures.
Table 4. Use of assessment-grading procedures.
MeanStandard Deviation
Assessment and/or marking strategiesP612. Written reports or assignments3.10 *0.93
P613. Essays based on written texts or audio-visual material2.880.86
P63. Multiple-choice test2.341.31
P65. Short-question test2.201.31
P64. Open question exam2.091.39
P610. Portfolios1.741.97
P62. Classroom participation1.691.27
P69. Practical work with feedback1.651.17
P66. Closed question exam1.561.52
P611. Field workbooks1.531.41
P67. Written exams with documents available1.031.21
P68. Oral exams0.921.18
(*) Average values obtained from an ordinal scale from 0 (no use) to 4 (very frequent use).
Table 5. Mean values of the four assessment-grading scales.
Table 5. Mean values of the four assessment-grading scales.
MeanStandard Deviation
Assessment with written assignments3.000.78
Assessment with closed question exams2.021.11
Continuous and formative assessment1.641.05
Assessment with open question exams with or without notes1.320.85
Table 6. Correlations between types of assessment vs. number of students and correlations between types of assessment vs. teaching hours.
Table 6. Correlations between types of assessment vs. number of students and correlations between types of assessment vs. teaching hours.
Number of StudentsHours of Teaching/Week
QuestionsPearson’s CorrelationSig.Pearson’s CorrelationSig.
Assessment with closed question exams0.19 **0.0000.16 **0.001
Continuous and formative assessment−0.10 *0.043−0.10 *0.047
Assessment with written assignments and essays−0.010.7660.050.311
Assessment with open question exams0.010.1170.040.467
The correlation is significant at the 0.05 level (bilateral) *; the correlation is significant at the 0.01 level (bilateral) **.
Table 7. Correlations between questions of the assessment systems vs. number of students and correlations between questions of the assessment systems vs. teaching hours.
Table 7. Correlations between questions of the assessment systems vs. number of students and correlations between questions of the assessment systems vs. teaching hours.
Number of StudentsHours of Teaching/Week
ScalesQuestionsPearson’s CorrelationSig.Pearson’s CorrelationSig.
Assessment with closed question examsP63. Multiple-choice test0.19 **0.0000.19 **0.000
P65. Short-question exam0.14 **0.0020.11 *0.022
P66. Closed question exam0.15 **0.0010.11 *0.025
Continuous and formative assessmentP69. Practical assignments with feedback−0.12 *0.011−0.10 *0.034
P62. Participation in the classroom−0.10 *0.036−0.080.091
P610. Portfolios−0.10 *0.042−0.070.144
P611. Field notebooks0.0070.883−0.050.323
Assessment with written workP612. Reports or written work0.050.3150.000.467
P613. Essays from written texts−0.070.1330.040.093
Assessment with open questions examsP64. Open question exam0.14 **0.0030.13 **0.006
P67. Written examinations with documents available0.080.0750.0470.314
P68. Oral exams0.020.668−0.13 **0.006
Correlation is significant at the 0.05 level (bilateral) *; correlation is significant at the 0.01 level (bilateral) **.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Manrique-Arribas, J.C.; López-Pastor, V.M.; Palacios-Picos, A. External Constraints on the Development of Quality Assessment of Students’ Learning in Higher Education. Educ. Sci. 2025, 15, 20. https://doi.org/10.3390/educsci15010020

AMA Style

Manrique-Arribas JC, López-Pastor VM, Palacios-Picos A. External Constraints on the Development of Quality Assessment of Students’ Learning in Higher Education. Education Sciences. 2025; 15(1):20. https://doi.org/10.3390/educsci15010020

Chicago/Turabian Style

Manrique-Arribas, Juan C., Víctor M. López-Pastor, and Andrés Palacios-Picos. 2025. "External Constraints on the Development of Quality Assessment of Students’ Learning in Higher Education" Education Sciences 15, no. 1: 20. https://doi.org/10.3390/educsci15010020

APA Style

Manrique-Arribas, J. C., López-Pastor, V. M., & Palacios-Picos, A. (2025). External Constraints on the Development of Quality Assessment of Students’ Learning in Higher Education. Education Sciences, 15(1), 20. https://doi.org/10.3390/educsci15010020

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop