Next Article in Journal
The Design, Field Testing, and Evaluation of a Contextual, Problem-Based Curriculum: Feedback Analysis from Mathematics Teachers on the Field Test Version of Connected Mathematics®4
Previous Article in Journal
Enhancing Computational Thinking of Deaf Students Using STEAM Approach
Previous Article in Special Issue
Application of Cooperative Learning and Its Relation to 3 × 2 Achievement Goals in Teachers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evaluative Judgment: A Validation Process to Measure Teachers’ Professional Competencies in Learning Assessments

by
José Miguel Olave Astorga
* and
Félix González-Carrasco
Facultad de Filosofía y Humanidades, Pontificia Universidad Católica de Valparaíso, Valparaíso 2340025, Chile
*
Author to whom correspondence should be addressed.
Educ. Sci. 2025, 15(5), 624; https://doi.org/10.3390/educsci15050624
Submission received: 13 April 2025 / Revised: 5 May 2025 / Accepted: 13 May 2025 / Published: 20 May 2025

Abstract

:
This article deals with teachers’ professional development, focusing specifically on their competencies to assess learning. Research in this field has shown a lack of instruments for measuring such competencies in practicing teachers. In this context, we present the validation process of such an instrument, called Classroom Evaluative Judgment, which is designed to assess teachers’ competencies in evaluating their students’ school work. We adopt a quantitative approach, with a non-experimental and sequential design. First, the instrument was subjected to content validation through expert judgment. Subsequently, a pilot test was carried out with an unintentional sample, applying statistical reliability analysis and confirmatory factor analysis to ensure the internal consistency of the instrument with respect to its theoretical basis. Finally, we validated the instrument with 266 participants, obtaining high levels of internal consistency and statistical reliability. The results support the soundness of the proposed model and its usefulness for measuring professional teaching competencies in the field of learning assessment. Its application in real contexts of professional practice could open new lines of research on the evaluative judgment of teachers and the strengthening of their evaluative identity.

1. Introduction

The international literature widely recognizes that assessment influences teaching and learning (Wylie, 2020; Baird et al., 2017); therefore, it is important to link the increase in professional competencies in learning assessment with their eventual transfer to the classroom. In addition, it is affirmed that formative assessment practices carried out by teachers not only improve their students’ learning, but are also effective for professional development as they favor inclusion and strengthen participation among colleagues (Ainscow et al., 2024; DeLuca et al., 2023; Hoefflin & Allal, 2007).
The need to implement training processes for teachers in terms of learning assessment parallels the emergence of professional standards, thus making it necessary to implement a situated and progressive professional development approach in educational communities to improve and support teachers’ assessment practices (DeLuca et al., 2016a, 2016b). In Chile, Law 20.903 of 2016 establishes a regulatory framework for teachers’ professional development, emphasizing collaboration and situated learning as fundamental pillars to improve educational quality (Organisation for Economic Co-operation and Development [OECD], 2015, 2017). Complementarily, in order to promote specific competencies in learning assessment among teachers, the Chilean Ministry of Education has recently enacted new Professional Teaching Standards (CPEIP, 2022). The standards outline the necessary professional expertise regarding key knowledge, skills, and attitudes for educational processes (Wyatt-Smith et al., 2017). A criticism voiced in the literature on standards is that these professional norms are not sufficient to represent the complexity of teachers’ evaluative work (Wyatt-Smith & Looney, 2016). From this perspective, this article seeks to describe the validation process of an instrument employed to measure professional competencies in learning assessment in the context of teachers’ professional practice, allowing us to obtain valid and reliable information that will enable favorable decision making to support their professional development.
Coinciding with current understanding that assessments serve as a sociocultural activity (Broadfoot, 2021) influenced by policies affecting the professional practice of teachers and, in particular, their ability to evaluate their students’ work in the classroom (Looney et al., 2018), we interpret the assessment of learning as a complex process in which teachers and students actively participate. Therefore, the design of an assessment tool should not only respond to the intentions of teachers but should also consider students’ differing trajectories and ways of learning. Specifically, the assessment perspective assumed by this study coincides with Heritage’s (2007) definition of a formative assessment, defined as “a systematic process for gathering evidence about learning, in which students participate actively with their teachers, sharing learning objectives, understanding how their learning progresses, what steps they should take and how to do it” (p. 142). This approach to assessment emphasizes that teachers and students develop evaluative judgments about different school tasks—as evidence of learning—in order to make pedagogical decisions relevant to the overall process and not merely about isolated outcomes (Black & Wiliam, 2018). According to Sadler (1989), evaluative judgment is a central element in formative assessment processes as it helps students and teachers to make estimates about how much material has been learned and what remains to be taught. An accepted definition in the specialized literature is the one proposed by Tai et al. (2018), which points out that evaluative judgment is the process in which one assesses the quality of one’s own and others’ work. Recent research on evaluative judgment has mainly been conducted with higher education students (Sun et al., 2024); as a result, there are few studies on the process of evaluative judgment among teachers.
In light of the identified research gap on the development of professional competencies in the area of learning assessment, the specialized literature highlights assessment literacy (AL) as the knowledge and skills required by teachers to implement learning assessment processes (Brookhart, 2023). Moving away from theoretical approaches that reduce the assessment process to a technical and decontextualized event, we seek to reconceptualize the concept of AL by integrating “knowledge, beliefs, feelings and skills of teachers in their roles as evaluators of student learning” (Adie et al., 2020, p.4). In this reconceptualization, the concept of teachers’ evaluative identity is included as a fundamental aspect, underscoring that when a teacher constructs evaluative judgments about their students’ work, they not only incorporate a series of evaluative knowledge and skills, but also deploy their evaluative expertise in which the emotional and contextual dimensions of their professional experience participate (Wyatt-Smith et al., 2024; Adie et al., 2020; Looney et al., 2018).
The concept of assessment literacy among practicing teachers has become increasingly important for both initial and continued teacher education (Xu & Brown, 2016), as it can be used as the basis for their professional development (DeLuca et al., 2023). Consequently, constructing instruments that can measure these competencies is an important challenge for researchers. To date, research in this area has been developed based on instruments that provide weak psychometric evidence (Gotch & French, 2014; DeLuca et al., 2016a). In particular, these instruments focus on measuring general knowledge about learning assessment, leaving ample research space to explore what items are important when assessing learning and how assessment literacy impacts students’ learning outcomes (Yan & Pastore, 2022a; Wylie, 2020).
In response to this need, instruments that have been developed from the AL perspective were reviewed. DeLuca et al. (2016a) presented an instrument called Approaches to Classroom Assessment Instrument (ACAI), which is based on the analysis of professional standards in the field of learning assessment in fifteen countries. The results of their study propose a sort of agenda for research in this field, highlighting the need to investigate teachers within classroom assessment spaces. In this sense, in Chile, Meckes (2018) validated an online instrument for measuring evaluative competencies in primary education teachers based on four theoretical dimensions; namely, collecting evidence about learning from their students; analyzing and interpreting evidence of learning; formative feedback; and certifying and grading student learning. Although this instrument is focused on measuring professional competencies to develop the evaluation process, the theoretical link with evaluation practices is still weak. Seeking to integrate conceptual, practical, and socioemotional aspects present in the evaluation process, Yan and Pastore (2022a, 2022b) presented the Teacher Formative Assessment Literacy Scale (TFALS) and the Teacher Formative Assessment Practice Scale (TFAPS), with the aim of measuring teaching practices in formative assessments. Both scales have been validated through statistical analysis, confirming their structure and psychometric quality. These instruments align with the perspective of the present study, as they integrate the theoretical, emotional, and attitudinal knowledge present in the process of learning assessments in the classroom. In the same sense, as we have mentioned, the introduction of the concept of evaluative identity has gained strength in studies with teachers, as the perceptions that teachers have of themselves as evaluators and how these perceptions affect their evaluative judgments about their students’ work have become relevant in current research (Olave & Orrego, 2025). Thus, Estaji and Ghiasvand (2021) and Jan-nesar Moqaddam et al. (2021) have presented scales for the measurement of competencies in learning assessment that incorporate dimensions of teachers’ evaluative identity. Both works are based on the model proposed by Looney et al. (2018), who suggested five key dimensions in the construction of evaluative identity. The results of these studies suggest the importance of professional practice and students’ learning trajectories as key elements when assessing students’ work.
Following this line of research, the present study focused on validating an instrument used to measure teachers’ professional competencies in the field of learning assessment, which is based on the framework of assessment literacy in the context of teachers’ professional practice. Thus, the present study aims to describe the validation process of this instrument, discuss its usefulness for measuring professional competencies in the field of learning assessment, and propose its application in real contexts of pedagogical practice oriented toward teachers’ professional development.

2. Methods

The instrument was designed to be applied in a pilot study and was subsequently implemented to answer the following research question: How can an instrument that measures the evaluative competencies of teachers in the context of their professional practice be validated? The research was developed using a quantitative approach, with a non-experimental and sequential design (Arévalo-Chávez et al., 2020). In the first stage of the study, the instrument was subjected to expert judgment evaluation. After this evaluation, a pilot phase was applied to a non-probabilistic sample of 66 teachers, whose inclusion characteristics were their experience in classroom evaluation at primary and secondary education levels. The results of the pilot application were analyzed by means of statistical reliability criteria and confirmatory factor analysis, following which internal coherence adjustments were made to the initial theoretical proposal.
The second stage of the study consisted of the final application of the instrument taking into consideration the adjustments made after the first stage. This final application consisted of the participation of 266 elementary and middle school teachers who work in 19 educational establishments that report to a local public education service (SLEP) belonging to the Chilean public education system.

3. Results

In the first stage, the Classroom Evaluative Judgment instrument was constructed based on the study by Wyatt-Smith et al. (2024), who consider evaluative practices to be associated with three dimensions; namely, (a) evaluating the quality of work; (b) considering the trajectory of students during evaluation; and (c) making professional decisions in favor of future learning. Based on these dimensions, 21 items were elaborated, each of which was accompanied by 5 alternatives with a score ranging from 1 to 5, with the highest score being the one that best relates to the expected characteristics of formative evaluation in schools. The scores related to the alternatives were as follows: Never (1), Almost never (2), Occasionally (3), Almost always (4), and Always (5).
Regarding content validity of the pilot instrument, the instrument was validated by five expert judges in the area of learning assessment; specifically, doctors in the field of education with experience in teacher training. They were asked to analyze the instrument based on the criteria of relevance and clarity; that is, whether each item is appropriate according to its dimension, whether it fulfills the objective or purpose of the instrument, and whether each item is formulated and written in a clear way. The judges evaluated each of the items and incorporated qualitative observations for the items which caused disagreement regarding their clarity and/or relevance. This information was analyzed through thematic analysis and discussed with the research team, allowing for adjustments to be made to the items for the final application of the pilot instrument.
After content validation, a pilot test was applied to a non-probabilistic sample of 66 teachers, of which 22.7% exclusively comprised elementary school teachers, 27.3% exclusively comprised middle school teachers, and 50% belonged to both educational cycles. The objective of the analysis of the pilot application was to make adjustments to the proposed theoretical model, contrasting it with an analysis of the statistical reliability and standard estimation of each of the items related to their respective dimensions. For this purpose, a confirmatory factor analysis was carried out. We opted for this type of analysis as the structure (dimensions, sub-dimensions, and items) of the instrument was defined according to the hypotheses and theoretical assumptions of the researchers. The analytical work aimed to control for a number of previously established factors and variables between which relationships are observed (Ferrando & Anguiano-Carrasco, 2010). As a result of the pilot phase, three items were eliminated due to overlapping with other items, and restructuring was also carried out according to the statistical relationship between items and dimensions. This process helped to consolidate the instrument’s theoretical affinity and allowed us to review the standard estimators and the significance tests of the factor loadings, which led to the model being based on six dimensions with three items in each dimension, as shown in the following table (Table 1).
The table describes the model fit and the selected items. The factor loadings are statistically significant (p < 0.05), indicating that each item is associated with the proposed dimension, with standardized estimator values higher than 0.4 serving as reference and representing the proportion of variance explained by the factor with respect to the item (Ventura-León, 2019). For example, in Dimension 6, item D4_P19 presents one of the highest standardized estimators (0.837), reinforcing its relevance in the measurement of that factor. Similarly, in Dimension 2, item D2_P5 exhibits a standardized loading of 0.419, a lower value compared to others, but still statistically significant (Z = 3.165; p = 0.00155). This range of standardized factor loadings (ranging approximately between 0.4 and 0.84) suggests a good explanatory power of the items for their respective factors, reflecting the consistency and robustness of the proposed factor model.
The proposed model based on six dimensions has statistically significant factor loadings (see Table 2), in addition to satisfactory fit indices (RMSEA = 0.078; TLI = 0.789; CFI = 0.834). Thus, the soundness and relevance of the proposed theoretical model are supported. In light of these results, we decided to advance on the model proposed by Wyatt-Smith et al. (2024) by disaggregating dimensions related to task quality and resolution (D1–D2); incorporating the recognition of individual and group progress among students (D3) and the assessment of attitudinal elements (D5); and finally disaggregating the dimension corresponding to professional decision making into two dimensions: decision making for teaching and decision making for learning (D4-D6). The final instrument is presented in Table 3.
As a result of this first stage, it was found that the instrument offers internal consistency and construct validity adequate for its application as a definitive instrument, which reinforces its usefulness for measuring professional competencies in the context of student learning assessment. The process of content validation, application, and analysis of the pilot phase allowed us to advance the factor loadings analysis and define the definitive dimensions and items of the second stage. In this way, the final instrument was constructed using six dimensions with 18 items.
As in the pilot instrument, each item of the final instrument presents five alternatives with a score ranging from 1 to 5. The scores related to the alternatives were as follows: Never (1), Almost never (2), Occasionally (3), Almost always (4), and Always (5). The questionnaire was distributed through a digital platform and socialized with the support of a local public education service belonging to the Chilean public education system. The participants in this stage were 266 teachers working in 19 schools belonging to a local public education service in Chile. Purposive sampling was used to select the participants following contact with each educational institution. The calculation of the power of the sample was corroborated using the G*Power 3.1 software, obtaining a power (1−b err prob)= 0.964. Of the participants, 60% corresponded to elementary school teachers and 35% to middle school teachers, while 5% did not indicate the level at which they work. The participants gave their consent to participate in the research, which was formalized via the informed consent form prepared within the framework of the research and validated by the ethics committee of the university sponsoring the study under the code BIOEPUCV-H 628-2023.
The results of the second stage obtained Cronbach’s value of 0.896, which is interpreted as a good coefficient (Frias-Navarro & Pascual-Soler, 2022). This information corroborates the internal consistency of the construct as well as its applicability to the proposed case. Based on the information obtained in the theoretical application, we proceeded to perform confirmatory factor analysis with the purpose of validating its theoretical affinity and reviewing the standard estimators, as well as the significance tests of the factor loadings (see Table 4).
The analysis of the factor loadings of the scale allowed us to confirm the model’s high validity and reliability for its intended use. All of the factor loadings of the scale are high (0.723 to 0.935) and the Cronbach’s alpha values vary between 0.886 and 0.895. The measures of fit (see Table 5)—particularly the RMSEA (Root Mean Square Error of Approximation)—indicate that the model has a “good” fit to the data (RMSEA = 0.078). Moreover, the TLI (Index) indicates that the proposed model is “acceptable” with respect to the independence model (TLI = 0.789). The values analyzed, specifically the adjustment of the factor loadings, allowed for the development of the general instrument, according to the sampling criteria employed for its validation, and the application of the proposed theoretical model.
In summary, the application of the final version of the instrument, supported by the results of the confirmatory factor analysis and the high internal consistency (Cronbach’s alpha = 0.896), confirms its soundness and viability. The factor loadings, which range from 0.723 to 0.935, together with the fit values (RMSEA = 0.078; TLI = 0.789), demonstrate the coherence of the proposed structure with the theoretical model. Likewise, the high internal reliability in each dimension (alpha between 0.886 and 0.895) reinforces the instrument’s capacity to accurately measure the defined factors. Taken together, these findings support the relevance of the tool and its usefulness as a valid and reliable resource in the context of the assessing the professional practice of teachers who make evaluative judgments about their students’ work.

4. Discussion

Firstly, the instrument was constructed based on the scales presented in order to link theoretical approaches with evaluative practices. In this case, the instrument advances on practical dimensions such as the assessment of individual and group trajectories of students, together with the assessment of attitudinal dimensions derived from the construction of evaluative judgments by practicing teachers. In this way, the Classroom Evaluative Judgment instrument surpasses other tools intended only to measure knowledge related to learning assessment (Yan & Pastore, 2022a, 2022b; Meckes, 2018; DeLuca et al., 2016a). In addition, the Classroom Evaluative Judgment instrument strongly incorporates dimensions of assessment context and evaluative identity development (Estaji & Ghiasvand, 2021; Jan-nesar Moqaddam et al., 2021; Looney et al., 2018).
Secondly, the instrument provides a structured framework that allows teachers to reflect on their own evaluative practices, especially in the area of formative assessment. This aspect is important in a context where learning assessment, in many cases, focuses on measuring performance rather than promoting learning. In summary, the validated instrument can help to support peer professional development (DeLuca et al., 2023) by using a situated (DeLuca et al., 2016b) and inclusive (Ainscow et al., 2024; DeLuca et al., 2023) approach that can contribute to the consolidation of a community in which teachers share and discuss their evaluative experiences.
Thirdly, the instrument responds to the demands for reliable data on how teachers understand the evaluation process of their students, providing timely data on their evaluative identity and their role as expert evaluators (Adie et al., 2020; Looney et al., 2018). The instrument includes not only items that identify the quality of school tasks but also items assessing teachers’ ability to recognize the educational trajectories of their students, along with assessing attitudinal dimensions that their students develop when performing such tasks. In addition, the instrument incorporates dimensions concerning the decisions derived from evaluative judgment, such as professional decision making in favor of improving teaching and learning. In summary, it can be said that the instrument advances the evaluative judgment model proposed by Wyatt-Smith et al. (2024).
Finally, as professional standards have not been sufficiently clear in representing the complexity of the learning assessment process (Wyatt-Smith & Looney, 2016), the proposed instrument was validated as a useful tool for evaluating teachers’ assessment of student learning for accountability purposes (DeLuca et al., 2023), along with offering timely evidence from a model of classroom evaluative judgment that enables a complex understanding of the process.

5. Conclusions

The validated instrument represents a significant contribution to the line of research in educational evaluation, specifically in the context of teachers who are implementing formative evaluation strategies. Its contribution can be analyzed in three main dimensions: the strengthening of validity and reliability in the measurement of professional competencies in learning assessment; the generation of empirical evidence to strengthen the evaluative identity of teachers based on analysis of their evaluative judgments regarding their students’ work; and the possibility of replication in other educational contexts in order to support teachers’ professional development in the field of learning assessment.
In addition, the instrument opens a space for future research linking teaching practice and student learning. Its systematic application can provide empirical data on specific competencies of the evaluation process and consequently improve teaching and student learning.
Regarding the limitations of this study, we acknowledge that given the Chilean regulations on learning assessment, this study was conducted only with teachers from the public sector; therefore, future research should address this challenge and look to adopt the instrument for application with teachers from different educational levels, as well as including teachers from charter and private schools. Addressing these challenges could open a new research agenda that considers substantive elements when evaluating student learning, as well as the effects of training for teachers in the context of their professional practice.

Author Contributions

J.M.O.A. contributes to the development of the article: Conceptualization, methodology, investigation, data curation, writing—original draft preparation; writing—review and editing; visualization; supervision; and project administration. F.G.-C.: methodology; software; validation and formal analysis; investigation; resources; data curation; writing—original draft preparation. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financed by the National Agency for Research and Development (ANID) Chile. The project is registered as No. 11230410.

Institutional Review Board Statement

The study was approved by the Ethics Committee of the sponsoring institution, Pontificia Universidad Católica de Valparaíso, Chile. The ethics committee approved it under code BIOEPUCV-H 628-2023 dated 11 May 2023.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data used as a basis for the analyses developed in this article are available at https://data.mendeley.com/preview/r7j8jfm7jf?a=e0d67456-110c-4170-ba93-4f37eb07b37f (accessed 1 May 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Adie, L., Stobart, G., & Cumming, J. (2020). The construction of the teacher as expert assessor. Asia-Pacific Journal of Teacher Education, 48(4), 436–453. [Google Scholar] [CrossRef]
  2. Ainscow, M., Calderón-Almendros, I., Duk, C., & Viola, M. (2024). Using professional development to promote inclusive education in Latin America: Possibilities and challenges. Professional Development in Education, 51, 149–166. [Google Scholar] [CrossRef]
  3. Arévalo-Chávez, P., Cruz-Cárdenas, J., Guevara Maldonado, C., Palacio Fierro, A., Bonilla Bedoya, S., Estrella Bastidas, A., Guadalupe Lanas, J., Zapata Rodríguez, M., Jadán Guerrero, J., Arias Flores, H., & Ramos Galarza, C. (2020). Actualización en metodología de la investigación científica. Universidad Tecnológica Indoamérica. [Google Scholar]
  4. Baird, J. A., Andrich, D., Hopfenbeck, T. N., & Stobart, G. (2017). Assessment and learning: Fieldsapart? Assessment in Education: Principles, Policy and Practice, 24(3), 317–350. [Google Scholar] [CrossRef]
  5. Black, P., & Wiliam, D. (2018). Classroom assessment and pedagogy. Assessment in Education: Principles, Policy & Practice, 25(6), 551–575. [Google Scholar] [CrossRef]
  6. Broadfoot, P. (2021). The sociology of assessment: Comparative and policy perspectives. The Selected Works of Patricia Broadfoot. Routledge. [Google Scholar]
  7. Brookhart, S. M. (2023). Assessment literacy in a better assessment future. Chinese Journal of Applied Linguistics, 46(2), 162–179. [Google Scholar] [CrossRef]
  8. CPEIP. (2022). Pedagogical and disciplinary standards for pedagogy careers. CPEIP. [Google Scholar]
  9. DeLuca, C., LaPointe-McEwan, D., & Luhanga, U. (2016a). Approaches to classroom assessment inventory: A new instrument to support teacher assessment literacy. Educational Assessment, 21(4), 248–266. [Google Scholar] [CrossRef]
  10. DeLuca, C., LaPointe-McEwan, D., & Luhanga, U. (2016b). Teacher assessment literacy: A review of international standards and measures. Educational Assessment, Evaluation and Accountability, 28(3), 251–272. [Google Scholar] [CrossRef]
  11. DeLuca, C., Willis, J., Cowie, B., Harrison, C., & Coombs, A. (2023). Cultivating teacher evaluation skills. In Learning to assess: Teacher education, learning innovation and accountability. Springer. [Google Scholar] [CrossRef]
  12. Estaji, M., & Ghiasvand, F. (2021). Assessment perceptions and practices in academic domain: The design and validation of an assessment identity questionnaire (TAIQ) for EFL teachers. International Journal of Language Testing, 11(1), 103–131. [Google Scholar]
  13. Ferrando, P. J., & Anguiano-Carrasco, C. (2010). Factor analysis as a research technique in psychology. Papeles del Psicólogo, 31(1), 18–33. [Google Scholar]
  14. Frias-Navarro, D., & Pascual-Soler, M. (2022). Research design, analysis and writing of results. Palmero Ediciones. [Google Scholar] [CrossRef]
  15. Gotch, C. M., & French, B. F. (2014). A systematic review of assessment literacy measures. Educational Measurement: Issues and Practice, 33(2), 14–18. [Google Scholar] [CrossRef]
  16. Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta Kappan, 89(2), 140–145. [Google Scholar] [CrossRef]
  17. Hoefflin, G., & Allal, L. (2007). Assessment in the context of professional development: The implementation of 110 a portfolio project. In S. Frankland (Ed.), Enhancing teaching and learning through assessment. Springer. [Google Scholar]
  18. Jan-nesar Moqaddam, Q., Khodabakhshzadeh, H., Motallebzadeh, K., & Khajavy, G. H. (2021). اندازه گیری هویت ارزیابی معلمان زبان انگلیسی ساخت و اعتباریابی پرسشنامه پرسشنامه هویت ارزیابی معلمان. Journal of Language and Translation, 1(1), 29. [Google Scholar] [CrossRef]
  19. Looney, A., Cumming, J., Van Der Kleij, F., & Harris, K. (2018). Reconceptualising the role of teachers as assessors: Teacher assessment identity. Assessment in Education: Principles, Policy & Practice, 25(5), 442–467. [Google Scholar] [CrossRef]
  20. Meckes, L. G. (2018). An online instrument to assess evaluative competencies of Basic Education teachers. Final Report FONIDE: FX11668. FONIDE Technical Secretariat. Available online: https://centroestudios.mineduc.cl/wp-content/uploads/sites/100/2018/10/Informe-final-FONIDE-FX11668-Meckes_ap-convertedDU.pdf (accessed on 1 April 2024).
  21. Olave, J. M., & Orrego, R. (2025). Formative assessment strategies for elementary and middle school teachers: Decisions to improve teaching and learning. Pages of Education, 18(1), 1. [Google Scholar] [CrossRef]
  22. Organisation for Economic Co-operation and Development [OECD]. (2015). Education at a glance 2015. OECD Indicators. OECD Publishing. [Google Scholar]
  23. Organisation for Economic Co-operation and Development [OECD]. (2017). Education in Chile. Reviews of national policies for education. OECD. [Google Scholar]
  24. Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18(2), 119–144. [Google Scholar] [CrossRef]
  25. Sun, W., Ding, Y., Wang, R., Liu, Y., Wang, Y., Zhu, B., & Liu, Q. (2024). Bibliometric analysis of assessment and evaluation in higher education: 2012–2023. Assessment & Evaluation in Higher Education, 49(8), 1121–1135. [Google Scholar] [CrossRef]
  26. Tai, J. M., Aijawi, R., Boud, D., Dawson, P., & Panadero, E. (2018). Developing evaluative judgement: Enabling students to make decisions about the quality of work. Higher Education, 76(3), 467–481. [Google Scholar] [CrossRef]
  27. Ventura-León, J. (2019). Two easy ways to interpret the famous factor loadings. Gaceta Sanitaria, 33(6), 599. [Google Scholar] [CrossRef]
  28. Wyatt-Smith, C., Adie, L., & Harris, L. (2024). Supporting teacher judgement and decision-making: Using focused analysis to help teachers see students, learning, and quality in assessment data. British Educational Research Journal, 50, 1420–1448. [Google Scholar] [CrossRef]
  29. Wyatt-Smith, C., Alexander, C., Fishburn, D., & McMahon, P. (2017). Standards of practice to standards of evidence: Developing assessment capable teachers. Assessment in Education: Principles, Policy and Practice, 24(2), 250–270. [Google Scholar] [CrossRef]
  30. Wyatt-Smith, C., & Looney, A. (2016). Professional standards and the assessment work of teachers. In L. Hayward, & D. Wyse (Eds.), Handbook on curriculum, pedagogy and assessment (pp. 805–820). Routledge. [Google Scholar]
  31. Wylie, E. C. (2020). Observing formative assessment practice: Learning lessons through validation. Educational Assessment, 25(4), 251–258. [Google Scholar] [CrossRef]
  32. Xu, Y., & Brown, G. T. L. (2016). Teacher assessment literacy in practice: A reconceptualization. Teaching and Teacher Education, 58, 149–162. [Google Scholar] [CrossRef]
  33. Yan, Z., & Pastore, S. (2022a). Are teachers literate in formative assessment? The development and validation of the teacher formative assessment literacy scale. Studies in Educational Evaluation, 74, 101183. [Google Scholar] [CrossRef]
  34. Yan, Z., & Pastore, S. (2022b). Assessing teachers’ strategies in formative assessment: The teacher formative assessment practice scale. Journal of Psychoeducational Assessment, 40(5), 592–604. [Google Scholar] [CrossRef]
Table 1. Factor dimensions and loadings.
Table 1. Factor dimensions and loadings.
FactorIndicator 1EstimatorEEZpStandard Estimator
Dimension 1D1_P20.3160.0913.491<0.0010.527
D1_P30.3600.0894.057<0.0010.655
D1_P40.2620.0693.800<0.0010.560
Dimension 2D2_P50.2650.0843.165<0.0010.419
D2_P60.5460.0896.147<0.0010.779
D2_P70.5830.0886.629<0.0010.834
Dimension 3D3_P120.2570.0614.242<0.0010.556
D3_P130.4700.1233.818<0.0010.511
D3_P140.7390.1375.411<0.0010.682
Dimension 4D4_P150.3570.1003.553<0.0010.478
D4_P160.5900.1045.692<0.0010.713
D4_P200.4790.1184.078<0.0010.525
Dimension 5D2_P90.2340.0514.626<0.0010.586
D2_P100.4730.1004.736<0.0010.639
D2_P110.2120.0693.087<0.0010.432
Dimension 6D4_P170.5720.1174.906<0.0010.605
D4_180.5780.1065.441<0.0010.653
D4_P190.7970.1087.360<0.0010.837
1 According to the statistical analysis, questions Q1, Q8, Q21 are eliminated.
Table 2. Adjustment measures.
Table 2. Adjustment measures.
CI 90% of RMSEA
IFCTLIRMSEAInferiorSuperior
0.8340.7890.0780.0470.104
Table 3. General description of the final instrument.
Table 3. General description of the final instrument.
DimensionDescription
D1.
Task resolution
Recognize in the students’ work the resolution of the task according to established criteria.
D2.
Qualities in the performance of tasks
Identify the qualities observed when solving the proposed tasks.
D3.
Learning path
Identify individual progress stages according to the learning trajectory of each student in relation to himself/herself and his/her course group.
D4.
Decision making for teaching
Determine decisions that involve next steps to improve student learning and improve your teaching tools or strategies.
D5.
Task implications
Assess attitudinal aspects of their students involved in the performance of the tasks.
D6.
Decision making for learning
Identify pedagogical decisions derived from evaluative judgment to improve learning support for their students.
Table 4. Factor loadings and instrument validation.
Table 4. Factor loadings and instrument validation.
FactorIndicatorFactor LoadingspAlpha
D1.
Task resolution
D1.P1 As I review students’ schoolwork, I recognize what is expected of the learning objective.0.864<0.0010.889
D1.P2 As I review students’ schoolwork, I identify the skills employed for its resolution.0.862<0.0010.889
D1.P3 While reviewing students’ schoolwork, I identify the knowledge involved in the assignment.0.768<0.0010.892
D2.
Qualities in the performance of tasks
D2.P4 As I review students’ schoolwork, I assess how they integrate cross-cutting skills in their resolution.0.810<0.0010.889
D2.P5 While reviewing schoolwork, I value the integration of knowledge related to other contexts.0.934<0.0010.886
D2.P6 As I review the task, assess the application of content and skills in different contexts.0.855<0.0010.888
D3.
Learning path
D3.P7 As I review the assignment I recognize the student’s progress in achieving the assignment.0.723<0.0010.891
D3.P8 When reviewing schoolwork, I compare individual work with the group’s progress.0.756<0.0010.890
D3.P9 When I review the assignment, I compare similar assignments (from other years or from the same year) that my students have solved.0.764<0.0010.892
D4.
Decision making for teaching
D4.P10 After reviewing the assignment, I make adjustments to the instruments (e.g., clarify instructions, adjust scores, etc.).0.779<0.0010.890
D4.P11 After reviewing schoolwork, I propose or create new instruments that reflect new learning.0.825<0.0010.888
D4.P12 After reviewing the task, I adjust my planning according to the results obtained.0.806<0.0010.890
D5.
Task implications
D5.P13 While reviewing the task, I value the responsibility for its completion0.935<0.0010.892
D5.P14 When reviewing schoolwork, I assess the order and clarity of the task0.764<0.0010.895
D5.P15 While reviewing the task, I value creativity (or originality) in solving the task.0.843<0.0010.893
D6. Decision making for learningD6.P16 After reviewing the assignment, I develop individual recommendations for each student.0.798<0.0010.890
D6.P17 After reviewing the assignment, I make recommendations to the course group for the development of future assignments on the subject.0.724<0.0010.891
D6.P18 After reviewing the assignment, I propose new challenges based on each student’s achievement.0.852<0.0010.889
Table 5. Measures of instrument fit.
Table 5. Measures of instrument fit.
CI 90% of RMSEA
IFCTLIRMSEAInferiorSuperior
0.8340.7890.0780.0470.104
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Olave Astorga, J.M.; González-Carrasco, F. Evaluative Judgment: A Validation Process to Measure Teachers’ Professional Competencies in Learning Assessments. Educ. Sci. 2025, 15, 624. https://doi.org/10.3390/educsci15050624

AMA Style

Olave Astorga JM, González-Carrasco F. Evaluative Judgment: A Validation Process to Measure Teachers’ Professional Competencies in Learning Assessments. Education Sciences. 2025; 15(5):624. https://doi.org/10.3390/educsci15050624

Chicago/Turabian Style

Olave Astorga, José Miguel, and Félix González-Carrasco. 2025. "Evaluative Judgment: A Validation Process to Measure Teachers’ Professional Competencies in Learning Assessments" Education Sciences 15, no. 5: 624. https://doi.org/10.3390/educsci15050624

APA Style

Olave Astorga, J. M., & González-Carrasco, F. (2025). Evaluative Judgment: A Validation Process to Measure Teachers’ Professional Competencies in Learning Assessments. Education Sciences, 15(5), 624. https://doi.org/10.3390/educsci15050624

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop