1. Introduction
In response to the competitive nature of university rankings, the focus of university teaching has undergone significant changes since the 1990s. The emergence of the Scholarship of Teaching and Learning (SoTL) has been identified as a solution to the challenges faced by universities on a global scale. In China, there is an analogous policy orientation that emphasizes universities’ involvement in creating social benefits, in line with the contemporary purpose of higher education. However, universities, as nonprofit institutions, may jeopardize their academic goals if they become overly concerned with market- or profit-based objectives [
1,
2].
Therefore, over the past three decades, the SoTL has received a great deal of attention worldwide [
2,
3,
4,
5,
6,
7]. However, the collective dilemma faced by the SoTL research and practice is that, in the process of evaluating teachers’ teaching, universities are more inclined to use the unified official criteria instead of student expectations to ensure that they can achieve better results in the university ranking system or national assessments system. These tensions reflect a broader institutional paradox—while policies rhetorically endorse the SoTL principles, their operationalization remains constrained by a legacy system prioritizing measurable outputs over transformative educational processes, especially in developing countries. In this context, student-centered assessment is critical as a complementary and reflective perspective for the understanding of the “supercomplexity” [
8] concept—SoTL.
1.1. The Traditional “Research-First” Ecosystem and the Global SoTL Movement
Amidst the intensifying competition in global higher education, university ranking mechanisms and market-driven logic have profoundly reshaped institutional teaching priorities [
9]. Since the 1990s, teaching has evolved from a “static process” of knowledge transmission to a “dynamic practice” of competency cultivation and intellectual inspiration. This paradigm shift catalyzed the emergence of the SoTL framework. The traditional “research-over-teaching” model prioritizes institutional prestige through research productivity, relegating teaching to a secondary role characterized by passive knowledge transmission and standardized instruction.
As a reflexive departure from the traditional “research-over-teaching” academic ecosystem, the SoTL advocates for reconceptualizing pedagogical practices as a systematized and disseminable form of scholarly inquiry. Rooted in evidence-based continuous improvement [
10], the SoTL positions teaching not merely as an instructional activity but as an intellectual endeavor requiring systematic investigation, pedagogical innovation, and rigorous validation through empirical methodologies. This paradigm shift redefines faculty roles by integrating dual identities—researcher and educator—and emphasizes the institutionalization of robust evaluation frameworks that equate teaching excellence with research achievements. Concurrently, it transitions students from passive knowledge recipients to active participants in inquiry-based learning environments, fostering critical thinking and adaptability. Ultimately, the SoTL seeks to harmonize academic rigor with educational values, cultivating learners who are capable of addressing complex societal challenges.
In
Table 1, we further explain the difference between the two paradigms through the five dimensions of Core Philosophy, Core Objective, Faculty Role, Student Role, Teaching Methodology, and Evaluation System.
1.2. The Characteristics of SoTL Practice in China and Student Assessment as a Reflective Perspective
The global academic movement of the SoTL aligns with China’s ongoing higher education reforms, particularly the “Double First-Class” initiative, which prioritizes universities’ societal contributions through knowledge innovation while warning against the risks of marketization eroding academic integrity [
1,
2]. This policy framework creates a dual tension for SoTL implementation—it demands that universities simultaneously uphold scholarly rigor and maximize societal impact through pedagogical excellence. However, as some studies have pointed out, SoTL practices in developing nations are confronted with multiple challenges. Existing studies suggest that a teaching-evaluative deficit is particularly pronounced in developing nations, where challenges such as unequal resource distribution, insufficient pedagogical innovation, and inadequate responses to diverse student needs impede substantive quality breakthroughs [
11].
As a representative of developing countries, China’s SoTL practice remains in its nascent stage of development, characterized by a top–down institutional exploration primarily driven by educational administrative authorities and universities. This governance model has led to an evaluation system heavily influenced by externally imposed standards—such as policy mandates from the Ministry of Education and institutional performance metrics—rather than organic, faculty-led, pedagogical reflection and innovation. For instance, reforms like the “Guidelines for Deepening Education Evaluation Reform in the New Era” emphasize quantifying teaching achievements but often overlook the core SoTL tenets of critical self-inquiry, evidence-based iterative refinement, and disciplinary-specific pedagogical creativity.
Consequently, while administrative frameworks provide structural legitimacy to teaching scholarship, they risk reducing the SoTL to a compliance-driven exercise, diluting its transformative potential to foster authentic, localized teaching praxis. Concurrently, the pressure to align teaching with market-driven skill demands risks reducing curricula to instrumentalist training, marginalizing the SoTL’s emphasis on fostering critical thinking and lifelong learning. These tensions reflect a broader institutional paradox—while policies rhetorically endorse the SoTL principles, their operationalization remains constrained by a legacy system prioritizing measurable outputs over transformative educational processes.
It is within this contested terrain of practice that this study proposes student evaluations as a reflexive lens to interrogate China’s SoTL implementation. By foregrounding bottom–up student perspectives—such as feedback on pedagogical practices, self-reported learning experiences, and critical reflections on curriculum relevance—this approach seeks to unmask the disjuncture between policy rhetoric and classroom realities. Student voices, often marginalized in top–down policy frameworks, can reveal how administrative compliance and market-driven instrumentalism inadvertently constrain pedagogical innovation. Such insights underscore the SoTL’s imperative to prioritize the reflexive teaching praxis—where educators iteratively refine instruction based on the evidence of student learning—rather than merely fulfilling externally imposed benchmarks. More importantly, centering student agency aligns with the SoTL’s foundational commitment to fostering lifelong learning competencies, which transcend narrow skill acquisition to cultivate adaptive, ethically grounded learners capable of addressing societal challenges.
Consequently, integrating student perspectives into the SoTL evaluation frameworks not only enhances educational sustainability by bridging policy and practice but also recenters teaching scholarship on its core mission: nurturing humanistic and intellectual growth in alignment with both global scholarly ideals and localized educational contexts.
1.3. Consistency Between SDG4 and Student-Centered SoTL Assessment
Embedded within the United Nations’ 2030 Agenda for Sustainable Development, Sustainable Development Goal 4 (SDG4)—Quality Education—establishes a transformative mandate for reimagining higher education systems globally. SDG4 calls for the establishment of “inclusive, equitable, and quality education systems that promote lifelong learning opportunities for all” [
12], a vision that resonates intrinsically with the SoTL—a paradigm dedicated to advancing pedagogical excellence through evidence-based, reflective practices.
SDG4 emphasizes multidimensional interventions to dismantle educational inequities and elevate teaching quality, while the SoTL operationalizes this vision by reorienting pedagogy toward learner-centered methodologies and context-sensitive innovation. Together, they construct a cohesive framework that bridges global sustainability aspirations with localized pedagogical reforms, ensuring education systems are both universally equitable and adaptively responsive to diverse learner needs.
The SoTL’s pedagogical framework directly addresses SDG4.3 (equitable access to tertiary education) by dismantling systemic barriers through adaptive curriculum design and hybrid learning models. Simultaneously, the SoTL advances SDG4.4 (skills for employment) by prioritizing competency cultivation over content transmission. Through inquiry-based learning strategies, industry-aligned collaborative projects, and real-world problem-solving initiatives, SoTL effectively translates the aspirational framework of SDG4 into quantifiable educational outcomes in the classroom.
- 2.
Pedagogical Innovation for Sustainability (SDG4.7)
By prioritizing metacognitive skills and self-directed learning, SoTL directly supports SDG 4.7 on fostering sustainability competencies. Through deliberate reflection on learning processes and student-led knowledge construction, learners develop the agency to apply knowledge across disciplinary boundaries. Interdisciplinary program design further strengthens this focus, cultivating interrelated competencies such as systems analysis (mapping cause-and-effect relationships in sustainability issues), value-based decision-making (weighing ethical implications of actions), and responsive innovation (adapting solutions to evolving contexts). This integrated approach links educational theory with societal needs and equips learners to address interrelated social, environmental, and economic challenges.
- 3.
Teacher Professionalization as a Lever (SDG4.c)
The transition from instructor-led didacticism to facilitative pedagogy—central to both the SoTL and SDG4.c (teacher training)—redefines educators as co-learners and reflective practitioners. SoTL-driven professional development programs, informed by student evaluations and peer-reviewed teaching research, enhance pedagogical effectiveness while aligning faculty growth with institutional sustainability goals. For example, workshops on inclusive assessment design or technology-enhanced instruction operationalize SDG4.c’s call for qualified, adaptable educators.
- 4.
Excluded Subgoals
Subgoals such as SDG4.1 (universal primary/secondary education) and SDG4.2 (early childhood development) primarily target foundational education stages, rendering them peripheral to higher education-focused SoTL. And, SDG4.5 (gender disparities) and SDG4.6 (literacy/numeracy) are excluded due to their stronger alignment with basic education equity and foundational skill acquisition, respectively. Infrastructure-focused targets like SDG4.a (inclusive facilities) and SDG4.b (global scholarships) are siloed in administrative or policy domains, as they address resource allocation and international partnerships rather than classroom-level pedagogical dynamics.
In summary, by institutionalizing reflexive student evaluations, the SoTL translates SDG4’s aspirational targets into context-driven pedagogical reforms. This integration enables education systems to embody global sustainability commitments while fostering localized responsiveness to learner diversity, with teacher development catalyzed through student-informed reflective practices.
3. Materials and Methods
This study attempts to develop and validate a student-centered SoTL assessment scale, based on two existing student assessment scales for the SoTL, and considering SoTL practices and the national assessment system in China.
3.1. Stage 1—Initial Item Generation
Based on two established scales and incorporating China’s Official Teaching Evaluation Index System, this study constructed a student evaluation scale for the SoTL.
The foundational scale for this study is derived from Abrami, d’Apollonia, and Rosenfield’s pioneering work, which initially developed a “student assessment scale” in 1993. Through iterative revisions over decades, this scale was formally published in the book
The Scholarship of Teaching and Learning in Higher Education:
An Evidence-
Based Perspective [
10]. The refined version identifies 31 instructional characteristics critical to effective teaching, independently prioritized by both students and educators.
The second scale is the “Course and Teaching Evaluation Questionnaires” (CTEQ) mentioned in the book
Evaluating Teaching and Learning [
32]. This book provides a practical guide for colleges and universities on the SoTL and was compiled through interviews with esteemed educators, which consists of nine dimensions of teaching skills. To fully consider the higher education context in China, the scale was also constructed with reference to some of the evaluation criteria and contents in the
Undergraduate Education and Teaching Evaluation Indicator System (
2021–
2025) issued by the Ministry of Education of China.
Moreover, this study conducted semi-structured interviews with five students to ensure a more comprehensive collection of information about the SoTL based on a student assessment perspective and asked 10 students to read the initial scale and asked them to present face-to-face the confusions that existed in its formulation and categorization in order to ensure that the statements in the initial questionnaire are easy to understand and free of ambiguity. As a result, the initial scale comprises 48 items that are categorized into five hypothetical constructs based on inter-item relationships.
3.2. Stage 2—Pilot Test
3.2.1. Pilot Test Procedure
Scale development includes conducting a pilot test on relevant populations to remove items that do not meet a certain threshold of reliability and validity in the initial item pool [
33,
34]. The pilot test was conducted on the same teaching day in a randomly selected classroom, chosen from four universities situated in Guangzhou, Guangdong Province, China. A total of 237 surveys were distributed, with 202 valid responses returned, resulting in a response rate of 85.2%.
The pilot test process is divided into two parts: Scale A asks students to answer which factors they think are more important for the SoTL, and exploratory factor analysis (EFA) is used on the scale results to confirm the factor structure; Scale B asks students to evaluate the teachers directly, and multiple regression analysis is conducted on the scale results to confirm the explanatory power of the factor.
3.2.2. Results of Exploratory Factor Analysis (EFA) of Scale A
Prior to conducting exploratory factor analysis (EFA), the suitability of the scale was assessed using Bartlett’s test of sphericity and the Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy. The initial KMO value (0.667) indicated suboptimal structural validity, necessitating the further refinement of the scale. To enhance structural validity, 11 items with individual KMO values below 0.5 and item reliability coefficients lower than 0.6 were sequentially removed. After these adjustments, the finalized scale retained 37 items, achieving an overall KMO value of 0.721 (
Table 2), which meets the recommended threshold for factor analysis (KMO > 0.70). This result confirms the dataset’s appropriateness for subsequent EFA.
The result of the EFA indicates that the participants’ evaluations of the items’ significance were coherent and stable. Principal component analysis was used to extract 14 shared factors, with eigenvalues exceeding 1. The initial five common factors make up 51.345% of the total variance, demonstrating their significance in the construct, while the contribution to the explanation from the sixth common factor onwards steadily diminishes. To determine which factors are significant, those with a sudden and steep slope are retained while those with a flat slope are removed. The gravel diagram of this scale shows an inflection point at the fifth eigenvalue, with the slope increasing from this point. Therefore, upon combining the conclusions derived from the above two techniques, the preliminary determination regarding the number of factors extracted was five (
Table 3).
3.2.3. Results of Stepwise Regression Analysis of Scale B
To validate the significance of the five mentioned dimensions, the correlation between a teacher’s overall evaluation and their students’ assessment of their attributes was investigated. In Scale B, we employ regression analysis to establish if an association exists between the students’ ratings of interpersonal relationships, communicative reflection, knowledge base, classroom practice, teaching effectiveness, and their judgment of the teacher. Such a correlation may indicate that these five dimensions carry significant weight for students, even if they are not consciously aware of it.
In this study, stepwise regression analysis was used to avoid the interference of multicollinearity, and the results showed that all five predictor variables for the inputs effectively predict this effector variable with significant power (
Table 4). The combined effect of the five independent variables on the dependent variable (R
2 = 0.993) was 99.3%. All five variables had considerable power in predicting students’ overall satisfaction with their teachers’ teaching. The amount of individual variance explained by each independent variable in the regression model reached a significance at
p < 0.05.
The results of the two parts in the pilot test display conclusive consistency for these five factors. The multiple regression analysis reveals that the mean and total scores of the five dimensions, acquired from EFA, are all aligned with the average teacher evaluation score as the dependent variable. The pilot data analysis results of two scales display conclusive consistency, leading to the final identification of five dimensions (37 items) of the SoTL based on student perspectives in the context of Chinese higher education.
3.3. Stage 3—Main Survey
The main survey was conducted with 505 students in China. The students were asked to respond to statements using a 5-point Likert scale ranging from 1 (strongly unimportant) to 5 (strongly important). Lastly, confirmatory factor analysis (CFA) and convergent and discriminant validity analysis were conducted.