Trustworthy Assessment of University Competitiveness Using a Neural Network Model

Grzeszczyk, Tadeusz A.

doi:10.3390/info17060536

Open AccessArticle

Trustworthy Assessment of University Competitiveness Using a Neural Network Model

by

Tadeusz A. Grzeszczyk

Faculty of Management, Warsaw University of Technology, Narbutta 85, 02-524 Warsaw, Poland

Information 2026, 17(6), 536; https://doi.org/10.3390/info17060536

Submission received: 29 March 2026 / Revised: 16 May 2026 / Accepted: 21 May 2026 / Published: 1 June 2026

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Universities compete for funding, and their positions depend on the results of national assessments and rankings, which are expensive to produce and based on difficult-to-predict expert opinions. Assessment results have a significant impact on a university’s reputation, funding levels, attractiveness to faculty and staff, and success in recruiting top-tier students. Expert assessments and forecasts are widely used, but additional support from trusted AI tools is desirable. Several attempts have been made to use various machine learning methods, but confidence in such solutions is limited due to perceived difficulties in clearly and reliably justifying the resulting predictions. This research aims to present a proposal for using neural network models, accompanied by explanations of their predictions, to support trustworthy and sustainable assessment of university competitiveness. This methodological contribution enhances the transparency and interpretability of the assessment process and is further supported by empirical studies based on data from selected universities. A Fully Connected Neural Network (FCNN) is used for the calculations, and the local interpretable model-agnostic explanations (LIME) method is applied to explain the prediction results. The results confirm the usefulness of the proposed model and provide a solid foundation for improving evaluation systems and building trust in AI applications for assessing universities’ competitive position and the benefits of scientific research for society.

Keywords:

explainable AI; trustworthy assessment; sustainable AI; university rankings; research prestige; explainable prediction; neural networks; local interpretable model-agnostic explanations

1. Introduction

Substantial public investment in academic research is crucial for contemporary sustainable growth, fostering both the advancement of knowledge and innovations that deliver tangible benefits to society and the marketplace. Reliable and effective research assessment tools are essential for policymakers, regulatory bodies, and research funding agencies to precisely target funding streams towards R&D projects with the greatest potential for application, especially in the context of responsible research in the social sciences and management. Therefore, research on developing simple and cost-effective models for assessing research impacts is of significant interest [1]. Evaluating research impact and other university quality metrics allows funding institutions and policymakers to gauge their competitiveness in securing external funding. There is a significant need to develop tools and models for building rankings and conducting comparative analyses of various higher education institutions, considering the multiplicity and complexity of competition in higher education [2].

The research presented in this article stems from the recognition of the need for and opportunity to conduct interdisciplinary methodological research, leading to the improvement of multifaceted research impact assessments by providing expert support and, to some extent, even replacing them with AI systems, as human decisions are typically accompanied by uncertainty and low objectivity. The key challenge is to multidimensionally improve the assessment of a university’s competitive position based on research impact through the use of neural networks, which have significant potential for solving predictive problems. Neural network models may provide a practical, scalable support tool for university assessment, complementing expert-based evaluation procedures and offering opportunities to enhance confidence in the results.

The application of neural networks in this field is the subject of research, e.g., in assessing the interdisciplinary importance of scientists, institutions, and countries using a method with the analysis of interconnected multi-layer networks of disciplines and citations [3]. A significant limitation in the large-scale application of neural network models is the lack of a clear and reliable justification for the results. In this study, which examines the use of neural network models with explanations in assessing the competitive position of universities related to research impact, the first step was to build a model based on the Fully Connected Neural Network (FCNN). The calculations used empirical datasets for selected universities assessed using the UK’s Research Excellence Framework (REF) system. This framework supports the assessment of research quality in higher education institutions, their ranking, and decisions regarding the allocation of research funds. The assessment process takes into account research outputs and outcomes, the impact of that study beyond academia, and the research environment. Non-academic impact is playing a growing role in the assessment and distribution of research revenues, to the extent that it contributes to increasing income inequality among higher education institutions, and allocating the majority of funds to a smaller number of universities [4].

This research aims to propose the use of neural network models, accompanied by explanations of their predictions, for trustworthy and sustainable assessments of university competitiveness. The methodological contribution lies in combining the FCNN with the Local Interpretable Model-Agnostic Explanations (LIME) method to explain the prediction results.

The following research questions were formulated:

(1): To what extent can neural network models accurately predict the competitiveness of universities based on empirical data?
(2): How can explainable AI techniques contribute to trustworthy and sustainable neural network–based assessments of university competitiveness by improving transparency and interpretability, and which factors emerge as the most influential?

National research evaluation systems are not uniform. Some countries use citation metrics, while others rely primarily on peer review. Opinions vary regarding the effectiveness of these two distinct concepts, and some countries even switch from one approach to the other every few years. There is credible research evidence that, in the context of the UK REF, it is justified to conduct analyses at the institutional level rather than the publication level. For many areas of scientific research, metrics reflect peer review relatively well and can be used as an alternative [5].

The institution’s Grade Point Average (GPA), as reported in the UK REF, is closely tied to research impact (including societal impact) and citation and bibliometric indicators. Previous research has shown that for many Units of Assessment (UoAs), there is a significant correlation between research impact measured by citation data and institutions’ GPA rankings, reflecting the results of analyses of their outputs conducted within the UK REF [6].

Compared to previous work, undertaking this research made it possible to introduce an effective and reliable method of predicting the value of the GPA indicators using neural networks with explanations. Explaining prediction results may contribute to increasing confidence in predictive neural network models and expanding their applications in research impact assessment. The proposed prediction method may also support universities in monitoring accumulated achievements subject to subsequent evaluation, while the prediction results may help academics better target their work with regard to GPA, ranking positions, and public research funding.

The structure of this article is designed to achieve the research objective and obtain key benefits. The following sections present the evolution of research excellence assessment, which plays a key role in the competitiveness of higher education institutions, details of the methodological approach, and a synthetic discussion of the empirical research results. Finally, conclusions, study limitations, and directions for further research are presented.

2. National Research Assessment Systems and University Competitive Position

2.1. Research Evaluation as a Basis for Public Fund Allocation

The allocation of limited public funds for research is typically based on a multifaceted assessment of universities’ achievements in research excellence and the benefits they bring to the economy and society beyond the academic world. Universities compete for funding, and their competitive position depends on the results of evaluations within national systems and rankings. These evaluation systems are expensive and rely heavily on expert opinions, which are not always easy to predict. Universities are interested in evaluation results because they determine not only their reputation but also the amount of funding they receive, the level of interest from potential employees in the labor market, and the recruitment of quality students.

The results of research evaluation systems play a key role in shaping universities’ image, as they directly influence their position in international and national rankings. These rankings are closely monitored by the public and decision-makers, and therefore influence decisions regarding the participation of qualified candidates in the recruitment process and the allocation of state funding. Universities, therefore, strive to continually improve their research standards, as this improves their competitive position. Research evaluation systems serve as a tool that motivates universities to invest in innovative projects, which translates into increased prestige and attractiveness to industry partners and funders. The results of these evaluations depend not only on the number of publications but also on their impact on the development of science and the economy, which is taken into account in the evaluation processes.

The impact of scientific work should be visible in the immediate and wider environments of universities in many aspects: economic and social, and should influence the development of research development strategies. Research results should not only be assessed in terms of scientific excellence, as specifically understood by university staff, but above all, they should be relevant to stakeholders from business and public administration institutions. Therefore, it is essential to develop responsible research in social sciences, business, and management, which requires not only rigorous methodologies but also transparent evaluation mechanisms that consider both substantive quality and the real impact of research on addressing contemporary socioeconomic challenges.

Decision-makers should have reliable evaluation indicators and effective tools that enable effective identification of research projects and programs that ensure the effectiveness of public funds spending, are a source of measurable effects in the form of increased competitiveness of regions and national economies, as well as improved quality of life in line with the concept of sustainable development [7]. Research projects and programs implemented within the field of social sciences and the discipline of management studies should be characterized by a multifaceted impact: applied, theoretical, scientific, social, educational and political, and researchers should systematically expand the scope of research to strengthen the impact adequately to key social problems that go beyond the immediate context of organizational management [8].

Financing research in the form of research projects and programmes should be considered in the context of higher education policy, which is based on quasi-market competition between scientific institutions, as well as on estimating results in terms of public good, social impact and comparative analyses of performance [9]. The use of performance-based research funding systems usually leads to improved efficiency of public research funding instruments and improved management of the higher education system [10].

2.2. Competitive Position of Universities

Public funding for research is allocated through special financial instruments, taking into account the results of national evaluation systems. To optimize this process, policymakers and regulatory bodies continuously refine mechanisms for distributing public funds, ensuring greater efficiency and impact. In practice, funding is often directed toward top-performing universities, which receive additional resources to further advance their research capabilities. Sometimes, these forms of support for selected universities are awarded in the form of special initiatives (programs and projects) for research excellence, which support the improvement of research quality and the achievement of a significant competitive position by universities, not only at the national level but, above all, in the context of a globalizing research environment [11].

The amount of public funding allocated to scientific research is therefore primarily dependent on performance-based analysis, which assesses scientific excellence and social impact. Conducting such analysis and assessment processes is a complex and multifaceted problem, the solution to which is difficult to achieve, and the solutions employed are often subject to criticism within the scientific community. This criticism concerns, e.g., the substantial resources required to implement university performance assessment processes, accusations of limited objectivity among expert teams, and sometimes unclear and quasi-arbitrary judgments made based on a wide variety of criteria related to, e.g., the natural environment, social impact, and scientific products of projects [12]. The objectivity of the analysis and evaluation processes can be improved by using quantitative bibliometric indicators, but they are not sufficient, and it is necessary to take into account qualitative aspects of social impact, which are usually investigated and described by means of case studies [13].

The development of tools that support experts in decision-making can facilitate assessment processes and increase objectivity. Potentially useful solutions include various AI methods, such as the various neural network models, deep learning, and integrated AI systems [14]. Such solutions can be used to enhance multifaceted research evaluation systems and the competitive standing of universities that conduct them. This direction of method and tool development is particularly important in the case of assessing qualitative outcomes, which are difficult to quantify and measure. AI methods and systems offer an opportunity to mitigate the potential bias of experts, who often hold differing beliefs and attitudes, and whose decisions can be influenced by various personal circumstances that are sometimes difficult to predict and properly incorporate into analyses.

Research on the application of new AI models in institutional evaluation addresses an important gap, as many previous AI-based solutions have focused mainly on educational and teaching processes [15]. Less attention has been paid to using AI methods to assess institutions themselves and to support the analysis of their competitive position. Efforts are being made to develop performance evaluation indicator systems, for example, within the framework of an evaluation model based on selected elements, such as Context–Input–Process–Product, which allow for comparing inputs and outputs to assess competitiveness, increase the motivation of university staff, improve resource allocation, and enhance various benefits for universities [16].

2.3. The Increasing Role of Evaluation and Prediction Methods

The growing importance of evaluation methods and tools is particularly evident amidst the increasing competition for research funding. To some extent, it can be assumed that this competition is similar in nature to the market strategies employed by commercial organizations. However, assessing the competitive position of scientific organizations, unlike that of companies, is significantly more difficult. The results of competition among commercial organizations stem from the financial performance of individual competitors, while assessing the competitive position of universities and their research poses a significant challenge. These difficulties stem from the difficulty of quantitatively and simply accounting for the scientific excellence and social utility of knowledge derived from research, particularly in the social sciences and humanities. Furthermore, evaluation indicators are subject to frequent changes, resulting from numerous attempts to improve the system, not all of which are successful.

Despite significant challenges in assessing the competitive position of universities, new national performance-based evaluation systems are being developed or existing ones improved, e.g., the UK REF [17], Excellence in Research for Australia (ERA), Performance-Based Research Fund (PBRF) in New Zealand, Research Council of Norway (RCN), and Research Assessment Exercise (RAE) in Hong Kong [18].

National systems are developing independently of each other, but certain similarities can be observed, e.g., in terms of emphasizing the great importance of the utilitarian values of the knowledge created and assessing the social and economic impact of research, which was initiated in the British REF system [19]. National university evaluation and ranking systems are sometimes directly modeled on the British model, e.g., in Poland and Hong Kong. There are signs that evaluation criteria related to research impact will continue to gain importance [18].

A review of the evaluation methods used indicates directions for development resulting from adapting methods known in the social sciences, as well as those based on bibliometric approaches to a lesser extent. Classic methods include expert panels, multi-criteria analyses, case studies, mixed methods, observation, interviews, document analysis, bibliometric analyses of citations and impact factors, as well as emerging text mining methods and alternative metrics related to social media [20].

The growing importance of national research assessment systems, such as the REF in the UK, necessitates the use of robust methods for assessing the competitiveness of universities. The results of the assessment are largely based on contextual and qualitative observations and analyses by experts. The practices of expert review, which have been known for many years, remain largely unchanged, even when government and public research funding institutions attempt to control and streamline the review processes in expert panels by introducing more and more rules on structuring expert debates and improving their accountability [21].

Expert assessments are often supplemented with multi-criteria decision-making and hybrid multi-criteria decision-making (MCDM) models in order to integrate different dimensions of results [22]. The case study method further enriches the evaluation processes by offering in-depth insight into institutional strategies and results based on qualitative analyses. This method is commonly used to demonstrate and justify public expenditure on research, based on documented evidence of its impact on society, the economy, the environment, culture, and other areas [23].

Observation techniques are also used to understand organizational behavior and research culture at universities [24]. However, bibliometric indicators remain the basis for research evaluation, measuring the number of publications and the impact of citations. Nevertheless, the development of digital technologies and evaluative bibliometrics is contributing to the ongoing transformation of indicator-based research evaluation towards the increasingly widespread use of data-driven systems [25]. Indicators such as citation counts, the h-index, and journal impact factors are widely used to gauge research quality. The selection of indicators and their application are crucial, as they significantly impact the assessment of higher education’s functioning and the quality of scientific research [26]. The indicators are also used to assess universities’ competitive position by estimating their social impact, using methods based on indicators of the impact of scientific research outside the academic community [27]. These indicators reflect contributions to politics, the economy, and society, broadening the scope of competitiveness assessment [28]. Methods and indicators of scientometric evaluation concerning information processes in scientific research support the evaluation of research and publication activity of individual employees, research teams, and scientific institutions at the level of national evaluation systems [29]. Performance indicators at the individual researcher and team levels influence the results of institutional evaluation, which requires integrating various indicators into complex, multidimensional evaluation systems. The complexity and intricacy of evaluation systems create an atmosphere of uncertainty, anxiety, and concern about the subjectivity of assessments, despite the often quantitative nature of the indicators used [30]. Therefore, the literature on the subject highlights the challenges associated with assessment methods and techniques and emphasizes the need to develop a new generation of rankings that can integrate new information-processing methods and remain open to new ranking techniques for determining the competitive position of scientific institutions [31].

Recently, there has been an unprecedented increase in interest in artificial intelligence technologies in higher education institutions, which are playing an increasingly important role in the strategic management of universities and influencing academic staff’s attitudes and students’ skills [32]. The use of AI models in the evaluation of scientific research and the competitive position of universities can also bring many benefits, resulting, among other things, from the possibility of automating the analysis of large data sets, which significantly speeds up the evaluation process, identifying complex and nonlinear relationships that are difficult to capture using traditional statistical methods, increasing the objectivity of assessments by limiting subjective decisions by experts, building predictive models to support forecasting the future impact of research and the competitive position of scientific institutions, and better use of data in decision-making processes related to supporting research management and science policy.

Due to challenges in the methods and techniques for assessing and ranking universities’ competitive positions, there is a need to develop research on the application of advanced computational techniques in this field to support predictive and evaluative processes. Among the various methods, FCNN deserves special attention, as it offers great possibilities for modeling nonlinear relationships in evaluation data. FCNN models enable the integration of various indicators and, based on them, support the analysis of the competitiveness of scientific institutions. However, the nature of neural networks, which are difficult to analyze due to their “black box” functioning, raises concerns about the clarity and reliability of evaluation results. To address this, explainability techniques such as LIME are being developed to provide local interpretations of model predictions.

The combined use of FCNN and LIME enables stakeholders to understand the impact of individual characteristics on predicted competitiveness outcomes; therefore, this neural network model, when used with LIME, is a promising approach to trustworthy assessment of university competitiveness. The development of this method can be based on previous experience with various applications of Explainable AI (XAI), e.g., for transparent decision-making in production management [33], security systems [34], load forecasting systems [35], and other applications [36].

Table 1 presents a qualitative comparison of selected prognostic and decision-support methods using a three-level assessment scale across predictive performance, interpretability, data efficiency, robustness, scalability, and prognostic utility. The comparison was made based on the author’s previous experience [37] and indicates that neural network-based approaches may offer higher predictive performance and scalability but require more data and provide limited interpretability unless supplemented with post hoc explanation methods. In contrast, expert-based, fuzzy, and case-based methods are generally more interpretable and data-efficient, although their predictive performance and robustness may be more limited and should be confirmed through empirical validation.

The future development of assessment methods and systems is linked to interdisciplinary research into the potential use of AI technologies. This research is noteworthy because it leads to an expansion of the set of available methods. The most well-known are neural network models, which are modeled on the functioning of the nervous system and the human brain. These relatively simple mathematical models prove to be highly useful for a variety of applications because they generalize knowledge acquired from the analysis of empirical data. Their significant utility in constructing assessment systems stems from their adaptability and flexibility in handling various data types, as well as their robustness to noisy, distorted, and even incomplete data. Among the drawbacks of classical neural networks is the difficulty in reliably justifying the obtained results.

Due to the shortcomings of neural network models, research into the applications of explainable neural network models is gaining increasing importance. These models are attracting significant attention due to their unique features. Among the features of systems based on such models is greater clarity and traceability in the decision-making process, as they identify the factors and input data that significantly influenced the final prediction. This reduces the ‘black box’ effect and helps build trust in AI-assisted decision-making systems.

It is justified to undertake various actions aimed at ensuring the sustainable development of the higher education sector, including those that lead to a continuous increase in stakeholder engagement, awareness, and trust [38]. New AI methods can support this by helping us detect judgment errors and reducing expert bias, thereby enabling the improvement of trustworthy assessment models. Furthermore, the ability to explain model decisions increases the acceptability of obtained results and, in some cases, compliance with selected regulations, such as the General Data Protection Regulation (GDPR) or AI regulations. The explanations generated by such models help experts from various fields better understand and verify results, more effectively integrate AI into human decision-making processes and increase trust in automated recommendations in human–machine collaboration. Ultimately, such systems can significantly contribute to building public trust in AI based on clear explanations of suggested decisions, which is fundamental to the widespread implementation and social acceptance of AI solutions.

3. Methodology

In this study, the FCNN-based model was used to predict GPA values because, compared to other neural network architectures, this architecture is computationally less expensive. It does not require complex mathematical operations, is based on a simple layered structure, and is easy to implement. Moreover, compared with more advanced machine learning approaches such as Random Forest, Extreme Gradient Boosting (XGBoost), Bayesian Additive Regression Trees (BART), or more complex neural network architectures, the FCNN provides a suitable and interpretable baseline for initial computational experiments. Although these alternative models may offer strong predictive performance, their use often involves additional tuning, higher model complexity, or more difficult interpretation of the prediction process. Therefore, the FCNN was selected as an appropriate starting point for forecasting GPA values, particularly because the main methodological focus of this study was not an exhaustive comparison of predictive algorithms, but the integration of a standard neural network model with the LIME explainability method.

To perform calculations related to the neural model, program codes were written using Python 3.13.7, a flexible, high-level, and object-oriented programming language [39]. To explain the prediction results, the LIME method was employed, a software implementation of which was developed based on the selected implementation [40].

Figure 1 presents a two-stage architecture for predicting and interpreting GPA values using an FCNN and the LIME method. In the first stage, the FCNN model is trained on standardized input data to generate GPA predictions for higher education institutions. In the second stage, LIME is applied to individual predictions to provide local, interpretable explanations by identifying the most influential features and their contributions.

The LIME methodology for explaining neural network predictions can be summarized through the following steps [35,41,42]:

An individual instance is first selected as the target of the explanation and justification process.
A synthetic dataset is then created by introducing random perturbations in the vicinity of the selected instance.
The trained neural network, treated as a black box model, is applied to the perturbed instances to obtain corresponding predictions.
Each perturbed sample is assigned a proximity-based weight, reflecting its similarity to the original instance and determining its relative influence in the explanation process.
The most relevant features contributing to the neural network’s prediction are identified based on the perturbed data.
A simplified and interpretable surrogate model is subsequently trained using the weighted perturbed samples.
Finally, the local behavior of the neural network is explained by analyzing the feature contributions derived from the surrogate model.

The calculations utilized empirical training and test data concerning selected universities in the UK, previously introduced for analyses using the tree-based machine learning model—Bayesian Additive Regression Trees (BART) [43]. Using this dataset for research provides a solid foundation for future studies with various AI models and for conducting comparative analyses of the resulting forecast outcomes.

The dataset consisted of 109 entries (one per university) with 18 features and GPA values. All features were categorized into one of the three categories: Research Productivity Metrics (e.g., Average h-index), Financial and Staffing Metrics (e.g., University income), and Student Profile (e.g., Percentage of disabled students). Around one-third of the entries were extracted to the testing set, and the following variables were used similarly to previous research by Balbuena [43,44]: University GPA (dependent variable), Research Productivity Metrics (4): Average h-index, Citation impact, Percentage of PhD-holding staff, and Web of Science indexed publications; Financial and Staffing Metrics (4): University income, Spending per student, REF-submitted full-time equivalent researchers, and Student-staff ratio; Student Profile (10): Percentage of non-EU postgraduates, Entry tariff average, Percentage of disabled students, Socially disadvantaged student share, Percentage of state school student, ADHD student share, Graduate employment rate (in UK, after 6 months), Average graduate salary, Satisfaction scores of students, and Career prospects rating. 10% of the dataset was used as the validation set. The remaining samples were added to the training set. Before the training, the features and GPA were normalized to a mean of 0 and a variance of 1.

The FCNN model used for the experiments contained 4 FC layers: three layers with 64, 32, 16 neurons and Rectified Linear Unit (ReLU) activation, respectively, and a final layer with one neuron and linear activation for performing regression of arbitrary values. The popular ReLU function was chosen because it performs well in deep learning, is relatively simple and computationally undemanding, is fast in analysis, does not cause gradient vanishing during backpropagation, and allows for effective analysis of nonlinear and complex dependencies in datasets (Figure 2).

The model was trained to predict GPA values based on 18 university-related features. Training was conducted for 50 epochs with an Adam optimizer (learning rate 0.0002), Mean Squared Error (MSE) loss, and a batch size of 4. After each epoch, the training set was randomly shuffled. The result of the training was the model with the lowest validation loss after one of the training epochs.

Having completed the training, the LIME method was applied to the chosen model for each university in the testing set. LIME is a technique for explaining the predictions of a black box model for a given sample by locally approximating the model’s decision function [45]. After passing the model and the sample, LIME generates synthetic samples around the given one and uses the model to acquire predictions for those samples. Then, using the linear model and the predictions, LIME achieves local explanations of the black box model’s decisions.

4. Results and Discussion

The dataset contains information on 109 UK universities and includes 18 explanatory variables describing institutional, faculty, and student characteristics, as well as the target variable GPA), which reflects research quality and performance. Each observation in the dataset corresponds to a single university and is represented as a structured tabular record. Table 2 presents an example of a single data instance used as input to the model.

The dataset is stored in tabular format (CSV/DTA), with rows corresponding to universities and columns representing numerical features. Prior to model training, all input variables were standardized to ensure comparability and stable convergence during optimization.

To enhance interpretability, the LIME method is applied to the trained FCNN model. For each selected university in the test set, LIME generates synthetic data points in the local neighborhood of the instance and obtains predictions from the neural network. Based on these locally sampled data, a simple linear model is fitted to approximate the behavior of the complex model. As a result, LIME provides local explanations of predictions by identifying the most influential features and their positive or negative contributions to the predicted GPA value.

The constructed neural model achieved an MSE of 0.13 and an R-squared value of 0.79. The obtained prediction results, shown in Figure 3, indicate that the model relatively accurately reflects the relationship between the input variables and the output GPA. The results are slightly worse than those obtained with the BART model [43], probably because FCNN models are prone to overfitting small datasets. The use of small tabular datasets entails significant methodological risks that must be considered when refining the solutions employed. It may be useful, for example, to draw on advances in the field of small-sample uncertainty assessment—specifically, its theoretical foundations, application scenarios, and implementation strategies [46].

Figure 4 depicts exemplary explanations of GPA predictions for Coventry University (GPA = 2.67) and University of Westminster (GPA = 2.72). The explanations provide insight into the possible reasons for the GPA value of each university. For example, the percentage of faculty with a Ph.D. lower than 28% generally decreases the GPA by 0.3. If the university increased this value, its REF assessment should increase drastically. On the other hand, a small student-to-staff ratio (below 15.64) was marked as a positive aspect for a higher GPA, resulting in a 0.1 increase in the results. For both universities, low citation impact was noted as a significant factor in lowering the results. The explanations offer valuable insight into the potential impact of various university features on their REF assessment.

The obtained results demonstrate the practical usefulness of such models in supporting analysis and evaluation processes, as well as in predicting and determining a university’s competitive position. The results can be considered a good introduction and encouragement to continue research into comparative analyses using other types of neural predictive models with explanations of the results. There is a general consensus among researchers regarding the feasibility of using explainability as a basis for designing and implementing strategies to build, maintain, and restore trust [47]. The introduction of such solutions is generally desirable given the limited explainability and interpretability of typical neural network models, the erosion of digital trust, the occurrence of confidentiality issues, and low trust in institutions that use AI, as well as the related serious social consequences and ethical problems accompanying decision-making processes [48]. Improving trustworthy prediction and assessment systems is consistent with the concept of sustainability because reliable, transparent, and accountable decision-support tools enable long-term, equitable, and well-informed management of environmental, social, and economic resources.

The presented solution, however, requires an additional critical perspective, recognizing that interpreting and justifying results cannot be limited to the use of explanatory models closely related only to neural network models. Research on AI models requires a broader scope of analysis, greater clarity, and socio-structural explanations that go beyond simple model interpretation and understanding of the machine learning model itself [49]. It is also necessary to take into account the opinions that the growing interest in XAI and the number of proposed solutions may negatively affect the perceived usefulness of AI models, the accuracy of predictions generated by AI, and consequently, reduce the interest in modifying organizational processes made in terms of implementing intelligent systems [50].

The experimental design adopted in this study should be interpreted in light of well-documented challenges in reliably evaluating machine learning models. In particular, selecting the epoch with the lowest loss on the test set may introduce optimistic bias, as the test data are no longer strictly independent of the model selection process. This issue has been extensively discussed in the literature on model assessment, where insufficient separation between training, validation, and testing stages has been shown to compromise the reliability and reproducibility of reported results [51,52].

Best practices in machine learning recommend reserving the test set exclusively for final performance estimation, while model selection and hyperparameter tuning should rely on validation data or nested cross-validation strategies [53,54]. Prior studies emphasize that even indirect adaptation to the test set can lead to systematically overestimated performance, particularly in data-limited and application-oriented studies [55,56].

From the perspective of trustworthy AI, methodological soundness is a fundamental prerequisite for model clarity and reliability. Recent frameworks for trustworthy AI stress that trustworthiness cannot be reduced to predictive accuracy alone, but must also encompass robustness and reproducibility of the experimental protocol [57,58]. In this regard, evaluation choices directly affect the credibility of downstream explainability and interpretation analyses.

XAI methods, such as feature attribution and surrogate modeling, are increasingly used to enhance transparency and user trust in AI systems [59,60]. Popular approaches, including LIME [45,61,62] and SHapley Additive exPlanations (SHAP) [63,64], provide post hoc explanations of model predictions, but their reliability critically depends on the stability and generalization of the underlying predictive models. Recent studies have shown that explanation methods may be sensitive to data perturbations, model overfitting, and evaluation bias, thereby reinforcing the need for rigorous, unbiased model assessment [65]. Incorporating uncertainty estimates alongside explainable predictions has been shown to enhance the interpretability and robustness of model outputs, particularly in safety-critical or high-stakes application domains [66].

Against this backdrop, the present study should be viewed primarily as an application-oriented investigation of established AI methods on a domain-specific dataset, with an emphasis on interpretability and trustworthiness rather than methodological novelty. The reported results are therefore intended to provide indicative insights into model behavior and explanation mechanisms, serving as a foundation for more rigorous, transparent, and trustworthy evaluation protocols in future research.

At the end of the discussion of the research results, it is worth briefly addressing the research questions posed. Regarding the first research question, it is evident that neural network models can capture complex, nonlinear relationships in empirical data on the competitiveness of public universities. However, when considered in isolation, these models exhibit significant limitations in terms of trustworthiness. Their black box nature limits insight into the underlying decision-making mechanisms, reducing confidence in the predictions and limiting their direct applicability for strategic assessment in the public higher education sector. Consequently, while neural networks may offer predictive potential, their outputs alone should be interpreted with caution.

Addressing the second research question highlights the added value of explainable AI techniques in overcoming these limitations. By providing transparent, interpretable explanations of neural network predictions, XAI methods enable the identification of the most influential factors shaping public university competitiveness and a better understanding of model behavior. This enhanced transparency supports more trustworthy and sustainable assessments, enabling stakeholders to critically evaluate results, increase confidence in AI-assisted analyses, and use the findings as an informed input to decision-making rather than as opaque predictions.

5. Conclusions

The usefulness of classic data analysis and prediction methods is often limited, and therefore, approaches based on AI technologies, particularly neural network models, are sought. However, such models should be developed with the crucial need to explain obtained results and build trust in the applications of intelligent systems in mind. AI systems supporting the prediction and determination of university competitiveness should focus not only on solving complex and multifaceted problems but also on creating conditions for good relations between various stakeholder groups, including regulatory institutions, research funding organizations, and representatives of the scientific community. Building trust in intelligent assessment models is crucial for the functioning of higher education institutions and creating friendly working conditions for researchers, who are subject to increasing pressure to evaluate their work results in terms of bibliometric indicators, but above all, parameters related to the university environment and research quality measures that should be useful in the non-academic world.

Current evaluation systems are not only expensive but also often raise questions about their objectivity and the lack of clear justification for the results obtained. Existing approaches are largely insufficient, and opportunities are being explored to expand available solutions through the use of AI technologies, which often operate as a black box. This research is interesting because it highlights new methods that could improve the quality of predictive models and increase trust in evaluation systems.

5.1. Methodological Contributions

This paper presents a proposal for utilizing a neural network model, accompanied by explanations of its predictions, in trustworthy and sustainable assessments of university competitiveness. The importance of this type of evaluation increases each year, and it is worth exploring alternative solutions to evaluation systems based primarily on expert judgment, which may be subject to bias.

Supporting evaluation decisions using methods based on machine learning and neural networks is a sensible idea. However, the introduction of this type of solution is hindered mainly due to the lack of transparent and credible principles for justifying prediction results. The research proved the feasibility and practical usefulness of the FCNN-based model for predicting the GPA index. It has been demonstrated that the LIME method can also be used to efficiently explain the results of GPA prediction useful in determining the competitive position of universities. Predictive models using neural networks with explanations can now be utilized by research impact assessors and rated institutions, enabling them to predict GPA values and adjust their research plans accordingly. Efficient explanations of the prediction results can increase confidence in the models used and the obtained predictions. The research is preliminary and requires further development using larger datasets and more advanced neural network models and methods to explain the results.

It is also worth conducting an analysis aimed at dimensionality reduction and effective feature selection to select a small number of the most important independent variables that significantly influence the dependent variable, while maintaining the greatest possible predictive power of the selected machine learning models. Consideration should also be given to additionally involving experts in the process of justifying results and improving human–computer interactions.

In reference to the research questions posed and based on the conducted empirical analyses, neural network models demonstrate potential in predicting the competitiveness of public universities, but their black box nature significantly limits transparency and confidence in the results when used in isolation. The findings further show that integrating explainable AI techniques substantially enhances the interpretability and trustworthiness of these assessments, enabling a more transparent and reliable understanding of the factors driving public university competitiveness.

5.2. Limitations and Directions for Future Research

Several limitations of this study should be acknowledged. Most notably, the evaluation protocol does not fully conform to the most rigorous standards for model selection and testing, as the test set was partially involved in the epoch selection process. While this choice was driven by practical constraints, it may have led to optimistic performance estimates.

Future work will address this limitation by adopting more robust validation strategies, including explicit separation of training, validation, and holdout test sets or nested cross-validation schemes. From a trustworthy AI perspective, particular emphasis will be placed on methodological transparency, robustness, and reproducibility, alongside predictive accuracy.

Further research will also extend the current analysis by incorporating more advanced explainability and uncertainty-estimation techniques, enabling a deeper assessment of model reliability and decision-making behavior. These directions are expected to strengthen the contribution of AI-based methods as trustworthy and explainable tools within the considered application domain.

Funding

Scientific work and publication financed from the state budget under the program of the Minister of Science and Higher Education in Poland, called ‘Science for Society II’ project no. NdS-II/SP/0409/2023/01; the amount of cofinancing and total value of the project PLN 280 500,00.

Institutional Review Board Statement

The data analyzed in this study are publicly available and contain no identifiable human information. Ethical review and approval were waived for this study according to the U.S. Common Rule (45 CFR 46).

Informed Consent Statement

Informed consent for participation is not required as per U.S. Common Rule (45 CFR 46).

Data Availability Statement

The author used data from [44].

Acknowledgments

While preparing this article, the author used Grammarly software for Windows, version 1.2.128.1582 to edit and proofread the manuscript. After using this program, the author reviewed and edited the content as required and takes full responsibility for the content of the publication.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FCNN	Fully Connected Neural Network
LIME	Local Interpretable Model-agnostic Explanations
REF	Research Excellence Framework
GPA	Grade Point Average
UoAs	Units of Assessment
ERA	Excellence in Research for Australia
PBRF	Performance-Based Research Fund
RCN	Research Council of Norway
RAE	Research Assessment Exercise
GDPR	General Data Protection Regulation
BART	Bayesian Additive Regression Trees
ReLU	Rectified Linear Unit
MSE	Mean Squared Error
MCDA	Multi-Criteria Decision Analysis
XAI	Explainable AI

References

Wood, T.; Wilner, A. Research Impact Assessment: Developing and Applying a Viable Model for the Social Sciences. Res. Eval. 2024, 35, rvae022. [Google Scholar] [CrossRef]
Kosmützky, A.; Meier, F. Competing: An Analytical Framework and Application in Higher Education. Stud. High. Educ. 2025, 51, 18–38. [Google Scholar] [CrossRef]
Omodei, E.; De Domenico, M.; Arenas, A. Evaluating the Impact of Interdisciplinary Research: A Multilayer Network Approach. Netw. Sci. 2017, 5, 235–246. [Google Scholar] [CrossRef]
Pinar, M.; Unlu, E. Evaluating the Potential Effect of the Increased Importance of the Impact Component in the Research Excellence Framework of the UK. Br. Educ. Res. J. 2020, 46, 140–160. [Google Scholar] [CrossRef]
Traag, V.A.; Waltman, L. Systematic Analysis of Agreement between Metrics and Peer Review in the UK REF. Palgrave Commun. 2019, 5, 29. [Google Scholar] [CrossRef]
Pride, D.; Knoth, P. Peer Review and Citation Data in Predicting University Rankings, a Large-Scale Analysis. In Digital Libraries for Open Knowledge; Méndez, E., Crestani, F., Ribeiro, C., David, G., Lopes, J.C., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 11057, pp. 195–207. [Google Scholar]
Bendickson, J. Building Entrepreneurship Research for Impact: Scope, Phenomenon, and Translation. J. Small Bus. Manag. 2021, 59, 535–543. [Google Scholar] [CrossRef]
Wickert, C.; Post, C.; Doh, J.P.; Prescott, J.E.; Prencipe, A. Management Research That Makes a Difference: Broadening the Meaning of Impact. J. Manag. Stud. 2021, 58, 297–320. [Google Scholar] [CrossRef]
Marginson, S.; Yang, L. Higher Education and Public Good in England. High Educ. 2025, 89, 183–203. [Google Scholar] [CrossRef]
Zacharewicz, T.; Lepori, B.; Reale, E.; Jonkers, K. Performance-Based Research Funding in EU Member States—A Comparative Assessment. Sci. Public Policy 2019, 46, 105–115. [Google Scholar] [CrossRef]
Menter, M.; Lehmann, E.E.; Klarl, T. In Search of Excellence: A Case Study of the First Excellence Initiative of Germany. J. Bus. Econ. 2018, 88, 1105–1132. [Google Scholar] [CrossRef]
Pinar, M.; Horne, T.J. Assessing Research Excellence: Evaluating the Research Excellence Framework. Res. Eval. 2022, 31, 173–187. [Google Scholar] [CrossRef]
Dotti, N.F.; Walczyk, J. What Is the Societal Impact of University Research? A Policy-Oriented Review to Map Approaches, Identify Monitoring Methods and Success Factors. Eval. Program Plan. 2022, 95, 102157. [Google Scholar] [CrossRef] [PubMed]
Liu, P.; Lai, Y.; Liu, D. Artificial Intelligence Research in Organizations: A Bibliometric Approach. Cogent Bus. Manag. 2024, 11, 2408439. [Google Scholar] [CrossRef]
Forero-Corba, W.; Negre Bennasar, F. Técnicas y Aplicaciones Del Machine Learning e Inteligencia Artificial En Educación: Una Revisión Sistemática. RIED-Rev. Iberoam. Educ. Distancia 2023, 27, 209–253. [Google Scholar] [CrossRef]
Cheng, Y.; Weng, S.; Cui, Z. An Exploration of the Path for Artificial Intelligence to Assist in the Competitive Performance Evaluation in University. In Advances in Transdisciplinary Engineering; Hu, Z., Zhang, Q., He, M., Wang, C., Yanovsky, F., Eds.; IOS Press: Amsterdam, The Netherlands, 2026. [Google Scholar]
Blackburn, R.; Dibb, S.; Tonks, I. Business and Management Studies in the United Kingdom’s 2021 Research Excellence Framework: Implications for Research Quality Assessment. Br. J. Manag. 2024, 35, 434–448. [Google Scholar] [CrossRef]
Li, D.; Lo, W.Y.W.; Yang, R. Unpacking the Discourse Surrounding the Impact Agenda in the Hong Kong Research Assessment Exercise 2020. Res. Eval. 2024, 33, rvae034. [Google Scholar] [CrossRef]
Gunn, A.; Mintrom, M. Higher Education Policy Change in Europe: Academic Research Funding and the Impact Agenda. Eur. Educ. 2016, 48, 241–257. [Google Scholar] [CrossRef]
Grzeszczyk, T.A. Developing Methods for Assessing the Social Impact of Scientific Study. In European Conference on Research Methodology for Business and Management Studies; Academic Conferences International Limited Curtis Farm: Reading, UK, 2024; Volume 23, pp. 121–127. [Google Scholar] [CrossRef]
Reale, E.; Zinilli, A. Evaluation for the Allocation of University Research Project Funding: Can Rules Improve the Peer Review? Res. Eval. 2017, 26, 190–198. [Google Scholar] [CrossRef]
Radovanović, M.; Jovčić, S.; Petrovski, A.; Cirkin, E. Evaluation of University Professors Using the Spherical Fuzzy AHP and Grey MARCOS Multi-Criteria Decision-Making Model: A Case Study. Spectr. Decis. Mak. Appl. 2025, 2, 197–217. [Google Scholar] [CrossRef]
Penfield, T.; Baker, M.J.; Scoble, R.; Wykes, M.C. Assessment, Evaluations, and Definitions of Research Impact: A Review. Res. Eval. 2014, 23, 21–32. [Google Scholar] [CrossRef]
Crano, W.D.; Brewer, M.B.; Lac, A. Principles and Methods of Social Research, 3rd ed.; Routledge, Taylor & Francis Group: New York, NY, USA; London, UK, 2015. [Google Scholar]
Krüger, A.K.; Petersohn, S. From Research Evaluation to Research Analytics. The Digitization of Academic Performance Measurement. Valuat. Stud. 2022, 9, 11–46. [Google Scholar] [CrossRef]
Wilsdon, J. The Metric Tide: Independent Review of the Role of Metrics in Research Assessment and Management; Online-Ausg.; Sage Publications: Los Angeles, CA, USA, 2016. [Google Scholar]
De Silva, P.U.K.; K. Vance, C. Assessing the Societal Impact of Scientific Research. In Scientific Scholarly Communication; Fascinating Life Sciences; Springer International Publishing: Cham, Switzerland, 2017; pp. 117–132. [Google Scholar]
Fenby-Hulse, K.; Heywood, E.; Walker, K. (Eds.) Research Impact and the Early-Career Researcher; Routledge: Abingdon, UK; New York, NY, USA, 2019. [Google Scholar]
Vinkler, P. The Evaluation of Research by Scientometric Indicators; Chandos Pub: Oxford, UK, 2010. [Google Scholar]
Ter Bogt, H.J.; Scapens, R.W. Performance Management in Universities: Effects of the Transition to More Quantitative Measurement Systems. Eur. Account. Rev. 2012, 21, 451–497. [Google Scholar] [CrossRef]
Daraio, C.; Bonaccorsi, A.; Simar, L. Rankings and University Performance: A Conditional Multidimensional Approach. Eur. J. Oper. Res. 2015, 244, 918–930. [Google Scholar] [CrossRef]
Abulibdeh, A.; Baya Chatti, C.; Alkhereibi, A.; El Menshawy, S. A Scoping Review of the Strategic Integration of Artificial Intelligence in Higher Education: Transforming University Excellence Themes and Strategic Planning in the Digital Era. Eur. J. Educ. 2025, 60, e12908. [Google Scholar] [CrossRef]
Alshkeili, H.M.H.A.; Almheiri, S.J.; Khan, M.A. Privacy-Preserving Interpretability: An Explainable Federated Learning Model for Predictive Maintenance in Sustainable Manufacturing and Industry 4.0. AI 2025, 6, 117. [Google Scholar] [CrossRef]
Lee, J.; Rew, J. Vision-Language Model-Based Local Interpretable Model-Agnostic Explanations Analysis for Explainable In-Vehicle Controller Area Network Intrusion Detection. Sensors 2025, 25, 3020. [Google Scholar] [CrossRef] [PubMed]
Grzeszczyk, T.A.; Grzeszczyk, M.K. Justifying Short-Term Load Forecasts Obtained with the Use of Neural Models. Energies 2022, 15, 1852. [Google Scholar] [CrossRef]
Saarela, M.; Podgorelec, V. Recent Applications of Explainable AI (XAI): A Systematic Literature Review. Appl. Sci. 2024, 14, 8884. [Google Scholar] [CrossRef]
Grzeszczyk, T.A. Neural Classification of Research Scientific Excellence in Universities. Procedia Comput. Sci. 2025, 270, 5138–5146. [Google Scholar] [CrossRef]
Raji, A.; Hassan, A. Sustainability and Stakeholder Awareness: A Case Study of a Scottish University. Sustainability 2021, 13, 4186. [Google Scholar] [CrossRef]
Python Software Foundation 2025. Available online: http://python.org (accessed on 22 September 2025).
Ribeiro, M.T. Lime: Explaining the Predictions of Any Machine Learning Classifier. 2025. Available online: https://github.com/marcotcr/lime (accessed on 25 September 2025).
Ribeiro, M.T.; Singh, S.; Guestrin, C. Anchors: High-Precision Model-Agnostic Explanations. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar] [CrossRef]
Shams Amiri, S.; Mottahedi, S.; Lee, E.R.; Hoque, S. Peeking inside the Black-Box: Explainable Machine Learning Applied to Household Transportation Energy Consumption. Comput. Environ. Urban Syst. 2021, 88, 101647. [Google Scholar] [CrossRef]
Balbuena, L.D. The UK Research Excellence Framework and the Matthew Effect: Insights from Machine Learning. PLoS ONE 2018, 13, e0207919. [Google Scholar] [CrossRef]
Balbuena, L.D. UK REF 2014 Analysis Data and R Script V.2. 2025. Available online: https://www.protocols.io/view/uk-ref-2014-analysis-data-and-r-script-j8nlk55y6l5r/v2 (accessed on 5 September 2025).
Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: New York, NY, USA, 2016; pp. 1135–1144. [Google Scholar]
Xia, H.; Ding, Y.; Chen, W.; Zhang, J.; Li, H.; Zhao, H.; Song, R. Measurement Uncertainty Evaluation with Small Samples: A Review and Prospect. Measurement 2026, 258, 119031. [Google Scholar] [CrossRef]
Papagni, G.; De Pagter, J.; Zafari, S.; Filzmoser, M.; Koeszegi, S.T. Artificial Agents’ Explainability to Support Trust: Considerations on Timing and Context. AI Soc. 2023, 38, 947–960. [Google Scholar] [CrossRef]
Toderas, M. Artificial Intelligence for Sustainability: A Systematic Review and Critical Analysis of AI Applications, Challenges, and Future Directions. Sustainability 2025, 17, 8049. [Google Scholar] [CrossRef]
Smart, A.; Kasirzadeh, A. Beyond Model Interpretability: Socio-Structural Explanations in Machine Learning. AI Soc. 2025, 40, 2045–2053. [Google Scholar] [CrossRef]
Wulff, K.; Finnestrand, H. Creating Meaningful Work in the Age of AI: Explainable AI, Explainability, and Why It Matters to Organizational Designers. AI Soc. 2024, 39, 1843–1856. [Google Scholar] [CrossRef]
Varma, S.; Simon, R. Bias in Error Estimation When Using Cross-Validation for Model Selection. BMC Bioinform. 2006, 7, 91. [Google Scholar] [CrossRef] [PubMed]
Raschka, S.; Liu, Y.; Mirjalili, V. Machine Learning with PyTorch and Scikit-Learn: Develop Machine Learning and Deep Learning Models with Python; Packt Publishing: Birmingham, UK, 2022. [Google Scholar]
Heaton, J. Ian Goodfellow, Yoshua Bengio, and Aaron Courville: Deep Learning. Genet. Program. Evolvable Mach. 2018, 19, 305–307. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer Series in Statistics; Springer: New York, NY, USA, 2009. [Google Scholar]
Roberts, M.; Driggs, D.; Thorpe, M.; Gilbey, J.; Yeung, M.; Ursprung, S.; Aviles-Rivero, A.I.; Etmann, C.; McCague, C.; Beer, L.; et al. Common Pitfalls and Recommendations for Using Machine Learning to Detect and Prognosticate for COVID-19 Using Chest Radiographs and CT Scans. Nat. Mach. Intell. 2021, 3, 199–217. [Google Scholar] [CrossRef]
Kapoor, S.; Narayanan, A. Leakage and the Reproducibility Crisis in Machine-Learning-Based Science. Patterns 2023, 4, 100804. [Google Scholar] [CrossRef] [PubMed]
Floridi, L.; Cowls, J.; Beltrametti, M.; Chatila, R.; Chazerand, P.; Dignum, V.; Luetge, C.; Madelin, R.; Pagallo, U.; Rossi, F.; et al. AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations. Minds Mach. 2018, 28, 689–707. [Google Scholar] [CrossRef]
Ethics Guidelines for Trustworthy AI|Shaping Europe’s Digital Future. Available online: https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai (accessed on 30 January 2026).
Doshi-Velez, F.; Kim, B. Towards A Rigorous Science of Interpretable Machine Learning. arXiv 2017, arXiv:1702.08608. [Google Scholar] [CrossRef]
Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Pedreschi, D.; Giannotti, F. A Survey Of Methods For Explaining Black Box Models. ACM Comput. Surv. 2018, 51, 1–42. [Google Scholar] [CrossRef]
Wang, Y.; Fang, X.; Xu, Z.; Li, J.; Wang, L. Exploring the Impact of Explainability in Large Language Model (LLM) Applications on User Experience. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 26 April 2025; ACM: New York, NY, USA, 2025; pp. 1–8. [Google Scholar]
Spannaus, A.; Hanson, H.A.; Tourassi, G.; Penberthy, L. Topological Interpretability for Deep Learning. In Proceedings of the Platform for Advanced Scientific Computing Conference, Zurich, Switzerland, 3 June 2024; ACM: New York, NY, USA, 2024; pp. 1–11. [Google Scholar]
Hwang, H.; Bell, A.; Fonseca, J.; Pliatsika, V.; Stoyanovich, J.; Whang, S.E. SHAP-Based Explanations Are Sensitive to Feature Representation. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency, Athens, Greece, 23 June 2025; ACM: New York, NY, USA, 2025; pp. 1588–1601. [Google Scholar]
Lundberg, S.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 4768–4777. [Google Scholar]
Adebayo, J.; Gilmer, J.; Muelly, M.; Goodfellow, I.; Hardt, M.; Kim, B. Sanity Checks for Saliency Maps. In Proceedings of the 32nd International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2018; pp. 9525–9536. [Google Scholar]
Begoli, E.; Bhattacharya, T.; Kusnezov, D. The Need for Uncertainty Quantification in Machine-Assisted Medical Decision Making. Nat. Mach. Intell. 2019, 1, 20–23. [Google Scholar] [CrossRef]

Figure 1. Two-stage architecture for GPA prediction and interpretation (FCNN and LIME) (source: own study).

Figure 2. GPA prediction using the FCNN model with LIME-based explanations (source: own study).

Figure 3. Predicted and actual GPA values for the test set (source: own study).

Figure 4. Exemplary explanations of the GPA prediction by the FCNN model on two universities from the testing set (source: own study).

Table 1. Comparison of selected evaluation and prediction methods.

	Expected Predictive Performance	Interpretability/Explainability	Data Efficiency	Robustness/Low Risk of Misleading Results	Scalability	Prognostic/Decision Utility
Expert-based MCDA	Low-Moderate	High	High	Low-Moderate	Low-Moderate	Moderate-High
FCNN, MLP	Moderate-High	Low	Low	Moderate	High	Moderate
Fuzzy Inference Systems	Moderate	High	Moderate–High	Moderate	Moderate	Moderate
Case-Based Reasoning	Moderate	High	Moderate	Moderate	Low-Moderate	Low-Moderate
FCNN with LIME	High	Moderate-High	Low	Moderate	Moderate	High, if validated and calibrated

Source: own study.

Table 2. Sample input data for the FCNN model for selected variables.

Variable	Description	Example Value
GPA	Target variable: overall REF GPA	3.22
entry_tariff	Average entry tariff indicating the academic qualification of students	520
websci_docs	Total number of Web of Science publications	75,168
pctStateSchools	Percentage of students from state schools	65.7
univ_income	Total institutional income	940,019
stud_staff_ratio	Student-to-staff ratio	10.10
pct_phd_faculty	Percentage of academic staff holding a PhD	55.24
cite_impact	Average citation impact	1.70

Source: based on [44].

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Grzeszczyk, T.A. Trustworthy Assessment of University Competitiveness Using a Neural Network Model. Information 2026, 17, 536. https://doi.org/10.3390/info17060536

AMA Style

Grzeszczyk TA. Trustworthy Assessment of University Competitiveness Using a Neural Network Model. Information. 2026; 17(6):536. https://doi.org/10.3390/info17060536

Chicago/Turabian Style

Grzeszczyk, Tadeusz A. 2026. "Trustworthy Assessment of University Competitiveness Using a Neural Network Model" Information 17, no. 6: 536. https://doi.org/10.3390/info17060536

APA Style

Grzeszczyk, T. A. (2026). Trustworthy Assessment of University Competitiveness Using a Neural Network Model. Information, 17(6), 536. https://doi.org/10.3390/info17060536

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Trustworthy Assessment of University Competitiveness Using a Neural Network Model

Abstract

1. Introduction

2. National Research Assessment Systems and University Competitive Position

2.1. Research Evaluation as a Basis for Public Fund Allocation

2.2. Competitive Position of Universities

2.3. The Increasing Role of Evaluation and Prediction Methods

3. Methodology

4. Results and Discussion

5. Conclusions

5.1. Methodological Contributions

5.2. Limitations and Directions for Future Research

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI