A Novel Stacked Generalization Ensemble-Based Hybrid SGM-BRR Model for ESG Score Prediction

Wang, Zhie; Wang, Xiaoyong; Liu, Xuexin; Zhang, Jun; Xu, Jingde; Ma, Jun

doi:10.3390/su16166979

Open AccessArticle

A Novel Stacked Generalization Ensemble-Based Hybrid SGM-BRR Model for ESG Score Prediction

by

Zhie Wang

¹

,

Xiaoyong Wang

^1,*

,

Xuexin Liu

²,

Jun Zhang

¹

,

Jingde Xu

³ and

Jun Ma

¹

Management Engineering School, Capital University of Economics and Business, Beijing 100070, China

²

College of Business Administration, Capital University of Economics and Business, Beijing 100070, China

³

Institute of Higher Education, North China Institute of Science and Technology, Langfang 065201, China

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(16), 6979; https://doi.org/10.3390/su16166979

Submission received: 21 June 2024 / Revised: 19 July 2024 / Accepted: 23 July 2024 / Published: 14 August 2024

Download

Browse Figures

Versions Notes

Abstract

Recently, financial institutions and investors have placed an increasing emphasis on ESG (environmental, social, and governance) as a principal indicator for the evaluation of companies. However, the current ESG scoring systems lack uniformity and are often subjective. It is of great importance to be able to make accurate predictions regarding the ESG scores of corporations. A Stacked Generalization Model that employs Random Forest (RF), Gradient Boosting Decision Tree (GBDT), eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM) as base learners, with Bayesian Ridge Regression (BRR) as the meta-model for integrating the predictions of these diverse models is proposed. The goal is to develop an ESG score prediction model for Chinese companies. The experimental data set encompasses Chinese A-share listed companies from 2012 to 2020. The Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and coefficient of determination (R²) are employed for model evaluation and are compared with seven benchmark models. The results demonstrate that SGM-BRR reduces the RMSE by 18.4%, 17.3%, 13.7%, and 76.1%, the MAE by 15.4%, 18.4%, 15.8%, and 68.4%, and increases the R² by 2%, 1.4%, 2%, and 6% for ESG, E, S, and G scores, respectively. Furthermore, the model’s performance is validated across different industries, with SGM-BRR exhibiting the most optimal performance of RMSE, MAE, and R² in 27, 25, and 27 groups, respectively. Consequently, the model demonstrates broad applicability and stability performance in ESG score prediction.

Keywords:

ESG score; prediction; ensemble learning algorithms; Stacked Generalization Model; Bayesian Ridge Regression

1. Introduction

Over the past few decades, ESG standards have evolved from a marginal concern to a central issue in global finance and investment [1]. In 2006, the United Nations launched the Principles for Responsible Investment (PRI), which was followed by the adoption of the 2030 Agenda for Sustainable Development in 2015 [2]. This marks the entry of ESG investing into the mainstream, becoming an important tool for corporate long-term value and sustainability [3,4]. ESG scores not only help investors make more informed investment decisions, but companies can analyze their ESG performance to identify potential avenues for improvement, thereby enhancing their market competitiveness and public image. In China, the Securities Regulatory Commission issued the “Guiding Opinions on Establishing a Green Financial System” in 2016, which was the first proposal to establish a green financial system and promote green investment [5]. Furthermore, the Chinese government has underscored the significance of ESG principles in the 14th Five-Year Plan for Ecological Environment Protection. Among them, the low-carbon city pilot policy is a crucial instrument for advancing ESG practices in China. Research indicates that the pilot policy of low-carbon cities can markedly enhance enterprises’ green innovation capacity and resource utilization efficiency by establishing rigorous carbon emission standards, disseminating clean energy technologies, and supporting green buildings [6,7]. These policies not only facilitate the advancement of enterprises in environmental governance, but also reduce the costs of enterprises in R&D and the application of green technologies through fiscal and tax incentives and financial support, thereby stimulating the enthusiasm of enterprises to participate in the low-carbon transformation [8,9]. As Chinese companies continue to grow in the global marketplace, it is critical to accurately measure a company’s ESG score. Currently, there is a plethora of ESG standards in existence, including the GRI, SASB, and TCFD standards, among others. These standards exhibit notable discrepancies in their index selection, scoring methodologies, and data requirements. This presents a challenge to the development and application of the ESG model. Secondly, it can be argued that China’s ESG scoring system is not without flaws, and that the scoring process is open to subjective interpretation. The majority of companies fail to provide comprehensive data disclosure, which presents a significant obstacle to data acquisition [10,11]. Furthermore, the rating process of rating agencies is markedly disparate [12]. If the ESG information of enterprises can be evaluated accurately through the objective data disclosed by the enterprises themselves, then it will assist financial institutions and investors in better assessing the risks associated with these enterprises and making more effective investment decisions.

The accelerated evolution and extensive deployment of artificial intelligence (AI) technologies have positioned them as a pivotal instrument in the domain of ESG investment. Researchers have developed numerous models for extracting ESG information from unstructured text materials, including natural language processing, topic analysis, and word bag models [13,14,15]. However, the quality of ESG rating prediction extracted from text data remains uncertain. Moreover, there is a paucity of literature examining ESG disclosure by Chinese enterprises. Consequently, an investigation into structured data pertinent to extant environmental, social, and corporate governance ratings offers novel insights into the assessment of rating accuracy [16].

In terms of indicator analysis, previous studies have demonstrated that a higher ESG level has a positive impact on enterprise performance [17]. For example, the return on assets (ROA) and return on equity (ROE) in financial performance have been demonstrated to exhibit a positive correlation with ESG [18]. Conversely, some indicators, such as low debt cost, have been shown to display an inverse relationship with ESG [19]. In addition to financial indicators, the size of the company is also a factor that affects the ESG score. Larger companies are better positioned to provide ESG data, which results in higher scores [20]. Moreover, the alignment between a company’s diversified board and executive compensation with its ESG performance can markedly enhance its ESG score [21,22].

In a study employing the rough set method, Garcia et al. [23] sought to predict ESG ratings through the analysis of financial indicators, including earnings per share and return on assets. However, the efficacy of this method is contingent upon the quality of the data sets employed. If the data are of an inferior quality or the sample size is insufficient, then the accuracy of the predictions will be adversely affected. In addition, more and more machine learning models are used for ESG prediction. D’Amato et al. [24] compared the performance of six different machine learning models using balance sheet data and discussed the effectiveness of the Random Forest algorithm in predicting ESG scores. Raza et al. [25] discovered that the artificial neural network exhibited superior performance when compared to other machine learning algorithms on balance sheet and income statement data. Del Vitto [26] also demonstrated the advantages of artificial neural networks in predicting ESG and compared network models of different structures. These studies relied primarily on financial data, thereby overlooking the potential influence of non-financial factors. Consequently, their explanatory power is constrained. Furthermore, a single model is inadequate for effectively coping with the random fluctuations and noise inherent in the data, and its adaptability to enterprises in different industries is insufficient. In a related study, Krappel et al. [27] proposed a heterogeneous ensemble model, which combined two ensemble learning algorithms to predict ESG ratings. Although the model considers the characteristics of the company’s financial situation, industry classification, geographical location, and stock market performance, its explanatory power only covers 54% of the information in the ESG rating, indicating that the model is unable to fully capture the comprehensiveness of ESG rating.

While existing research has made notable strides in predicting ESG ratings through financial data, several challenges remain. One such challenge is the model’s limited generalizability, and another is the insufficient consideration of non-financial factors. This paper proposes a new stack generalization model (SGM-BRR), which consists of four integrated models and employs Bayesian Ridge Regression as the primary meta-model. To validate the model, we utilize data from Chinese A-share listed companies from 2012 to 2020, select 21 features, and divide the data into a training set and a test set. The random search algorithm and ten-fold cross-validation are employed to optimize the model parameters. The models were evaluated using RMSE, MAE, and R, and their performance was compared with that of seven benchmark models. Moreover, to assess the model’s capacity for generalization, this paper conducts further research and analysis on the model according to different industry scores. The results demonstrate that SGM-BRR exhibits superior performance compared to other benchmark models in terms of RMSE, MAE, and R. Furthermore, the model demonstrates robust predictive capabilities across a range of industries, particularly in the energy and industrial sectors. The overarching framework of this methodology is illustrated in Figure 1. In comparison to existing literature, this study not only considers the combination and optimization of the model, but also pays particular attention to the adaptability and universality of the model in the context of different data distributions and complexities.

The main contributions of this work can be summarized as follows:

(1): It proposes a new ESG score prediction framework consisting of the Stacked Generalization Model with Bayesian Ridge Regression (SGM-BRR) that combines RF, GBDT, XGBoost, and LightGBM. It overcomes the inherent shortcomings of a single model and can effectively improve the accuracy and robustness of prediction.
(2): It uses Bayesian Ridge Regression as a meta-model to dynamically adjust weights based on the performance of the basic model on the actual data set to achieve the intelligent combination of prediction results from different models, and use a random search algorithm to find the optimal parameters of the model.
(3): The proposed SGM-BRR is evaluated using real data sets. A comparative study is conducted on seven benchmark models, and the test results show that this model achieves the optimal performance in the prediction of the total score and the other three dimensions.
(4): Previous research has mainly focused on predicting the overall score of an enterprise. An in-depth comparison exploring the score prediction of SGM-BRR in different industries is conducted, further proving that the model has strong generalization ability.

The following is the proposed structure for the study. Section 2 outlines the methodology employed for data collection and organization. Section 3 presents the research methodology. In Section 4, the experiments and analysis are presented, along with the predicted results. The conclusions of this paper and the outlines of future research plans are presented in Section 5.

2. Research Data Analysis

This section primarily presents the methodology employed for the aggregation of data. The initial step is the selection and collection of the indicators. Subsequently, the data are cleaned and screened, and finally, a visual display is created.

This empirical study is primarily focused on the Chinese A-share market, the second largest equity market globally. Its diverse industrial and corporate representation, encompassing significant segments of the Chinese economy, offers a comprehensive and representative insight into the country’s economic landscape. To address the lack of uniformity of ESG standards, the text employs enterprise financial statements and publicly available non-financial index data as independent variables to predict ESG scores of enterprises. Ultimately, 21 features are selected, as illustrated in Table 1. These indicators encompass the financial status, operational scale, governance structure, and environmental performance of the enterprise, thereby reflecting the enterprise’s comprehensive performance from a multitude of perspectives. Financial indicators, including financial expenses, debt-to-asset ratio, and earnings per share (EPS), can comprehensively illustrate the financial standing of enterprises, elucidate market valuation, and reflect the efficacy of resource allocation within the enterprise. The provision of stable data input for the model serves to reduce the dependence on specific ESG standards. Secondly, the indicators of governance structure, such as employees, directors, the percentage of female directors, and the number of independent directors, demonstrate the governance and management levels of enterprises, which is beneficial for enhancing the universality and adaptability of the model. Environmental indicators, such as carbon emissions, per capita carbon emissions, and carbon emission intensity, are becoming increasingly significant in ESG standards. The integration of these data sets enhances the precision of the model in forecasting environmental performance. In terms of historical ESG performance, a lagging ESG score can serve as a point of reference for the model. This index facilitates the correction and optimization of the model’s prediction results. Consequently, the incorporation of comprehensive, multi-dimensional information enhances the precision and comprehensiveness of ESG score prediction. This approach ensures consistency across different ESG standards, enhancing the applicability and reliability of the method.

In the process of data collection and collation, the independent variable data of this study are primarily sourced from the China Economic Net, the CSMAR database, and the China Statistical Yearbook. The enterprise ESG score data are obtained from Bloomberg. In the event of missing values in the sample data, the direct deletion method is employed to guarantee the integrity and stability of the data set. Furthermore, the consistency of the data is verified, and any instances of repetition or unreasonable values are removed. Ultimately, the study yields 5049 samples from 968 companies listed on A-shares, encompassing 106,029 observed values. To ensure that features contribute information in a fair manner during the model training process, particularly when there are significant differences in the dimensions and value ranges of the features, the Z score standardization is used. This method normalizes the data to follow a normal distribution, thereby facilitating the convergence of the learning algorithm and improving model accuracy [28]. This approach effectively scales the data while preserving outliers in the data.

Figure 2 illustrates the sample distribution, where the horizontal coordinate is the number of samples, and the vertical coordinate indicates the score. From the figure, the ESG distribution is relatively uniform, mainly concentrated between 20 and 40. The overall E score is low, and the S score is higher compared to the E score, mainly distributed around 10 points. The G score distribution is more dispersed, with a large gap between the scores. Table 2 shows the statistical indicators of each dimension score. Subsequently, statistical analysis is performed on the collected raw data, with the calculation of key statistical indicators, including the mean, median, and standard deviation of the data set, as shown in Table 3.

The environmental, social, and governance (ESG) scores of enterprises in disparate industries in China are subject to varying scoring criteria. To further verify the extensive adaptability of the model, this paper classifies enterprises according to the Global Industry Classification Standard (GICS). Due to the limited sample size, industries with an insufficient number of samples were excluded from the classification process. Ultimately, nine industry classifications were retained. The nine industry classifications are as follows: energy, materials, industrial, consumer discretionary, consumer staples, health care, communication services, utilities, and real estate. The distribution of industries is illustrated in Figure 3. The data for the industry and materials industries account for a relatively high proportion, representing approximately 24.9% and 21% of the total sample, respectively. In contrast, the real estate and energy industries account for a relatively low proportion, representing 4.4% and 4.1% of the total sample, respectively.

3. Methodology

This section presents a comprehensive overview of the fundamental theory and framework of SGM-BRR. This section provides a comprehensive analysis of the optimization process of SGM-BRR and the evaluation index of the description model.

3.1. Basic Learning

3.1.1. Random Forest (RF)

The RF is a popular ensemble learning approach that is widely acclaimed for its good predictive power, especially in complex tasks such as predicting a firm’s ESG score. The principle behind RF lies in combining the predictions of multiple decision tree models to produce predictions that are more accurate and robust than those of any single decision tree [29,30]. A key calculation formula in RF is the Gini impurity, represented by Equation (1), which is used to quantify the purity of nodes in the decision tree:

G i n i = 1 - \sum_{i = 1}^{n} p_{i}^{2},

(1)

where

p_{i}

is the proportion of samples that belong to class

i

in a specific node. The Gini impurity helps determine the optimal split for each node of the decision tree, ensuring the model can accurately classify and predict the data [31]. Another important formula is to calculate feature importance, as expressed in Equation (2), which helps to understand which features contribute the most to predictions.

i m p o r t a n c e = \frac{Δ A c c u r a c y}{Δ N o d e s},

(2)

where

Δ A c c u r a c y

is the change in model accuracy with and without the feature, and

Δ N o d e s

represents the change in the number of nodes.

3.1.2. Gradient Boosting Decision Tree (GBDT)

GBDT operates by building predictive models in the form of an ensemble of weak predictive models. The core principle behind GBDT is to iteratively add decision trees to the model, where each successive tree aims to correct the errors made by the previous decision tree. This process can be formalized as follows: given a data set

D = {\{(x_{i}, y_{i})\}}_{i = 1}^{N}

, where

x_{i}

represents the features and

y_{i}

the target for each sample

i

, the model starts with a constant predictor

F_{0} (x)

. At each step

m

, a new tree

h_{m} (x)

is fitted to the negative gradient of the loss function

L (y, F)

with respect to the model predictions

F_{m - 1} (x)

. The model update equation is

F_{m} (x) = F_{m - 1} (x) + v \cdot h_{m} (x)

, where

v

is the learning rate, controlling the contribution of each tree. This process continues for

M

iterations or until a stopping criterion is met [32].

GBDT can capture non-linear relationships between features, reducing bias and variance in predictions. It avoids overfitting by using techniques such as learning rate and feature subsampling [33].

3.1.3. eXtreme Gradient Boosting (XGBoost)

XGBoost was proposed by Chen and Guestrin [34], has excellent performance in handling sparse data and solving various types of prediction problems, and is scalable. XGBoost builds an ensemble of weak learners so that each subsequent model attempts to correct the mistakes made by the previous model [35]. The objective function of XGBoost is expressed as Equation (3):

o b j (Θ) = \sum_{i = 1}^{n} l (y_{i}, {\hat{y}}_{i}) + \sum_{k = 1}^{K} Ω (f_{k}),

(3)

where

l

is a differentiable convex loss function that measures the difference between the predicted value

{\hat{y}}_{i}

and the actual label

y_{i}

for each instance

i, f_{k}

is a function corresponding to the

k

tree in the model, and

Ω

represents the regularization term. Regularization terms usually include parameters that control the depth of the tree and the minimum weight required for further partitioning on the leaf nodes of the tree. It is used to penalize the complexity of the model to avoid overfitting [36].

Θ

denotes the parameters of the model.

3.1.4. Light Gradient Boosting Machine (LightGBM)

LightGBM is a highly efficient and scalable gradient boosting framework introduced to address various predictive modeling tasks. LightGBM employs a novel tree-growing algorithm that selects the leaf with maximum delta loss to grow, leading to faster training speed and higher efficiency. Unlike traditional gradient boosting methods that grow trees level-wise, LightGBM grows trees leaf-wise, which significantly reduces computational costs and increases the model’s speed [37]. Mathematically, the process of selecting the optimal leaf for growth can be represented as Equation (4):

Δ_{l o s s} = \sum_{i \in I_{l e a f}} g_{i} \cdot x_{i, f e a t u r e} + \frac{1}{2} \sum_{i \in I_{l e a f}} h_{i} \cdot x_{i, f e a t u r e}^{2} + λ,

(4)

where

I_{l e a f}

is the set of instances in the leaf,

g_{i}

and

h_{i}

are the first- and second-order gradients of the loss function with respect to the prediction for instance

i

,

x_{i, f e a t u r e}

is the value of the feature used for splitting, and

λ

is the regularization term. The efficiency of LightGBM also comes from its ability to handle large data sets with a reduced memory footprint, thanks to its gradient-based one-sided sampling and exclusive feature bundling. These techniques minimize data scanning and memory usage while maintaining model accuracy [38].

3.2. The Meta-Learner of the Bayesian Ridge Regression (BRR)

BRR is a method that employs Bayesian theorem in conjunction with ridge regression to estimate regression parameters, aiming to enhance the stability and accuracy of regression estimates. Unlike traditional ridge regression, which estimates parameters by minimizing a linear combination of the sum of squared residuals and a regularization term, Bayesian Ridge Regression considers the parameters as random variables and utilizes prior distributions to represent uncertainty in parameters. Specifically, the posterior distribution of the model parameters is determined by both the observed data and the prior distribution, expressed as

p (θ | X, y) \propto p (y | X, θ) p (θ)

, where

p (y | X, θ)

represents the likelihood function given the parameters and input data, and

p (θ)

is the prior distribution of the parameters. By maximizing the posterior distribution or computing its expected value, the Bayesian estimate of the parameters can be obtained. This approach offers a systematic way to consider uncertainty in parameter estimation, making the model more robust [39].

3.3. Stacked Framework of SGM-BRR

The stacking algorithm is an integration technique. By combining and leveraging the strengths of each base model, more accurate and robust predictions are produced [40]. The structure of SGM-BRR consists of two levels: a base model that makes initial predictions, and a meta-model that learns to combine these predictions into a final output. The specific implementation framework is shown in Figure 4.

The data set is subdivided into training and test sets. The purpose of the training set

{\{(X_{i}, y_{i})\}}_{i = 1}^{N}

is to serve as the basis for model training, where

N

represents the number of samples in the training set. In contrast, the test set

{\{(X_{i}, y_{i})\}}_{i = N + 1}^{N + M}

serves as the basis for evaluating the final model’s performance, where

M

represents the number of samples in the test set. The model performance is evaluated using 10-fold cross-validation. The training set is divided into 10 subsets, with the k-th fold serving as the verification set and the remaining 9 folds serving as the training set.

It should be noted that the k-th discount is represented by

V_{k}

, while the combination of the other nine discounts is represented by

T_{k}

. The basic learners in the first layer include RF, XGBoost, GBDT, and LGBM. For each basic learner,

T_{k}

is employed for training purposes, while

V_{k}

is utilized for prediction in K-fold cross-validation. It should be noted that the prediction result is

P_{k}^{(j)}

, where j represents the j-th basic learner. The prediction results of the j-th basic learner at the k-th fold are as follows:

P_{k}^{(j)} = f_{j} (T_{k}, V_{k}),

(5)

In each iteration of cross-validation, the prediction outcomes of the fundamental learner are employed to generate a novel feature matrix, which serves as the input for the meta-learner. In the context of K-fold cross-validation, there are j basic learners, and the prediction result of each basic learner in each compromise is

P_{k}^{(j)}

. The following characteristic matrix

Z

can be derived as follows:

Z = [\begin{matrix} P_{1}^{(1)} & P_{1}^{(2)} & \dots & P_{1}^{(j)} \\ P_{2}^{(1)} & P_{2}^{(2)} & \dots & P_{2}^{(j)} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ P_{k}^{(1)} & P_{k}^{(2)} & \dots & P_{k}^{(j)} \end{matrix}],

(6)

The data set is organized in a two-dimensional matrix, with each row representing the prediction results of a verification set and each column representing the prediction results of a different basic learner.

Bayesian Ridge Regression is a linear regression method that employs L2 regularization. The objective of this method is to reduce the complexity of the model and prevent overfitting by introducing a regularization term. The objective function of Bayesian Ridge Regression is defined as:

m i n_{w} (\sum_{i = 1}^{N} {(y_{i} - w^{T} z_{i})}^{2} + λ {∥ w ∥}^{2}),

(7)

where

{∥ w ∥}^{2}

represents the L2 norm (Euclidean norm) of the regression coefficient vector

w

. The L2 norm is defined as

{∥ w ∥}^{2} = \sum_{j = 1}^{d} w_{j}^{2}

, where

d

is the dimension of the coefficient vector

w

,

w

represents the regression coefficient vector,

z_{i}

is defined as the feature vector of the i-th sample, and

y_{i}

represents the true value of the i-th sample. λ is a regularization parameter. Consequently, the solution to Bayesian Ridge Regression can be derived from the following formula:

w = {(Z^{T} Z + λ I)}^{- 1} Z^{T} y,

(8)

where

Z

is the characteristic matrix,

y

is the target value vector, and

I

represents the identity matrix.

Bayesian Ridge Regression estimates model parameters by maximizing posterior probability. Combining prior distribution and the likelihood function, the prior distribution is usually assumed to be a Gaussian distribution:

p (w | α) \sim N (0, α^{- 1} I)

. Where α is the inverse of regularization intensity, and the likelihood function assumes that the difference between the observed value and the predicted value obeys a Gaussian distribution conforming to

p (y | w, β) ~ N (Z w, β^{- 1} I)

, where β is the inverse of the noise intensity. Calculating posterior distribution by Bayesian theorem,

p (w | Z, y, α, β) \propto p (y | w, β) p (w | α)

. Finally, the predicted value of Bayesian Ridge Regression can be derived as

\hat{y} = Z w

. Once the meta-learner has undergone training, it can be employed to make predictions regarding the test set. The prediction result matrix is

{\hat{p}}_{t e s t} : [\begin{matrix} {\hat{p}}^{(1)}_{t e s t} & {\hat{p}}^{(2)}_{t e s t} & \dots & {\hat{p}}^{(j)}_{t e s t} \end{matrix}]

, generated by each basic learner in turn, using the test set

X_{t e s t}

as the input. The prediction result matrix of the basic learner is input

{\hat{p}}_{t e s t}

into the meta-learner, resulting in the final prediction shown below.

{\hat{y}}_{t e s t} = g ({\hat{p}}_{t e s t}) = {\hat{p}}_{t e s t} w,

(9)

The basic learners selected for consideration are RF, GBDT, XGBoost, and LightGBM, based on their respective performance advantages and complementary characteristics. RF mitigates the risk of overfitting by constructing multiple decision trees and averaging their results, thereby exhibiting robust performance and the capacity to process high-dimensional data. GBDT enhances the model’s performance by incrementally refining the residuals of preceding trees, thereby facilitating the identification of non-linear data relationships. XGBoost is optimized based on GBDT, and the model’s generalization ability and efficiency are enhanced through the implementation of weighted voting and regularization. Furthermore, LightGBM enhances the training speed and accuracy through the incorporation of a leaf algorithm and straight acceleration technology. The complementary advantages of these basic models are as follows: the strong robustness and anti-noise of RF can compensate for the possible overfitting problems of GBDT and XGBoost; meanwhile, the advantages of GBDT and XGBoost in dealing with complex non-linear relationships can compensate for the shortcomings of RF in capturing fine data structures. The efficiency and accuracy of LightGBM can effectively enhance the training speed and prediction performance of the entire model.

The combination of multiple models allows for the full utilization of the advantages inherent to each model. In assessing the significance of features, RF determines the importance of each feature by calculating the mean reduced impurity (such as the Gini index or information gain) across all trees. Similarly, GBDT and XGBoost assess the significance of features based on the number of splits in all trees and their impact on error reduction. The LightGBM algorithm employs a feature selection method based on the square root to calculate the gain and splitting times of each feature, thereby determining its relative importance. These methods concentrate on the assessment of feature importance, thereby facilitating the provision of comprehensive information regarding the relative importance of individual features. The Bayesian probability framework is introduced into ridge regression, and the model parameters are estimated using the prior distribution and the posterior distribution. By adding the L2 regularization term to the loss function to penalize large coefficients and make them approach zero, the variance inflation caused by multicollinearity is reduced. This method enhances the robustness of model parameter estimation and ensures satisfactory prediction performance even in the presence of complex feature interactions.

3.4. Statistical Measures for Model Evaluation

The choice of evaluation metrics is crucial for quantifying a model’s performance and accuracy. Different metrics evaluate a model’s strengths and weaknesses from multiple dimensions. For tasks such as predicting corporate ESG scores, commonly used evaluation metrics include RMSE, MAE, and R² [41,42]. Each metric reflects the difference between model predictions and actual values from a different perspective. The expressions of these indicators are shown in Equations (10)–(12):

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}},

(10)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|,

(11)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \hat{y})}^{2}},

(12)

where

y_{i}

is the observed value,

{\hat{y}}_{i}

is the predicted value, and

n

is the number of evaluation metrics assess model performance in terms of the accuracy of the model’s predictions, the magnitude of the bias, and the model’s ability to account for variability.

4. Experimental Results and Analysis

In this section, the RMSE, MAE, and R² values are employed to analyze the experimental results of SGM-BRR and seven additional models in predicting scores. Subsequently, the applicability of the model is verified by examining data from a variety of industries and dimensions. The results demonstrate that SGM-BRR exhibits enhanced performance in forecasting ESG, E, S, and G scores relative to a single model. It displays robust generalization ability and high prediction accuracy on cross-industry data sets. The model code was developed based on the Scikit-Learn and TensorFlow frameworks and subsequently verified in the PyCharm environment.

4.1. Prediction Performance

To ensure the accuracy of the experiment, this paper divides the data into a training set and test set according to the ratio of 8:2. Besides BRR, the performances of several other meta-models are compared, including linear regression (LR), neural network (ANN) and ridge regression. We conducted experiments on ESG, E score, S score, and G score. The results show that the performance of different meta-models is different in different scoring dimensions, but in general, SGM-BRR performs well in many indicators, especially in R². The results are shown in Table 4.

Subsequently, a random sample is drawn from the specified hyperparametric space to identify the optimal parameter combination for SGM-BRR and the remaining seven models. In comparison to the conventional grid search approach, the random search method allows for the investigation of a greater number of parameter combinations within the same computational constraints, thereby enhancing the likelihood of identifying the optimal parameters. In the case of RF, XGBoost, GBDT, and LightGBM, the number of estimators and the maximum depth are subject to adjustment. In the process of parameter adjustment for SGM-BRR, the primary variables that are subject to modification are the number of iterations and the regularization parameters alpha_1, alpha_2, lambda_1, and lambda_2 of BRR. The search range is set to the interval (100, 500). Regarding the neural network, two parameters are primarily subject to alteration: the number of network layers and the number of neurons in each layer. To assess the efficacy of each parameter combination, we employ cross-validation, whereby the average score of the model on the validation set is calculated by randomly dividing the training set and validation set on numerous occasions [43]. Ultimately, the parameter combination with the highest verification score is identified as the optimal parameter. The parameter settings are presented in Table 5. Subsequently, the optimized model is validated on the test set, and the results are presented in Table 6.

As illustrated in Table 6, SGM-BRR has an RMSE value of 3.4706, an MAE value of 2.5081, and an R² value of 0.8193 in ESG score prediction. In comparison to other benchmark models, the RMSE and MAE values exhibit a reduction of 0.184 and 0.154, respectively. The R² value exhibits an average increase of 2%. With respect to the environment, the RMSE and MAE values of SGM-BRR exhibit a reduction of 0.173 and 0.184, respectively. The R² increases by an average of 1.4%. With respect to the social aspect, the RMSE and MAE values exhibit a decrease of approximately 0.137 and 0.158, respectively. The R² value increases by approximately two percent on average. Regarding the domain of governance, the RMSE and MAE values exhibit a decline, with reductions of 0.761 and 0.684, respectively. The R² value increases by approximately six percent. These findings demonstrate that the integration of multiple models and the comprehensive utilization of their respective advantages can effectively address non-linear relations and capture the interaction of features. The introduction of a penalty term into the optimization objective of Bayesian Ridge Regression serves to mitigate the potential for the prediction error of a single basic model to exert an excessive influence on the overall result. This method not only enhances the stability of the predictive model, but also improves its adaptability to different data distributions.

In practical application, enterprises may utilize SGM-BRR to enhance the efficacy of their investment decision-making processes and ESG reporting frameworks. SGM-BRR assists enterprises in more accurately evaluating the environmental awareness, social responsibility performance, and governance ability of potential investment targets. This is achieved by providing more precise score predictions. To comprehensively evaluate the sustainability of enterprises, reduce investment risks, and improve the rate of return, it is necessary to employ a methodology that can provide accurate and reliable data. In the context of ESG reporting, the use of more accurate ESG score prediction enables enterprises to more effectively disclose their environmental, social, and governance performance, enhance the accuracy and consistency of the report, and strengthen the confidence of investors. Consequently, incorporating SGM-BRR into the investment decision-making process and ESG reporting framework can not only enhance the scientific and rational basis of investment decisions, but also reinforce the credibility of ESG reporting, thereby enhancing the sustainable development capabilities of enterprises.

Scatter plots were generated for the ESG, E, S, and G prediction scores, as illustrated in Figure 5. Each scatter point represents a sample point. The abscissa represents the true value, while the ordinate represents the predicted value. The solid red line represents the scenario in which the predicted result is identical to the true value. The greater the degree of concentration of the scattered points around the red line, the closer the predicted value is to the true value. This is to say that the value of r is closer to 1. As illustrated in Figure 5, the RMSE of SGM-BRR for the ESG score is 3.4706, MAE is 2.5081, and R² is 0.8193. These values demonstrate that the model exhibits high accuracy, a minimal discrepancy between the predicted and actual values, and a concentration of scattered points around the red line, indicating an optimal fitting effect. In the environmental dimension, the RMSE of SGM-BRR is 5.4704, MAE is 3.4329, and R² is 0.7771. Although the error is slightly larger than that observed in the ESG score, SGM-BRR demonstrates excellent performance in capturing the variation of environmental data and produces more stable and accurate prediction results than other models. In the social dimension, the RMSE of SGM-BRR is 3.1883, MAE is 2.0388, and R² is 0.7697, indicating a high degree of accuracy in prediction. Fluctuations are evident in the sample data of the social dimension; however, the model demonstrates an ability to fit these data well. In the governance dimension, the RMSE of SGM-BRR is 7.1270, MAE is 4.6450, and R² is 0.7447. Due to the significant disparity in governance samples, the predictive efficacy is not as pronounced as in other domains. Nevertheless, SGM-BRR demonstrates a clear advantage over other models in addressing the intricacies and ambiguity inherent in governance data.

4.2. Further Analyses

The objective of this study is to analyze the performance of different models in predicting the company’s overall ESG score and its E, S, and G dimension scores. The results demonstrate that SGM-BRR outperforms other ensemble models and deep learning models in multiple evaluation indicators, exhibiting excellent prediction performance. To further verify the effectiveness of our model and provide more in-depth analysis, given the disparate environmental, social, and governance concerns encountered by various industries, the variables influencing ESG scores will also vary. Consequently, the performance of distinct models in forecasting corporate ESG scores across diverse industries will be investigated. Enterprises were divided according to the CIGS classification standards, and enterprises with less than 100 samples were excluded. Ultimately, nine industry classifications were retained, and the data set was divided at a ratio of 8:2 for model fitting. Comparative analysis of the performance of various models in terms of ESG, E, S, and G scores across different industries is conducted, which illustrates the correlation between the predicted values of all models and the actual values observed in the test sets of various industries. As illustrated in Figure 6, Figure 7, Figure 8 and Figure 9, each subgraph depicts the index results of ESG, E, S, G score obtained by distinct enterprises.

As can be seen from Table 7, there are significant differences in the predictive effectiveness of the different models in each industry under the four dimensions of ESG, E, S, and G.

In the ESG dimension, SGM-BRR performs best in all industries with an optimal RMSE, MAE, and R². The integrated model does not perform well in most industries, while SGM-BRR is more stable across industries. Especially in the energy sector, SGM-BRR has an RMSE of 3.2937, MAE of 2.4409, and R² of 0.8971, which reduces the RMSE and MAE by about 1.2366 and 1.062 on average and improves R² by an average of 10.89% compared to the other models. Although the improvement in R² is not as high as in the real estate and health care sectors, the energy sector has the highest R² of any sector, indicating that the model adapts well to the data characteristics of the energy sector. In contrast, the consumer staples industry does not have a significant forecast boost, with an RMSE of 4.3355, MAE of 2.5240, and R² of 0.6669, but it is still better than the other models. This may be due to the higher volatility and non-linear characteristics of data in the consumer staples industry, which poses a higher challenge to the model.

In the E dimension, SGM-BRR performs best on all metrics in the materials, industrial, energy, information technology, and health care industries. Especially in the energy industry, SGM-BRR has an RMSE of 5.9769 and R² of 0.8246, which is the highest optimization among all industries compared to the other models, with an average reduction of 1.2366 in RMSE and an average improvement of 6.98% in R². The MAE is 4.3561, which is an average reduction of 0.5326 compared to the other models and is second only to health care’s 0.5573. This indicates that SGM-BRR captures energy-consumption-related features better in ESG and E dimensions.

In the S dimension, SGM-BRR performs best in all indicators in the real estate, industrial, utilities, and consumer staples industries. SGM-BRR’s RMSE in the utilities industry is 3.8377, with the highest optimization margin of the indicator among all industries, at 0.5169; the energy industry’s MAE is 2.1004, with the highest optimization margin of the indicator among all industries at 0.3599; and the R² in the real estate sector is 0.6258, with the highest optimization margin among all sectors at 11.15%. These results indicate that SGM-BRR excels in capturing the social characteristics of these industries.

In the G dimension, SGM-BRR performs best in the materials, real estate, utilities, energy, information technology, and consumer staples industries. In the real estate industry, SGM-BRR has an RMSE of 7.2681 and R² of 0.7229, which is the highest optimization among all industries with an average reduction of about 3.343 in RMSE and an average improvement of 37.20% in R² compared to other models. The MAE is 5.2091, which is an average reduction of 2.4640 compared to the other models and is second only to energy’s 2.6271. In contrast, other models such as LSTM and RNN show more fluctuating performance in the S and G dimensions, failing to show a consistent advantage and a negative R².

Nevertheless, in the consumer goods industry, due to the high volatility and non-linear characteristics of data, the improvement in prediction accuracy of the model is not readily apparent. This indicates that caution should be exercised when dealing with high-volatility industry data. Overall, SGM-BRR has excellent prediction results under multiple industries and multiple dimensions. There are 36 experimental results in total, and SGM-BRR has 27 optimal results in RMSE and R² indicators, and 25 optimal results in MAE indicators. Taken together, SGM-BRR is optimal in 24 results for RMSE, MAE, and R². This further indicates that the model has stronger stability and generalization ability. Therefore, the results of this experiment further illustrate that the model proposed in this paper has stronger stability and generalization ability. Through these analyses, it can provide a powerful reference for enterprises in choosing the appropriate forecasting model and help them to achieve more accurate forecasts in different dimensions.

Figure 6 illustrates the fitting results of ESG scores for all models in various industry test sets. Compared to other benchmark models, SGM-BRR demonstrates the most favorable fitting effect across diverse industries, indicating that the model exhibits robust generalization and stability in ESG score prediction across all industries. Notably, in the domains of energy, materials, and industry, the fitting of sample data is closer to the diagonal, exhibiting a clear advantage over other industries. This indicates that SGM-BRR exhibits robust feature extraction capabilities in these domains. Notably, the R² value for the energy industry is 0.8971, which is a noteworthy achievement. Furthermore, the RMSE value for the SGM-BRR model is the lowest across all industries. Additionally, the MAE value for the real estate industry is also the lowest among all models.

Figure 7 shows the fitting results for the E score for each industry-based test set. In terms of E score prediction, although SGM-BRR maintains the best prediction performance in most industries, in the real estate, public utilities, and consumer goods industries, the GRU and RNN show better prediction results. This may be due to the unique advantages of these models in processing time series data and capturing long-term dependencies, making them more effective in industry-specific forecasts. Moreover, in the energy industry of the environmental dimension, SGM-BRR still performs best, with an RMSE of 5.9769, MAE of 4.3561, and R² of 0.8246. It is evident that there is a discrepancy between the model’s predicted values and the actual values of ESG scores and E scores. In comparison to other models, the prediction results of the deep learning model exhibit considerable fluctuations across different industries. This is because the depth model necessitates a substantial quantity of data. SGM-BRR has demonstrated the most favorable outcomes in most industries.

Figure 8 and Figure 9 show the fitting results on S score and G score for each industry-based test set. The gap between the predicted and true values of different models is shown. In the prediction of S score, SGM-BRR shows the best prediction effect in the materials, real estate, industrial, public utilities, and consumer staples industries. In the optional consumer and health care industries, the GRU has the best overall performance, and the best model in the information technology industry is LSTM. The analysis results in the energy industry show that the XGBoost has the best performance. These differences reflect the adaptability of each model to specific data structures and industry characteristics. For the prediction of G score, SGM-BRR has demonstrated excellent prediction performance in all other industries except the industrial and health care industries, highlighting its advanced ability in analyzing multi-dimensional data.

Figure 10 shows the experimental results of SGM-BRR. It shows the result histogram of MAE, RMSE, and R² in different industries with different scoring dimensions. From the figure, we find that the value of RMSE is the largest in the environmental score and has lower values in the ESG and governance dimensions. MAE has lower values in the ESG and social dimensions, and higher values in the governance dimension. RMSE has the lowest value among social scores, especially in the real estate industry. There is no obvious difference in R², indicating that the model has no obvious difference in scoring fitting in different dimensions in different industries and has strong generalization ability.

As show in Figure 10, the RMSE, R², and MAE indicators of SGM-BRR in predicting ESG and E scores in nine industries are better than the other models, highlighting the superiority of its comprehensive performance. Especially in the energy field, the SGM-BRR RMSE value is 3.2937, MAE is 2.4409, and R² is 0.8971. Compared with the other models, the RMSE and MAE are reduced by about 1.2366 and 1.062 on average, respectively, and the R² is increased by 1.9%, indicating that SGM-BRR has a strong ability to capture and learn the characteristics of the industry. Secondly, its performance in the industrial field is also very outstanding, with an RMSE of 3.2426, MAE of 2.3452, and R² of 0.8467. Compared with the other models, the RMSE and MAE are reduced by about 0.3426 and 0.2964, respectively, on average, and the R² is increased by about 0.0321. It reflects the good adaptability of the model to the characteristics of data in the industrial field. In contrast, the forecast effect of the consumer staples industry is weaker, with an RMSE value of 4.3355, an MAE value of 2.5240, and an R² of 0.6669, but it is still better than the other models. This may be because the consumer staples industry data have strong volatility and non-linear characteristics, which poses higher challenges to the model.

In summary, the integration of multiple base models, including RF, XGBoost, LightGBM, and GBDT, together with the use of Bayesian Ridge Regression as the meta-model in SGM-BRR, effectively combines the advantages of each base model, thereby improving prediction accuracy and generalization ability. Upon the examination of the experimental results, it can be observed that the prediction performance of SGM-BRR is superior to that of the baseline model in most industries. Regarding the comprehensive prediction of ESG scores, the performance of SGM-BRR is demonstrably superior in all industries. This not only corroborates the efficacy and dependability of SGM-BRR, but also underscores its remarkable adaptability in dealing with intricate and fluctuating data attributes and industry-specific nuances.

5. Conclusions

SGM-BRR was constructed to accurately predict the ESG scores of companies in China. This model integrates various models, including RF, GBDT, XGBoost, and Light BGM, as basic models. Bayesian Ridge Regression is used as a meta-model to combine the basic models and regularize them, thereby offsetting the deviations and shortcomings of a single model. The data set employed in this study comprises the financial records of China A-share listed companies from 2012 to 2020, encompassing 21 features. The model is optimized through 10-fold cross-validation and a random search algorithm. Thet RMSE, MAE, and R² are employed to assess the models. The following conclusions are drawn:

(1): SGM-BRR demonstrates superior predictive performance compared to both the single ensemble learning model and the neural network model in terms of predicting the scores of ESG, E, S, and G. For the ESG score, the RMSE decreases by 0.1840, the MAE decreases by 0.154, and the R² increases by 2%. For the E score, the RMSE decreases by 0.173, the MAE decreases by 0.184, and the R² increases by 1.4%. For the S score, the RMSE decreases by 0.137, the MAE decreases by 0.158, and the R² increases by approximately 2%. For the G score, the RMSE decreases by 0.761, the MAE decreases by 0.684, and the R² increases by approximately 6%. In the three dimensions of the ESG score, the R² value for the E score is the highest, followed by that for the G score.
(2): To further study the prediction effect of the model in different industries, this paper classifies industries according to the Global Industry Classification Standard and deletes the data with a small number of industry classifications. Finally, the scoring data for each dimension is divided into 9 industries, resulting in a total of 36 experimental results. The RMSE value of SGM-BRR exhibits the most favorable performance in 27 groups, while the MAE value demonstrates the most favorable performance in 24 groups and the R² value demonstrates the most favorable performance in 27 groups.
(3): The findings indicate that SGM-BRR exhibits enhanced generalization and adaptability in the context of diverse data and feature types. It is noteworthy that the industries with the most favorable R² performance in ESG score prediction by SGM-BRR are energy, materials, and industry, respectively. For the E score, the industries with the highest performance are energy, real estate, and industry. For the S score, the industries with the highest performance are energy, health care, and materials. For the G score, the industries with the highest performance are health care, industry, and information technology.

The SGM-BRR model proposed in this study has the capacity to explain more than 80% of ESG information. Considering this projection, investors will be able to comprehend the tangible outcomes of enterprises in the realms of environmental stewardship, social responsibility, and corporate governance, as well as the sustainable growth trajectory of target enterprises. This will facilitate the formulation of prudent investment decisions. Furthermore, enterprises may utilize SGM-BRR to forecast and assess their own ESG performance, thereby enhancing the precision and reliability of ESG reports. This is of significant consequence with respect to enhancing the transparency and market reputation of enterprises. While SGM-BRR demonstrates overall efficacy, it is not without potential limitations or constraints in specific contexts. External factors and complexity, such as extreme weather events, changes in social policies, and the influence of public opinion, may give rise to deviations and stability problems in forecasting. Accordingly, the introduction of additional variables and their dynamic adjustment is necessary to enhance the precision and reliability of the predictions.

Secondly, the utilization of data sets from disparate countries and regions can facilitate the process of verification. To illustrate, Europe and the Asia–Pacific region may be considered. The utilization of these heterogeneous data sets will facilitate a comprehensive evaluation of the universality of SGM-BRR. The transnational application of such models will undoubtedly encounter challenges, given the difficulties in obtaining data, the discrepancies in data quality, and the disparate regulatory and policy environments that prevail across different countries. Accordingly, future verification of the model will be conducted on a transnational basis to guarantee its applicability and accuracy on a global scale.

Author Contributions

Conceptualization, X.L., Z.W. and X.W.; methodology, X.W., Z.W. and J.M.; software, X.W.; validation, J.Z.; resources, J.Z. and X.L.; data curation, X.W. and Z.W.; writing—original draft preparation, X.W.; writing—review and editing, J.X., Z.W. and X.L.; project administration, J.Z. and J.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by an Experimental study on the evolution process of flow field structure of dusty gas explosion propagation based on accident prevention of the National Natural Science Foundation of China (Grant No. 51874134).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no competing interest.

Abbreviations

Abbreviation	Full Term
ESG	Environmental, Social, and Governance
SGM-BRR	Stacked Generalization Model with Bayesian Ridge Regression
RF	Random Forest
GBDT	Gradient Boosting Decision Tree
XGBoost	eXtreme Gradient Boosting
LightGBM	Light Gradient Boosting Machine
LR	Linear Regression
ANN	Artificial Neural Network
BRR	Bayesian Ridge Regression
SGM-BRR	Stacked Generalization Model
PRI	Principles for Responsible Investment
ROA	Return on Assets
ROE	Return on Equity
RMSE	Root Mean Squared Error
MAE	Mean Absolute Error
R²	R-squared coefficient
RNN	Recurrent Neural Network
LSTM	Long Short-Term Memory
GRU	Gated Recurrent Unit
E score	Environment Score
S score	Social Score
G score	Governance Score

References

Sonko, K.N.; Sonko, M. Demystifying Environmental, Social and Governance (ESG). In Palgrave Studies in Impact Finance; Palgrave Macmillan: Cham, Switzerland, 2023. [Google Scholar]
Friede, G. Why don’t we see more action? A metasynthesis of the investor impediments to integrate environmental, social, and governance factors. Bus. Strategy Environ. 2019, 28, 1260–1282. [Google Scholar] [CrossRef]
Read, C. Understanding Sustainability Principles and ESG Policies: A Multidisciplinary Approach to Public and Corporate Responses to Climate Change; Springer Nature: Berlin, Germany, 2023. [Google Scholar]
Tsalis, T.A.; Terzaki, M.; Koulouriotis, D.; Tsagarakis, K.P.; Nikolaou, I.E. The nexus of United Nations’ 2030 Agenda and corporate sustainability reports. Sustain. Dev. 2023, 31, 784–796. [Google Scholar] [CrossRef]
Zhang, Z.; Li, J.; Guan, D. Value chain carbon footprints of Chinese listed companies. Nat. Commun. 2023, 14, 2794. [Google Scholar] [CrossRef]
Wang, J.; Liu, Z.; Shi, L.; Tan, J. The impact of low-carbon pilot city policy on corporate green technology innovation in a sustainable development context—Evidence from Chinese listed companies. Sustainability 2022, 14, 10953. [Google Scholar] [CrossRef]
Zhou, B.; Huang, Y.; Zhao, Y. Research on the incentive effect of the policy combination of carbon-reduction pilot cities. Int. Rev. Econ. Financ. 2024, 91, 456–475. [Google Scholar] [CrossRef]
Xiao, X.; He, G.; Zhang, S.; Zhang, S. Impact of China’s Low-Carbon City Pilot Policies on Enterprise Energy Efficiency. Sustainability 2023, 15, 10440. [Google Scholar] [CrossRef]
Zeng, S.; Li, T.; Wu, S.; Gao, W.; Li, G. Does green technology progress have a significant impact on carbon dioxide emissions? Energy Econ. 2024, 133, 107524. [Google Scholar] [CrossRef]
Nigam, S.; Behera, A.P.; Gogoi, M.; Verma, S.; Nagabhushan, P. Strike off removal in Indic scripts with transfer learning. Neural Comput. Appl. 2023, 35, 12927–12943. [Google Scholar] [CrossRef]
Joubrel, M.; Maksimovich, E. ESG Data and Scores. In Valuation and Sustainability: A Guide to Include Environmental, Social, and Governance Data in Business Valuation; Springer: Berlin/Heidelberg, Germany, 2023; pp. 67–98. [Google Scholar]
Chatterji, A.K.; Durand, R.; Levine, D.I.; Touboul, S. Do ratings of firms converge? Implications for managers, investors and strategy researchers. Strateg. Manag. J. 2016, 37, 1597–1614. [Google Scholar] [CrossRef]
Ramani, R.S.; Aguinis, H. Using field and quasi experiments and text-based analysis to advance international business theory. J. World Bus. 2023, 58, 101463. [Google Scholar] [CrossRef]
Ranta, M.; Ylinen, M.; Jarvenpaa, M. Machine Learning in Management Accounting Research: Literature Review and Pathways for the Future. Eur. Account. Rev. 2023, 32, 607–636. [Google Scholar] [CrossRef]
Lee, J.; Kim, M. ESG information extraction with cross-sectoral and multi-source adaptation based on domain-tuned language models. Expert Syst. Appl. 2023, 221, 119726. [Google Scholar] [CrossRef]
D’Amato, V.; D’Ecclesia, R.; Levantesi, S. Firms’ profitability and ESG score: A machine learning approach. Appl. Stoch. Models Bus. Ind. 2023, 40, 243–261. [Google Scholar] [CrossRef]
Henrique, B.M.; Sobreiro, V.A.; Kimura, H. Literature review: Machine learning techniques applied to financial market prediction. Expert Syst. Appl. 2019, 124, 226–251. [Google Scholar] [CrossRef]
Sandberg, H.; Alnoor, A.; Tiberius, V. Environmental, social, and governance ratings and financial performance: Evidence from the European food industry. Bus. Strategy Environ. 2023, 32, 2471–2489. [Google Scholar] [CrossRef]
Apergis, N.; Poufinas, T.; Antonopoulos, A. ESG scores and cost of debt. Energy Econ. 2022, 112, 106186. [Google Scholar] [CrossRef]
Drempetic, S.; Klein, C.; Zwergel, B. The Influence of Firm Size on the ESG Score: Corporate Sustainability Ratings Under Review. J. Bus. Ethics 2020, 167, 333–360. [Google Scholar] [CrossRef]
Aliani, K.; Hamza, F.; Alessa, N.; Borgi, H.; Albitar, K. ESG disclosure in G7 countries: Do board cultural diversity and structure policy matter? Corp. Soc. Responsib. Environ. Manag. 2024, 31, 3031–3042. [Google Scholar] [CrossRef]
Cohen, S.; Kadach, I.; Ormazabal, G.; Reichelstein, S. Executive Compensation Tied to ESG Performance: International Evidence. J. Account. Res. 2023, 61, 805–853. [Google Scholar] [CrossRef]
Garcia, F.; Gonzalez-Bueno, J.; Guijarro, F.; Oliver, J. Forecasting the Environmental, Social, and Governance Rating of Firms by Using Corporate Financial Performance Variables: A Rough Set Approach. Sustainability 2020, 12, 3324. [Google Scholar] [CrossRef]
D’Amato, V.; D’Ecclesia, R.; Levantesi, S. ESG score prediction through random forest algorithm. Comput. Manag. Sci. 2022, 19, 347–373. [Google Scholar] [CrossRef]
Raza, H.; Khan, M.A.; Mazliham, M.S.; Alam, M.M.; Aman, N.; Abbas, K. Applying artificial intelligence techniques for predicting the environment, social, and governance (ESG) pillar score based on balance sheet and income statement data: A case of non-financial companies of USA, UK, and Germany. Front. Environ. Sci. 2022, 10, 975487. [Google Scholar] [CrossRef]
Del Vitto, A.; Marazzina, D.; Stocco, D. ESG ratings explainability through machine learning techniques. Ann. Oper. Res. 2023, 1–30. [Google Scholar] [CrossRef]
Krappel, T.; Bogun, A.; Borth, D. Heterogeneous ensemble for ESG ratings prediction. arXiv 2021, arXiv:2109.10085. [Google Scholar]
Singh, D.; Singh, B. Investigating the impact of data normalization on classification performance. Appl. Soft. Comput. 2020, 97, 105524. [Google Scholar] [CrossRef]
Parvin, H.; MirnabiBaboli, M.; Alinejad-Rokny, H. Proposing a classifier ensemble framework based on classifier selection and decision tree. Eng. Appl. Artif. Intell. 2015, 37, 34–42. [Google Scholar] [CrossRef]
Sagi, O.; Rokach, L. Explainable decision forest: Transforming a decision forest into an interpretable tree. Inf. Fusion 2020, 61, 124–138. [Google Scholar] [CrossRef]
Azar, A.T.; El-Metwally, S.M. Decision tree classifiers for automated medical diagnosis. Neural Comput. Appl. 2013, 23, 2387–2403. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Liu, W.; Fan, H.; Xia, M. Credit scoring based on tree-enhanced gradient boosting decision trees. Expert Syst. Appl. 2022, 189, 116034. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
Dong, J.; Chen, Y.; Yao, B.; Zhang, X.; Zeng, N. A neural network boosting regression model based on XGBoost. Appl. Soft. Comput. 2022, 125, 109067. [Google Scholar] [CrossRef]
Zhao, C.; Wu, D.; Huang, J.; Yuan, Y.; Zhang, H.-T.; Peng, R.; Shi, Z. BoostTree and BoostForest for Ensemble Learning. Ieee Trans. Pattern Anal. Mach. Intell. 2023, 45, 8110–8126. [Google Scholar] [CrossRef]
Bentejac, C.; Csorgo, A.; Martinez-Munoz, G. A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
Sun, X.; Liu, M.; Sima, Z. A novel cryptocurrency price trend forecasting model based on LightGBM. Financ. Res. Lett. 2020, 32, 101084. [Google Scholar] [CrossRef]
Tipping, M.E. Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 2001, 1, 211–244. [Google Scholar]
Ganaie, M.A.; Hu, M.; Malik, A.K.; Tanveer, M.; Suganthan, P.N. Ensemble deep learning: A review. Eng. Appl. Artif. Intell. 2022, 115, 105151. [Google Scholar] [CrossRef]
Tsionas, M.G. Multi-criteria optimization in regression. Ann. Oper. Res. 2021, 306, 7–25. [Google Scholar] [CrossRef]
Bilokha, A.; Cheng, M.; Fu, M.; Hasan, I. Understanding CSR champions: A machine learning approach. Ann. Oper. Res. 2024, 1–14. [Google Scholar] [CrossRef]
Dai, Q. A competitive ensemble pruning approach based on cross-validation technique. Knowl.-Based Syst. 2013, 37, 394–414. [Google Scholar] [CrossRef]

Figure 1. The research framework.

Figure 2. Corporate ESG score distribution chart.

Figure 3. Company industry distribution.

Figure 4. The framework of SGM-BRR.

Figure 5. Fitting diagram of score prediction results of different models. (A) Fitting diagram of ESG score model prediction results. (B) Fitting diagram of E score model prediction results. (C) Fitting diagram of S score model prediction results. (D) Fitting diagram of G score model prediction results.

Figure 6. Fit of predicted vs. true ESG scores across industries.

Figure 7. Fit of predicted vs. true E scores across industries.

Figure 8. Fit of predicted vs. true S scores across industries.

Figure 9. Fit of predicted vs. true G scores across industries.

Figure 10. Metric results for the SGM-BRR models.

Table 1. Characteristic indicators and introduction.

No.	Feature	Feature Description and Quantification
1	Financial Expenses	The costs a company incurs in financing its operations.
2	Company Size	A measure of the scope of a company’s operations.
3	Debt-to-Asset Ratio	A financial ratio that expresses the percentage of a company’s assets financed by debt.
4	Earnings Per Share (EPS)	A company’s net income divided by its number of outstanding shares indicates per-share profitability.
5	Price-to-Earnings Ratio (P/E Ratio)	The valuation ratio of a company’s current stock price to its earnings per share.
6	Total Assets	The total of all assets owned by a company, including current and non-current assets.
7	Total Liabilities	The total amount of debt a company owes to outside parties.
8	Total Share Capital	The total value of all shares of a company outstanding.
9	Net Profit	A company’s total earnings, after subtracting all expenses and taxes from its total revenue.
10	Total Current Assets	The total of all assets of a business that are converted into cash during a year or economic cycle.
11	Return on Assets (ROA)	A financial ratio that measures the profitability of a company’s assets in generating revenue.
12	Beta Value	A measure of a stock’s volatility relative to the overall market.
13	Industry	A category or division of a company’s operations that is characterized by common business activities.
14	Employees	The total number of people employed by the company.
15	Directors	The total number of persons serving on a company’s board of directors.
16	Percentage of Female Directors	Female board members as a percentage of total directors.
17	Number of Independent Directors	Directors who have no material or pecuniary relationship with the company or its associates ensure the impartiality of decisions.
18	Carbon Emissions	The total amount of carbon dioxide emissions produced where the company is located.
19	Per Capita Carbon Emissions	Per capita carbon dioxide emissions in the province where the company is located.
20	Carbon Emission Intensity	Indicators of carbon emissions per unit of GDP where the company is located.
21	Lagging ESG Score	The company’s ESG score for the previous year.

Table 2. Descriptive statistics of ESG scores and their components.

Statistic	ESG Score	E Score	S Score	G Score
Mean	29.39	10.58	13.82	65.27
Median	28.60	6.98	11.64	69.30
Std	8.40	12.15	6.88	13.93

Table 3. Statistical analysis of feature data.

Feature	Mean	Std	Min	Median	Max
Financial Expenses (Billion)	0.424	1.413	4.846	0.073	0.301
Company Size (Billion)	43.94	153.07	0.22	11.59	28.60
Debt-to-Assets Ratio	0.48	0.19	0.01	0.49	0.63
EPS	0.68	1.27	0	0.42	0.84
P/E Ratio	66.47	203.03	0.3	26.43	51.94
Total Assets (Billion)	48.64	165.11	0.333	13.08	31.82
Total Liabilities (Billion)	29.93	104.72	0.009	6.04	17.14
Total Share Capital (Billion)	2.99	9.54	0.039	1.24	2.42
Net Profit (Billion)	1.89	5.91	0.001	0.54	1.34
Total Current Assets (Billion)	24.02	88.10	0.082	5.83	14.40
ROA	5.74	5.38	0.01	4.23	7.91
Beta Value	1.1	0.28	0.06	1.1	1.28
Industry	4.36	2.69	1	3	7
Employees	15,576	37,988	33	5616	13,597
Directors	9.17	1.95	4	9	10
Percentage of Female Directors	0.13	0.12	0	0.11	0.22
Independent Directors	3.39	0.71	2	3	4
Carbon Emissions	49,023.02	33,573.15	5063.4	42,779.28	69,203.56
Per Capita Carbon Emissions	8.29	5.68	1.31	6.3	8.38
Carbon Emission Intensity	1.39	1.28	0.25	0.99	1.6
Lagging ESG Score	27.65	7.94	11.16	27.16	31.59

Table 4. Prediction results of different meta-model combinations.

Score	Model	RMSE	MAE	R²	Score	Model	RMSE	MAE	R²
ESG	SGM-ANN	3.4852	2.4870	0.8178	E	SGM-ANN	5.4831	3.4329	0.7760
	SGM-Bridge	3.4716	2.5096	0.8192		SGM-Bridge	5.4852	3.4365	0.7759
	SGM-LR	3.4721	2.5113	0.8192		SGM-LR	5.4839	3.4413	0.7760
	SGM-BRR	3.4706	2.5081	0.8193		SGM-BRR	5.4704	3.4328	0.7771
S	SGM-ANN	3.1900	2.0814	0.7695	G	SGM-ANN	7.1337	4.7550	0.7442
	SGM-Bridge	3.1924	2.0469	0.7691		SGM-Bridge	7.1345	4.6574	0.7442
	SGM-LR	3.1979	2.0526	0.7683		SGM-LR	7.129	4.647	0.7446
	SGM-BRR	3.1883	2.0388	0.7697		SGM-BRR	7.127	4.645	0.7447

Table 5. Model parameters.

Method	Score Function	Search Space	Default Value
RF	Number of estimators	(10, 200)	160
RF	Max depth	(3, 20)	12
XGBoost	Number of estimators	(10, 200)	36
XGBoost	Max depth	(3, 20)	5
GBDT	Number of estimators	(10, 200)	62
GBDT	Max depth	(3, 20)	6
LightGBM	Number of estimators	(10, 200)	63
LightGBM	Max depth	(3, 20)	12
SGM-BRR	n_iter	(100, 500)	330
	alpha_1	(1 × 10⁻⁶, 1 × 10⁻⁴)	0.00006
	alpha_2	(1 × 10⁻⁶, 1 × 10⁻⁴)	0.00005
	lambda_1	(1 × 10⁻⁶, 1 × 10⁻⁴)	0.00009
	lambda_2	(1 × 10⁻⁶, 1 × 10⁻⁴)	0.00001
RNN	Number of layers	1, 2, 3	2
RNN	Number of neurons	(50, 200)	141
LSTM	Number of layers	1, 2, 3	2
LSTM	Number of neurons	(50, 200)	66
GRU	Number of layers	1, 2, 3	2
GRU	Number of neurons	(50, 200)	104

Table 6. Prediction model evaluation performance index results.

Score	Model	RMSE	MAE	R²	Score	Model	RMSE	MAE	R²
ESG	RF	3.5881	2.5834	0.8069	E	RF	5.5451	3.5022	0.7710
	XGBoost	3.6953	2.6722	0.7952		XGBoost	5.8718	3.8057	0.7432
	GBDT	3.6006	2.6409	0.8055		GBDT	5.5206	3.4671	0.7730
	LightGBM	3.4948	2.5157	0.8168		LightGBM	5.7268	3.6697	0.7557
	RNN	3.7515	2.7169	0.7889		RNN	5.5638	3.5127	0.7694
	GRU	3.7472	2.7316	0.7894		GRU	5.6905	3.8163	0.7588
	LSTM	3.7911	2.8204	0.7844		LSTM	5.6775	3.5457	0.7599
	SGM-BRR	3.4706	2.5081	0.8193		SGM-BRR	5.4704	3.4328	0.7771
S	RF	3.2787	2.1142	0.7565	G	RF	7.3916	4.7959	0.7254
	XGBoost	3.4626	2.2862	0.7284		XGBoost	7.4989	4.9933	0.7174
	GBDT	3.2110	2.0694	0.7664		GBDT	7.4740	5.0120	0.7192
	LightGBM	3.2867	2.1168	0.7553		LightGBM	7.1882	4.6512	0.7403
	RNN	3.2991	2.2300	0.7534		RNN	8.5494	6.0684	0.6326
	GRU	3.3552	2.2376	0.7450		GRU	8.5436	5.8398	0.6331
	LSTM	3.3829	2.3231	0.7407		LSTM	8.6188	6.0078	0.6266
	SGM-BRR	3.1883	2.0388	0.7697		SGM-BRR	7.127	4.645	0.7447

Table 7. Test set results of ESG, E, S, and G score predictions for each industry for all models.

		ESG			E			S			G
Industry	Model	RMSE	MAE	R²	RMSE	MAE	R²	RMSE	MAE	R²	RMSE	MAE	R²
Materials	RF	3.7872	2.7778	0.8220	6.1682	4.3768	0.7393	3.7238	2.4635	0.6746	7.8395	5.1524	0.6904
	XGBoost	4.0626	3.0337	0.7952	6.6257	4.5565	0.6992	3.8630	2.6170	0.6499	7.8186	5.3703	0.6921
	GBDT	3.8592	2.9120	0.8152	6.3076	4.5774	0.7274	3.6337	2.3671	0.6902	8.0613	5.3759	0.6727
	LightGBM	3.9071	2.9544	0.8106	6.4640	4.6283	0.7137	3.7362	2.5857	0.6725	7.9070	5.3308	0.6851
	RNN	4.4311	3.3102	0.7563	6.0662	4.4089	0.7479	3.5313	2.4813	0.7074	10.0101	7.0190	0.4953
	GRU	4.3299	3.2903	0.7673	6.0217	4.4168	0.7516	3.5088	2.4406	0.7111	8.9261	6.3957	0.5987
	LSTM	4.5218	3.3713	0.7463	6.0687	4.4245	0.7477	3.7069	2.5807	0.6776	10.1669	7.0988	0.4794
	SGM-BRR	3.7451	2.7686	0.8259	5.9637	4.3435	0.7563	3.4955	2.3704	0.7133	7.5984	5.0706	0.7092
Real Estate	RF	3.1944	2.2696	0.7307	4.3256	2.3706	0.7676	1.9133	1.2754	0.6108	7.5041	5.4745	0.7046
	XGBoost	3.6192	2.5896	0.6544	5.1716	2.5580	0.6677	2.2077	1.4490	0.4818	8.3029	5.6024	0.6383
	GBDT	3.1503	2.2574	0.7381	4.3636	2.2398	0.7635	1.8861	1.1933	0.6218	8.1493	6.0876	0.6516
	LightGBM	3.5905	2.5338	0.6598	5.4573	2.8828	0.6300	2.2169	1.5481	0.4775	7.2868	5.2281	0.7214
	RNN	4.8480	3.8952	0.3798	3.9928	2.4272	0.8020	2.0725	1.6176	0.5433	14.0381	10.5154	−0.0339
	GRU	4.3600	3.6182	0.4984	3.9941	2.4607	0.8018	2.1931	1.6250	0.4886	13.1401	9.6377	0.0941
	LSTM	5.3384	4.1574	0.2480	4.0007	2.4568	0.8012	2.4218	1.9101	0.3765	15.8596	11.1663	−0.3196
	SGM-BRR	3.1407	2.4304	0.7397	4.2845	2.2954	0.7720	1.8762	1.1915	0.6258	7.2681	5.2091	0.7229
Industrials	RF	3.3452	2.4319	0.8369	5.5818	3.6269	0.7270	4.1661	2.4649	0.6541	7.1968	4.5750	0.7543
	XGBoost	3.5962	2.5673	0.8115	5.6956	3.6862	0.7158	4.2333	2.5578	0.6429	7.7503	5.0780	0.7150
	GBDT	3.2829	2.3736	0.8429	5.5849	3.6181	0.7267	3.9369	2.2702	0.6911	7.0888	4.6312	0.7616
	LightGBM	3.3255	2.3770	0.8388	5.5277	3.7649	0.7323	4.1107	2.4920	0.6633	6.7783	4.4727	0.7820
	RNN	3.6212	2.7046	0.8089	5.2737	3.5168	0.7563	3.9633	2.4603	0.6870	8.7054	6.2797	0.6404
	GRU	3.5610	2.6261	0.8152	5.2827	3.4801	0.7555	3.9556	2.3906	0.6882	8.1296	5.5879	0.6864
	LSTM	3.8695	2.8806	0.7818	5.4854	3.7075	0.7364	4.0111	2.4585	0.6794	8.7298	6.2156	0.6384
	SGM-BRR	3.2426	2.3452	0.8467	5.1477	3.3322	0.7678	3.8710	2.2700	0.7014	7.0803	4.6285	0.7622
Utilities	RF	4.4214	3.1076	0.7396	7.6870	4.9174	0.6584	3.9239	2.5707	0.6558	9.7872	5.9077	0.4225
	XGBoost	5.0138	3.7164	0.6651	7.6794	4.9670	0.6591	4.4100	2.7775	0.5653	8.8618	5.6680	0.5266
	GBDT	4.5398	3.2644	0.7254	7.7645	5.0363	0.6515	3.9807	2.5198	0.6458	9.2150	5.5647	0.4881
	LightGBM	4.5323	3.3088	0.7264	7.8866	5.1593	0.6405	4.1035	2.7445	0.6236	9.1865	5.8168	0.4912
	RNN	5.3620	4.2234	0.6170	6.4838	4.4756	0.7570	4.3756	2.6675	0.5720	14.0621	11.5513	−0.1921
	GRU	4.9894	3.9008	0.6684	6.3190	4.3228	0.7692	4.7467	2.9280	0.4964	12.4408	9.7042	0.0669
	LSTM	6.1670	4.7693	0.4934	6.4852	4.4050	0.7569	4.9415	3.1854	0.4542	14.8024	11.6163	−0.3209
	SGM-BRR	4.3018	3.0814	0.7535	7.2800	4.9488	0.6936	3.8377	2.4668	0.6708	8.8035	5.4197	0.5328
Consumer Discretionary	RF	3.7860	2.7938	0.7795	6.4493	3.7286	0.6892	4.1376	2.6888	0.6063	8.9295	5.8814	0.6228
	XGBoost	3.8257	2.8226	0.7749	6.7330	4.0992	0.6613	4.3990	2.9001	0.5550	9.3715	6.5634	0.5845
	GBDT	3.7287	2.6734	0.7861	6.7398	3.8531	0.6606	3.9970	2.6707	0.6326	9.0981	6.2601	0.6084
	LightGBM	3.8465	2.8179	0.7724	6.7655	4.0841	0.6580	3.7939	2.5924	0.6690	8.9758	6.4452	0.6189
	RNN	5.4380	3.9778	0.5451	6.0300	3.6582	0.7283	3.8893	2.5292	0.6522	12.5662	9.1403	0.2529
	GRU	4.6634	3.3591	0.6655	5.8212	3.4984	0.7468	3.7217	2.4052	0.6815	11.7947	8.1540	0.3419
	LSTM	5.4425	3.9140	0.5443	6.0312	3.7105	0.7282	4.0074	2.6111	0.6307	14.2228	10.4401	0.0430
	SGM-BRR	3.5964	2.6571	0.8010	6.2978	3.7174	0.7036	3.7666	2.5230	0.6738	8.6985	6.1441	0.6420
Energy	RF	3.3934	2.5434	0.8908	6.9874	4.4760	0.7603	3.7404	2.1417	0.8038	7.6428	4.9445	0.6772
	XGBoost	3.3537	2.5222	0.8933	7.4172	4.9974	0.7299	3.3307	1.9983	0.8445	8.0757	5.4204	0.6396
	GBDT	3.3718	2.5419	0.8922	6.7374	4.4753	0.7772	3.3638	2.1663	0.8413	7.7644	5.1588	0.6669
	LightGBM	3.5847	2.8424	0.8781	7.7894	5.1649	0.7021	4.0030	2.5250	0.7753	7.7497	5.3279	0.6681
	RNN	5.8967	4.6262	0.6702	6.5496	4.8268	0.7894	4.0683	2.6874	0.7679	12.7587	9.9368	0.1005
	GRU	5.2668	4.1466	0.7369	6.7962	4.9930	0.7732	3.8409	2.6327	0.7931	12.2141	9.4747	0.1756
	LSTM	6.8453	5.2976	0.5556	7.1133	5.2877	0.7516	4.3542	3.0709	0.7342	15.3944	12.1787	−0.3095
	SGM-BRR	3.2937	2.4409	0.8971	5.9769	4.3561	0.8246	3.6002	2.1004	0.8183	7.2748	4.8646	0.7076
Consumer Staples	RF	4.3940	2.5242	0.6579	9.1019	4.3961	0.4570	3.3152	2.2583	0.6378	7.9816	5.1443	0.5864
	XGBoost	4.4135	2.7189	0.6549	9.0238	4.0308	0.4663	3.3111	2.2732	0.6387	8.9137	5.9431	0.4841
	GBDT	4.5425	2.6587	0.6344	9.0864	4.3792	0.4589	3.5500	2.4319	0.5847	7.8985	5.4975	0.5949
	LightGBM	4.4245	2.5735	0.6531	9.4357	4.5892	0.4165	3.2352	2.1914	0.6550	8.5083	5.8153	0.5300
	RNN	5.1629	3.5999	0.5277	9.0091	4.3728	0.4680	3.6105	2.6375	0.5704	11.2444	8.9251	0.1790
	GRU	4.9666	3.3203	0.5629	9.0429	4.4958	0.4640	3.4998	2.4480	0.5963	9.2961	7.4391	0.4389
	LSTM	5.5365	3.9158	0.4569	9.1647	4.6909	0.4495	3.6762	2.5421	0.5546	11.2117	8.8937	0.1838
	SGM-BRR	4.3355	2.5240	0.6669	8.9748	4.4357	0.4721	3.0503	2.1827	0.6933	7.8222	5.0830	0.6027
Information Technology	RF	4.4809	3.0602	0.6961	7.4494	4.2624	0.6111	4.5331	2.9713	0.5012	7.0643	4.7082	0.7280
	XGBoost	4.6719	3.1327	0.6696	7.2991	4.1079	0.6267	4.7453	3.1057	0.4534	7.6423	5.1979	0.6817
	GBDT	4.4008	2.9167	0.7069	7.5119	4.2165	0.6046	4.5530	3.0111	0.4968	7.4620	4.9131	0.6965
	LightGBM	4.5194	2.9599	0.6909	7.2006	4.2330	0.6367	4.4065	3.0537	0.5287	7.4602	5.0746	0.6966
	RNN	5.4383	4.1607	0.5523	7.4606	4.4994	0.6100	4.5705	2.8923	0.4929	12.6251	9.3261	0.1312
	GRU	4.7544	3.4588	0.6579	7.4421	4.4482	0.6119	4.5001	2.8645	0.5084	10.1070	7.2145	0.4432
	LSTM	5.0483	3.7534	0.6143	7.3204	4.5032	0.6245	4.3667	2.8728	0.5372	12.7067	9.4440	0.1199
	SGM-BRR	4.3828	2.8865	0.7093	7.1873	4.0341	0.6380	4.5088	2.9848	0.5065	7.0377	4.6616	0.7300
Health Care	RF	3.9040	3.0197	0.7861	7.1252	4.7813	0.6707	4.3914	2.9310	0.7143	6.3547	3.9459	0.7882
	XGBoost	4.0025	3.0784	0.7751	7.4449	5.0022	0.6405	4.8779	3.1178	0.6475	5.7023	3.9050	0.8295
	GBDT	4.0045	3.0557	0.7749	7.1400	5.0098	0.6693	4.4216	3.0129	0.7104	5.9270	4.0165	0.8158
	LightGBM	4.1070	2.9943	0.7632	7.6646	5.2523	0.6189	4.0488	2.9174	0.7571	6.3053	4.3245	0.7915
	RNN	5.6340	4.4005	0.5544	7.8746	5.7202	0.5978	4.3218	3.0624	0.7233	12.5473	9.3867	0.1743
	GRU	5.1253	3.8563	0.6313	7.8477	5.6277	0.6005	3.9214	2.7804	0.7722	9.7894	7.1158	0.4974
	LSTM	6.0974	4.6782	0.4781	8.2165	5.8949	0.5621	4.4079	3.1783	0.7122	13.4421	10.2538	0.0523
	SGM-BRR	3.8263	2.9345	0.7945	7.0645	4.7696	0.6763	4.2129	2.8744	0.7371	5.7413	3.8929	0.8271

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Z.; Wang, X.; Liu, X.; Zhang, J.; Xu, J.; Ma, J. A Novel Stacked Generalization Ensemble-Based Hybrid SGM-BRR Model for ESG Score Prediction. Sustainability 2024, 16, 6979. https://doi.org/10.3390/su16166979

AMA Style

Wang Z, Wang X, Liu X, Zhang J, Xu J, Ma J. A Novel Stacked Generalization Ensemble-Based Hybrid SGM-BRR Model for ESG Score Prediction. Sustainability. 2024; 16(16):6979. https://doi.org/10.3390/su16166979

Chicago/Turabian Style

Wang, Zhie, Xiaoyong Wang, Xuexin Liu, Jun Zhang, Jingde Xu, and Jun Ma. 2024. "A Novel Stacked Generalization Ensemble-Based Hybrid SGM-BRR Model for ESG Score Prediction" Sustainability 16, no. 16: 6979. https://doi.org/10.3390/su16166979

APA Style

Wang, Z., Wang, X., Liu, X., Zhang, J., Xu, J., & Ma, J. (2024). A Novel Stacked Generalization Ensemble-Based Hybrid SGM-BRR Model for ESG Score Prediction. Sustainability, 16(16), 6979. https://doi.org/10.3390/su16166979

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Stacked Generalization Ensemble-Based Hybrid SGM-BRR Model for ESG Score Prediction

Abstract

1. Introduction

2. Research Data Analysis

3. Methodology

3.1. Basic Learning

3.1.1. Random Forest (RF)

3.1.2. Gradient Boosting Decision Tree (GBDT)

3.1.3. eXtreme Gradient Boosting (XGBoost)

3.1.4. Light Gradient Boosting Machine (LightGBM)

3.2. The Meta-Learner of the Bayesian Ridge Regression (BRR)

3.3. Stacked Framework of SGM-BRR

3.4. Statistical Measures for Model Evaluation

4. Experimental Results and Analysis

4.1. Prediction Performance

4.2. Further Analyses

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI