Hybrid Metaheuristic Optimized Extreme Learning Machine for Sustainability Focused CO2 Emission Prediction Using Globalization-Driven Indicators

Almsallti, Mahmoud; Alzubi, Ahmad Bassam; Adegboye, Oluwatayomi Rereloluwa

doi:10.3390/su17156783

Open AccessArticle

Hybrid Metaheuristic Optimized Extreme Learning Machine for Sustainability Focused CO₂ Emission Prediction Using Globalization-Driven Indicators

by

Mahmoud Almsallti

,

Ahmad Bassam Alzubi

and

Oluwatayomi Rereloluwa Adegboye

^*

Business Administration Department, Institute of Graduate Research and Studies, University of Mediterranean Karpasia, Mersin-10, TR-10 Mersin, Northern Cyprus, Lefkosa 99010, Turkey

^*

Author to whom correspondence should be addressed.

Sustainability 2025, 17(15), 6783; https://doi.org/10.3390/su17156783

Submission received: 10 June 2025 / Revised: 23 July 2025 / Accepted: 23 July 2025 / Published: 25 July 2025

(This article belongs to the Special Issue Sustainable Application of Artificial Intelligence and Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

The escalating threat of climate change has intensified the global urgency to accurately predict carbon dioxide (CO₂) emissions for sustainable development, particularly in developing economies experiencing rapid industrialization and globalization. Traditional Extreme Learning Machines (ELMs) offer rapid learning but often yield unstable performance due to random parameter initialization. This study introduces a novel hybrid model, Red-Billed Blue Magpie Optimizer-tuned ELM (RBMO-ELM) which harnesses the intelligent foraging behavior of red-billed blue magpies to optimize input-to-hidden layer weights and biases. The RBMO algorithm is first benchmarked on 15 functions from the CEC2015 test suite to validate its optimization effectiveness. Subsequently, RBMO-ELM is applied to predict Indonesia’s CO₂ emissions using a multidimensional dataset that combines economic, technological, environmental, and globalization-driven indicators. Empirical results show that the RBMO-ELM significantly surpasses several state-of-the-art hybrid models in accuracy (higher R²) and convergence efficiency (lower error). A permutation-based feature importance analysis identifies social globalization, GDP, and ecological footprint as the strongest predictors underscoring the socio-economic influences on emission patterns. These findings offer both theoretical and practical implications that inform data-driven Artificial Intelligence (AI) and Machine Learning (ML) applications in environmental policy and support sustainable governance models.

Keywords:

Machine Learning (ML); Extreme Learning Machine (ELM); sustainability; sustainable development; Artificial Intelligence (AI); hybrid metaheuristic optimization; CO₂ emission prediction

1. Introduction

The accumulation of greenhouse gases in the atmosphere resulting from massive emissions has inflicted significant and potentially irreversible harm on the Earth’s natural ecosystem [1]. Between 2011 and 2020, the average global surface temperature rose by approximately 1.1 °C compared to pre-industrial levels (1850–1900). This ongoing environmental degradation has made climate change a major concern shared by nations. Despite increasing global awareness, greenhouse gas emissions in general, and CO₂ emissions in particular, have continued to rise [2]. These emissions are driven by unsustainable practices in energy production, land use, and consumption [3]. According to the Intergovernmental Panel on Climate Change (IPCC), global warming has led to more frequent and intense heatwaves, droughts, floods, and sea-level rise [4].

These changes threaten ecosystems, food and water security, human health, and economic stability, particularly in regions with limited capacity to adapt. Without significant reductions in greenhouse gas emissions, many of these effects risk becoming irreversible, further widening social inequalities and obstructing sustainable development [5]. In response to the escalating threat, the global community has undertaken over the years a series of international initiatives and agreements. One of the earliest milestones was the 1992 United Nations Conference on Environment and Development, which laid the foundation for future climate action [6]. In 1994, the United Nations Framework Convention on Climate Change (UNFCCC) came into effect as a non-binding treaty aimed at encouraging voluntary efforts to address greenhouse gas emissions. The first legally binding international accord was the Kyoto Protocol, adopted in 1997 and entered into force in 2005 [7]. The 2015 Paris Agreement marked another significant global commitment, aiming to limit global temperature rise to well below 2 °C above pre-industrial levels, and preferably to 1.5 °C [8]. Recently, at the United Nations’ COP26 conference on climate change held in 2021, strategies for mitigating greenhouse gas emissions were explored across different thematic areas. Among these areas, particular attention was given to the application of machine learning in forecasting greenhouse gas emissions, emphasizing its potential to support informed decision-making in efforts to reduce and manage emissions effectively [9]. This highlights a critical implication: the success of climate action initiatives does not rely only on political actions but also on effective scientific support. Mitigating emissions requires accurate models to predict greenhouse gas levels and identify key contributing factors at the individual, community, and national levels.

The literature on CO₂ emissions forecasting generally categorizes existing methodologies into three main groups: conventional statistical models, standalone machine learning models, and hybrid approaches. Conventional statistical models comprise techniques such as System Dynamics (SD), Vector Auto-Regression (VAR), Auto-Regressive Integrated Moving Average (ARIMA), and Gray Models (GM), which have long been applied. Feng et al., using an SD model, forecasted Beijing’s CO₂ emissions and concluded its increase to 169.67 million tones CO₂ equivalent by 2030 [10]. Similarly, Bao et al. used an SD model to simulate carbon emissions in China’s thermal power sector, predicting a peak of 4.228 billion tones by 2026 under current development trends [11]. Hernandez et al. applied VAR and ARIMA models to forecast energy consumption and CO₂ emissions in a smart industry context, concluding that the ARIMA model demonstrated superior predictive accuracy [12]. Hamzacebi & Karakurt applied a GM to estimate Turkey’s energy-related CO₂ emissions from 1965 to 2025 [13]. While these studies show the use of statistical methods in CO₂ emission forecasting, they also reveal notable limitations, as highlighted by Hu et al. [14]. A major shortcoming is their inability to capture complex, nonlinear relationships among influencing variables. Another key drawback lies in their reliance on predefined functional forms and assumptions about data distribution, which can oversimplify real-world dynamics and lead to reduced predictive accuracy.

The second category, standalone machine learning models, including Artificial Neural Networks (ANN) [15], Support Vector Regression (SVR) [16], and Long Short-Term Memory Networks (LSTM) [17], responds to the increasing availability of high-dimensional, real-time environmental data for emission forecasting [18]. Machine learning algorithms are particularly effective at detecting complex patterns and correlations within large datasets. As a result, they hold substantial promise for enhancing the accuracy of CO₂ emission predictions. However, they often require intensive tuning and can suffer from overfitting, leading to the rise in hybrid approaches.

Hybrid approaches integrate machine learning algorithms with metaheuristic optimization techniques or combine multiple models to improve predictive accuracy. In this category, Saqr et al. proposed a hybrid model combining Greylag Goose Optimization (GGO) with a Multi-Layer Perceptron (MLP) to improve the accuracy of electric vehicle (EV) emissions forecasting [19]. Khajavi and Rastgoo developed a hybrid model using Random Forest, Support Vector Regression, and optimization algorithms to predict CO₂ emissions in 30 Chinese cities [20]. The Random Forest model optimized with the Slime Mold Algorithm achieved the best test accuracy, making it the most effective predictor. Alhussan et al. introduced a CO₂ forecasting framework using Bidirectional Gated Recurrent Unit (BIGRU) networks optimized with GGBERO, a novel algorithm that fuses Greylag Goose Optimization and Al-Biruni Earth Radius [21]. The model achieved a very low MSE. Phatai and Luangrungruang developed a predictive model combining a Backpropagation Neural Network (BPNN) with Particle Swarm Optimization (PSO) to estimate CO₂ emissions based on energy consumption data [22]. The results showed that the PSO-optimized machine learning model achieved high prediction accuracy.

The ELM, a single-hidden-layer feedforward neural network, has gained recognition for addressing certain limitations in traditional machine learning models, particularly in terms of computational speed and efficiency [23]. ELM is distinguished by its rapid training process and strong generalization ability, achieved by randomly assigning weights in the hidden layer and analytically determining the output weights. This streamlined architecture allows ELM to handle large datasets with low computational cost. However, like many machine learning models, ELM’s performance is highly sensitive to the random initialization of input weights and biases, which can result in variability and reduced prediction accuracy [24]. Furthermore, its generalization performance tends to degrade when dealing with high-dimensional or noisy datasets. These challenges have motivated ongoing research into optimization strategies aimed at improving ELM’s reliability and predictive power [25]. Pradhan et al. proposed a software defect prediction model combining ELM with Improved JAYA (IMJAYA) to optimize weights and biases [26]. Van Thieu et al. proposed hybrid models combining ELM with Pareto-like Sequential Sampling (PSS), Weighted Mean of Vectors (INFO), and Runge–Kutta Optimizer (RUN) to improve streamflow prediction [27]. Among the models tested, PSS-ELM achieved the best performance with high accuracy and robust convergence. Abba et al. applied ELM models optimized with Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Biogeography-Based Optimization (BBO), and a BBO-PSO hybrid to predict treated water quality parameters in Nigeria [28]. The BBO-PSO-ELM model achieved superior accuracy for pH, total dissolved solids, and hardness, while BBO-ELM performed best for turbidity. Saleh et al. addressed the prediction of Suspended Sediment Load (SSL), by ELM with four metaheuristic algorithms: PSO, Henry Gas Solubility Optimization (HGSO), Electromagnetic Field Optimization (EFO), and Nuclear Reaction Optimization (NRO) [29]. Experiments demonstrated that the ELM-HGSO model outperformed other models. Sun et al. addressed the challenge of identifying key drivers of CO₂ emissions in Hebei, China. They proposed a PSO-optimized ELM (PSO-ELM) model that incorporates factor analysis to reduce the number of influencing variables. The outcome showed that PSO-ELM outperformed standard ELM and BPNN, enhancing prediction accuracy and offering policy insights for low-carbon strategies [30]. Wei et al. tackled the issue of accurate CO₂ emission prediction in Hebei, China. They proposed a hybrid MFO-RF-ELM model, where Random Forest was used for feature analysis and ELM, optimized by Moth-Flame Optimization (MFO), handled prediction. The outcome showed that the proposed model surpassed other parallel models, achieving higher accuracy in forecasting CO₂ emissions [31]. Algwil and Khalifa addressed the need for accurate CO₂ prediction. They proposed a GMSMFO-ELM hybrid model, where Gaussian Mutation and Shrink Mechanism-based Moth Flame Optimization enhanced ELM training. The model achieved 96.5% R² and outperformed other hybrids across multiple error metrics, with key predictors identified as economic growth, FDI, and renewable energy [24]. Wang et al. focused on improving carbon emission forecasting in public buildings. They proposed the MIV-IHHO-DELM hybrid model, integrating Mean Impact Value (MIV) for feature selection, an Improved Harris Hawk Optimization (IHHO) for optimization, and a Deep ELM (DELM) for prediction. The model achieved a MAPE of 0.704% and RMSE of 18.01, significantly outperforming existing models and showing strong generalization, making it a powerful tool for emission reduction planning in the construction sector [32].

While these reviewed hybrid approaches have improved the ELM’s performance, challenges remain in achieving consistent accuracy, robustness in noisy datasets, and adaptability to the high-dimensional, nonlinear nature of CO₂ emissions data. These limitations are especially pressing in the context of recent international climate action. Notably, the COP26 United Nations Climate Change Conference emphasized the urgent need for improved strategies in monitoring, reporting, and forecasting greenhouse gas emissions, particularly in developing economies where data gaps and limited technical infrastructure remain key challenges. To address these limitations, this study proposes a novel hybrid model, the Red-Billed Blue Magpie Optimizer-tuned Extreme Learning Machine (RBMO-ELM). The RBMO algorithm, inspired by the adaptive foraging behavior of red-billed blue magpies [33], efficiently optimizes ELM’s input to hidden layer weights and hidden layer biases. This not only reduces the impact of random initialization but also enhances stability and predictive power across varied environmental conditions. The RBMO-ELM model is thus designed to offer a more robust, adaptive, and computationally efficient solution for CO₂ emission forecasting.

The remainder of the paper is structured as follows. Section 2 outlines the methodology, including the Red-Billed Blue Magpie Optimizer (RBMO), the Extreme Learning Machine (ELM), and the proposed RBMO-ELM hybrid prediction model. This is followed by the Experiment and Discussion section, which presents an evaluation of the RBMO-ELM model’s performance in Section 3. The paper concludes with a summary of the key findings and suggestions for future research in Section 4.

2. Methodology

2.1. The Red-Billed Blue Magpie Optimizer (RBMO)

The RBMO developed by Fu et al. is inspired by the Red-billed Blue Magpie (RBM), an avian species indigenous to parts of Asia. The optimization process of RBMO commences with the random generation of a population of candidate solutions within the specified bounds of the search space [33]. The initial population matrix

X

is expressed in Equation (1).

X = [\begin{matrix} x_{1,1} & \dots & x_{1, j} & \dots & x_{1, d i m} \\ x_{2,1} & \dots & x_{2, j} & \dots & x_{2, d i m} \\ ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ x_{n, 1} & \dots & x_{n, j} & \dots & x_{n, d i m} \end{matrix}]

(1)

where

X \in R^{n \times d i m}

represents the positions of all

n

search agents across

d i m

dimensions, and

x_{i, j}

indicates the value of the

j^{th}

variable for the

i^{th}

agent. Each element

x_{i, j}

is initialized as expressed in Equation (2).

x_{i, j} = (u b - l b) \cdot {R a n d}_{1} + l b

(2)

Here,

u b

and

l b

are the upper and lower boundaries of the decision space, and

{R a n d}_{1} \sim U (0,1)

is a uniformly distributed random number. RBMs forage either in small collaborative groups or larger clusters, adapting their strategy according to group size. This dynamic is modeled in RBMO using two separate update equations as given in Equations (3) and (4). Small group behavior (at each iteration, a group of randomly chosen individuals of size

p

are chosen within

{2,5}

) as given in Equation (3).

X_{i}^{t + 1} = X_{i}^{t} + (\frac{1}{p} \sum_{m = 1}^{p} X_{m}^{t} - X_{r s}^{t}) \cdot {R a n d}_{2}

(3)

Cluster-based exploration, with group size

q

is randomly chosen within

[10, n]

, and is modeled as expressed in Equation (4). The values of the two group sizes p and q are stochastically determined at each iteration to maintain diversity and dynamic role distribution among agents. This randomized group partitioning enhances the global and local search trade-off throughout the optimization process.

X_{i}^{t + 1} = X_{i}^{t} + (\frac{1}{q} \sum_{m = 1}^{q} X_{m}^{t} - X_{r s}^{t}) \cdot {R a n d}_{3}

(4)

where

X_{i}^{t}

is the current position of the

i^{t h}

agent at iteration

t

,

X_{m}^{t}

denotes a randomly selected individual from the population,

X_{r s}^{t}

is a randomly sampled solution from the current population, and

{R a n d}_{2} a n d {R a n d}_{3}

denote a random number within 0 and 1. These equations emulate the adaptive and cooperative food-searching behavior of the RBMs. Once a food source is identified, the magpies coordinate their attack. The RBMO mirrors this through targeted movements toward promising solutions (the food location), moderated by cooperation and randomness. In small groups, the position is updated with Equation (5).

X_{i}^{t + 1} = X_{f o o d}^{t} + C F \cdot (\frac{1}{p} \cdot \sum_{m = 1}^{p} X_{m}^{t} - X_{i}^{t}) \cdot {R a n d n}_{1}

(5)

In cluster attacks, the update rule is expressed in Equation (6).

X_{i}^{t + 1} = X_{f o o d}^{t} + C F \cdot (\frac{1}{q} \cdot \sum_{m = 1}^{q} X_{m}^{t} - X_{i}^{t}) \cdot {R a n d n}_{2}

(6)

where

X_{food}^{t}

is the position of the best solution, and

{R a n d n}_{1}

and

{R a n d n}_{2}

denote a random number with standard normal distribution.

C F

(Control Factor) is defined as follows in Equation (7)

C F = {(1 - \frac{t}{T})}^{(2 \cdot \frac{t}{T})}

(7)

with

T

being the maximum number of iterations while

t

is the current iteration. This term gradually shifts the search emphasis from exploration to exploitation as the algorithm progresses. To simulate the food storing behavior of RBMs, the algorithm retains promising solutions for future exploitation. This mechanism ensures that only improvements in fitness are accepted, as expressed in Equation (8).

X_{i}^{t + 1} = \{\begin{array}{l} X_{i}^{t}, & if {fitness}_{o l d}^{i} > {fitness}_{new}^{i} \\ X_{i}^{t + 1}, & otherwise \end{array}

(8)

This elitism-based mechanism reinforces convergence stability and guards against performance degradation.

2.2. Extreme Learning Machine (ELM)

The ELM, introduced by Huang et al. [34], is a type of single-hidden-layer feedforward neural network (SLFN). ELM is recognized for its extremely fast training speed, minimal parameter tuning, and strong generalization capability. Unlike deep neural networks, which require large-scale datasets, extensive hyperparameter tuning, and high computational cost, ELM offers an efficient alternative for achieving high predictive accuracy. ELM addresses the limitations associated with traditional learning algorithms, such as slow convergence and entrapment in local minima, particularly by circumventing the need for iterative gradient-based optimization. This makes it well-suited for modeling highly nonlinear systems with complex activation functions [35]. The ELM architecture operates under two primary principles. Firstly, the input weights

w_{i}

and biases

b_{i}

of the hidden layer neurons are assigned randomly and remain fixed throughout training. The only adjustable parameter is the number of hidden nodes

N

. Secondly, instead of using iterative updates, the output weights

β

are computed analytically by transforming the learning task into a least squares problem, where the hidden layer output matrix is pseudoinverted using the Moore—Penrose method. Given an input

x

, the output function of ELM can be expressed Equation (9).

f (x) = \sum_{i = 1}^{N} β_{i} G (w_{i}, b_{i}, x) = β \cdot h (x)

(9)

where

β_{i}

denotes the output weight connecting the

i^{th}

hidden node to the output,

w_{i}

and

b_{i}

are the input weight and bias of the

i^{th}

hidden neuron,

G (w_{i}, b_{i}, x)

is the activation function applied to the input, and

h (x) = [G (w_{1}, b_{1}, x), \dots, G (w_{N}, b_{N}, x)]

represents the hidden layer output vector with respect to input

x

. The training objective is to minimize the discrepancy between the predicted and target outputs over the training set, which leads to the following optimization problem in Equation (10)

m i n \sum_{i = 1}^{N} {‖β \cdot h (x_{i}) - y_{i}‖}^{2}

(10)

The optimal output weight vector

β

is obtained through the Moore–Penrose generalized inverse of the hidden layer output matrix

H

, as follows in Equation (11).

β = H^{+} Y

(11)

H^{+}

is the Moore—Penrose inverse of H, and

Y

is the target output matrix. Despite its computational efficiency, the random initialization of

w_{i}

and

b_{i}

in ELM sometimes leads to suboptimal solutions, impairing predictive accuracy and stability, especially in complex, chaotic systems such as CO₂ prediction. To mitigate this shortcoming, optimization techniques like the RBMO are often employed. The RBMO enhances the ELM by adaptively tuning the input weights and hidden layer biases, thereby leveraging the exploratory strengths of swarm intelligence to augment the learning capacity and predictive robustness of the ELM framework. The illustration of the ELM model is given in Figure 1.

2.3. Proposed Prediction Model

The flowchart in Figure 2. illustrates the operational framework of the Red-billed Blue Magpie Optimizer-Extreme Learning Machine (RBMO-ELM) model, a hybrid approach designed to enhance predictive accuracy and robustness by combining the global search capabilities of the RBMO with the rapid learning efficiency of the ELM. The process begins with the initialization of a population of candidate solutions, each representing a set of input weights and biases for the ELM model. These candidates are evaluated using the ELM, which computes the fitness of each solution based on a predefined objective function, in this research, the Root Mean Squared Error (RMSE). The main optimization loop proceeds iteratively, where in each iteration, a stochastic operation is applied based on whether small-group (Equations (3) and (5)) or cluster-based (Equations (4) and (6)) update strategies should be applied, simulating the cooperative foraging and hunting behavior of red-billed blue magpies. After updating the positions, the new solutions are again evaluated using ELM. If a new solution yields better fitness than its predecessor, it is retained; otherwise, the previous position is preserved through an elitist strategy. The global best solution is continuously tracked and updated until the maximum number of iterations is reached, after which the optimal set of ELM parameters is returned. Despite its efficiency and generalization strength, the ELM suffers from critical limitations. These include its sensitivity to the random initialization of input weights and biases, which often leads to suboptimal solutions. The RBMO component effectively mitigates these issues by optimizing the ELM’s parameters through adaptive, population-based search, enhancing both the stability and accuracy of the overall model. Thus, the RBMO-ELM hybrid provides a powerful tool for tackling nonlinear, high-dimensional prediction tasks that require both computational speed and adaptive learning capability.

3. Experiment and Discussion

In this section, two experimental frameworks are devised to comprehensively assess the efficacy of the proposed Red-billed Blue Magpie Optimizer (RBMO) and its hybridization with the Extreme Learning Machine (ELM). The first experimental phase focuses on the standalone performance of the RBMO, which is benchmarked against 15 standardized test functions from the CEC 2015 competition suite [36]. These functions encompass a broad spectrum of optimization scenarios, including unimodal, multimodal, hybrid, and composite landscapes designed to rigorously evaluate the algorithm’s search ability, convergence behavior, and adaptability across diverse and challenging environments. Each test function provides a medium for examining the RBMO’s capacity to balance exploration and exploitation effectively. In the second phase, the RBMO is integrated with the ELM to construct a hybrid predictive framework aimed at modeling carbon dioxide (CO₂) emissions. This task poses significant challenges for traditional predictive models due to its nonlinear, dynamic, and chaotic nature. The performance of the RBMO-ELM model is systematically validated using a suite of evaluation metrics. These metrics offer comprehensive insights into the model’s predictive accuracy, generalization capacity, and robustness.

3.1. Evaluation of the Proposed RBMO on CEC2015 Functions

To comprehensively evaluate the optimization proficiency of the proposed RBMO algorithm, a suite of 15 benchmark functions from the CEC 2015 competition test suite is employed. These benchmark problems are deliberately developed to emulate a broad spectrum of real-world optimization challenges and span various categories, including unimodal, multimodal, hybrid, and composite functions. Each category presents distinct topological complexities, such as separability and modality, that rigorously test the algorithm’s global search behavior, convergence stability, and exploitation capability. In order to ensure a fair and unbiased comparative evaluation, RBMO is benchmarked against several state-of-the-art metaheuristic algorithms, namely the original Aquila Optimizer (AO) [37], Exponential Distribution Optimizer (EDO) [38], Harris Hawks Optimization (HHO) [39], the JAYA algorithm [40], Polar Lights Optimization (PLO) [41], Sine Cosine Algorithm (SCA) [42], and the Red-billed Blue Magpie Optimizer (RBMO). This focuses on recently developed metaheuristic optimizers that demonstrate strong performance in complex search scenarios and have seen successful integration into hybrid machine learning models. Collectively, these algorithms form a modern, competitive benchmark that enables robust assessment of standalone RBMO and RBMO-ELM’s search dynamics and convergence behavior relative to other leading metaheuristic approaches. Each algorithm is subjected to a uniform experimental procedure: the maximum number of iterations is fixed at 4000, the population size is maintained at 30, and all experiments are executed in a 30-dimensional search space. Moreover, to account for the stochastic nature of these algorithms, each test is independently repeated 30 times. The specific parameter configurations for all compared algorithms are systematically outlined in Table 1. The primary objective of this experimental framework is to quantitatively and qualitatively assess the effectiveness of RBMO in solving high-dimensional and complex optimization problems. The parameter values presented in Table 1 are adopted from the original study, where these configurations were empirically tuned and validated across multiple benchmark functions. These settings have been retained to ensure consistency and comparability with the foundational algorithm.

The comparative analysis of the RBMO against several state-of-the-art metaheuristic algorithms reveals the consistent superiority of the RBMO across the CEC 2015 benchmark suite as seen in Table 2. In the single-modal category (F1–F2), RBMO achieves significantly better performance than all other compared algorithms, demonstrating strong global convergence capabilities and high precision in optimizing functions with a single optimum. It outperforms traditional approaches such as AO and HHO, which exhibit a higher average error and variability. For multimodal functions (F3–F5), the RBMO maintains a clear advantage by avoiding premature convergence and locating better optima in F3 and F5. Compared to competitors, especially the EDO, SCA, and JAYA, the RBMO demonstrates a more consistent ability to escape local minima and achieve lower error values, which is critical in multimodal landscapes with numerous deceptive basins of attraction. The PLO demonstrated enhanced convergence in F4 and F5.

In the case of hybrid functions (F6–F8), which integrate multiple search space properties and present a higher level of complexity, the RBMO consistently delivers superior results. It records lower average objective values and reduced standard deviations compared to all other algorithms, indicating both efficiency and stability in F6 and F8. While some algorithms, such as the PLO and SCA, show competitive performance on selected functions, the RBMO leads across the board in this category. The composite functions (F9–F15) further affirm RBMO’s robustness—more specifically F10, F12, F13, F14, and F15. These functions, designed to simulate real-world optimization challenges with diverse structural traits, are effectively navigated by RBMO, which secures the top performance on nearly all of them. In contrast, algorithms like HHO and EDO struggle with convergence or exhibit higher variability, while RBMO reliably produces better and more consistent results. Its advantage is particularly pronounced in functions with complex, rugged landscapes, where other methods demonstrate increased instability. Statistical validation using the Friedman test confirms the dominance of the RBMO, which obtains the best rank across all compared algorithms. The test also indicates statistically significant differences in performance compared to the other algorithms according to the Wilcoxon Sign Rank Sum p-value, reinforcing the reliability of the observed outcomes.

The convergence plots for functions F1 through F15 in Figure 3 offer valuable insights into the dynamic behavior and optimization efficiency of the RBMO in comparison to other metaheuristic algorithms. Across the plots in Figure 3, the fitness mean values of 30 independent runs over 4000 iterations provide an illustration of convergence speed, solution quality, and algorithmic stability.

For F1 and F2 (unimodal functions), the RBMO exhibits a faster convergence rate and significantly lower final fitness values than all competing algorithms. This demonstrates its high precision in locating the global optimum in smooth search landscapes and reflects strong exploitation ability. In particular, while other algorithms exhibit slower or stagnated convergence trends, the RBMO continues to improve steadily and consistently. In F3, F4, and F5, which are multimodal functions characterized by numerous local optima, the RBMO maintains a notable advantage in convergence behavior. It not only reaches better final solutions but also demonstrates smoother and more stable convergence trajectories, F3 and F5, though slightly less than the PLO in F4. Competing algorithms such as the EDO and SCA show fluctuating or plateaued performance, suggesting premature convergence or ineffective exploration.

The convergence behavior on F6, F7, and F8, which belong to the hybrid function category, further reinforces the RBMO’s effectiveness in F6 and F8. These functions are complex due to their combined modality and landscape heterogeneity. Despite these challenges, the RBMO attains the lowest fitness values with minimal oscillations, indicating a superior balance between global exploration and local exploitation. Notably, the AO and JAYA display inconsistent convergence patterns and fail to reach comparable solution quality. For F9–F15, representing composite functions with intricate structural properties, RBMO once again demonstrates the leading performance in most functions. It converges more rapidly and achieves superior fitness outcomes, whereas most competitors converge prematurely. The gradual but decisive descent in the RBMO’s fitness curve highlights its robustness in handling non-separable, rugged, and multimodal landscapes.

The box plots presented in Figure 4 for benchmark functions F1 through F15 provide a robust statistical visualization of the distribution, stability, and robustness of the optimization results obtained during the benchmarking experiment of the RBMO against compared optimizers. These box plots graphically illustrate the median, interquartile range (IQR), and outlier behavior, offering deeper insights beyond average convergence behavior. For F1 and F2, which are unimodal functions, the RBMO demonstrates extremely compact box plots with narrow interquartile ranges and no outliers. This reflects a high degree of consistency and precision across multiple independent runs. In contrast, algorithms such as the HHO and JAYA show high variance and frequent outliers, indicating less reliable convergence and sensitivity to initialization. On multimodal functions (F3–F5), the advantage of the RBMO becomes even more evident. For instance, in F3 and F5, the RBMO exhibits lower medians and tighter distributions compared to all other optimizers. Competing algorithms, particularly the EDO and HHO, not only have wider box plots but also demonstrate multiple outliers, revealing frequent suboptimal convergence. While the PLO shows a relatively compact distribution in some cases, the RBMO still consistently records lower fitness values and greater robustness.

In the case of hybrid functions (F6–F8), which integrate diverse modalities and impose more complex search space dynamics, the RBMO continues to exhibit excellent performance. Its box plots are notably compact with consistently lower median values, suggesting strong adaptability and efficient search behavior in irregular and composite landscapes. Conversely, algorithms like the EDO and HHO reveal substantial variance and outliers, implying the less effective handling of complex landscapes. For composite functions such as F9 and F15, the RBMO maintains superior distributional characteristics. The box plots for these functions show that the RBMO not only achieves lower central tendency but also exhibits reduced dispersion compared to the other methods. Particularly in F10, while the HHO and SCA show high variance and numerous extreme values, the RBMO remains stable with a tight clustering of results, affirming its robustness and convergence reliability under challenging conditions. Finally, the box plot analysis corroborates the RBMO’s exceptional stability, reliability, and resistance to outliers across a wide range of benchmark functions. Its ability to produce consistently high-quality solutions with minimal performance variation highlights its potential as a dependable optimizer for both theoretical and practical applications.

3.2. Evaluation of RBMO-ELM Model

3.2.1. Data

The dataset utilized in this study comprises quarterly observations from the first quarter of 1990 (1990 Q1) to the fourth quarter of 2022 (2022 Q4), representing a comprehensive time span of 33 years for Indonesia. The data encompasses a range of environmental, technological, and socio-economic indicators relevant to the analysis of carbon emissions. Specifically, the variables include the following: carbon emissions (CO₂) as the dependent variable, and six independent features—ecological footprint (EF), economic growth (GDP), green innovation (GINO), medium and high-tech industries (MHT), renewable energy (REN), and social globalization (SG). These variables were sourced from reputable international databases, including the World Bank [38], OECD Green Growth Indicators [39], Global Footprint Network [40], the British Petroleum Database (BP), and the KOF Swiss Economic Institute Database, to ensure data reliability. Prior to analysis, the dataset underwent a series of preprocessing steps. All features were normalized using the Min-Max scaling technique to bring them into the [0, 1] range, using the minimum and maximum values of each feature, thereby ensuring comparability and improving model convergence. The dataset was subsequently divided into training and testing subsets using a 70:30 split with randomized selection to avoid order-based bias. Additionally, the correlation heatmap and distribution plots of variables, are illustrated in Figure 5 and Figure 6. These preprocessing procedures ensured the dataset was clean, well-structured, and analytically robust for the subsequent application of machine learning and optimization algorithms. For data completeness, only observations with full availability across all selected indicators were used, resulting in a clean and consistent dataset. Stationarity testing, typically required for statistical models like ARIMA, was not applied here, as EML and related machine learning models can handle non-stationary inputs due to their flexible nonlinear approximation capabilities.

3.2.2. Evaluation Metrics

Evaluation metrics are fundamental to machine learning, providing a consistent and quantitative basis for evaluating the predictive performance of models. These metrics not only measure the reliability of model outputs in approximating actual observations but also serve as critical tools in guiding model refinement, comparative assessment, and the selection of optimal predictive frameworks. In this study, a comprehensive suite of standard performance indicators is utilized to evaluate the accuracy and reliability of the proposed model. The selected metrics include the coefficient of determination

(R^{2})

, root mean squared error (RMSE), mean squared error (MSE), maximum error (ME), and relative absolute error (RAE) [43]. The respective mathematical definitions of these metrics are provided below in Equations (12)–(16).

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(12)

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

(13)

R A E = \frac{\sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|}{\sum_{i = 1}^{n} |y_{i} - \overline{y}|}

(14)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \overline{y})}^{2}}

(15)

M E = m a x (|y_{i} - {\hat{y}}_{i}|)

(16)

In the equations above,

n

denotes the number of data instances,

y_{i}

represents the actual observed values,

{\hat{y}}_{i}

denotes the predicted values, and

\overline{y}

is the mean of the observed data. Collectively, these metrics enable a rigorous evaluation of the model’s predictive accuracy, robustness, and generalization capability, thereby supporting a reliable comparison among the compared models.

3.2.3. Evaluation of Models and Discussion

In this study, the number of hidden neurons N in the ELM was fixed at 10 for all optimizer-tuned models and the baseline ELM. This design choice ensures that performance differences across models stem solely from the quality of the optimized weights and biases, rather than variation in network architecture. The range of optimization for each weight and bias was set between [−10,10], consistent with common practice in ELM-based hybrid models. The experimental evaluation, conducted using 5-fold cross validation, rigorously compares the performance of the proposed RBMO-ELM against several state-of-the-art hybrid models, including the AO-ELM, EDO-ELM, HHO-ELM, JAYA-ELM, PLO-ELM, SCA-ELM, and the baseline ELM model. This study utilized recent optimizers due to their enhanced exploration–exploitation balance, dynamic control mechanisms, and adaptive strategies, making them more suitable for benchmarking the proposed RBMO-ELM. This focus ensures a fair and modern comparative framework aligned with current algorithmic advancements in optimization research. The results span both training and testing phases, evaluated using five regression metrics as expressed in Equations (12)–(16).

The RBMO-ELM consistently outperforms all benchmarked models in the training phase across all five folds. Its R² values range between 0.9957 and 0.9979, indicating that more than 99.5% of the variance in the training data is accounted for by the model predictions, as seen in Table 3. In contrast, most other hybrid models display R² values ranging between 0.88 and 0.95, suggesting higher bias and less predictive precision. The RBMO-ELM also exhibits substantially lower RMSE and MSE values across all folds. For instance, in Fold 1, the model achieves an RMSE of 5.7070 × 10⁻³ and an MSE of 3.3000 × 10⁻⁵, in contrast to the AO-ELM and EDO-ELM, which yield higher RMSE and MSE values. These reductions in error imply a more accurate model with minimal residual variance. The maximum error (ME) achieved by the RBMO-ELM is also the lowest in every fold, ranging from 1.6825 × 10⁻² to 2.3126 × 10⁻², which highlights the model’s ability to maintain prediction reliability even in the worst-case scenario. In comparison, other methods such as the HHO-ELM and AO-ELM often produce higher ME values, indicating higher sensitivity to outlier predictions. Furthermore, the RBMO-ELM demonstrates remarkable performance in terms of the RAE, while the PLO-ELM and SCA-ELM though competitive still exhibit higher RAE in most folds. Models like the AO-ELM and HHO-ELM again perform the worst, with RAE values reflecting a poor normalization of absolute errors.

The robustness and generalization capability of the RBMO-ELM are most evident during the testing phase in Table 4. Across all five folds, the RBMO-ELM maintains the highest generalization accuracy, with R² values ranging from 0.9878 to 0.9981. In contrast, traditional hybrid models like the AO-ELM and HHO-ELM exhibit performance degradation during testing, with R² dropping as low as 0.84, reflecting a tendency to overfit during training. In terms of the RMSE and MSE, the RBMO-ELM continues to yield the most accurate predictions. This significant reduction in prediction error confirms that the RBMO-ELM not only achieves low training error but also generalizes well on unseen data. Moreover, the ME for the RBMO-ELM remains the lowest across the majority of the test folds, ranging from 1.2794 × 10⁻² to 4.3513 × 10⁻², again substantially outperforming other models. The RAE values of the RBMO-ELM during testing remain impressively low, ranging from 2.7561 × 10⁻² to 6.5896 × 10⁻², further reinforcing the model’s strong predictive generalization and consistent reduction in absolute errors relative to baseline predictions. Competing optimizers like the JAYA-ELM and EDO-ELM frequently record high RAE values, indicating relative inefficiency.

The statistical analysis of the average, standard deviation, and best results of the training and testing outcomes across 20 independent runs confirms the superior performance of the RBMO-ELM model relative to its competitors. In the training phase, as detailed in Table 5, the RBMO-ELM achieved the highest average R² of 0.997847, significantly surpassing all peer optimizers, including other strong performers like the PLO-ELM (0.984792) and SCA-ELM (0.984936). This high R² value indicates that the RBMO-ELM model captures nearly all the variance in the data, thus exhibiting exceptional predictive power. Furthermore, the RBMO-ELM attained the lowest average MSE of 3.5750 × 10⁻⁵ and RMSE of 5.9545 × 10⁻³, reflecting the model’s high precision in approximating true values. The minimal ME of 1.9722 × 10⁻² and RAE of 2.7684 × 10⁻² further substantiate its accuracy and robustness. These performance metrics are accompanied by very low standard deviations across all indicators, underscoring the model’s stability and consistent convergence across runs.

In the testing phase from Table 6, the RBMO-ELM continued to outperform other hybrid models. It achieved the highest average R² value of 0.989096, confirming its superior generalization ability on unseen data. In comparison, the PLO-ELM and SCA-ELM, though competitive, attained lower average R² values of 0.980525 and 0.979919, respectively. The RBMO-ELM also yielded the lowest average RMSE of 1.0178 × 10⁻², MSE of 1.0515 × 10⁻⁴, and ME of 2.5465 × 10⁻², alongside the smallest RAE of 4.8434 × 10⁻². Notably, the RBMO-ELM model exhibited the lowest variability, evidenced by the smallest standard deviations, which further confirms its robustness and optimization stability. Traditional optimizers such as the AO-ELM, HHO-ELM, and JAYA-ELM showed inferior results, with notably higher error magnitudes and standard deviations, indicating less reliable performance across runs. The proposed RBMO-ELM model demonstrates substantial improvements in predictive accuracy, reliability, and consistency due to the enhanced exploitation–exploration balance introduced by the RBMO strategy. These findings validate the algorithm’s efficacy for the CO₂ prediction task and underscore its superiority in optimizing the ELM model when compared to both classical and recently developed metaheuristic-based alternatives. To address the risk of overfitting, this study adopted a 5-fold cross-validation strategy, ensuring that the RBMO-ELM model generalizes well across different data subsets. Consistently high R² values and low error metrics (RMSE, MSE, RAE) across both training and testing phases, as shown in Table 3, Table 4, Table 5 and Table 6, indicate that the model maintains robust predictive performance without overfitting. Additionally, the low standard deviations across 20 independent runs, along with minimal error dispersion in Figure 7, Figure 8 and Figure 9, further validate the model’s generalization capability and stability.

The scatter plots shown in Figure 7 illustrate the predictive performance of various hybrid ELM models using their best result from 20 independent runs. Each plot represents the relationship between predicted and actual CO₂ emission values for both training and testing sets, with a dashed red line indicating the ideal line (perfect prediction). Among all models, the RBMO-ELM plot exhibits the most alignment of both training and testing data points along the ideal line, reflecting its superior predictive accuracy and generalization capability. The points are tightly clustered with minimal dispersion, indicating that the model captures the underlying mapping between the features and target with high fidelity. This strong fit correlates well with the model’s lowest recorded RMSE and MSE values, as well as the highest R² from both training and testing evaluations, as given in Table 5 and Table 6.

In contrast, the AO-ELM, HHO-ELM, and JAYA-ELM plots demonstrate noticeable deviation from the ideal line, particularly in the test set points. These deviations highlight their relatively weaker generalization. The HHO-ELM and JAYA-ELM exhibit larger vertical dispersion, which is consistent with their higher MSE and ME values from the test phase. Models such as PLO-ELM and SCA-ELM show reasonably good clustering, particularly during training, but the spread of testing data points indicates mild overfitting and reduced generalization. Though they perform better than the earlier methods, their scatter plots do not exhibit the tight fit achieved by the RBMO-ELM. This is corroborated by their slightly lower R² and higher RMSE on the test data compared to the RBMO-ELM. The baseline ELM model, lacking any optimization, displays a more dispersed pattern around the ideal line, especially in the test data. This illustrates the limitations of ELM when used without fine-tuned input weights and biases, leading to reduced prediction precision and increased error margins.

The error plot in Figure 8 provides a comparative analysis of the predictive performance and robustness of each hybrid ELM model, showcasing actual versus predicted values across data samples and the corresponding absolute error (in dark red) for both training and testing sets. Among the models, the RBMO-ELM distinctly demonstrates the most consistent and minimal prediction error throughout the entire dataset. The error curve remains tightly centered around zero, with negligible oscillations, particularly during the testing phase. This directly aligns with the model’s superior performance in Table 5 and Table 6, where it achieved the lowest RMSE, MSE, ME, and RAE. The minimal discrepancy between predicted and actual values illustrates the RBMO-ELM’s exceptional generalization capability and its ability to maintain accuracy on unseen data. In contrast, models such as the AO-ELM, HHO-ELM, and JAYA-ELM exhibit significant volatility in their error curves. The amplitude of error deviations increases notably in the testing phase, suggesting reduced stability and a tendency toward overfitting or insufficient adaptation to unseen data. Specifically, the HHO-ELM and JAYA-ELM show several spikes, indicating multiple samples with substantial prediction errors. These fluctuations are consistent with their relatively higher ME and RMSE values reported in the quantitative evaluations. The EDO-ELM performs moderately better, with a comparatively smoother error than the AO-ELM and HHO-ELM. However, it still exhibits noticeable deviations during testing, indicating moderate instability. Similarly, the SCA-ELM and PLO-ELM show improved predictive alignment in the training phase, but the test phase reveals modest error dispersion, reflective of slightly diminished generalization compared to the RBMO-ELM. The baseline ELM model, which lacks any optimization of input weights and biases, presents the most erratic error behavior. Its error plot reveals frequent and high-magnitude deviations across both training and testing samples, underlining its limited modeling capacity and susceptibility to poor convergence. This observation is consistent with its inferior statistical metrics across all performance indicators.

The convergence curve of optimizer-based ELM models during training using the RMSE as the objective function is depicted in Figure 9. The plot clearly demonstrates the optimization efficiency and learning dynamics of the RBMO-ELM in comparison to other hybrid models. The RMSE serves as the fitness function, providing insight into how effectively each optimizer minimizes the training error over 50 iterations. The RBMO-ELM curve exhibits the fastest and most stable convergence, achieving the lowest RMSE throughout the training process. This smooth and consistent descent reflects the robust exploitation and refined search balance achieved through the RBMO mechanism. By contrast, the SCA-ELM and PLO-ELM show moderately strong convergence behaviors, but with slower descent rates and higher final RMSE values than RBMO-ELM. While both models exhibit steady error reduction, they plateau at higher fitness levels, indicating premature convergence or suboptimal search capabilities relative to the RBMO-ELM.

The HHO-ELM optimizer shows early promise with a sharp initial drop in RMSE, but its progression slows significantly and stabilizes at a higher error level than the RBMO. This behavior suggests that while the HHO-ELM can initially exploit promising regions, it lacks the fine-tuned convergence behavior needed for precise optimization. On the other hand, the AO-ELM and EDO-ELM demonstrate noticeably slower and less efficient convergence. The AO-ELM exhibits a gradual descent with significant stagnation, while the EDO-ELM remains largely flat with minimal error reduction, reflecting its limited effectiveness in tuning the ELM model for this particular task. The JAYA-ELM, meanwhile, shows one of the least responsive curves, with minimal variation across iterations. Its near-horizontal trajectory suggests weak search adaptability and underperformance in minimizing the RMSE compared to the other strategies.

Permutation-based feature importance was adopted in this study due to its model-agnostic simplicity, interpretability, and effectiveness in quantifying the relative contribution of each feature to prediction accuracy. Permutation importance remains a widely accepted method for understanding model behavior, particularly in black-box settings such as hybrid metaheuristic-optimized models [44,45]. Though widely accepted, the limitation of permutation importance is that it assumes feature independence. As observed in the correlation matrix (Figure 5), certain moderate correlations exist. This study adopted the permutation importance score as a method to explore the impact of input variables on the model prediction accuracy. The permutation feature importance analysis of the RBMO-ELM model, as presented in Figure 10, provides a comprehensive ranking of input variables based on their influence on CO₂ emission predictions. The Social Globalization (SG) indicator emerged as the most critical predictor, contributing the highest importance score. This underscores the significant impact of global social integration such as international communication, information exchange, and migration on national emission levels, due to changes in consumption behavior, industrial demand, and policy diffusion. Social Globalization (SG), as measured by the KOF index, significantly shapes CO₂ emissions through increased interpersonal contact, information exchange, and cultural diffusion. Research by Dreher et al. [46] established that higher globalization generally corresponds with elevated CO₂ output, attributable to rising consumption and industrial demand [47]. A recent ARDL study by Rasool et al. [48] found that in Indonesia, SG tends to mitigate environmental degradation, largely through the adoption of green technologies and the spread of global environmental norms. Conversely, studies from the South Asia region [49] reveal that SG drastically reduces CO₂ emissions. This reflects Indonesia’s policy trajectory to harness globalization to advance green innovation while managing rising consumption and energy pressures. Consequently, policymakers should design SG-linked strategies that simultaneously promote environmental standards, green technologies, and sustainable consumption behaviors. In conclusion, the identification of SG as a significant determinant in CO₂ emissions in Indonesia is consistent with findings reported in prior studies. Following SG, Economic Growth (GDP) ranks second in importance, reinforcing the well-established link between economic expansion and increased energy consumption, which often results in elevated carbon emissions. The third most influential feature is Ecological Footprint (EF), which captures environmental degradation and resource utilization efficiency. Its relatively high score signifies its role in explaining variations in emissions by reflecting unsustainable consumption and land use patterns. On the other hand, Medium and High-Tech Industry (MHT), Green Innovation (GINVO), and Renewable Energy (REN) exhibit lower importance scores.

3.2.4. Preservation of System Properties and Model Contributions

Artificial Neural Networks (ANNs), including Extreme Learning Machines (ELMs), have long been recognized for their exceptional capability in modeling nonlinear, high-dimensional, and complex systems. Their strength lies in their data-driven learning mechanism, which enables them to capture patterns and interactions that are often challenging for traditional models analytically. In environmental modeling domains such as CO₂ emission prediction, these characteristics make neural networks particularly attractive due to their ability to generalize well across diverse scenarios.

However, despite their versatility, neural networks such as the RBMO-ELM face notable limitations in preserving the inherent structural properties of the systems they model. Neural networks often operate under the assumption of independent and identically distributed data and do not inherently incorporate domain-specific constraints such as temporal dependencies, physical laws, or inter-variable causal relationships. This limitation affects model interpretability and realism in systems governed by complex physical or socio-economic interactions, where preserving the system structure is crucial.

Recent advancements in structure-preserving neural architectures have aimed to bridge this gap. For instance, structure-preserving neural networks, as proposed by Hernández et al., enable the model to maintain specific properties or structures of the data while learning in a manner consistent with the laws of thermodynamics [50]. Additionally, Xiao et al. proposed a machine learning approach that employs recurrent neural networks (RNNs) and is grounded in a mathematical framework that preserves the Birkhoffian structure [51]. Physics-informed neural networks (PINNs) developed by Raissi et al. integrate partial differential equations into the loss function to enforce given laws [52]. These innovations highlight a growing emphasis on embedding domain structure preservation into the learning process of machine learning models.

Although the RBMO-ELM framework proposed in this study does not incorporate structure-preserving mechanisms to preserve the inherent properties of the CO₂ emission system in a structural sense, it lacks the integration of domain-specific physical constraints, temporal dependencies, or causal relationships that characterize structure-preserving models. The inherent properties of the CO₂ emission system in this study encompass several key structural characteristics. First, nonlinear relationships exist among variables. Emissions are influenced by complex, nonlinear dependencies on predictors such as GDP, social globalization, and ecological footprint, reflecting intertwined socio-economic and environmental mechanisms. Second, the system exhibits temporal dynamics, as emission levels evolve over time in response to industrialization trajectories, policy interventions, and globalization trends. Third, the system is characterized by strong interdependencies among predictors; for instance, GDP affects energy consumption, which in turn is closely linked to the ecological footprint, forming a network of causal and correlative interactions. This limitation arises because the ELM component primarily relies on randomized weights and biases, without incorporating prior knowledge of the system’s physical laws or socio-economic dynamics. The RBMO, on the other hand, focuses solely on minimizing prediction error, without enforcing structural coherence in the learned representation. Nevertheless, it effectively captures the underlying relationships through optimized learning. It contributes significantly to performance and pattern learning through the following three key strengths:

Optimized Parameter Tuning: The RBMO enhances ELM training by fine-tuning weights and biases, thereby improving model stability and convergence.
Flexible Model Architecture: The inherent simplicity and modularity of ELM allow rapid training and adaptation, enabling the model to approximate complex mappings.
Interpretability: Feature importance analysis, especially highlighting social globalization, provides valuable insights for policy framing.

These strengths are evidenced in the empirical results. As illustrated in the predicted vs. actual plots (Figure 7), Error plots in Figure 8, and the sensitivity analysis (Figure 10), the RBMO-ELM demonstrates the capacity to learn consistent patterns within the dataset. The close alignment between predicted and observed CO₂ levels suggests effective internalization of the underlying relationships, even in the absence of formal structure-preserving constraints. Preserving the inherent properties of complex systems necessitates model architectures that extend beyond conventional feedforward approaches such as Extreme Learning Machines (ELMs). Advanced frameworks such as recurrent neural networks specifically designed to capture temporal dependencies, and manifold regularization techniques that preserve topological or relational structures offer promising avenues in this regard. Future experiments will explore the integration of these structure-preserving methodologies to enhance model fidelity and interpretability.

4. Conclusions

This study proposed a novel hybrid predictive framework, the RBMO-ELM, which integrates the RBMO with the ELM to improve carbon emission prediction accuracy in Indonesia. The RBMO algorithm, inspired by the cooperative and adaptive foraging behaviors of red-billed blue magpies, demonstrated a strong exploration–exploitation balance and was employed to optimally tune the weights and biases of the ELM model. Comprehensive evaluations were conducted on benchmark functions (CEC2015) to assess the optimization capabilities of RBMO. A quarterly dataset consisting of economic, technological, environmental, and globalization indicators was used to predict CO₂ Emissions using the RBMO-ELM model. Empirical results from multiple experiments, including 5-fold cross-validation and 20 independent runs, consistently demonstrate that the RBMO-ELM outperforms other hybrid counterparts in terms of R², RMSE, MSE, ME, and RAE metrics. The RBMO-ELM also achieved the fastest and most stable convergence profile, highlighting its superior learning and generalization capabilities. Permutation feature importance analysis further identified Social Globalization, Economic Growth, and Ecological Footprint as the most influential drivers of CO₂ emissions in Indonesia, providing valuable policy implications regarding the socio-economic dimensions of environmental sustainability. The limitations of this work include the use of a single-country dataset, which limits the generalizability of the model to other countries. Moreover, while the ELM offers rapid learning, it remains sensitive to initial parameter selection, motivating the use of metaheuristics but also introducing algorithmic complexity. Future research will extend this study by applying the RBMO-ELM framework to multi-country or sector-specific datasets, incorporating dynamic or adaptive control strategies within the RBMO for enhanced scalability. Future research will expand the benchmarking framework by incorporating more models, including advanced deep learning architectures such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), as well as ensemble learning methods like XGBoost and Random Forest. Future work will also focus on embedding structure-preserving techniques within the RBMO-ELM framework. Potential directions will include incorporating domain constraints via regularization, exploring socio-economic informed ELM variants, or leveraging graph-based feature encoders to maintain system preservation.

Author Contributions

M.A.: Conceptualization, Methodology, Formal Analysis, Original Draft. A.B.A.: Supervision, Resources, Editing. O.R.A.: Supervision, Resources, Editing, Methodology, Original Draft. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data obtained through the experiments are available upon request from the corresponding author.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

References

Sun, W.; Huang, C. Predictions of carbon emission intensity based on factor analysis and an improved extreme learning machine from the perspective of carbon emission efficiency. J. Clean. Prod. 2022, 338, 130414. [Google Scholar] [CrossRef]
Khaliq, M.A.; Alsudays, I.M.; Alhaithloul, H.A.S.; Rizwan, M.; Yong, J.W.H.; Ur Rahman, S.; Sagir, M.; Bashir, S.; Ali, H.; Hongchao, Z. Biochar impacts on carbon dioxide, methane emission, and cadmium accumulation in rice from Cd-contaminated soils; A meta-analysis. Ecotoxicol. Environ. Saf. 2024, 274, 116204. [Google Scholar] [CrossRef]
Calvin, K.; Dasgupta, D.; Krinner, G.; Mukherji, A.; Thorne, P.W.; Trisos, C.; Romero, J.; Aldunce, P.; Barrett, K.; Blanco, G.; et al. IPCC, 2023: Climate Change 2023: Synthesis Report. Contribution of Working Groups I, II and III to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change; Core Writing Team, Lee, H., Romero, J., Eds.; Intergovernmental Panel on Climate Change (IPCC): Geneva, Switzerland, 2023. [Google Scholar] [CrossRef]
Saleh, C.; Dzakiyullah, N.R.; Nugroho, J.B. Carbon dioxide emission prediction using support vector machine. IOP Conf. Ser. Mater. Sci. Eng. 2016, 114, 012148. [Google Scholar] [CrossRef]
Jha, M.K.; Dev, M. Impacts of Climate Change. In Smart Internet of Things for Environment and Healthcare; Azrour, M., Mabrouki, J., Alabdulatif, A., Guezzaz, A., Amounas, F., Eds.; Springer Nature: Cham, Switzerland, 2024; pp. 139–159. [Google Scholar] [CrossRef]
Thomas, C. The United Nations Conference on Environment and Development (UNCED) of 1992 in Context. Available online: https://www.tandfonline.com/doi/abs/10.1080/09644019208414053 (accessed on 9 June 2025).
Mor, S.; Aneja, R.; Madan, S.; Ghimire, M. Kyoto Protocol and Paris Agreement: Transition from Bindings to Pledges—A Review. Millenn. Asia 2024, 15, 690–711. [Google Scholar] [CrossRef]
Dwivedi, Y.K.; Hughes, L.; Kar, A.K.; Baabdullah, A.M.; Grover, P.; Abbas, R.; Andreini, D.; Abumoghli, I.; Barlette, Y.; Bunker, D.; et al. Climate change and COP26: Are digital technologies and information management part of the problem or the solution? An editorial reflection and call to action. Int. J. Inf. Manag. 2022, 63, 102456. [Google Scholar] [CrossRef]
Emami Javanmard, M.; Ghaderi, S.F. A Hybrid Model with Applying Machine Learning Algorithms and Optimization Model to Forecast Greenhouse Gas Emissions with Energy Market Data. Sustain. Cities Soc. 2022, 82, 103886. [Google Scholar] [CrossRef]
Feng, Y.Y.; Chen, S.Q.; Zhang, L.X. System dynamics modeling for urban energy consumption and CO₂ emissions: A case study of Beijing, China. Ecol. Model. 2013, 252, 44–52. [Google Scholar] [CrossRef]
Bao, X.; Xie, T.; Huang, H. Prediction and Control Model of Carbon Emissions from Thermal Power Based on System Dynamics. Pol. J. Environ. Stud. 2021, 30, 5465–5477. [Google Scholar] [CrossRef]
Hernandez, R.M.; Castillo, J.J.; Ragasa, A.A.; Mendez, C.; Agdon, F. Energy Consumption Forecasting for Smart Industry Using Auto-Regressive Integrated Moving Average (ARIMA) and Vector Auto-Regression (VAR) Model. In SIET ’23, Proceedings of the 8th International Conference on Sustainable Information Engineering and Technology, Bali, Indonesia, 24–25 October 2023; Association for Computing Machinery: New York, NY, USA, 2023; pp. 129–136. [Google Scholar] [CrossRef]
Hamzacebi, C.; Karakurt, I. Forecasting the Energy-related CO₂ Emissions of Turkey Using a Grey Prediction Model. Energy Sources Part Recovery Util. Environ. Eff. 2015, 37, 1023–1031. [Google Scholar] [CrossRef]
Hu, Y.; Wang, B.; Yang, Y.; Yang, L. An Enhanced Particle Swarm Optimization Long Short-Term Memory Network Hybrid Model for Predicting Residential Daily CO₂ Emissions. Sustainability 2024, 16, 8790. [Google Scholar] [CrossRef]
Altikat, S. Prediction of CO₂ emission from greenhouse to atmosphere with artificial neural networks and deep learning neural networks. Int. J. Environ. Sci. Technol. 2021, 18, 3169–3178. [Google Scholar] [CrossRef]
Wen, L.; Cao, Y. Influencing factors analysis and forecasting of residential energy-related CO₂ emissions utilizing optimized support vector machine. J. Clean. Prod. 2020, 250, 119492. [Google Scholar] [CrossRef]
Moon, T.; Choi, H.Y.; Jung, D.H.; Chang, S.H.; Son, J.E. Prediction of CO₂ Concentration via Long Short-Term Memory Using Environmental Factors in Greenhouses. Hortic. Sci. Technol. 2020, 38, 201–209. [Google Scholar] [CrossRef]
Singh, P.K.; Pandey, A.K.; Ahuja, S.; Kiran, R. Multiple forecasting approach: A prediction of CO₂ emission from the paddy crop in India. Environ. Sci. Pollut. Res. 2022, 29, 25461–25472. [Google Scholar] [CrossRef]
Saqr, A.E.-S.; Saraya, M.S.; El-Kenawy, E.-S.M. Enhancing CO₂ emissions prediction for electric vehicles using Greylag Goose Optimization and machine learning. Sci. Rep. 2025, 15, 16612. [Google Scholar] [CrossRef]
Khajavi, H.; Rastgoo, A. Predicting the carbon dioxide emission caused by road transport using a Random Forest (RF) model combined by Meta-Heuristic Algorithms. Sustain. Cities Soc. 2023, 93, 104503. [Google Scholar] [CrossRef]
Alhussan, A.A.; Metwally, M.; Towfek, S.K. Predicting CO₂ Emissions with Advanced Deep Learning Models and a Hybrid Greylag Goose Optimization Algorithm. Mathematics 2025, 13, 1481. [Google Scholar] [CrossRef]
Phatai, G.; Luangrungruang, T. Modeling Energy-Related CO₂ Emissions with Backpropagation and Metaheuristics. In Proceedings of the 2024 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Osaka, Japan, 19–22 February 2024; pp. 429–433. [Google Scholar] [CrossRef]
Cambria, E.; Huang, G.-B.; Kasun, L.L.C.; Zhou, H.; Vong, C.M.; Lin, J.; Yin, J.; Cai, Z.; Liu, Q.; Li, K.; et al. Extreme Learning Machines [Trends & Controversies]. IEEE Intell. Syst. 2013, 28, 30–59. [Google Scholar] [CrossRef]
Algwil, A.R.A.; Khalifa, W.M.S. An enhanced moth flame optimization extreme learning machines hybrid model for predicting CO₂ emissions. Sci. Rep. 2025, 15, 11948. [Google Scholar] [CrossRef]
Van Thieu, N.; Houssein, E.H.; Oliva, D.; Hung, N.D. IntelELM: A python framework for intelligent metaheuristic-based extreme learning machine. Neurocomputing 2025, 618, 129062. [Google Scholar] [CrossRef]
Pradhan, D.; Muduli, D.; Zamani, A.T.; Yaqoob, S.I.; Alanazi, S.M.; Kumar, R.R.; Parveen, N.; Shameem, M. Refined Software Defect Prediction Using Enhanced JAYA Optimization and Extreme Learning Machine. IEEE Access 2024, 12, 141559–141579. [Google Scholar] [CrossRef]
Van Thieu, N.; Nguyen, N.H.; Sherif, M.; El-Shafie, A.; Ahmed, A.N. Integrated metaheuristic algorithms with extreme learning machine models for river streamflow prediction. Sci. Rep. 2024, 14, 13597. [Google Scholar] [CrossRef] [PubMed]
Abba, S.I.; Pham, Q.B.; Malik, A.; Costache, R.; Gaya, M.S.; Abdullahi, J.; Mati, S.; Usman, A.G.; Saini, G. Optimization of Extreme Learning Machine with Metaheuristic Algorithms for Modelling Water Quality Parameters of Tamburawa Water Treatment Plant in Nigeria. Water Resour. Manag. 2025, 39, 1377–1401. [Google Scholar] [CrossRef]
Saleh, A.; Zulkifley, M.A. Prediction of suspended sediment load in Sungai Semenyih using extreme learning machines and metaheuristic optimization approach. J. Environ. Manag. 2025, 380, 124987. [Google Scholar] [CrossRef]
Sun, W.; Wang, C.; Zhang, C. Factor analysis and forecasting of CO₂ emissions in Hebei, using extreme learning machine based on particle swarm optimization. J. Clean. Prod. 2017, 162, 1095–1101. [Google Scholar] [CrossRef]
Wei, S.; Yuwei, W.; Chongchong, Z. Forecasting CO₂ emissions in Hebei, China, through moth-flame optimization based on the random forest and extreme learning machine. Environ. Sci. Pollut. Res. 2018, 25, 28985–28997. [Google Scholar] [CrossRef]
Wang, J.; Tan, Y.; Yu, J.; Yu, H.; Wang, M.; Zhou, M. A deep hybrid prediction framework for building operational carbon emissions: Integrating enhanced extreme learning machines. Energy Rep. 2025, 13, 4126–4140. [Google Scholar] [CrossRef]
Fu, S.; Li, K.; Huang, H.; Ma, C.; Fan, Q.; Zhu, Y. Red-billed blue magpie optimizer: A novel metaheuristic algorithm for 2D/3D UAV path planning and engineering design problems. Artif. Intell. Rev. 2024, 57, 134. [Google Scholar] [CrossRef]
Huang, G.-B.; Zhu, Q.-Y.; Siew, C.-K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
Chai, S.; Zhang, Z.; Zhang, Z. Carbon price prediction for China’s ETS pilots using variational mode decomposition and optimized extreme learning machine. Ann. Oper. Res. 2025, 345, 809–830. [Google Scholar] [CrossRef]
Liang, J.J.; Qu, B.Y.; Suganthan, P.N.; Chen, Q. Problem Definitions and Evaluation Criteria for the CEC 2015 Competition on Learning-Based Real-Parameter Single Objective Optimization; Technical Report 201411A; Computational Intelligence Laboratory, Zhengzhou University: Zhengzhou, China; Nanyang Technological University: Singapore, 2014; Volume 29. [Google Scholar]
Abualigah, L.; Yousri, D.; Abd Elaziz, M.; Ewees, A.A.; Al-qaness, M.A.A.; Gandomi, A.H. Aquila Optimizer: A novel meta-heuristic optimization algorithm. Comput. Ind. Eng. 2021, 157, 107250. [Google Scholar] [CrossRef]
Abdel-Basset, M.; El-Shahat, D.; Jameel, M.; Abouhawwash, M. Exponential distribution optimizer (EDO): A novel math-inspired algorithm for global optimization and engineering problems. Artif. Intell. Rev. 2023, 56, 9329–9400. [Google Scholar] [CrossRef]
Heidari, A.A.; Mirjalili, S.; Faris, H.; Aljarah, I.; Mafarja, M.; Chen, H. Harris hawks optimization: Algorithm and applications. Future Gener. Comput. Syst. 2019, 97, 849–872. [Google Scholar] [CrossRef]
Venkata Rao, R. Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems. Int. J. Ind. Eng. Comput. 2016, 7, 19–34. [Google Scholar] [CrossRef]
Yuan, C.; Zhao, D.; Heidari, A.A.; Liu, L.; Chen, Y.; Chen, H. Polar lights optimizer: Algorithm and applications in image segmentation and feature selection. Neurocomputing 2024, 607, 128427. [Google Scholar] [CrossRef]
Mirjalili, S. SCA: A Sine Cosine Algorithm for solving optimization problems. Knowl.-Based Syst. 2016, 96, 120–133. [Google Scholar] [CrossRef]
Khajavi, H.; Rastgoo, A.; Masoumi, F. Enhanced Streamflow Forecasting for Crisis Management Based on Hybrid Extreme Gradient Boosting Model. Iran. J. Sci. Technol. Trans. Civ. Eng. 2025, 1–22. [Google Scholar] [CrossRef]
23 Permutation Feature Importance—Interpretable Machine Learning. Available online: https://christophm.github.io/interpretable-ml-book/feature-importance.html (accessed on 2 July 2025).
Altmann, A.; Toloşi, L.; Sander, O.; Lengauer, T. Permutation importance: A corrected feature importance measure. Bioinformatics 2010, 26, 1340–1347. [Google Scholar] [CrossRef]
Dreher, A.; Gaston, N.; Martens, P. Measuring Globalisation; Springer: New York, NY, USA, 2008. [Google Scholar] [CrossRef]
Gygli, S.; Haelg, F.; Potrafke, N.; Sturm, J.-E. The KOF Globalisation Index—Revisited. Rev. Int. Organ. 2019, 14, 543–574. [Google Scholar] [CrossRef]
Rasool, Y.; Jianguo, D.; Ali, K. Exploring the linkage between globalization and environmental degradation: A disaggregate analysis of Indonesia. Environ. Dev. Sustain. 2024, 26, 16887–16915. [Google Scholar] [CrossRef]
Abbas, M.; Yang, L.; Lahr, M.L. Globalization’s effects on South Asia’s carbon emissions, 1996–2019: A multidimensional panel data perspective via FGLS. Humanit. Soc. Sci. Commun. 2024, 11, 1171. [Google Scholar] [CrossRef]
Hernández, Q.; Badías, A.; González, D.; Chinesta, F.; Cueto, E. Structure-preserving neural networks. J. Comput. Phys. 2021, 426, 109950. [Google Scholar] [CrossRef]
Xiao, S.; Chen, M.; Zhang, R.; Tang, Y. Structure-Preserving Recurrent Neural Networks for a Class of Birkhoffian Systems. J. Syst. Sci. Complex. 2024, 37, 441–462. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]

Figure 1. Extreme Learning Machine.

Figure 2. RBMO-ELM Flowchart.

Figure 3. Convergence Plot of the RBMO and other Optimizers.

Figure 4. Box Plot of RBMO and Compared Optimizers on the CEC2015 function.

Figure 5. Correlation Matrix of Variables.

Figure 6. Distribution of Variables.

Figure 7. Scatter Plot of Predicted vs. Actual for Training and Testing.

Figure 8. Error Plots of Training and Testing.

Figure 9. Convergence Curve.

Figure 10. Feature Importance Score.

Table 1. Parameter Settings of Optimizers.

Optimizer	Hyperparameter
AO	µ = 0.00565, $ω = 0.005,$ $α = δ = 0.1$
EDO	Switch Parameter = 0.5
HHO	$E_{0} = [2,0]$
JAYA	-
PLO	$m = 100, a = [1,1.5]$
SCA	$a = 2$
RBMO	$ϵ = 0.5$

Table 2. Results of the RBMO and Compared Optimizers on CEC2015.

Function	Metric	AO	EDO	HHO	JAYA	PLO	SCA	RBMO
F1	AVG	4.673 × 10⁷	7.707 × 10⁹	1.888 × 10¹⁰	6.290 × 10⁹	6.754 × 10⁵	1.290 × 10¹⁰	2.703 × 10³
	STD	3.619 × 10⁷	3.837 × 10⁹	6.666 × 10⁹	2.385 × 10⁹	2.090 × 10⁵	2.860 × 10⁹	3.242 × 10³
F2	AVG	4.132 × 10⁴	3.376 × 10⁴	5.874 × 10⁴	6.673 × 10⁴	6.358 × 10⁴	3.846 × 10⁴	2.000 × 10²
	STD	5.106 × 10³	7.202 × 10³	2.567 × 10³	1.796 × 10⁴	9.265 × 10³	5.769 × 10³	1.039 × 10⁻³
F3	AVG	3.273 × 10²	3.293 × 10²	3.398 × 10²	3.343 × 10²	3.188 × 10²	3.361 × 10²	3.128 × 10²
	STD	3.393	2.965	2.429	2.995	1.327	2.560	3.949
F4	AVG	4.614 × 10³	7.153 × 10³	5.344 × 10³	7.748 × 10³	3.581 × 10³	7.542 × 10³	3.662 × 10³
	STD	5.767 × 10²	4.755 × 10²	8.097 × 10²	3.870 × 10²	3.877 × 10²	3.338 × 10²	4.489 × 10²
F5	AVG	5.013 × 10²	5.030 × 10²	5.017 × 10²	5.026 × 10²	5.007 × 10²	5.027 × 10²	5.007 × 10²
	STD	5.375 × 10⁻¹	4.409 × 10⁻¹	5.724 × 10⁻¹	3.321 × 10⁻¹	1.176 × 10⁻¹	3.054 × 10⁻¹	2.438 × 10⁻¹
F6	AVG	6.007× 10²	6.023 × 10²	6.033 × 10²	6.011 × 10²	6.005 × 10²	6.021 × 10²	6.003 × 10²
	STD	1.327× 10⁻¹	1.043	3.465 × 10⁻¹	3.059 × 10⁻¹	5.882 × 10⁻²	4.445 × 10⁻¹	5.300 × 10⁻²
F7	AVG	7.006× 10²	7.300 × 10²	7.437 × 10²	7.181 × 10²	7.002 × 10²	7.248 × 10²	7.004 × 10²
	STD	2.365 × 10⁻¹	1.154 × 10¹	6.052	4.267	2.913 × 10⁻²	5.141	1.948 × 10⁻¹
F8	AVG	8.408 × 10²	7.927 × 10⁴	4.176 × 10⁵	1.491 × 10⁴	8.146 × 10²	4.513 × 10⁴	8.050 × 10²
	STD	9.933	1.298 × 10⁵	5.129 × 10⁵	9.928 × 10³	1.454	5.614 × 10⁴	1.152
F9	AVG	9.124 × 10²	9.134 × 10²	9.131 × 10²	9.133 × 10²	9.121 × 10²	9.133 × 10²	9.112 × 10²
	STD	4.319 × 10⁻¹	2.337 × 10⁻¹	3.916 × 10⁻¹	2.533 × 10⁻¹	3.793 × 10⁻¹	2.187 × 10⁻¹	4.550 × 10⁻¹
F10	AVG	1.852 × 10⁶	1.153 × 10⁶	2.519 × 10⁷	5.757 × 10⁶	4.941 × 10⁵	9.643 × 10⁶	2.243 × 10⁴
	STD	1.224 × 10⁶	1.473 × 10⁶	1.428 × 10⁷	2.203 × 10⁶	1.927 × 10⁵	4.361 × 10⁶	1.194 × 10³
F11	AVG	4.527 × 10³	1.434 × 10⁴	9.176 × 10⁵	5.276 × 10⁵	1.137 × 10³	8.158 × 10⁶	1.142 × 10³
	STD	4.695 × 10³	1.026 × 10⁴	2.497 × 10⁶	4.552 × 10⁵	2.408 × 10¹	7.609 × 10⁶	6.184 × 10¹
F12	AVG	4.813 × 10³	6.926 × 10⁴	3.498 × 10⁹	5.145 × 10⁵	3.002 × 10³	8.429 × 10⁸	1.383 × 10³
	STD	1.326 × 10³	3.320 × 10⁵	1.516 × 10¹⁰	8.770 × 10⁵	4.957 × 10²	1.299 × 10⁹	4.260 × 10²
F13	AVG	1.562 × 10³	1.656 × 10³	1.848 × 10³	1.570 × 10³	1.559 × 10³	1.585 × 10³	1.558 × 10³
	STD	1.540	2.948 × 10¹	1.417 × 10²	9.009	9.386 × 10⁻¹	8.156	6.216
F14	AVG	2.074 × 10³	4.818 × 10³	4.444 × 10³	2.367 × 10³	2.115 × 10³	2.853 × 10³	1.973 × 10³
	STD	8.698 × 10¹	5.004 × 10²	1.480 × 10³	1.911 × 10²	1.353 × 10²	2.297 × 10²	3.022 × 10⁻¹
F15	AVG	2.843 × 10³	2.378 × 10³	2.753 × 10³	2.786 × 10³	2.572 × 10³	2.910 × 10³	2.365 × 10³
	STD	4.992 × 10¹	3.537 × 10²	8.056 × 10¹	1.431 × 10²	8.141 × 10¹	3.159 × 10¹	2.663 × 10²
	p-value	6.550 × 10⁻⁴	6.550 × 10⁻⁴	6.550 × 10⁻⁴	6.550 × 10⁻⁴	1.703 × 10⁻²	6.550 × 10⁻⁴	6.550 × 10⁻⁴
	Friedman Mean	3.27	4.93	5.93	4.90	2.17	5.57	1.23
	Friedman Rank	3	5	7	4	2	6	1

Bold means the best value.

Table 3. Training results with 5-fold Cross Validation of the RBMO-ELM.

Cross Validation	Model	R²	RMSE	MSE	ME	RAE
Fold 1	AO-ELM	0.890731	4.0066 × 10⁻²	1.6050 × 10⁻³	1.1848 × 10⁻¹	1.8758 × 10⁻¹
	EDO-ELM	0.975701	1.8427 × 10⁻²	3.4000 × 10⁻⁴	4.3473 × 10⁻²	8.6112 × 10⁻²
	HHO-ELM	0.931951	3.1619 × 10⁻²	1.0000 × 10⁻³	7.2702 × 10⁻²	1.4803 × 10⁻¹
	JAYA-ELM	0.949819	2.8074 × 10⁻²	7.8800 × 10⁻⁴	6.1632 × 10⁻²	1.3229 × 10⁻¹
	PLO-ELM	0.983397	1.5618 × 10⁻²	2.4400 × 10⁻⁴	3.4396 × 10⁻²	7.3118 × 10⁻²
	SCA-ELM	0.984547	1.5068 × 10⁻²	2.2700 × 10⁻⁴	3.7415 × 10⁻²	7.0541 × 10⁻²
	RBMO-ELM	0.997927	5.7070 × 10⁻³	3.3000 × 10⁻⁵	2.3126 × 10⁻²	2.6891 × 10⁻²
	ELM	0.939773	2.9746 × 10⁻²	8.8500 × 10⁻⁴	9.8621 × 10⁻²	1.3926 × 10⁻¹
Fold 2	AO-ELM	0.91195	3.6210 × 10⁻²	1.3110 × 10⁻³	9.3294 × 10⁻²	1.7603 × 10⁻¹
	EDO-ELM	0.898329	3.7361 × 10⁻²	1.3960 × 10⁻³	9.0330 × 10⁻²	1.7812 × 10⁻¹
	HHO-ELM	0.949311	2.7474 × 10⁻²	7.5500 × 10⁻⁴	6.9879 × 10⁻²	1.3356 × 10⁻¹
	JAYA-ELM	0.938262	2.9150 × 10⁻²	8.5000 × 10⁻⁴	9.2647 × 10⁻²	1.3255 × 10⁻¹
	PLO-ELM	0.983997	1.5437 × 10⁻²	2.3800× 10⁻⁴	3.5256 × 10⁻²	7.5046 × 10⁻²
	SCA-ELM	0.986408	1.4227 × 10⁻²	2.0200 × 10⁻⁴	2.9028 × 10⁻²	6.9161 × 10⁻²
	RBMO-ELM	0.99652	6.9200 × 10⁻³	4.8000 × 10⁻⁵	1.8310 × 10⁻²	3.1468 × 10⁻²
	ELM	0.936239	3.0814 × 10⁻²	9.4900 × 10⁻⁴	9.3922 × 10⁻²	1.4980 × 10⁻¹
Fold 3	AO-ELM	0.930005	3.3130 × 10⁻²	1.0980 × 10⁻³	7.6624 × 10⁻²	1.5226 × 10⁻¹
	EDO-ELM	0.966657	2.2814 × 10⁻²	5.2000 × 10⁻⁴	4.3149 × 10⁻²	1.0841 × 10⁻¹
	HHO-ELM	0.976324	1.9268 × 10⁻²	3.7100 × 10⁻⁴	5.4520 × 10⁻²	8.8552 × 10⁻²
	JAYA-ELM	0.94844	2.7950 × 10⁻²	7.8100 × 10⁻⁴	6.8194 × 10⁻²	1.3523 × 10⁻¹
	PLO-ELM	0.987548	1.3974 × 10⁻²	1.9500 × 10⁻⁴	3.4310 × 10⁻²	6.4219 × 10⁻²
	SCA-ELM	0.986808	1.4383 × 10⁻²	2.0700 × 10⁻⁴	3.2648 × 10⁻²	6.6101 × 10⁻²
	RBMO-ELM	0.995707	8.0650 × 10⁻³	6.5000 × 10⁻⁵	1.8547 × 10⁻²	3.9022 × 10⁻²
	ELM	0.948733	2.8354 × 10⁻²	8.0400 × 10⁻⁴	7.2765 × 10⁻²	1.3031 × 10⁻¹
Fold 4	AO-ELM	0.885407	3.7812 × 10⁻²	1.4300 × 10⁻³	1.0660 × 10⁻¹	1.7353 × 10⁻¹
	EDO-ELM	0.932378	3.2731 × 10⁻²	1.0710 × 10⁻³	8.6088 × 10⁻²	1.5206 × 10⁻¹
	HHO-ELM	0.902305	3.4913 × 10⁻²	1.2190 × 10⁻³	7.7119 × 10⁻²	1.6022 × 10⁻¹
	JAYA-ELM	0.942135	2.8082 × 10⁻²	7.8900 × 10⁻⁴	7.8125 × 10⁻²	1.3091 × 10⁻¹
	PLO-ELM	0.980638	1.5543 × 10⁻²	2.4200 × 10⁻⁴	3.3148 × 10⁻²	7.1330 × 10⁻²
	SCA-ELM	0.98093	1.5425 × 10⁻²	2.3800 × 10⁻⁴	3.4529 × 10⁻²	7.0788 × 10⁻²
	RBMO-ELM	0.996448	6.9570 × 10⁻³	4.8000 × 10⁻⁵	1.6825 × 10⁻²	3.2432 × 10⁻²
	ELM	0.938749	2.7644 × 10⁻²	7.6400 × 10⁻⁴	1.0055 × 10⁻¹	1.2687 × 10^-−1
Fold 5	AO-ELM	0.918109	3.4764 × 10⁻²	1.2090 × 10⁻³	7.3694 × 10⁻²	1.6327 × 10⁻¹
	EDO-ELM	0.880042	4.0124 × 10⁻²	1.6100 × 10⁻³	9.7470 × 10⁻²	1.8370 × 10⁻¹
	HHO-ELM	0.873904	4.3138 × 10⁻²	1.8610 × 10⁻³	1.1263 × 10⁻²	2.0260 × 10⁻¹
	JAYA-ELM	0.974952	1.8887 × 10⁻²	3.5700 × 10⁻³	4.7962 × 10⁻²	8.8049 × 10⁻²
	PLO-ELM	0.986998	1.3852 × 10⁻²	1.9200 × 10⁻⁴	3.2134 × 10⁻²	6.5057 × 10⁻²
	SCA-ELM	0.984467	1.5141 × 10⁻²	2.2900 × 10⁻⁴	3.6857 × 10⁻²	7.1108 × 10⁻²
	RBMO-ELM	0.997483	5.9870 × 10⁻³	3.6000 × 10⁻⁵	1.9211 × 10⁻²	2.7911 × 10⁻²
	ELM	0.94044	2.9648 × 10⁻²	8.7900 × 10⁻⁴	9.0713 × 10⁻²	1.3924 × 10⁻¹

Bold means the best value.

Table 4. Test Results with 5-fold Cross Validation of the RBMO-ELM.

Cross Validation	Model	R²	RMSE	MSE	ME	RAE
Fold 1	AO-ELM	0.900711	3.7251 × 10⁻²	1.3880 × 10⁻³	8.1431 × 10⁻²	1.7438 × 10⁻¹
	EDO-ELM	0.976232	1.9915 × 10⁻²	3.9700 × 10⁻⁴	4.0401 × 10⁻²	9.3899 × 10⁻²
	HHO-ELM	0.878359	4.1231 × 10⁻²	1.7000 × 10⁻³	6.8421 × 10⁻²	1.9301 × 10⁻¹
	JAYA-ELM	0.912735	2.8869 × 10⁻²	8.3300 × 10⁻⁴	5.8517 × 10⁻²	1.3188 × 10⁻¹
	PLO-ELM	0.981300	1.6166 × 10⁻²	2.6100 × 10⁻⁴	3.7234 × 10⁻²	7.5677 × 10⁻²
	SCA-ELM	0.981557	1.6055 × 10⁻²	2.5800 × 10⁻⁴	3.7685 × 10⁻²	7.5155 × 10⁻²
	RBMO-ELM	0.991548	8.9840 × 10⁻³	8.1000 × 10⁻⁵	2.4345 × 10⁻²	4.1042 × 10⁻²
	ELM	0.943023	2.8219 × 10⁻²	7.9600 × 10⁻⁴	6.7522 × 10⁻²	1.3210 × 10⁻¹
Fold 2	AO-ELM	0.887378	3.5240 × 10⁻²	1.2420 × 10⁻³	7.2771 × 10⁻²	1.4569 × 10⁻¹
	EDO-ELM	0.915493	3.8588 × 10⁻²	1.4890 × 10⁻³	8.2820 × 10⁻²	1.6928 × 10⁻¹
	HHO-ELM	0.927116	2.8349 × 10⁻²	8.0400 × 10⁻⁴	5.9033 × 10⁻²	1.1720 × 10⁻¹
	JAYA-ELM	0.884011	4.2803 × 10⁻²	1.8320 × 10⁻³	9.9881 × 10⁻²	2.2885 × 10⁻¹
	PLO-ELM	0.979390	1.5075 × 10⁻²	2.2700 × 10⁻⁴	3.1758 × 10⁻²	6.2323 × 10⁻²
	SCA-ELM	0.979963	1.4864 × 10⁻²	2.2100 × 10⁻⁴	2.6537 × 10⁻²	6.1450 × 10⁻²
	RBMO-ELM	0.993066	1.0465 × 10⁻²	1.1000 × 10⁻⁴	3.1616 × 10⁻²	5.5953 × 10⁻²
	ELM	0.951569	2.3109 × 10⁻²	5.3400 × 10⁻⁴	5.8281 × 10⁻²	9.5537 × 10⁻²
Fold 3	AO-ELM	0.840602	3.9643 × 10⁻²	1.5720 × 10⁻³	7.4635 × 10⁻²	2.0174 × 10⁻¹
	EDO-ELM	0.935282	2.4409 × 10⁻²	5.9600 × 10⁻⁴	4.1236 × 10⁻²	1.0799× 10⁻¹
	HHO-ELM	0.945644	2.3150 × 10⁻²	5.3600 × 10⁻⁴	4.8822 × 10⁻²	1.1781 × 10⁻¹
	JAYA-ELM	0.908178	3.0144 × 10⁻²	9.0900 × 10⁻⁴	5.2873 × 10⁻²	1.2573 × 10⁻¹
	PLO-ELM	0.968510	1.7620 × 10⁻²	3.1000 × 10⁻⁴	4.0261 × 10⁻²	8.9669 × 10⁻²
	SCA-ELM	0.968746	1.7554 × 10⁻²	3.0800 × 10⁻⁴	3.0425 × 10⁻²	8.9332 × 10⁻²
	RBMO-ELM	0.991675	9.0760 × 10⁻³	8.2000 × 10⁻⁵	1.6118 × 10⁻²	3.7857 × 10⁻²
	ELM	0.884659	3.3723 × 10⁻²	1.1370 × 10⁻³	9.8628 × 10⁻²	1.7161 × 10⁻¹
Fold 4	AO-ELM	0.848498	5.5825 × 10⁻²	3.1160 × 10⁻³	1.2691 × 10⁻¹	2.8611 × 10⁻¹
	EDO-ELM	0.857874	3.6160 × 10⁻²	1.3080 × 10⁻³	9.1723 × 10⁻²	1.7492 × 10⁻¹
	HHO-ELM	0.918573	4.0926 × 10⁻²	1.6750 × 10⁻³	9.7528 × 10⁻²	2.0975 × 10⁻¹
	JAYA-ELM	0.944426	3.1634 × 10⁻²	1.0010 × 10⁻³	6.7481 × 10⁻²	1.5076 × 10⁻¹
	PLO-ELM	0.989706	1.4552 × 10⁻²	2.1200 × 10⁻⁴	3.1934 × 10⁻²	7.4579 × 10⁻²
	SCA-ELM	0.989921	1.4399 × 10⁻²	2.0700 × 10⁻⁴	3.3970 × 10⁻²	7.3795 × 10⁻²
	RBMO-ELM	0.998143	5.7830 × 10⁻³	3.3000 × 10⁻⁵	1.2794 × 10⁻²	2.7561 × 10⁻²
	ELM	0.940644	3.4942 × 10⁻²	1.2210 × 10⁻³	7.4547 × 10⁻²	1.7908 × 10⁻¹
Fold 5	AO-ELM	0.951941	2.5601 × 10⁻²	6.5500 × 10⁻⁴	5.1078 × 10⁻²	1.1833 × 10⁻¹
	EDO-ELM	0.833062	5.4056 × 10⁻²	2.9220 × 10⁻³	1.2144 × 10⁻²	2.8049 × 10⁻¹
	HHO-ELM	0.890743	3.8601 × 10⁻²	1.4900 × 10⁻³	9.5016 × 10⁻²	1.7842 × 10⁻¹
	JAYA-ELM	0.966422	2.2964 × 10⁻²	5.2700× 10⁻⁴	5.1359 × 10⁻²	1.0942 × 10⁻¹
	PLO-ELM	0.975741	1.8189 × 10⁻²	3.3100 × 10⁻⁴	3.1967 × 10⁻²	8.4070 × 10⁻²
	SCA-ELM	0.977301	1.7595 × 10⁻²	3.1000 × 10⁻⁴	3.3996 × 10⁻²	8.1323 × 10⁻²
	RBMO-ELM	0.987822	1.3829 × 10⁻²	1.9100 × 10⁻⁴	4.3513 × 10⁻²	6.5896 × 10⁻²
	ELM	0.939254	2.8783 × 10⁻²	8.2800 × 10⁻⁴	5.7253 × 10⁻²	1.3304 × 10⁻¹

Bold means the best value.

Table 5. Train Results 20 independent run of the RBMO-ELM.

		AO-ELM	EDO-ELM	HHO-ELM	JAYA-ELM	PLO-ELM	SCA-ELM	RBMO-ELM	ELM
R²	AVG	0.934629	0.935460	0.937513	0.936696	0.984792	0.984936	0.997847	0.938278
	STD	2.9454 × 10⁻²	2.7098 × 10⁻²	2.5185 × 10⁻²	2.1303 × 10⁻²	8.8751 × 10⁻⁴	1.0810 × 10⁻³	4.1445 × 10⁻⁴	2.2204 × 10⁻¹⁶
	BEST	0.981044	0.968894	0.974115	0.983935	0.986439	0.987507	0.998519	0.938278
RMSE	AVG	2.9797 × 10⁻²	2.8521 × 10⁻²	2.9284 × 10⁻²	3.1872 × 10⁻²	1.4738 × 10⁻²	1.4665 × 10⁻²	5.9545 × 10⁻³	2.9703 × 10⁻²
	STD	6.8248 × 10⁻³	5.7428 × 10⁻³	5.9716 × 10⁻³	6.0272 × 10⁻³	4.3167 × 10⁻⁴	5.3699 × 10⁻⁴	5.7322 × 10⁻⁴	0
	BEST	1.6461 × 10⁻²	2.0198 × 10⁻²	1.9236 × 10⁻²	1.6340 × 10⁻²	1.3923 × 10⁻⁴	1.3364 × 10⁻²	4.9610 × 10⁻³	2.9703 × 10⁻²
MSE	AVG	9.3445 × 10⁻⁴	8.4655 × 10⁻⁴	8.9325 × 10⁻⁴	1.0521 × 10⁻³	2.1735 × 10⁻⁴	2.1540 × 10⁻⁴	3.5750 × 10⁻⁵	8.8200 × 10⁻⁴
	STD	4.2106 × 10⁻⁴	3.5542 × 10⁻⁴	3.6003 × 10⁻⁴	3.5410 × 10⁻⁴	1.2615 × 10⁻⁵	1.5347 × 10⁻⁵	6.8984 × 10⁻⁶	5.4210 × 10⁻¹⁹
	BEST	2.7100 × 10⁻⁴	4.0800 × 10⁻⁴	3.7000 × 10⁻⁴	2.6700 × 10⁻⁴	1.9400 × 10⁻⁴	1.7900 × 10⁻⁴	2.5000 × 10⁻⁵	8.8200 × 10⁻⁴
ME	AVG	8.3745 × 10⁻²	6.9831 × 10⁻²	7.5297 × 10⁻²	8.1140 × 10⁻²	3.5662 × 10⁻²	3.3720 × 10⁻²	1.9722 × 10⁻²	9.6242 × 10⁻²
	STD	2.6277 × 10⁻²	2.2064 × 10⁻²	1.7958 × 10⁻²	2.4238 × 10⁻²	2.3099 × 10⁻²	2.1992 × 10⁻³	4.4385 × 10⁻³	2.7756 × 10⁻¹⁷
	BEST	4.1000 × 10⁻²	3.7874 × 10⁻²	3.8928 × 10⁻²	4.2437 × 10⁻²	3.0536 × 10⁻²	2.9552 × 10⁻²	1.4964 × 10⁻²	9.6242 × 10⁻²
RAE	AVG	1.4115 × 10⁻¹	1.3347 × 10⁻¹	1.3872 × 10⁻¹	1.4818 × 10⁻¹	6.9815 × 10⁻²	6.9466 × 10⁻²	2.7684 × 10⁻²	1.4071 × 10⁻¹
	STD	3.2329 × 10⁻²	2.6874 × 10⁻²	2.8288 × 10⁻²	2.8022 × 10⁻²	2.0446 × 10⁻³	2.5440 × 10⁻³	2.6649 × 10⁻³	2.7756 × 10⁻¹⁷
	BEST	7.7977 × 10⁻²	9.4517 × 10⁻²	9.1121 × 10⁻²	7.5971 × 10⁻²	6.5954 × 10⁻²	6.3304 × 10⁻²	2.3064 × 10⁻²	1.4071 × 10⁻¹

Bold means the best value.

Table 6. Test Results of 20 independent runs of the RBMO-ELM.

		AO-ELM	EDO-ELM	HHO-ELM	JAYA-ELM	PLO-ELM	SCA-ELM	RBMO-ELM	ELM
R²	AVG	0.904820	0.950116	0.909230	0.874945	0.980525	0.979919	0.989096	0.943142
	STD	4.4333 × 10⁻²	2.4488 × 10⁻²	3.9990 × 10⁻²	5.0991 × 10⁻²	2.0611 × 10⁻³	3.3931 × 10⁻³	2.8251 × 10⁻³	2.2204 × 10⁻¹⁶
	BEST	0.974666	0.976549	0.967505	0.961818	0.984377	0.984174	0.993547	0.943142
RMSE	AVG	3.6839 × 10⁻²	2.8963 × 10⁻²	3.6112 × 10⁻²	3.3952 × 10⁻²	1.7115 × 10⁻²	1.7348 × 10⁻²	1.0178 × 10⁻²	2.9284 × 10⁻²
	STD	8.8579 × 10⁻³	6.6741 × 10⁻³	8.0632 × 10⁻³	7.3024 × 10⁻³	9.0292 × 10⁻⁴	1.3856 × 10⁻³	1.2473 × 10⁻³	1.3878 × 10⁻¹⁷
	BEST	1.9548 × 10⁻²	2.0378 × 10⁻²	2.2138 × 10⁻²	1.9190 × 10⁻²	1.5350 × 10⁻²	1.5450 × 10⁻²	7.8890 × 10⁻³	2.9284 × 10⁻²
MSE	AVG	1.4355 × 10⁻³	8.8330 × 10⁻⁴	1.3690 × 10⁻³	1.2060 × 10⁻³	2.9380 × 10⁻⁴	3.0285 × 10⁻⁴	1.0515 × 10⁻⁴	8.5800× 10⁻⁴
	STD	6.6860 × 10⁻⁴	4.3352 × 10⁻⁴	6.0309 × 10⁻⁴	4.9182 × 10⁻⁴	3.1088 × 10⁻⁵	5.1108 × 10⁻⁵	2.7275 × 10⁻⁵	1.0842 × 10⁻¹⁹
	BEST	3.8200 × 10⁻⁴	4.1500 × 10⁻⁴	4.9000 × 10⁻⁴	3.6800 × 10⁻⁴	2.3600 × 10⁻⁴	2.3900 × 10⁻⁴	6.2000 × 10⁻⁵	8.5800 × 10⁻⁴
ME	AVG	8.4552 × 10⁻²	6.1373 × 10⁻²	7.6927 × 10⁻²	8.1366 × 10⁻²	3.8515 × 10⁻²	3.8256 × 10⁻²	2.5465 × 10⁻²	6.8495 × 10⁻²
	STD	2.6365 × 10⁻²	1.4616 × 10⁻²	1.9383 × 10⁻²	2.4915 × 10⁻²	2.1780 × 10⁻³	5.6819 × 10⁻³	5.9898 × 10⁻³	1.3878 × 10⁻¹⁷
	BEST	4.3945 × 10⁻²	3.5855 × 10⁻²	3.9776 × 10⁻²	4.1733 × 10⁻²	3.2816 × 10⁻²	3.1578 × 10⁻²	1.6918 × 10⁻²	6.8495 × 10⁻²
RAE	AVG	1.6802× 10⁻¹	1.3572 × 10⁻¹	1.6471 × 10⁻¹	1.6156 × 10⁻¹	7.8061 × 10⁻²	7.9124 × 10⁻²	4.8434 × 10⁻²	1.3357 × 10⁻¹
	STD	4.0401 × 10⁻²	3.1275 × 10⁻²	3.6777 × 10⁻²	3.4749 × 10⁻²	4.1182 × 10⁻³	6.3198 × 10⁻³	5.9354 × 10⁻³	2.7756 × 10⁻¹⁷
	BEST	8.9157 × 10⁻²	9.5493 × 10⁻²	1.0097 × 10⁻¹	9.1313 × 10⁻²	7.0014 × 10⁻²	7.0467 × 10⁻²	3.7541 × 10⁻²	1.3357 × 10⁻¹

Bold means the best value.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Almsallti, M.; Alzubi, A.B.; Adegboye, O.R. Hybrid Metaheuristic Optimized Extreme Learning Machine for Sustainability Focused CO₂ Emission Prediction Using Globalization-Driven Indicators. Sustainability 2025, 17, 6783. https://doi.org/10.3390/su17156783

AMA Style

Almsallti M, Alzubi AB, Adegboye OR. Hybrid Metaheuristic Optimized Extreme Learning Machine for Sustainability Focused CO₂ Emission Prediction Using Globalization-Driven Indicators. Sustainability. 2025; 17(15):6783. https://doi.org/10.3390/su17156783

Chicago/Turabian Style

Almsallti, Mahmoud, Ahmad Bassam Alzubi, and Oluwatayomi Rereloluwa Adegboye. 2025. "Hybrid Metaheuristic Optimized Extreme Learning Machine for Sustainability Focused CO₂ Emission Prediction Using Globalization-Driven Indicators" Sustainability 17, no. 15: 6783. https://doi.org/10.3390/su17156783

APA Style

Almsallti, M., Alzubi, A. B., & Adegboye, O. R. (2025). Hybrid Metaheuristic Optimized Extreme Learning Machine for Sustainability Focused CO₂ Emission Prediction Using Globalization-Driven Indicators. Sustainability, 17(15), 6783. https://doi.org/10.3390/su17156783

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hybrid Metaheuristic Optimized Extreme Learning Machine for Sustainability Focused CO₂ Emission Prediction Using Globalization-Driven Indicators

Abstract

1. Introduction

2. Methodology

2.1. The Red-Billed Blue Magpie Optimizer (RBMO)

2.2. Extreme Learning Machine (ELM)

2.3. Proposed Prediction Model

3. Experiment and Discussion

3.1. Evaluation of the Proposed RBMO on CEC2015 Functions

3.2. Evaluation of RBMO-ELM Model

3.2.1. Data

3.2.2. Evaluation Metrics

3.2.3. Evaluation of Models and Discussion

3.2.4. Preservation of System Properties and Model Contributions

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI