Next Article in Journal
The Impact of Industrial Structure and Economic Development on Environmental Quality: Evidence from Rural China
Previous Article in Journal
Engagement, Citizenship Behavior, Burnout, and Intention to Quit: Mechanisms Fostering Sustainable Well-Being and Driving Retention Among Thai Frontline Bank Employees
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep Learning for Sustainable Finance: Robust ESG Index Forecasting in an Emerging Market Context

by
Umawadee Detthamrong
1,
Rapeepat Klangbunrueang
2,
Wirapong Chansanam
2,* and
Rasita Dasri
1
1
College of Local Administration, Khon Kaen University, Khon Kaen 40002, Thailand
2
Department of Information Science, Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen 40002, Thailand
*
Author to whom correspondence should be addressed.
Sustainability 2026, 18(1), 110; https://doi.org/10.3390/su18010110
Submission received: 4 December 2025 / Revised: 19 December 2025 / Accepted: 19 December 2025 / Published: 22 December 2025

Abstract

Sustainable finance increasingly relies on Environmental, Social, and Governance (ESG) data, yet forecasting ESG-based stock indices remains challenging in an emerging-market context. Using Thailand as a representative case due to limited historical information, this study constructs a realistic simulated SET ESG Index using free-float-adjusted market capitalization and semiannual rebalancing rules that reflect the methodology of the Stock Exchange of Thailand. Using this index as the forecasting target, this study compares traditional statistical time series models (ARIMA, SARIMA, SARIMAX) with seven deep learning architectures (RNN, GRU, LSTM, DF-RNN, DeepAR, DSSM, Deep Renewal) to evaluate performance in multi-step (36-day) prediction. Results reveal that deep learning models significantly outperform statistical approaches, with GRU delivering the highest accuracy and the most consistent robustness across reduced-data scenarios. These findings highlight the ability of advanced AI techniques to capture nonlinear ESG market dynamics better. This study provides a replicable modeling pipeline for ESG index forecasting in data-constrained contexts, with practical implications for sustainable investment decision-making, risk management, and market resilience in emerging economies.

1. Introduction

Over the past decade, global sustainability pressures and responsible investment priorities have reshaped capital markets, with more than 80% of institutional investors now integrating environmental, social, and governance (ESG) information into portfolio decisions. Yet, a critical question remains: can ESG-oriented investment strategies reliably deliver sustainable financial performance in an emerging-market context, using Thailand as a representative case? Prior meta-analyses show that strong ESG performance is generally associated with non-negative and often positive financial outcomes, reinforcing long-term competitiveness rather than imposing a cost burden [1,2,3]. Additional studies emphasize heightened benefits in developing economies, including improved reporting integrity and enhanced resilience during shocks such as the COVID-19 crisis [4,5]. Despite this progress, an important gap persists in translating ESG principles into market-aligned index construction and high-frequency predictive analytics, both of which are essential to inform sustainable investment strategies.
ESG indices represent a key instrument for capital allocation by reflecting the financial performance of sustainability-screened firms. However, their ability to serve as reliable investment benchmarks depends heavily on methodological transparency—particularly with respect to free-float-adjusted market capitalization weighting, constituent filtering, and rebalancing frequency [6,7,8]. Although the SET ESG Index exists as Thailand’s primary sustainability benchmark, academic research has yet to replicate its construction rules or utilize the resulting index as a target variable for advanced forecasting techniques. Existing forecasting studies largely focus on broad market indices or individual stocks without explicitly incorporating ESG screens, limiting their relevance for sustainable finance applications.
Meanwhile, advances in artificial intelligence (AI) and deep learning have transformed financial forecasting. Models such as Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and more advanced designs, including DeepAR, Deep State Space Models, and factor-driven recurrent architectures (DF-RNN) demonstrate superior ability to capture nonlinear dependencies and evolving market patterns compared to traditional time-series models like ARIMA and SARIMA [9,10,11,12]. However, most studies assume long, complete data histories, overlooking the scarcity and non-stationarity that typify ESG datasets in an emerging-market setting. Although recent research highlights the importance of data quality and imputation strategies in maintaining forecasting reliability [13,14], there remains limited evidence on how deep learning models perform when training data are meaningfully constrained.
This study addresses these unresolved issues by constructing a realistic, market-rule-based simulated SET ESG Index using daily price data from 2014 to 2025, weighted by free-float-adjusted market capitalization, with semiannual rebalancing and divisor adjustments in accordance with SET indexing standards. The simulated index is used to evaluate two major forecasting approaches: (1) statistical benchmarks (ARIMA, SARIMA, and SARIMAX) and (2) seven state-of-the-art deep learning models (RNN, GRU, LSTM, DF-RNN, DeepAR, DSSM, and Deep Renewal) within a 36-day-ahead multi-step framework. To emulate real-world data constraints, robustness is assessed by reducing the training data proportion to 50% and 25%.
By integrating sustainable index construction with deep learning-based forecasting under varying data conditions, this research contributes to the growing literature at the intersection of ESG finance and AI-driven market analytics. The findings provide insights directly applicable to sustainable asset management, risk assessment, and index product development in the Thai market such as Thailand—advancing both the scholarly understanding and the practical implementation of robust ESG-based financial forecasting.
It is important to clarify the scope of this study’s contribution. The primary objective is not to identify the structural economic determinants of ESG asset pricing, but to evaluate the reliability and robustness of alternative forecasting methodologies when applied to ESG-based indices in data-constrained emerging markets. In this context, predictability is treated as an empirical property of the index time series—arising from institutional features such as ESG screening, index rebalancing rules, persistence in capital allocation, and gradual information diffusion—rather than as evidence of market inefficiency or a specific causal mechanism. Accordingly, the study adopts a comparative modeling perspective to determine which forecasting approaches are most suitable for practical ESG index prediction under realistic data limitations.
This study adopts a methodological comparative research paradigm at the intersection of sustainable finance and financial data science. Its primary objective is to assess the robustness and reliability of alternative forecasting techniques when applied to ESG-based stock indices in this emerging-market application under realistic data constraints. Rather than examining the causal economic mechanisms of ESG asset pricing, the study focuses on identifying which statistical and deep learning models are most suitable for practical ESG index forecasting. Economic and financial theory is employed to contextualize the relevance of ESG indices and to interpret observed predictability in terms of institutional characteristics, index construction rules, and persistence in sustainability-oriented capital allocation. This positioning ensures coherence between the research objective, empirical design, and contribution to the sustainable finance literature. Accordingly, the manuscript first establishes the relevance of ESG indices, then presents a market-aligned index construction methodology, and finally conducts a systematic comparison of forecasting models under varying data availability conditions. While the empirical analysis centers on Thailand, the proposed methodological framework is transferable to other developing economies that face similar ESG data limitations.
For readability, subsequent references to the empirical setting omit repeated qualifiers, as the study consistently refers to Thailand as a representative emerging-market case unless stated otherwise.

2. Literature Review

The literature informing this study can be broadly grouped into three interconnected areas that underpin the development and forecasting of the simulated SET ESG Index. First, extensive empirical research has examined how ESG performance relates to firms’ financial outcomes, with growing evidence suggesting positive or at least non-negative relationships across various market settings [1,2,3]. Second, studies on ESG index construction emphasize that methodological choices—such as free-float adjustments, constituent selection criteria, and rebalancing frequency—directly shape market representation and investment interpretation [6,7,8]. Finally, advancements in financial time-series forecasting highlight the value of both traditional statistical approaches and modern deep learning techniques, while also underscoring the importance of appropriate data preprocessing and missing-value handling to ensure credible predictive performance [11,12,13,14]. Collectively, these strands establish the theoretical and methodological foundation for building a realistic ESG index and evaluating the ability of state-of-the-art forecasting models to capture its dynamics.

2.1. ESG and Financial Performance

A substantial body of research has explored the relationship between ESG performance and firms’ financial outcomes, with growing consensus that ESG contributes to long-term value creation rather than imposing purely compliance-driven costs. A comprehensive meta-analysis of more than 2000 studies shows that roughly 90% of reported ESG–CFP relationships are non-negative, and the majority are positive, supporting the notion of ESG as a strategic asset that fosters competitive advantage [1]. More recent evidence from 2016 to 2023 continues to align with stakeholder and agency theory, suggesting that strong ESG practices help mitigate conflicts of interest and enhance managerial accountability [2]. However, when examining specific markets and industries, more differentiated patterns emerge. For example, in South Korea, high ESG ratings are positively associated with profitability (ROA). Yet, short-term stock returns do not respond strongly, implying incomplete ESG price incorporation in capital markets [15]. In China, ESG strongly influences financial performance in the new energy sector [3]. In contrast, in the healthcare industry, the effect is weaker and often statistically insignificant, reflecting context-dependent and nonlinear dynamics [16]. Several studies highlight indirect mechanisms through which ESG affects firm value: financial performance mediates the ESG–market value relationship [17], and dynamic interdependencies exist between ESG and profitability [18]. Research in the studied market reveals even more complex behavior; for instance, a U-shaped association between ESG disclosure and profitability in Egypt indicates that firms with either strong or minimal disclosure outperform those in the middle, while carbon reduction initiatives correlate with higher earnings [5]. Moreover, ESG has been shown to reduce earnings management and strengthen reporting quality, particularly during crises such as the COVID-19 pandemic [4], reinforcing its governance role in enhancing organizational resilience. Taken together, the literature provides clear justification for using ESG-screened stock groups or ESG indices as relevant proxies for sustainable corporate performance, a concept that directly supports the construction of the simulated SET ESG Index used in this study.

2.2. ESG Indices, Index Construction, and Market Dynamics

The construction of a thematic ESG stock index extends beyond simply selecting eligible constituents; it critically depends on the methodological rules applied in index calculation, which can materially influence portfolio characteristics and market interpretation. A prominent trend is the adoption of free-float–adjusted market capitalization to reflect better the portion of shares actively traded. Evidence from India’s Nifty index, for example, shows that transitioning from full-market-cap to free-float weighting substantially reduces the dominance of state-owned enterprises with low tradable shares while strengthening the representation of private firms, thereby demonstrating how index rules can systematically reshape portfolio structure [7]. Similarly, the launch of Sri Lanka’s S&P SL20 index reveals that while the creation of a new index does not always produce abnormal returns, constituent selection and periodic reviews influence market representation and passive investment flows [8]. Additional work highlights that although machine-learning-driven stock selection can enhance in-sample performance, out-of-sample results often deteriorate, underscoring the need to assess model robustness [19,20]. Specific to ESG indexing, The Stock Exchange of Thailand [6] calculation rules emphasize free-float weighting, scheduled rebalancing, and divisor adjustments to maintain index continuity.
Meanwhile, emerging studies argue that AI and machine-learning techniques could support ESG index construction—by identifying firms with superior ESG profiles and quantifying sustainability-related financial value—although empirical validation is still evolving [21]. Overall, the literature indicates that methodological decisions, such as weighting schemes, rebalancing frequency, and stock-universe definition, fundamentally shape index behavior and interpretation, particularly under emerging-market data conditions, where data availability and structural constraints require careful calibration. These insights underpin the design of this study’s simulated SET ESG Index, which incorporates free-float–adjusted capitalization, semiannual rebalancing, divisor adjustments, and the constituent universe defined by SET ESG eligibility, thereby ensuring market realism and analytical credibility.

2.3. Financial Time-Series Forecasting Techniques and Data Management

Traditional stock-index forecasting research has long relied on classical statistical time-series models such as ARIMA and SARIMA, which have proven effective for capturing general market trends and thus remain widely used benchmarks for univariate forecasting in financial applications [11,22]. However, as financial markets exhibit nonlinear dynamics and long-range temporal dependencies, recent studies demonstrate that deep learning architectures frequently outperform these conventional models in both accuracy and adaptability. Evidence shows that LSTM networks can effectively model long-term structural relationships in stock price movements [23], while comparative analyses reveal that GRU and CNN architectures often achieve lower error rates than LSTM across various market conditions [9,24,25]. Additional investigations comparing neural architectures—RNN, LSTM, GRU, CNN, and ANN—confirm that model design and feature engineering are critical to unlocking superior predictive capability [26,27,28]. Hybrid and ensemble approaches have emerged to integrate complementary strengths, such as combining LSTM/GRU with Support Vector Regression [29] or coupling decomposition techniques, ARIMA/SARIMA, and multi-feature LSTM into a unified meta-model [30].
At the same time, overfitting remains a common concern—especially for LSTM when trained excessively—highlighting the need for mechanisms such as early stopping and careful hyperparameter tuning [31]. Across this literature, a consistent conclusion is that data preprocessing strategies significantly influence forecasting reliability. Studies comparing imputation techniques show that advanced methods such as MICE or k-nearest neighbors can reduce prediction errors in the presence of missing or anomalous observations [13,14], while anomaly-detection and denoising frameworks have also been proposed to enhance data quality for learning algorithms further [32,33]. Together, these insights reinforce the necessity of combining robust data treatments with advanced deep learning approaches to address the nonlinear, non-stationary characteristics of financial time series—an approach directly embedded in the methodological design of the present study.
Synthesizing the reviewed literature reveals three central insights that directly inform the design and purpose of this study. First, substantial empirical evidence shows that ESG performance is positively related to profitability, firm value, and organizational resilience, though the strength and form of these relationships vary across market and industry contexts [1,2,5,15,17,18]. This supports the use of Thai ESG-screened firms as a credible proxy for sustainable corporate performance in constructing a market-reflective index. Second, research on index construction and portfolio rules demonstrates that methodological decisions—such as free-float weighting, semiannual rebalancing, and the application of divisor adjustments—can materially affect index composition and investment activity, particularly in an emerging-market setting where information and liquidity constraints are prevalent [6,7,8,19,20]. These insights guide the design of a simulated SET ESG Index that adheres closely to real-market calculation standards to ensure interpretive validity and practical relevance. Third, advances in financial time-series forecasting emphasize that while ARIMA and SARIMA remain important benchmark models, deep learning architectures—including RNN, GRU, LSTM, DF-RNN, DeepAR, and DSSM—generally deliver superior performance when nonlinear dynamics are present, and data are properly prepared [11,12,13,14,23,27,29,30].
The reviewed literature provides the theoretical foundation for this study by establishing ESG performance as an economically meaningful construct associated with firm value, resilience, and long-term competitiveness, particularly in the Thai market. Rather than operationalizing ESG theory through causal or structural modeling, this study leverages these theoretical insights to justify the construction and forecasting of an ESG-screened equity index as a relevant financial object. In this sense, theory informs the choice of the forecasting target—an ESG-based index reflecting sustainability-oriented capital allocation—while the empirical focus centers on evaluating the robustness of alternative forecasting methodologies under realistic data constraints. This design choice aligns with recent sustainable finance research that treats ESG indices as composite market signals whose dynamics can be analyzed and predicted without imposing restrictive assumptions on firm-level causal mechanisms.
Despite these developments, limited research has examined whether these advanced models retain their advantage under limited data availability—a common challenge in emerging-market ESG datasets. Addressing these gaps, this study constructs a realistic Thai ESG index aligned with the Stock Exchange’s index rules, employs a 36-day multi-step forecasting framework, and systematically compares statistical and deep learning models under varying levels of training data, thereby providing new evidence on both forecasting accuracy and robustness in the context of ESG index prediction in this emerging-market application.

3. Materials and Methods

This study pursues two primary objectives: first, to construct a simulated SET ESG index for the Stock Exchange of Thailand (referred to as the Thai SET ESG Realistic Index), using free-float–adjusted market capitalization weighting and semiannual portfolio rebalancing consistent with the Exchange’s index methodology; and second, to evaluate and compare the forecasting performance of conventional statistical models and advanced deep learning architectures over a 36-trading-day horizon, assessed using both numerical error and directional-accuracy measures. To achieve these aims, the methodological framework is organized around two core components: the preparation of financial data and the development of a realistic ESG index aligned with market principles, followed by a systematic performance comparison between statistical time-series models and deep learning-based forecasting approaches.

3.1. Data Collection and Preliminary Preparation

The stock universe for this study consists of all Thai equities included in the SET ESG Index during the second half of 2025, totaling 108 firms that meet the Stock Exchange of Thailand’s sustainability assessment criteria, consistent with the premise that ESG performance contributes to long-term value enhancement and sustainable development [1,2,3]. Daily closing prices for these firms were sourced from Yahoo Finance using the yfinance Python library (version 0.2.40), covering the period from 1 January 2014 to 31 October 2025 to span multiple market cycles and ensure a sufficiently rich dataset for robust time-series forecasting [11,12]. Market capitalization data were also collected and used to compute free-float–adjusted market weights, following established index-construction practices widely used at global exchanges [6,7].

3.2. Data Cleaning and Handling of Missing Values

To ensure the dataset’s suitability for both index construction and forecasting analysis, several data-cleaning procedures were implemented. First, the continuity of daily closing price data was verified for each firm, and those lacking complete price histories for the full 2014–2025 period were excluded to avoid downward bias in aggregate market valuations caused by data gaps. Second, the index base date was set to 2 January 2014, one of the earliest trading days in the dataset, and any firm without a closing price on that date was removed so that the initial index value would accurately represent prevailing market conditions [6,7]. Third, sporadic missing values were imputed using a forward-fill method, a widely accepted approach for financial time-series data that maintains temporal consistency and minimizes imputation error when the fraction of missing observations is low [13,14]. Finally, the retained securities were checked to ensure that all series had the same number of trading observations, and the dataset was chronologically aligned into a standardized DataFrame structure. Following these filtering steps, approximately 60–70 securities remained, which is sufficient to construct an index that credibly represents Thailand’s ESG-aligned equity market.

3.3. Free Float Simulation and Free-Float–Adjusted Market Capitalization Weighting

According to the index calculation methodologies adopted by the SET and other major international benchmarks such as Nifty, index constituents should be weighted using free-float–adjusted market capitalization rather than total outstanding shares, as this approach more accurately reflects the shares that are actively tradable in the market [6,7]. Because firm-level free-float ratios for Thai listed companies are not publicly disclosed in a machine-readable format suitable for large-scale quantitative analysis, this study adopts a simulation-based approximation consistent with international index construction practices. Specifically, free-float ratios are drawn from a uniform distribution bounded between 0.40 and 0.90. This range reflects empirically observed free-float thresholds applied by major exchanges and index providers, including the Stock Exchange of Thailand and comparable emerging markets, where minimum free-float requirements typically range from 15% to 35%. Highly liquid large-cap firms frequently exhibit free-float ratios exceeding 80%. Prior index methodology studies in emerging-market contexts similarly employ bounded stochastic approximations to reflect heterogeneity in tradable share structures when precise firm-level disclosures are unavailable. Importantly, the purpose of this simulation is not to estimate firm-specific free-float levels, but to introduce realistic variation in tradable market capitalization while preserving aggregate index dynamics and rebalancing behavior. Sensitivity analysis confirms that the forecasting performance rankings of the evaluated models remain stable under alternative free-float bounds, indicating that this assumption does not drive the study’s conclusions.
The free-float–adjusted market capitalization of company ( i ) is defined as follows:
A d j C a p i = M i × F i ,
where ( M i ) is the total market capitalization and ( F i ) is the free-float factor of company ( i ) . The initial weight of each company on the base date is then computed as:
w i = A d j C a p i j = 1 N A d j C a p j ,
where ( N ) is the total number of screened companies, and j = 1 N A d j C a p j represents the aggregate free-float–adjusted market capitalization of all constituents.
This approach is consistent with index construction methodologies used in major global stock exchanges [6,7,8].

3.4. Simulated SET ESG Index Calculation and Semiannual Rebalancing

The simulated SET ESG Index is assigned a base value of 1000 points on 2 January 2014. The daily index level from the following trading days is calculated using a free-float–adjusted market capitalization–weighted index formula:
I t = i = 1 N P i , t × W i t D t × 1000 ,
where
  • Pi,t = closing price of stock i on day t
  • Wi(t) = most recently updated weight of stock i on day t , and
  • Dt = divisor on day t , which ensures index continuity.
To ensure that the index reflects the updated market environment and reduces distortions caused by changes in free float and market capitalization, portfolio rebalancing is performed every six months. On each rebalancing day, the free-float–adjusted market capitalization is recalculated and a new set of index weights W is determined. A new divisor D t is then computed so that the index level does not “jump” if stock prices remain unchanged. Specifically:
D t = i = 1 N P i , t × W i ~ I t 1 / 1000 ,
This approach is widely adopted in global equity index construction to maintain index continuity despite changes in constituent membership or weight redistribution [6,7,8].
The resulting dataset is a daily time series that reflects both the price dynamics of ESG-compliant Thai equities and the free-float–adjusted market value structure, thereby closely replicating the characteristics of real-world ESG indices.
It is important to emphasize that the simulated SET ESG Index is not intended to replicate the official index level published by the Stock Exchange of Thailand on a point-by-point basis. Instead, it is designed to replicate the index construction logic and dynamic behavior—including free-float weighting, constituent screening, and scheduled rebalancing—under publicly available information constraints. Because historical daily values of the official SET ESG Index and detailed firm-level free-float data are not publicly accessible for the whole study period, a simulation-based approach is necessary to enable transparent, reproducible, and extensible analysis. This approach is consistent with prior index and portfolio studies in the studied market, where researchers replicate exchange methodologies to study structural dynamics when official series are unavailable.

3.5. Target Variable Construction and Train–Test Data Splitting

After generating the daily simulated SET ESG Index, the resulting series was structured as a univariate time series aligned with actual trading dates from the Stock Exchange of Thailand, and any residual missing values were removed to prevent disruptions in model learning, consistent with best practices in financial data preprocessing [13,14]. For the deep learning models, training inputs were constructed using a rolling-window scheme, in which sequences of past observations are mapped to future predicted values, enabling multi-step forecasting with a 36-day prediction horizon tailored to practical investment and portfolio-management needs. Examination of the rolling-window statistics reveals temporal variations in the mean and standard deviation, confirming non-stationarity in the index dynamics (Figure 1) and reinforcing the need for flexible sequence-learning architectures capable of capturing evolving market behavior.
Figure 1 illustrates the non-stationary characteristics of the simulated SET ESG Index using rolling statistical measures. The visible variations in both the mean and standard deviation over time indicate structural changes in market conditions and justify the use of advanced forecasting models capable of capturing nonlinear and time-varying patterns. A time-based split was used to divide the dataset into a training set and a test set, rather than random sampling, to avoid future data leakage—an essential concern in time-series forecasting [12,30]. Data from 2014 to 31 October 2023 were allocated to the training set, while data from 1 November 2023 to the end of the study period in 2025 were reserved for the test set.
The test set was used exclusively for evaluating model performance in unseen future periods. This temporal split is clearly illustrated in Figure 2.
Figure 2 presents the temporal separation of the dataset into training and testing subsets to prevent information leakage. The training sample spans January 2014 to October 2023, while the testing period covers November 2023 to October 2025, enabling a realistic evaluation of forecast accuracy under unseen market conditions.

3.6. Statistical Forecasting Models

The statistical forecasting framework in this study relies on three widely adopted time-series models—ARIMA, SARIMA, and SARIMAX—each of which has been shown to perform effectively in predicting stock market indices and asset prices [11,22,34]. Model identification involved estimating the autoregressive, differencing, and moving-average parameters, along with their corresponding seasonal components, and selecting the specification with the lowest Akaike Information Criterion (AIC) to ensure the most efficient fit. To validate model adequacy, residual diagnostics were conducted using standardized residual plots, histograms with density overlays, Q–Q plots, and correlograms, confirming that the residual variation approximated the characteristics of white noise and thus met the assumptions for reliable time-series inference.
Figure 3 illustrates the standardized residuals from the ARIMA model fitted to the simulated SET ESG Index, demonstrating generally stable behavior around zero throughout most of the estimation period. This pattern suggests that the model captures the index’s primary time-series structure reasonably well. However, the significant spike at the beginning of the sample and the noticeable increase in residual volatility toward the later years indicate periods when the model struggled to account for abrupt market-driven fluctuations or structural changes fully. These deviations imply potential heteroskedasticity and evolving dynamics in the index that a purely linear specification may not adequately model. Thus, while ARIMA offers an acceptable baseline forecast for broad market trends, the residual patterns highlight the need for more adaptive or nonlinear forecasting architectures—particularly during episodes of heightened uncertainty—to more accurately reflect the complex behavior of ESG-linked financial indices under emerging-market data conditions.
Figure 4 shows the distribution of residuals from the ARIMA model fitted to the simulated SET ESG Index. The histogram indicates that most residuals are concentrated near zero, suggesting that the model captures the index’s general level. However, the presence of noticeable tails and a small number of considerable deviations reflect occasional periods of sharp misalignment between the model’s predictions and actual index movements. This pattern suggests that while ARIMA provides a reasonable fit for normal market conditions, it is less effective during episodes of heightened volatility or structural shifts, underscoring the need for more flexible, nonlinear forecasting models for ESG-related financial time series in an emerging-market setting.
Figure 5 presents a Q–Q plot of the ARIMA residuals compared with a theoretical normal distribution. The blue dots represent the empirical quantiles of the standardized ARIMA residuals, while the red reference line indicates the theoretical quantiles expected under a normal distribution. Deviations of the residual points from the reference line—particularly in the tails—suggest departures from normality, reflecting heavy-tailed behavior and extreme observations that are not fully captured by the linear ARIMA specification. While the middle portion of the distribution aligns reasonably well with the reference line, deviations become pronounced in both tails, particularly with extreme positive and negative residuals that diverge substantially from normality. This pattern suggests the presence of heavy-tailed behavior and occasional extreme shocks in the simulated SET ESG Index that the ARIMA model does not fully capture. Consequently, the results indicate that residuals are not perfectly normally distributed, reinforcing the need for models capable of handling nonlinear dynamics and larger fluctuations typically observed in financial markets with ESG-related components.
Figure 6 displays the autocorrelation function of the ARIMA model residuals, showing that nearly all autocorrelation values fall within the 95% confidence bounds across the examined lags. The blue dots represent the sample autocorrelation coefficients of the ARIMA residuals at each lag, while the horizontal reference line indicates the zero-autocorrelation baseline. The absence of significant spikes outside conventional confidence bounds suggests that the residuals exhibit no systematic serial correlation, supporting the adequacy of the ARIMA specification in capturing linear temporal dependence. This pattern indicates an absence of systematic temporal dependence and suggests that the ARIMA specification effectively removes the majority of linear serial correlation in the simulated SET ESG Index data. The residuals, therefore, behave primarily as white noise, supporting the model’s adequacy in capturing short-term dynamics, although earlier diagnostics highlight that nonlinear behavior and heavy tails remain unaccounted for.
In the SARIMAX specification, a monthly calendar indicator was incorporated as an exogenous regressor to capture potential calendar-related effects commonly observed in equity markets. Specifically, the indicator was implemented as a set of binary dummy variables representing calendar months, following standard practice in time-series econometric modeling. The inclusion of a monthly structure is motivated by institutional features of the Thai capital market, including periodic portfolio rebalancing, reporting cycles, and recurring liquidity patterns associated with month-end trading behavior, which have been documented in emerging-market equity studies. Preliminary inspection of autocorrelation and partial autocorrelation functions suggested weak but non-negligible seasonal regularities at monthly frequencies. The statistical significance of the monthly indicators was evaluated within the SARIMAX framework, and while some coefficients reached conventional significance levels, their overall contribution to forecast accuracy was modest. Consequently, SARIMAX is retained as a benchmark model rather than a primary forecasting approach, providing a reference point against which the performance gains of deep learning models can be assessed [5,17].

3.7. Statistical Diagnostics and Data Characteristics

To validate the suitability of the simulated SET ESG Index for forecasting analysis, a series of econometric diagnostics were conducted prior to model estimation. First, the Augmented Dickey–Fuller (ADF) and Kwiatkowski–Phillips–Schmidt–Shin (KPSS) tests were applied to assess stationarity. The ADF results indicated rejection of the unit-root hypothesis (p < 0.05), while the KPSS test rejected the null of stationarity (p < 0.05), confirming that the index exhibits characteristics of a difference-stationary process consistent with equity return dynamics in the Thai market [11]. Residual correlogram and Ljung–Box statistics further demonstrated significant serial correlation, while the ARCH test returned a p-value < 0.05, indicating time-varying volatility typical of financial time series. Together, these results justify incorporating differencing in ARIMA-based models and motivate the use of deep learning architectures that can learn nonlinear dependencies and heteroscedasticity.
Table 1 summarizes the results of the stationarity and diagnostic tests, confirming that the simulated SET ESG Index exhibits stylized properties commonly observed in financial time series, including strong persistence, serial dependence, and volatility clustering. The Augmented Dickey–Fuller (ADF) test fails to reject the null hypothesis of a unit root (p = 0.9912), indicating that the simulated SET ESG Index is non-stationary in levels. Consistently, the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test rejects the null hypothesis of trend stationarity (p = 0.0100), further confirming the absence of a stable long-run mean or variance. In addition, the Ljung–Box test reveals highly significant serial correlation (p < 0.001), demonstrating that past index movements exert a strong influence on future dynamics, while the ARCH LM test indicates statistically significant heteroscedasticity (p < 0.001), consistent with time-varying volatility typical of equity markets. Together, these results indicate that the simulated SET ESG Index follows a difference-stationary process, characteristic of financial price series in emerging markets. These properties justify the application of differencing in ARIMA-based models and motivate the use of advanced nonlinear forecasting architectures capable of capturing long-memory behavior, autocorrelation, and dynamic volatility.

3.8. Deep Learning Model Configuration

The deep learning models utilized in this study include RNN, GRU, LSTM, DF-RNN, DeepAR, Deep State Space Model (DSSM), and Deep Renewal, all implemented in TensorFlow/Keras and aligned with methodological practices commonly adopted in comparative stock-index forecasting research [9,10,12,23]. Advanced architectures such as DF-RNN, DeepAR, DSSM, and Deep Renewal are designed to capture latent co-movements and probabilistic uncertainty in index trajectories more effectively than conventional recurrent networks [12]. In this study, DeepAR and DSSM were applied in proxy configurations that integrate GRU/LSTM cores with likelihood-based outputs, reflecting practical implementations frequently adopted in real-world financial modeling [27,28,29]. All models were trained using the Adam optimizer and mean squared error loss, with a rolling-window length of 60 trading days and a batch size of 32, and model training was capped at 200 epochs with early stopping based on validation-loss monitoring to mitigate overfitting, which is particularly relevant for LSTM architectures [31]. Hidden-layer configurations were standardized to ensure comparability, with RNN, GRU, and LSTM employing 64 recurrent units and 0.2 dropout. At the same time, DF-RNN, DSSM, and Deep Renewal incorporated deeper structures with increased units and 0.3 dropout to support enhanced representational capability. Collectively, these design decisions balance predictive accuracy and computational efficiency, enabling a fair evaluation of deep learning model robustness under varying data-availability conditions.
Table 2 summarizes the hyperparameter configurations employed to ensure a fair, consistent, and reproducible comparison of forecasting performance across all deep learning architectures evaluated in this study. All models were trained using the Adam optimizer with a learning rate of 0.001 and the mean squared error loss function—except for DeepAR and DSSM, which utilize Gaussian likelihood-based MSE to represent probabilistic uncertainty in predicted index trajectories better—while maintaining a batch size of 32, a maximum of 200 epochs, and an early stopping patience of 20 epochs to mitigate overfitting and control computational cost. The rolling-window input was set to 60 trading days, roughly three months of market activity, consistent with best practices in financial forecasting. Baseline architectures, including RNN, GRU, and LSTM, adopted a single hidden layer of 64 units with a dropout rate of 0.2, whereas advanced structures, such as DF-RNN and Deep Renewal, incorporated multi-layer configurations with increased dropout to enhance generalization. Proxied implementations of DeepAR and DSSM were achieved by embedding GRU/LSTM recurrent cores to ensure stability and applicability to real-world financial data, following practical frameworks in prior research [27,28,29,31]. Overall, the hyperparameter strategy reflects an evidence-based tuning process—guided by preliminary grid search and validation performance—designed to optimize predictive capability while maintaining efficiency and robustness in ESG index forecasting for an emerging-market context.
This study deliberately adopts a univariate forecasting framework based solely on past index values. This design choice reflects the study’s primary objective of evaluating the robustness of alternative forecasting architectures under data-constrained conditions typical of emerging ESG markets. Incorporating external variables—such as commodity prices, interest rates, political events, or ESG-related news sentiment—would require reliable, high-frequency, and consistently measured covariates, which are often unavailable or incomplete in emerging-market ESG contexts. Moreover, restricting the analysis to univariate inputs ensures a controlled comparison across models, allowing observed performance differences to be attributed to model architecture rather than to heterogeneous information sets.

3.9. Training Procedure and Performance Evaluation

To evaluate robustness under conditions of constrained data availability—an inherent challenge in this emerging-market application—the forecasting models were trained and tested using three different proportions of the historical dataset: 100%, 50%, and 25%. This design supports a systematic examination not only of predictive accuracy but also of the models’ resilience when information is limited. Performance was assessed using a combination of numerical error metrics and directional accuracy indicators, allowing for a comprehensive understanding of how reductions in available data affect forecasting precision and stability across the selected modeling approaches. Any approximation error arising from index simulation affects all forecasting models symmetrically, ensuring that relative performance comparisons and robustness conclusions remain internally valid.
Model performance was evaluated using a combination of numerical error metrics and directional accuracy indicators, as follows:
Mean Squared Error (MSE) and Root Mean Squared Error (RMSE).
These metrics measure average squared error and its square root and are widely used in stock index forecasting [12,28,29]:
MSE = 1 T t = 1 T y t y t ^ 2
RMSE = 1 T t = 1 T y t y t ^ 2
Mean Absolute Error (MAE).
Applied to statistical models to provide an alternative measure of absolute error less sensitive to outliers [11,22]:
MAE = 1 T t = 1 T y t y t ^
Mean Absolute Percentage Error (MAPE).
Defined using predicted values in the denominator to avoid abnormal percentage distortion when actual values approach zero [12,28]:
MAPE = 100 T t = 1 T y t y t ^ y t ^
Percentage of Correct Direction (POCID).
Evaluates the model’s ability to correctly predict the directional movement (up/down) of the index, which is critical for strategic investment decisions [10,27]:
POCID = 100 T t = 1 T 1 y t y t 1 y t ^ y t 1 > 0
Theil’s Inequality Coefficient (Theil’s U).
Used to benchmark performance against a random-walk model to determine whether the forecasting approach provides meaningful improvement [5,12]:
Theil s   U = 1 T t = 1 T y t y t ^ 2 1 T t = 1 T y t y t 1 2
In this study, Theil’s U is reported in scaled form for comparability across models and data-reduction scenarios. Accordingly, absolute values greater than unity do not indicate inferior performance relative to a naïve random-walk benchmark in isolation, but should be interpreted comparatively across models evaluated under identical conditions.
Robustness in this study is defined as the ability of a forecasting model to maintain relatively stable performance rankings and controlled degradation of error metrics (MSE, RMSE, MAPE, POCID, and Theil’s U) as training data availability is progressively reduced from 100% to 50% and 25%. Accordingly, robustness is assessed in relative terms across competing models under identical data constraints, rather than against an absolute performance threshold.

3.10. Summary of Methodology

This study integrates a market-aligned ESG index construction methodology that incorporates free-float–adjusted market capitalization, semiannual rebalancing, and divisor adjustments consistent with international index management standards [6,7,8], together with time-series preprocessing techniques specifically tailored to the statistical properties of financial data, including missing-value treatments informed by prior empirical evidence [13,14]. In parallel, the research conducts a systematic comparison of contemporary forecasting approaches, spanning both traditional time-series models and advanced deep learning architectures under a multi-step 36-day prediction horizon [9,11,12]. This combined methodological design enables the development of a Thai ESG index that closely reflects actual market mechanisms while also providing a rigorous evaluation framework for assessing AI-enabled forecasting performance in the context of emerging financial markets.

4. Results

This section presents the empirical findings derived from forecasting the simulated SET ESG Index using both statistical and deep learning models. The results are organized first to assess model performance under full-data conditions and then to evaluate robustness as the training dataset size is systematically reduced. Performance is compared using multiple accuracy and directional metrics to determine not only the predictive precision of each model but also its capacity to generalize under data constraints commonly encountered in the studied market. Together, these results provide a comprehensive evaluation of the relative strengths and limitations of traditional and AI-based forecasting approaches for ESG index prediction.

4.1. Results of Statistical Forecasting Models

Table 3 summarizes the performance of the ARIMA, SARIMA, and SARIMAX models in forecasting the simulated SET ESG Index, revealing that all three statistical approaches produce comparable error magnitudes, with ARIMA yielding the lowest MSE (206,896.80), closely followed by SARIMAX and SARIMA, while RMSE values remain clustered between roughly 454 and 456 index points and MAE values around 324–325 points, resulting in MAPE values near 11–11.5%. These results indicate that although conventional time-series models can reasonably approximate the overall movement of the index and capture its general directional shifts, their accuracy is significantly constrained when more precise numerical forecasts are required, particularly in comparison with deep learning models that exhibit markedly improved predictive performance as demonstrated in the following section [11,22,34].
ARIMA was selected as the sole statistical benchmark in this study due to its widespread adoption and proven ability to capture autoregressive dependencies in financial time series while providing interpretable baseline forecasts for stock indices [11,22]. Parameter selection followed the Akaike Information Criterion (AIC), and the final model passed standard diagnostic checks, including residual white noise and the absence of strong serial structure, as confirmed by standardized residual plots and correlograms. Although ARIMA effectively models broad-level index trajectories, its linear structure limits its predictive precision under nonlinear, high-volatility market conditions characteristic of ESG assets under emerging-market data conditions, highlighting the need for more adaptive deep learning architectures. The inclusion of monthly calendar effects in the SARIMAX model did not materially improve forecast accuracy relative to ARIMA and SARIMA, further supporting the use of ARIMA as the primary statistical benchmark.

4.2. Deep Learning Models Using 100% Training Data

An assessment of all seven deep learning architectures—RNN, GRU, LSTM, DF-RNN, DeepAR, DSSM, and Deep Renewal—using the complete training dataset (100%) demonstrates that the GRU model provides the most accurate numerical forecasts, achieving the lowest MSE (12,299.00) and RMSE (110.90 index points), along with the smallest MAPE of 3.12%, thereby showing the strongest ability to approximate the actual value of the simulated SET ESG Index with minimal [9,10,12,23]. Although the directional accuracy of most models, measured by POCID, falls within a relatively narrow range of approximately 48–51%, indicating only a slight improvement over a random directional guess, the principal strength of these deep learning approaches lies in their superior numerical precision rather than their capacity to anticipate market direction. These findings, as summarized in Table 4 and illustrated in Figure 7, highlight the significant performance advantage of deep learning over traditional statistical forecasting techniques in this ESG-focused emerging market context.
It is important to note a divergence between numerical forecasting accuracy and directional predictability. While GRU and related deep learning models achieve substantially lower MSE, RMSE, and MAPE values, their Percentage of Correct Direction (POCID) remains only marginally above 50%, indicating limited ability to consistently predict short-term index direction. This pattern suggests that the models are more effective at approximating index levels than at anticipating directional turning points, a phenomenon commonly observed in financial time-series forecasting.
Compared with the statistical models, even the least effective deep learning model—RNN—achieves substantially lower MSE and MAPE values than the ARIMA-based approaches, reinforcing the superior ability of neural network architectures to learn nonlinear and non-stationary dynamics typical of ESG-driven equity movements [9,12,23]. As shown in Table 4, GRU, DF-RNN, and Deep Renewal consistently exhibit the lowest numerical forecasting errors (MSE, RMSE, and MAPE) and Theil’s U values among the deep learning models evaluated, despite showing only marginal improvements in directional accuracy (POCID) relative to their counterparts, thus establishing these three architectures as the most reliable performers under the full-information training scenario. To further examine model robustness under realistic constraints frequently encountered in the Thai market, these three models were retrained using progressively reduced training sets of 50% and 25%, while holding the test period and preprocessing procedures constant, and the results presented in Table 5 and Table 6, and Figure 8 illustrate the degree to which each model’s error metrics deteriorate as training data availability declines, providing insight into their resilience and practical suitability for ESG index forecasting.

4.3. Deep Learning Models Using 50% Training Data

When the training set was reduced to 50%, the GRU network continued to perform best across all evaluation metrics, as shown in Table 5. GRU achieved the lowest MSE (11,820.39) and an RMSE of 108.72 index points, while also producing the smallest MAPE (3.06%) and the lowest Theil’s U value (11.01), confirming its numerical accuracy under more restrictive data availability. Additionally, GRU recorded the highest POCID value (53.09%), indicating that it not only forecasts index levels more precisely than the other models but also exhibits superior capability in predicting the direction of index movements.

4.4. Deep Learning Models Using 25% Training Data

When the training set was further reduced to just 25%, the divergence in model performance became even more apparent. As reported in Table 6, GRU remained the most accurate architecture, yielding the lowest MSE (59,050.45) and RMSE (243.00 index points), as well as the smallest MAPE (6.43%) and the lowest Theil’s U value (42.61). Notably, GRU sustained a POCID of 53.09%, indicating that it continued to provide comparatively stronger directional prediction capabilities than DF-RNN and Deep Renewal despite the significant reduction in available training information. This sustained superior performance highlights GRU’s robustness in handling constrained data environments, a characteristic particularly relevant for financial forecasting in this emerging-market application, where historical ESG index records are often limited.
Although absolute Theil’s U values increase for all models under extreme data reduction, GRU exhibits the smallest relative deterioration across evaluation metrics, supporting its classification as the most robust architecture within the comparative framework adopted in this study.
Figure 8 illustrates the training and validation learning curves for the three best-performing deep learning models—GRU, DF-RNN, and Deep Renewal—under full training data (100%) and reduced data conditions (50% and 25%). Across all scenarios, training losses decrease rapidly within the initial epochs, demonstrating efficient pattern learning from the input sequences. Validation losses closely track training losses, with no substantial divergence over time, indicating that overfitting is effectively controlled through early stopping and dropout regularization. GRU exhibits the most stable convergence behavior, maintaining consistently low validation losses even when trained with only 25% of the data, reflecting strong robustness to data scarcity. DF-RNN and Deep Renewal also achieve satisfactory convergence, although their curves reveal slightly higher variability and slower stabilization under reduced data availability, particularly in the 25% training condition. These results collectively demonstrate that the selected architectures generalize well across varying data environments, confirming model reliability and supporting the conclusion that GRU provides the most resilient forecasting performance for the simulated SET ESG Index in an emerging-market setting.
In sum, learning curves show consistently lower validation loss for GRU than for DF-RNN and Deep Renewal, confirming superior generalization performance. The early-stopping criterion effectively prevents overfitting, particularly in LSTM-derived architectures.
When visualizing the results across all three training-data conditions, GRU maintains comparatively low forecasting errors and more stable Theil’s U values than the other deep learning models, reinforcing its ability to preserve predictive performance even as training data become increasingly limited. This consistent advantage demonstrates the robustness of GRU under realistic data-scarcity scenarios typical of emerging markets. It highlights its suitability for forecasting ESG-linked financial indices where long historical series are often unavailable, as illustrated in Figure 9.
Figure 9 illustrates how the forecasting performance of the three best-performing deep learning models responds to reductions in training data, demonstrating the GRU architecture’s superior robustness. Across all four evaluation metrics—MSE, RMSE, MAPE, and Theil’s U—GRU consistently maintains lower error values than DF-RNN and Deep Renewal, even when the available training data declines from 100% to 50% and 25%. In contrast, DF-RNN and Deep Renewal show marked increases in error when training data become scarce, indicating greater dependence on extensive historical information to achieve accurate predictions. The GRU model’s comparatively stable performance highlights its resilience under realistic data-constrained conditions often encountered in the studied market, reinforcing its suitability for ESG index forecasting, where long, complete observation histories are not always available.

5. Discussion

This study set out to develop a realistic simulated SET ESG Index and evaluate the forecasting performance of traditional statistical models and contemporary deep learning architectures under multi-step prediction settings. The empirical results demonstrate that deep learning models significantly outperform their statistical counterparts in predicting the future trajectory of the ESG equity index. In particular, GRU, DF-RNN, and Deep Renewal exhibit markedly lower MSE, RMSE, and MAPE values while maintaining more favorable Theil’s U statistics, illustrating that deep learning methods capture nonlinear temporal dependencies that statistical time-series models fail to replicate. These findings reinforce existing evidence that advanced neural networks often surpass linear benchmarks in financial forecasting tasks where market dynamics are non-stationary and influenced by complex behavioral drivers [9,11,12,14].
From an economic perspective, the observed predictability of the simulated SET ESG Index does not necessarily imply exploitable inefficiencies, but rather reflects structural and institutional characteristics of ESG-oriented equity portfolios. ESG indices typically exhibit higher persistence due to stable constituent selection, periodic rather than continuous rebalancing, and long-term investment horizons of sustainability-oriented investors. These features can generate smoother dynamics and more extended memory than broad market indices, thereby enhancing forecastability over short- to medium-term horizons. Deep learning models—particularly GRU—appear well-suited to capturing such persistence and nonlinear adjustment patterns without requiring explicit specification of underlying economic drivers. Thus, while this study does not seek to disentangle causal mechanisms, it provides empirical evidence that ESG index dynamics possess exploitable temporal structure that data-driven models can reliably learn.
A central contribution of this research lies in its robustness analysis. Contrary to much of the prior work that assumes sufficiently long and complete datasets, this study explicitly tests model stability across multiple training data scenarios to reflect real-world ESG data environments in an emerging-market context, using Thailand as a representative case, where data scarcity persists due to evolving disclosure practices and inconsistencies in sustainability reporting [2,15]. The GRU model retains strong performance even when the training set is reduced by 75%, achieving consistent directional accuracy and acceptable proportional error. Such resilience suggests that GRU is better equipped to handle inherent data limitations than other architectures, making it a promising tool for supporting sustainable investment strategies in markets that are still transitioning toward comprehensive ESG data infrastructures. Additional sensitivity checks using narrower free-float bounds (e.g., 0.50–0.85) yielded consistent relative model performance, further supporting the robustness of the proposed forecasting framework.
The observed discrepancy between strong numerical accuracy and modest directional accuracy has important practical implications. Low MAPE values indicate that GRU produces forecasts that closely track the magnitude of index movements, which is valuable for risk assessment, portfolio valuation, and scenario analysis. However, the relatively weak POCID performance suggests that these forecasts should not be interpreted as reliable short-term trading signals. Instead, the primary utility of GRU-based forecasts lies in medium-horizon planning and quantitative risk management, rather than directional market timing.
While multivariate forecasting frameworks that integrate macroeconomic indicators, commodity prices, or sentiment measures can enhance predictive accuracy in developed markets, their effectiveness depends critically on data quality and availability. In emerging markets, ESG-relevant external variables are often noisy, delayed, or inconsistently defined, introducing additional sources of estimation error. By focusing on univariate dynamics, this study provides a conservative assessment of model capability. It demonstrates that deep learning architectures—particularly GRU—can extract meaningful temporal structure from ESG index series even in the absence of rich external information.
These findings logically build on the Introduction, which emphasized the growing importance of ESG information for long-term value creation and market resilience, particularly in developing economies where ESG integration is accelerating [1,4,5]. By demonstrating that reliable ESG index forecasting can be achieved despite data constraints, this study directly addresses the knowledge gap identified at the outset regarding the applicability of deep learning to emerging ESG markets. The results indicate that AI-driven forecasting can contribute to more informed capital allocation, improved risk management, and enhanced investor confidence—mechanisms that strengthen sustainable finance ecosystems and promote more equitable economic outcomes.
The empirical findings on the superior robustness of GRU are based on a single ESG index constructed for the Thai equity market. They should therefore be interpreted within this specific context. Although Thailand shares structural characteristics with many emerging markets—such as evolving ESG disclosure regimes, moderate liquidity, and limited historical ESG data—the study does not claim universal generalization across all developing economies. Instead, it demonstrates that, under realistic data constraints typical of emerging markets, GRU is more robust than alternative models in this representative case. Extending the analysis to ESG indices from additional developing countries would provide a valuable avenue for future research and enable broader cross-market validation.
Nevertheless, certain limitations should be acknowledged. The analysis relies solely on univariate time-series forecasting without incorporating external economic or corporate sustainability drivers that may influence ESG index performance. Additionally, while the simulated index closely adheres to fundamental market construction guidelines, it remains a proxy. It is subject to further validation once longer ESG series from the SET become available. These limitations suggest that future work should integrate macro-financial indicators, ESG disclosure quality metrics, and explainable AI techniques to deepen interpretability and ensure reliability in investment decision-making.
While the use of a simulated ESG index introduces approximation relative to the official exchange index, this limitation primarily affects absolute index levels rather than the temporal dependency structure exploited by forecasting models. As the study’s conclusions are based on relative model performance under identical data conditions, potential simulation-induced bias does not compromise the comparative validity of the results. Future research may extend this framework to use official ESG index series once longer, more granular data become publicly available.
Overall, the findings offer a compelling takeaway: deep learning, and particularly GRU, represents the most consistently robust method in relative terms across models when training data are substantially constrained for forecasting ESG-focused stock indices in the Thai market, even in the presence of substantial data constraints. This positions AI not only as a technological enhancement but also as a strategic enabler of sustainable finance, capable of supporting regulatory objectives, advancing responsible investment practices, and accelerating the transition toward resilient, sustainability-oriented capital markets in Thailand and similar economies. Accordingly, the theoretical contribution of this study lies not in testing ESG–financial performance relationships, but in demonstrating that ESG-informed market structures generate predictable index dynamics that can be effectively modeled using advanced data-driven techniques.

6. Conclusions

This study demonstrates that deep learning approaches, particularly GRU, provide substantially more accurate and robust multi-step numerical forecasts of a realistic simulated SET ESG Index than conventional statistical models, even when training data are scarce, underscoring their suitability for emerging markets where ESG reporting remains incomplete. By constructing an index aligned with real free-float weighting and rebalancing rules while testing model performance under multiple data availability scenarios, the study fills a critical gap in sustainable finance research by showing that AI-driven forecasting can remain reliable despite structural data limitations. Although the analysis focuses solely on univariate forecasting and a proxy index representation, these limitations present opportunities to integrate macroeconomic indicators, ESG disclosure quality metrics, and explainable AI into future work to further advance interpretability and decision support. Ultimately, the findings highlight how resilient deep learning tools can strengthen investment decision-making, risk management, and policy design in sustainability-focused capital markets, contributing to more competitive, transparent, and resilient financial ecosystems in Thailand and other rapidly developing economies that embrace ESG principles. Accordingly, the conclusions should be interpreted as evidence from a representative emerging-market case rather than as universal claims, with future multi-country studies needed to validate cross-market robustness further.
Future research may extend this framework by incorporating macroeconomic and macro-financial variables, commodity prices, ESG disclosure quality measures, firm-level sustainability indicators, and ESG-related news sentiment to examine how external information affects ESG index predictability in multivariate forecasting settings. Such extensions would enable a more explicit exploration of the economic drivers underlying ESG index dynamics while preserving the methodological foundation established in this study.

Author Contributions

Conceptualization: U.D., and W.C.; Methodology: U.D., R.K., W.C., and R.D.; Software: U.D., R.K., W.C., and R.D.; Validation: U.D., R.K., W.C., and R.D.; Formal analysis: U.D., R.K., W.C., and R.D.; Investigation: U.D., R.K., W.C., and R.D.; Resources: U.D., and W.C.; Data curation: U.D., and W.C.; Writing—original draft preparation: U.D., R.K., W.C., and R.D.; Writing—review and editing: U.D., R.K., W.C., and R.D.; Visualization: U.D., and W.C.; Supervision: W.C.; Project administration: U.D., and W.C.; funding acquisition, U.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work (Grant No. MHESI-CMDF 68-003) was supported by Office of the Permanent Secretary, Ministry of Higher Education, Science, Research and Innovation (OPS MHESI), Thailand Capital Market Development Fund (CMDF) and Khon Kaen University, Thailand.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The publicly available dataset analyzed in this study is the ESC Thailand Dataset (2014–2024). All code used for data preprocessing, model training, evaluation, and explainability analysis is openly accessible at https://github.com/wirapong/ESG_Thailand_Market.git (accessed on 18 December 2025). The repository contains version-controlled scripts, documented dependencies, and detailed instructions to support full transparency and reproducibility of the reported results.

Acknowledgments

During the preparation of this work, the authors used ChatGPT (OpenAI, GPT-4.0 version) and Grammarly (Grammarly Inc., Premium version, web-based application) for language editing. The authors declare that they reviewed and edited the final output as needed and take full responsibility for the content of the published article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Friede, G.; Busch, T.; Bassen, A. ESG and financial performance: Aggregated evidence from more than 2000 empirical studies. J. Sustain. Financ. Investig. 2015, 5, 210–233. [Google Scholar] [CrossRef]
  2. Li, X.; Saat, M.M.; Khatib, S.F.; Liu, Y. Sustainable development and firm value: How ESG performance shapes corporate success—A systematic literature review. Bus. Strategy Dev. 2024, 7, e70026. [Google Scholar] [CrossRef]
  3. Xu, J.; Wan, L. ESG performance and financial performance: Enterprise analysis from a sustainability perspective. Account. Audit. Financ. 2024, 5, 75–81. [Google Scholar] [CrossRef]
  4. Elhawary, E.; Elbolok, R. The implications of COVID-19 on ESG performance and financial reporting quality in Egypt. J. Financ. Report. Account. 2024; ahead-of-print. [Google Scholar] [CrossRef]
  5. Gabr, D.H.; ElBannan, M.A. Is it worth it to go green? ESG disclosure, carbon emissions and firm financial performance in an emerging-market context, using Thailand as a representative case. Rev. Account. Financ. 2024, 24, 193–217. [Google Scholar] [CrossRef]
  6. The Stock Exchange of Thailand. Ground Rules for the SET Index Series; Stock Exchange of Thailand: Bangkok, Thailand, 2023. [Google Scholar]
  7. Thunuguntla, J. Indian Nifty basis change from full market cap to free-float basis. Econ. Times 2009. [Google Scholar] [CrossRef]
  8. Perera, U.; Dissanayake, R.; Jayasundara, M. Assessing the impact of S&P SL20 index construction on listed companies in Colombo Stock Exchange (CSE). Int. J. Econ. Financ. 2016, 8, 159–177. [Google Scholar] [CrossRef]
  9. Barua, M.; Kumar, T.; Raj, K.; Roy, A.M. Comparative analysis of deep learning models for stock price prediction in the Indian market. FinTech 2024, 3, 551–568. [Google Scholar] [CrossRef]
  10. Chahuán-Jiménez, K. Neural network-based predictive models for stock market index forecasting. J. Risk Financ. Manag. 2024, 17, 242. [Google Scholar] [CrossRef]
  11. Jiang, L.C.; Subramanian, P. Forecasting of stock price using autoregressive integrated moving average model. J. Comput. Theor. Nanosci. 2019, 16, 3519–3524. [Google Scholar] [CrossRef]
  12. Patel, H.; Bolla, B.K.; Sabeesh, E.; Bhumireddy, D.R. Comparative study of predicting stock index using deep learning models. In International Conference on Cognitive Computing and Cyber Physical Systems; Springer Nature: Cham, Switzerland, 2023; pp. 45–57. [Google Scholar] [CrossRef]
  13. Utama, A.B.P.; Wibawa, A.P.; Handayani, A.N.; Irianto, W.S.G.; Nyoto, A. Improving time-series forecasting performance using imputation techniques in deep learning. In Proceedings of the 2024 International Conference on Smart Computing, IoT and Machine Learning (SIML), Surakarta, Indonesia, 6–7 June 2024; pp. 232–238. [Google Scholar] [CrossRef]
  14. Wongoutong, C. Performance comparison of various imputation methods for missing data mechanisms (MAR, MCAR, and MNAR) in a nonstationary time series. Int. J. Math. Math. Sci. 2025, 2025, 3031708. [Google Scholar] [CrossRef]
  15. Kim, J. ESG performance and financial performance. Korean Career Entrep. Bus. Assoc. 2025, 9, 319–336. [Google Scholar] [CrossRef]
  16. Jin, Y.; Shen, Z.; Liu, J.; Tansuchat, R. The impact of the digital economy on the health industry from the perspective of threshold and intermediary effects: Evidence from China. Sustainability 2023, 15, 11141. [Google Scholar] [CrossRef]
  17. Zhou, G.; Liu, L.; Luo, S. Sustainable development, ESG performance and company market value: Mediating effect of financial performance. Bus. Strategy Environ. 2022, 31, 3371–3387. [Google Scholar] [CrossRef]
  18. Zhang, G. Test of synergy between ESG performance and financial performance. Adv. Econ. Manag. Political Sci. 2025, 162, 73–83. [Google Scholar] [CrossRef]
  19. Moćić, B. Robust portfolio optimization strategies in the Serbian stock market. Manag. J. Sustain. Bus. Manag. Solut. Emerg. Econ. 2023, 28, 65–78. [Google Scholar] [CrossRef]
  20. Thong-Ou, P.; Nadee, W. Stock selection by machine learning in Thailand stock market (SET). In Proceedings of the 2024 8th International Conference on Information Technology (InCIT), Chonburi, Thailand, 14–15 November 2024; pp. 467–472. [Google Scholar] [CrossRef]
  21. Ferraro, G.; Quinto, I.; Scandurra, G.; Thomas, A. The impact of artificial intelligence and sustainability management on fostering ESG practices and competitive perspectives among SMEs. Corp. Soc. Responsib. Environ. Manag. 2025, 32, 6641–6657. [Google Scholar] [CrossRef]
  22. Meher, B.K.; Hawaldar, I.T.; Spulbar, C.M.; Birau, F.R. Forecasting stock market prices using mixed ARIMA model: A case study of Indian pharmaceutical companies. Investig. Manag. Financ. Innov. 2021, 18, 42–54. [Google Scholar] [CrossRef]
  23. Yu, Y. LSTM-based time series prediction model: A case study with yfinance stock data. ITM Web Conf. 2025, 70, 03015. [Google Scholar] [CrossRef]
  24. Chang, V.I.; Xu, Q.A.; Chidozie, A.; Wang, H. Predicting economic trends and stock market prices with deep learning and advanced machine learning techniques. Electronics 2024, 13, 3396. [Google Scholar] [CrossRef]
  25. Rathna, K.; Student, P.; Vinita, S.; Devi, S. RNN, GRU, and LSTM model analysis for AI-powered Indian stock market index forecasting. In Proceedings of the 2025 International Conference on Visual Analytics and Data Visualization (ICVADV), Tirunelveli, India, 4–6 March 2025; pp. 1517–1523. [Google Scholar] [CrossRef]
  26. Amrin, D.; Mukhanov, S.; Amanzholova, S.; Amirgaliyev, B. A comparative analysis of neural network models on predicting stock prices. Bull. Shakarim Univ. Tech. Sci. 2025, 3, 64–72. [Google Scholar] [CrossRef] [PubMed]
  27. Mehtab, S.; Sen, J.; Dasgupta, S. Robust analysis of stock price time series using CNN and LSTM-based deep learning models. In Proceedings of the 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 5–7 November 2020; pp. 1481–1486. [Google Scholar] [CrossRef]
  28. Sharaf, M.; Hemdan, E.E.; El-Sayed, A.; El-Bahnasawy, N.A. StockPred: A framework for stock price prediction. Multimed. Tools Appl. 2021, 80, 17923–17954. [Google Scholar] [CrossRef]
  29. Das, J.; Thulasiram, R.K.; Bowala, S.; Saumyamala, A.; Thavaneswaran, A. Hybrid LSTM/GRU and support vector regression models for stock index prediction. In Proceedings of the 2025 IEEE 49th Annual Computers, Software, and Applications Conference (COMPSAC), Toronto, ON, Canada, 8–11 July 2025; pp. 1917–1922. [Google Scholar] [CrossRef]
  30. Fu, P. A study on stock market forecasting based on time series with integrated multi-feature LSTM modeling. In Proceedings of the 2024 4th International Signal Processing, Communications and Engineering Management Conference (ISPCEM), Montreal, QC, Canada, 28–30 November 2024; pp. 384–388. [Google Scholar] [CrossRef]
  31. Wang, Q. Stock prediction under GRU, LSTM and GNN. In Proceedings of the 2024 IEEE 2nd International Conference on Image Processing and Computer Applications (ICIPCA), Shenyang, China, 28–30 June 2024; pp. 460–464. [Google Scholar] [CrossRef]
  32. Lima, J.; Salles, R.; Porto, F.; Coutinho, R.D.; Alpis, P.; Escobar, L.E.; Pacitti, E.; Ogasawara, E.S. Forward and backward inertial anomaly detector: A novel time series event detection method. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 18–23 July 2022; pp. 1–8. [Google Scholar] [CrossRef]
  33. Wang, Z.; Ventre, C. A financial time series denoiser based on diffusion models. In Proceedings of the 5th ACM International Conference on AI in Finance, Brooklyn, NY, USA, 14–17 November 2024; pp. 72–80. [Google Scholar] [CrossRef]
  34. Ramkumar, G.; Mohanavel, V.; S, R.; Tamilselvi, M.; Vijayashanthi, R.S.; Amruthavalli, P. Development of a robust stock market prediction mechanism based on enhanced comprehensive learning principles. In Proceedings of the 2023 International Conference on Research Methodologies in Knowledge Management, Artificial Intelligence and Telecommunication Engineering (RMKMATE), Chennai, India, 1–2 November 2023; pp. 1–8. [Google Scholar] [CrossRef]
Figure 1. Rolling mean and rolling standard deviation of the simulated SET ESG Index.
Figure 1. Rolling mean and rolling standard deviation of the simulated SET ESG Index.
Sustainability 18 00110 g001
Figure 2. Time-based train–test split for multi-step forecasting.
Figure 2. Time-based train–test split for multi-step forecasting.
Sustainability 18 00110 g002
Figure 3. Standardized residuals from the ARIMA model fitted to the simulated SET ESG Index.
Figure 3. Standardized residuals from the ARIMA model fitted to the simulated SET ESG Index.
Sustainability 18 00110 g003
Figure 4. Distribution of ARIMA residuals for the simulated SET ESG Index.
Figure 4. Distribution of ARIMA residuals for the simulated SET ESG Index.
Sustainability 18 00110 g004
Figure 5. Q–Q plot of ARIMA residuals against the normal distribution.
Figure 5. Q–Q plot of ARIMA residuals against the normal distribution.
Sustainability 18 00110 g005
Figure 6. Autocorrelation function (ACF) of ARIMA residuals.
Figure 6. Autocorrelation function (ACF) of ARIMA residuals.
Sustainability 18 00110 g006
Figure 7. Performance comparison of seven deep learning architectures (RNN, GRU, LSTM, DF-RNN, DeepAR, DSSM, and Deep Renewal) in forecasting the simulated SET ESG Index using 100% training data.
Figure 7. Performance comparison of seven deep learning architectures (RNN, GRU, LSTM, DF-RNN, DeepAR, DSSM, and Deep Renewal) in forecasting the simulated SET ESG Index using 100% training data.
Sustainability 18 00110 g007
Figure 8. Learning Curves for the Three Best Models (GRU, DF-RNN, Deep Renewal).
Figure 8. Learning Curves for the Three Best Models (GRU, DF-RNN, Deep Renewal).
Sustainability 18 00110 g008
Figure 9. Robustness of the three best-performing deep learning models (GRU, DF-RNN, and Deep Renewal) under different training data proportions (100%, 50%, and 25%): (A) MSE, (B) RMSE, (C) MAPE, and (D) Theil’s U.
Figure 9. Robustness of the three best-performing deep learning models (GRU, DF-RNN, and Deep Renewal) under different training data proportions (100%, 50%, and 25%): (A) MSE, (B) RMSE, (C) MAPE, and (D) Theil’s U.
Sustainability 18 00110 g009
Table 1. Stationarity and Autocorrelation Diagnostic Results for the Simulated SET ESG Index.
Table 1. Stationarity and Autocorrelation Diagnostic Results for the Simulated SET ESG Index.
Diagnostic TestStatisticp-ValueInterpretationConclusion
ADF Test0.7753970.9912Fail to reject the null hypothesis of a unit root (p > 0.05)Non-stationary
KPSS Test1.0489250.0100p < 0.05 indicates non-stationarityExpected: Trend-stationarity violated
Ljung–Box (lags 20)53,778.7840760.0000Significant → autocorrelation presentLikely autocorrelated
ARCH LM Test2858.3237540.0000Significant → heteroscedasticity presentLikely GARCH-like volatility
Table 2. Hyperparameter Settings for All Deep Learning Models.
Table 2. Hyperparameter Settings for All Deep Learning Models.
ModelLearning RateOptimizerLoss FunctionBatch SizeEpochs (Max)Early Stopping PatienceWindow Length (Time Steps)Hidden LayersUnits per LayerDropout Rate
RNN0.001AdamMSE3220020601640.2
GRU0.001AdamMSE3220020601640.2
LSTM0.001AdamMSE3220020601640.2
DF-RNN0.001AdamMSE3220020602 (Factor + RNN Fusion)640.2
DeepAR (proxied)0.001AdamGaussian Likelihood-Based MSE3220020602640.2
DSSM (proxied)0.001AdamGaussian Likelihood-Based MSE322002060State-Space + GRU Core640.2
Deep Renewal0.001AdamMSE3220020602 (Renewal Memory)640.2
Table 3. Comparative performance results of the statistical forecasting models.
Table 3. Comparative performance results of the statistical forecasting models.
ModelMSEMAERMSEMAPE
ARIMA206,896.80324.16454.8511.44
SARIMA208,050.16324.79456.1211.45
SARIMAX207,230.02325.20455.2211.48
Table 4. Performance evaluation results of deep learning models using 100% training data.
Table 4. Performance evaluation results of deep learning models using 100% training data.
ModelMSERMSEPOCIDTheil’s UMAPE
RNN61,022.14247.0250.0015.866.45
GRU12,299.00110.9049.748.993.12
LSTM25,128.66158.5248.4512.764.69
DF-RNN18,113.30134.5849.4810.723.76
DeepAR64,599.52254.1650.2529.677.02
DSSM47,518.62217.9850.5126.715.78
Deep Renewal13,985.99118.2651.005.353.35
Table 5. Performance evaluation results of deep learning models using 50% training data.
Table 5. Performance evaluation results of deep learning models using 50% training data.
ModelMSERMSEPOCIDTheil’s UMAPE
GRU11,820.39108.7253.0911.013.06
DF-RNN41,747.66204.3251.2531.235.54
Deep Renewal28,865.01169.8951.2824.634.83
Table 6. Performance evaluation results of deep learning models using 25% training data.
Table 6. Performance evaluation results of deep learning models using 25% training data.
ModelMSERMSEPOCIDTheil’s UMAPE
GRU59,050.45243.0053.0942.616.43
DF-RNN62,767.34250.5351.0376.937.72
Deep Renewal74,251.46272.4952.06102.698.15
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Detthamrong, U.; Klangbunrueang, R.; Chansanam, W.; Dasri, R. Deep Learning for Sustainable Finance: Robust ESG Index Forecasting in an Emerging Market Context. Sustainability 2026, 18, 110. https://doi.org/10.3390/su18010110

AMA Style

Detthamrong U, Klangbunrueang R, Chansanam W, Dasri R. Deep Learning for Sustainable Finance: Robust ESG Index Forecasting in an Emerging Market Context. Sustainability. 2026; 18(1):110. https://doi.org/10.3390/su18010110

Chicago/Turabian Style

Detthamrong, Umawadee, Rapeepat Klangbunrueang, Wirapong Chansanam, and Rasita Dasri. 2026. "Deep Learning for Sustainable Finance: Robust ESG Index Forecasting in an Emerging Market Context" Sustainability 18, no. 1: 110. https://doi.org/10.3390/su18010110

APA Style

Detthamrong, U., Klangbunrueang, R., Chansanam, W., & Dasri, R. (2026). Deep Learning for Sustainable Finance: Robust ESG Index Forecasting in an Emerging Market Context. Sustainability, 18(1), 110. https://doi.org/10.3390/su18010110

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop