Predicting Stock Volatility Using Multidimensional Financial Risk: Evidence from Machine Learning and Hybrid GARCH–Deep Learning Models

Ibrahim, Yara; Hussainey, Khaled; Moawad, Taghred Mokhtar Sayed

doi:10.3390/jrfm19060444

Open AccessArticle

Predicting Stock Volatility Using Multidimensional Financial Risk: Evidence from Machine Learning and Hybrid GARCH–Deep Learning Models

by

Yara Ibrahim

¹

,

Khaled Hussainey

^2,*

and

Taghred Mokhtar Sayed Moawad

²

¹

Investment and Finance Department, Egypt Japan University for Science and Technology, Alexandria 21934, Egypt

²

The Albert Gubay Business School, Bangor University, Bangor LL57 2DG, UK

^*

Author to whom correspondence should be addressed.

J. Risk Financial Manag. 2026, 19(6), 444; https://doi.org/10.3390/jrfm19060444 (registering DOI)

Submission received: 1 May 2026 / Revised: 15 June 2026 / Accepted: 16 June 2026 / Published: 19 June 2026

(This article belongs to the Special Issue Editorial Board Members’ Collection Series: Journal of Risk and Financial Management, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

This study investigates the determinants and predictability of stock return volatility by integrating firm-specific financial characteristics with advanced econometric and volatility modeling techniques. Using an unbalanced panel dataset comprising 1596 firms and 19,752 firm-year observations from MENA stock markets over the period 2010–2024, the analysis employs fixed-effects panel regression models, conditional volatility models, and machine learning-based forecasting approaches. Following extensive diagnostic testing, including tests for heteroskedasticity, serial correlation, cross-sectional dependence, and model specification, a two-way fixed-effects model with Driscoll–Kraay standard errors is adopted as the preferred estimation framework. The results indicate that liquidity ratio, cash ratio, sales growth, firm age, lagged volatility, and lagged returns are significant determinants of stock return volatility, whereas leverage, tangibility, board independence, firm size, Tobin’s Q, and profitability do not exhibit statistically significant effects after controlling for firm-specific and time-specific heterogeneity. The volatility analysis reveals substantial persistence in stock return volatility, with the EGARCH-t specification providing the best fit among the competing GARCH-family models according to the Akaike Information Criterion. The estimated asymmetry parameters indicate that volatility responds differently to positive and negative shocks, supporting the presence of asymmetric volatility dynamics and the suitability of asymmetric volatility models. The forecasting analysis shows that advanced machine learning and deep learning models achieve competitive predictive performance; however, differences in predictive accuracy across models are generally modest.

Keywords:

stock volatility; financial risk; machine learning; GARCH models; deep learning

1. Introduction

The precise measurement and forecasting of stock return volatility are a cornerstone of contemporary financial economics, given their critical importance for asset pricing, portfolio allocation, and financial risk management (Christensen et al., 2023; Gu et al., 2020). The definition of volatility relates to the conditional variability of asset returns, construed as a measure of market risk and uncertainty (Andersen et al., 2003; Patton, 2011). Inspired by the fact that stylized facts also characterize financial returns, including volatility clustering (Campbell et al., 2001), leverage effects (Black, 1976), and fat-tailed distributions (E. Fama, 1965), research over the past few decades has documented econometric models to account for time-varying variance originating from return series (Bollerslev, 1986; Engle, 1982). While these models remain fundamental, recent studies show that they fail to account for the nonlinearity and high-dimensional interactions underlying modern financial markets (Gu et al., 2020; Zhang et al., 2019), leading to inaccurate predictions and mispricing of assets. As a result, machine learning and deep learning approaches have become one of the most compelling avenues for improving volatility forecasting performance, as they permit flexible, data-driven modeling without restrictive functional-form assumptions (Christensen et al., 2021; Dixon et al., 2020; Fischer & Krauss, 2018).

With the growing complexity of financial systems and their increasing reliance on quantitative risk management frameworks, the demand for volatility forecasting has grown recently. The volatility estimates are necessary for pricing derivatives, for portfolio optimization, and as one of the risk measures used by regulators (value at risk and expected shortfall) that inform investors about worst-case loss scenarios (Daníelsson, 2011; Patton, 2011). The inability to model volatility correctly can lead to severe mispricing of this risk and to insufficient capital allocation, especially during financial turmoil (Baruník & Křehlík, 2018). Modern empirical research also points to the potential importance of firm-specific characteristics, such as leverage, and liquidity conditions, which give insight into the underlying state of firms’ information environment or financial health in influencing volatility beyond mere historical returns (Bates et al., 2009; Dechow et al., 2010; E. F. Fama & French, 2015). Nevertheless, most volatility forecasting models have been based almost entirely on market-based variables, with little integration of firm-level financial risk factors, which limits their ability to accurately predict volatility across different market conditions, particularly during periods of financial distress or economic downturns when firm-specific characteristics become more relevant.

This is particularly relevant in emerging markets, and more specifically in the MENA region, where financial systems are characterized by structural heterogeneity, developing regulatory regimes, and varying levels of integration into global markets (Bekaert & Harvey, 2002; Ben Naceur et al., 2008). Aloui et al. (2011) and Bekaert et al. (2007) argue that the MENA region has lower informational efficiency, higher transaction costs, and is more sensitive to firm-specific and macroeconomic shocks, which generate stronger volatility dynamics. Moreover, corporate governance systems, financial openness, and de jure preferences for external finance differ widely across firms, further increasing heterogeneity in risk exposure (Claessens & Yurtoglu, 2013). Recent studies show that nonlinear modeling techniques, such as those borrowed from machine learning, are more appropriate for capturing the complexities of emerging markets, where traditional linear models may not sufficiently characterize the underlying dynamics (Gu et al., 2020; Zhang et al., 2019).

Against this backdrop, the present study examines whether stock volatility can be better estimated by embedding multidimensional financial risk factors within a single modeling framework. This study addresses three key empirical questions: First, what is the impact of financial structure, and liquidity conditions on firm-level stock volatility? Second, do machine learning and deep learning models outperform traditional econometric methodologies in capturing these relationships? Third, do hybrid modeling approaches that combine traditional econometric volatility estimates with neural network architectures deliver better predictive accuracy? These questions are grounded in integrating approaches that rely heavily on theory-driven financial modeling alongside those that rely more on data-driven prediction, particularly in complex and heterogeneous environments.

This study makes three contributions. The first is a structured, multidimensional financial risk framework that incorporates accounting-based and financial indicators into volatility forecasting, expanding beyond the traditional focus on market-based predictors (Dechow et al., 2010; E. F. Fama & French, 2015). Second, it offers a comprehensive empirical comparison of econometric, machine learning, and deep learning models, adding to the rapidly growing literature on AI applications in finance (Christensen et al., 2021; Dixon et al., 2020; Gu et al., 2020). The third element creates a mixed modeling framework that combines GARCH-based volatility estimates with recurrent neural networks, leveraging both statistical structure and nonlinear learning to improve predictive power. Integrating complementary modeling paradigms using hybrid approaches has been shown to enhance forecasting performance, particularly in complex time-series environments (Krauss et al., 2017; Zhang et al., 2019).

The novelty of this study lies in integrating multiple firm-level financial risk dimensions with advanced predictive models within a unified empirical framework, with a particular focus on emerging markets. While recent studies have demonstrated the effectiveness of machine learning techniques in forecasting financial volatility (Fischer & Krauss, 2018; Gu et al., 2020), relatively few studies have simultaneously incorporated a comprehensive set of firm-specific financial risk indicators into volatility prediction models. Furthermore, although hybrid econometric–machine learning approaches have gained increasing attention, existing research has primarily concentrated on aggregate market indices or developed economies, leaving a significant gap in understanding firm-level volatility dynamics in emerging regions such as the MENA region (Zhang et al., 2019). This study contributes to the financial econometrics literature by identifying additional determinants of financial risk and enhancing the understanding of volatility predictability. It also advances the application of machine learning in finance by providing evidence from an underexplored emerging-market context.

The rest of this paper is organized as follows. A review of the literature on volatility modeling, financial risk determinants, and machine learning applications in finance is presented in the next section. The following section details the research methodology. The empirical findings are then presented and discussed, followed by some concluding remarks and avenues for future research.

2. Theoretical Background and Literature Review

Stock return volatility is an essential subject in financial economics, serving as a proxy for market risk and uncertainty (Andersen et al., 2003; Patton, 2011). The conception of a theoretical framework, specifically the Efficient Market Hypothesis, assumes that volatility arises from the continuous arrival of new information (E. F. Fama, 1970). Volatility clustering (Ané & Geman, 2000), fat tails (Bollerslev, 1986), and nonlinear dependence between values (Andersen et al., 2003; Cont, 2001) have been the core justification for features of the time series that have been repeatedly documented by empirical sources, and this has a primary role in developing econometric models that can be used to account for time-varying conditional variance.

The ARCH model (Engle, 1982) and its generalized form, the GARCH model (Bollerslev, 1986), have become the workhorse frameworks for modeling volatility dynamics because they capture persistence and clustering effects effectively. More extensions, such as EGARCH, enable the modeling of asymmetric volatility responses to positive and negative shocks, consistent with leverage effects (Black, 1976; Nelson, 1991). This parametrization is always present in a model, and these models themselves provide order and a theoretically grounded framework; however, the limitations of leveraging parametric assumptions and linear specifications become apparent when using this approach to capture nonlinear relationships observed in broader financial properties.

Simultaneously with the development of volatility models, a substantial body of research examines the influence of firm-specific factors on stock return behavior. From a corporate finance perspective, capital structure is a key risk driver. Trade-off theory implies that as leverage increases, the risk of financial distress also rises, and equity returns are more sensitive to shocks (Kraus & Litzenberger, 1973). Although controlling for firm heterogeneity and other risk factors has been known to weaken the relationship (Bartram et al., 2011; E. F. Fama & French, 1992), empirical studies have evinced a positive effect of leverage on volatility (Bhandari, 1988; Christie, 1982).

Another theoretical misconception pertains to liquidity, a crucial element of financial risk management. From an alternative perspective, the likelihood of financial distress diminishes with increased liquidity, thereby enhancing the stability of the firm’s value. (Bates et al., 2009; Opler et al., 1999). On the other hand, excessive liquidity may facilitate agency problems or the misallocation of capital; viewed through this lens, excess liquidity may create uncertainty (Jensen, 1986). The vagueness of the trust concept is paralleled by empirical evidence of stabilizing effects (Basu, 1997), weak links between trust and growth (Guiso et al., 2008; Knack & Keefer, 1997), and context-dependent effects (Harrison & Paton, 2004).

Various other stylized facts, such as corporate performance or growth variables intrinsically linked to financial structure and reporting quality, have been shown to be associated with different volatility dynamics. Profitability, e.g., Return on Assets (ROA), is often used as a proxy for financial stability (E. Fama & French, 2006).

But there is also the possibility that such firms attract more investor attention and speculative trading in their stocks, which increases return variability. Likewise, growth opportunities (defined as sales growth) introduce uncertainty about future cash flows, leading to unclear empirical relationships (Hong et al., 2026). This suggests that the relationship between firm fundamentals and volatility is fundamentally nonlinear and context dependent.

Under these complexities, traditional econometric models may fail to disentangle the interplay between firm characteristics and volatility dynamics. This results in increased adoption of machine learning approaches in financial modeling. Meanwhile, many machine learning algorithms, including Random Forest and gradient boosting methods, are potentially flexible nonparametric frameworks that can accommodate high-dimensional and nonlinear relationships (Chen & Guestrin, 2016; Christensen et al., 2023; Gu et al., 2020; Jordan & Mitchell, 2015; Ke et al., 2017; Quinlan, 1986). More importantly, this practice has been shown to yield better predictive performance than traditional approaches, supported by empirical evidence in settings where complex interactions between predictors are expected.

In this respect, deep learning models go one step further and exploit temporal dependencies in financial time series. Long Short-Term Memory (LSTM) or Gated Recurrent Unit (GRU) architectures of recurrent neural networks have proven very effective for modeling general sequence data over the past few years and, in particular, for the dynamic, temporal structure of asset returns (Dixon et al., 2020; Fischer & Krauss, 2018). However, empirical performance gains are often modest, and even when statistically significant, they may fall below conventional thresholds, raising concerns about model fit and interpretability.

Hybrid modeling frameworks have been proposed to offer the advantages of both approaches, combining the structure of structured econometric models with the flexibility of machine learning approaches. They combine GARCH-type models that generate volatility estimates with either ML or DL architectures to leverage the nonlinear learning capabilities of ML/DL approaches without sacrificing too much model transparency (Krauss et al., 2017; Zhang et al., 2019). Even if hybrid models have the potential to improve forecasting performance, existing evidence indicates that the gains from their application are modest and context specific.

Such findings reinforce the call for a framework that goes beyond aggregation and incorporates multidimensional firm-level financial risk factors using sophisticated modeling methods. This study adds to the literature by examining these associations within a joint framework while offering an extensive comparison of econometric, volatility-based, machine learning, deep learning, and hybrid methods in a common empirical environment, focusing on emerging markets.

3. Hypotheses Development

Building on the theoretical and empirical literature reviewed in Section 2, this study develops a set of testable hypotheses concerning the determinants and predictability of stock return volatility in the MENA region. The hypotheses are structured around three key dimensions: (1) the explanatory role of multidimensional firm-level financial risk factors, (2) the predictive performance of advanced modeling techniques, and (3) the incremental value of hybrid econometric–machine learning approaches.

3.1. Multidimensional Financial Risk and Stock Volatility

According to the Efficient Market Hypothesis (E. F. Fama, 1970), stock return volatility reflects the continuous incorporation of new information into security prices. Firm-level characteristics play a critical role in shaping investors’ perceptions of risk and uncertainty and, consequently, influence the variability of stock returns (Campbell et al., 2001; E. F. Fama & French, 1992). Prior research suggests that financial structure, liquidity position, governance quality, and firm performance constitute important dimensions of firm-specific risk that may affect stock return volatility.

From a financial structure perspective, capital structure theory argues that firms with higher leverage face greater financial obligations and increased exposure to financial distress, making their equity returns more sensitive to economic shocks and market uncertainty (Bhandari, 1988; Christie, 1982; Kraus & Litzenberger, 1973). In addition, asset tangibility influences a firm’s financing capacity and risk profile because tangible assets can serve as collateral, potentially reducing uncertainty and stabilizing firm value (Almeida et al., 2011). Consequently, financial structure characteristics may significantly affect stock return volatility.

Liquidity risk represents another important determinant of market uncertainty. Firms with stronger liquidity positions are generally better able to meet short-term obligations and absorb adverse shocks, which may reduce volatility (Bates et al., 2009; Opler et al., 1999). However, excessive liquidity may also create agency problems and inefficient resource allocation, potentially increasing uncertainty and return variability (Jensen, 1986). Given these competing theoretical arguments, the relationship between liquidity and stock volatility remains an empirical question.

Corporate governance mechanisms may also influence stock volatility by reducing information asymmetry and improving monitoring effectiveness. Board independence, in particular, enhances oversight and managerial accountability, potentially lowering firm-specific risk and stabilizing stock returns (Elmagrhi et al., 2018; Khan et al., 2017). Nevertheless, the effectiveness of governance mechanisms may vary across institutional environments, especially in emerging markets.

Firm performance is another factor that may affect stock volatility. Highly profitable firms generally exhibit stronger financial stability and lower operational risk, which may reduce volatility. Conversely, profitable firms may attract greater investor attention and speculative trading activity, potentially increasing stock price fluctuations (E. Fama & French, 2006; Hou et al., 2015).

These considerations are particularly relevant in the MENA region, where financial markets are characterized by heterogeneous institutional environments, varying levels of market efficiency, and greater exposure to firm-specific shocks (Aloui et al., 2011; Bekaert & Harvey, 2002). Consequently, different dimensions of firm-level financial risk may exert distinct effects on stock return volatility.

Accordingly, the following hypotheses are proposed:

H1a.

Financial structure risk, proxied by leverage and tangibility, is significantly associated with stock return volatility.

H1b.

Liquidity risk, proxied by the liquidity ratio and cash ratio, is significantly associated with stock return volatility.

H1c.

Governance risk, proxied by board independence, is significantly associated with stock return volatility.

H1d.

Firm performance, proxied by return on assets (ROA), is significantly associated with stock return volatility.

3.2. Predictive Performance of Advanced Models

Because traditional econometric models (e.g., ARCH and GARCH) can capture volatility clustering and persistence, they have been widely used in option pricing (Bollerslev, 1986; Engle, 1982). In contrast, parametric models are based on a set of restrictions and often lack the capacity to model nonlinear relationships or interdependencies common in financial data (Cont, 2001; Gu et al., 2020).

In recent decades, machine learning (ML) methods have also been gradually applied to financial modeling to mitigate the limitations of behavioral sentiment-based market models. For instance, Random Forest and gradient boosting are well-known techniques that can fit high-dimensional, nonlinear relationships without specifying a restrictive functional form (Breiman, 2001; Chen & Guestrin, 2016; Gu et al., 2020). For this reason, these models have the potential to provide better predictions, especially in complex environments, as empirical studies suggest that predicting similar persons yields more accurate outcomes than aggregating across persons.

Deep learning (DL) models take it a step further by capturing temporal dependencies to remember trends in financial time series. Recurrent Neural Networks, specifically Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) architectures, have been implemented in sequential data representation and applied for financial forecasting problems with promising results (Dixon et al., 2020; Fischer & Krauss, 2018).

However, more recent evidence suggests that the predictive gains obtained with these models (ML and DL) are often small and particularly sensitive not only to data characteristics but also to model specification (Christensen et al., 2021; Kelly et al., 2019). Especially in emerging markets characterized by limited data availability and frequent structural breaks, often more structural breaks.

Based on this literature, the following hypotheses are formulated:

H2a.

Machine learning models provide improved volatility forecasts compared to traditional econometric models.

H2b.

Deep learning models outperform traditional machine learning models in forecasting stock return volatility.

3.3. Incremental Value of Hybrid Modeling Approaches

To leverage the benefits of both econometric and machine learning approaches, hybrid modeling frameworks have been suggested. Specifically, GARCH-type models can capture time-series features such as volatility clustering and persistence, whereas machine and deep learning models can model nonlinear relations and intricate interactions (Krauss et al., 2017; Zhang et al., 2019).

Hybrid models aim to combine the strengths of both approaches, improving forecasting performance while preserving the interpretability of traditional econometric models. Empirical evidence indicates that hybrid frameworks tend to achieve better accuracy in prediction, especially when the (nonlinear dynamics and structural complexity of) system is complex (Zhang et al., 2019).

However, these gains remain context dependent. In practice, hybrid models often fail to capture extreme volatility events because volatility dynamics become highly irregular and noisy during periods of stress or large shocks (Baruník & Křehlík, 2018).

Accordingly, the following hypotheses are proposed:

H3a.

Hybrid GARCH–deep learning models outperform standalone econometric and machine learning models in forecasting stock return volatility.

H3b.

Hybrid models improve the tracking of volatility dynamics but have limited ability to capture extreme volatility spikes.

4. Research Methodology

4.1. Sample and Data Sources

The sample employed in this study consists of an unbalanced panel of publicly listed non-financial firms operating across Middle East and North Africa (MENA) capital markets over the period 2010–2024. The initial dataset was obtained from the London Stock Exchange Group (LSEG) and contained 2093 firms and 26,505 firm-year observations spanning 67 industries and 13 countries. The broad geographical coverage provides substantial variation in institutional quality, financial development, corporate governance practices, and market structures, making the sample particularly suitable for examining the determinants of stock return volatility in emerging markets. Following the data cleaning procedures and the exclusion of financial institutions and REITs, the final cleaned dataset comprised 1596 firms and 19,752 firm-year observations. However, the econometric analyses were conducted using a final estimation sample of 12,895 observations. This reduction is attributable to the use of lagged explanatory variables, lagged stock volatility, and lagged stock returns in the empirical models. Because all explanatory variables were specified in lagged form (t − 1), observations lacking the required prior-year information could not be retained in the estimation sample.

Firm-level financial and governance information was extracted from LSEG and subsequently transformed into a structured panel dataset. The data cleaning process involved several stages. First, duplicate variables and inconsistently labeled observations resulting from the original extraction format were identified and removed. Second, the dataset was reshaped into a firm-year panel structure, and all variables were standardized to ensure consistency across firms and reporting periods. Missing values were examined extensively, and observations with incomplete identifiers or insufficient information were corrected where possible. To preserve the maximum amount of information while minimizing data loss, missing observations were addressed using interpolation techniques and multiple imputation procedures for selected variables. Furthermore, logarithmic and inverse hyperbolic sine (IHS) transformations were applied to highly skewed financial variables to improve their distributional properties and reduce the influence of extreme values.

To mitigate the impact of outliers commonly observed in accounting and market-based variables, all continuous financial variables were winsorized at the 1st and 99th percentiles. Winsorization reduces the influence of extreme observations without eliminating valid firm-year information and is widely employed in empirical corporate finance and accounting research. Following winsorization, the distributions of the transformed variables were reassessed using skewness and kurtosis measures, resulting in substantially improved distributional characteristics suitable for econometric analysis.

Consistent with prior corporate finance and risk management research, financial institutions were excluded from the final sample. Specifically, firms classified within the Banks, Capital Markets, Consumer Finance, Financial Services, and Insurance industries were removed because their capital structures, regulatory environments, and risk profiles differ fundamentally from those of non-financial corporations. In addition, Real Estate Investment Trusts (REITs), including Diversified REITs, Hotel & Resort REITs, Industrial REITs, and Residential REITs, were excluded because their operating and financing structures are governed by specialized regulatory frameworks that make them incomparable to conventional corporations.

After applying these screening criteria, the final sample comprises 1596 non-financial firms and 19,752 firm-year observations distributed across 58 industries and 13 MENA countries. The sample remains highly diversified geographically and industrially, providing a comprehensive representation of non-financial corporate activity across the MENA region as shown in Table 1, Table 2 and Table 3.

The resulting dataset forms an unbalanced panel, reflecting differences in firm listing dates, disclosure practices, delistings, and data availability across countries and years. Such unbalanced panels are common in emerging-market research and are particularly suitable for panel-data econometric techniques because they maximize the use of available information while minimizing potential measurement error arising from excessive observation deletion. The final sample therefore provides a comprehensive and representative basis for investigating the determinants of stock return volatility across non-financial firms in MENA capital markets.

The dependent variable is stock return volatility, which serves as a measure of firm-level market risk. The explanatory variables are classified into three categories: financial structure risk, liquidity risk, and governance risk. Financial structure risk is represented by leverage and tangibility, liquidity risk is measured using the liquidity ratio and cash ratio, and governance risk is captured through board independence.

Several firm-specific control variables are included to account for differences in firm characteristics that may influence stock return volatility. These controls comprise firm size, measured as the natural logarithm of total assets; a market valuation proxy, measured as enterprise value divided by total assets; profitability, measured by return on assets (ROA); sales growth, measured as the annual percentage change in revenue per share; and firm age, measured as the number of years since incorporation. To address potential simultaneity bias and reduce endogeneity concerns, all explanatory and control variables are specified in lagged form _{(t − 1)}. In addition, lagged stock volatility and lagged stock returns are incorporated as dynamic control variables to capture the persistence and clustering effects commonly observed in financial market volatility. The definitions of all the variables are presented in Table 4.

4.2. Data Structure and Variable Construction

Let

i = 1, \dots, N

denote firms and

t = 1, \dots, T

denote years. Because the study relies on annual firm-level data, stock return volatility is measured using annual stock returns derived from year-end stock prices. Annual stock returns are calculated as follows:

R e t u r n_{i, t} = \frac{P_{i, t} - P_{i, t - 1}}{P_{i, t - 1}}

where

P_{i, t}

represents the stock price of firm

i

at the end of year

t

. Stock return volatility is measured as the annualized standard deviation of daily stock returns within each year and is calculated as follows:

V o l a t i l i t y_{i, t} = \sqrt{252} \times S D (r_{i, d})

where

r_{i, d}

denotes the daily stock return of firm

i

on trading day

d

during year

t

, and

S D (\cdot)

represents the standard deviation of all daily returns observed within that year. The factor

\sqrt{252}

annualizes the volatility measure based on the average number of trading days in a year.

This measure captures the annual level of stock return variability and serves as the study’s primary proxy for firm-level market risk. By utilizing daily return information while retaining an annual firm-year structure, the measure provides a more comprehensive assessment of volatility than volatility estimates derived solely from annual stock returns.

All explanatory variables are specified in lagged form,

X_{i, t - 1}

, to mitigate simultaneity bias and reduce potential endogeneity concerns. The explanatory variables are classified into three categories: financial structure risk, liquidity risk, and governance risk. Financial structure risk is represented by leverage, measured as total liabilities divided by shareholders’ equity, and tangibility, measured as net property, plant, and equipment divided by total assets. Liquidity risk is captured using the liquidity ratio (current assets divided by current liabilities) and the cash ratio (cash and cash equivalents divided by current liabilities). Governance risk is measured through board independence, calculated as the proportion of independent directors on the board.

Several firm-specific control variables are included to account for differences in firm characteristics that may affect stock return volatility. These controls include firm size, measured as the natural logarithm of total assets; market valuation, proxied by enterprise value divided by total assets; profitability, measured using return on assets (ROA); sales growth, measured as the annual percentage change in revenue per share; and firm age, measured as the number of years since incorporation. To capture the persistence typically observed in financial market volatility, the empirical models also include lagged stock volatility and lagged stock returns as dynamic control variables. This specification enables the analysis to account for both firm-specific characteristics and the time-series dynamics of stock return volatility.

4.3. Data Preprocessing and Statistical Diagnostics

Missing observations were handled using a sequential imputation procedure. First, missing values were linearly interpolated within each firm using the panel structure of the dataset. Second, any remaining gaps were addressed through forward-fill and backward-fill procedures. Finally, residual missing observations were imputed using Multiple Imputation by Chained Equations (MICE) implemented through Scikit-learn’s IterativeImputer with 20 iterations and a fixed random seed of 42. Following the imputation process, no missing values remained in the final analytical dataset.

The dataset is preprocessed according to standard procedures. We winsorize extreme observations at the 1st and 99th percentiles. We test for stationarity employing time-series and panel unit root tests (Augmented Dickey–Fuller (ADF), Levin–Lin–Chu (LLC), Im–Pesaran–Shin (IPS), with appropriate transformations applied where necessary. Additionally, cross-sectional dependence is tested in the Pesaran CD test.

Correlation matrices and Variance Inflation Factors are used for assessing multicollinearity. The Breusch–Pagan and White tests are used to test for heteroskedasticity. At the firm level, all econometric specifications use clustered robust standard errors.

Structural stability is examined using a COVID-19 dummy variable representing the pandemic period. An F-test compares the restricted and unrestricted specifications to determine whether a significant structural break occurred during the COVID-19 period.

4.4. Econometric Analysis

To examine the determinants of stock volatility, this study employs a panel data framework that accounts for unobserved firm-specific heterogeneity (Hausman, 2015; Hommes, 2013). The baseline model is specified as follows:

V o l_{i, t} = α_{i} + β^{'} X_{i, t - 1} + ϵ_{i, t}

where

V o l_{i, t}

represents the stock volatility of firm

i

in period

t

. The term

α_{i}

captures firm-specific effects that are invariant over time, while

X_{i, t - 1}

denotes a vector of lagged explanatory variables. The parameter vector

β^{'}

measures the impact of the explanatory variables on stock volatility. Finally,

ϵ_{i, t}

represents the idiosyncratic error term. All independent variables are lags by one period to reduce possible endogeneity and simultaneity issues.

Both fixed-effects and random-effects models were estimated. The Hausman test strongly rejected the null hypothesis of no systematic difference between the estimators (χ² = 884.38, p < 0.001), indicating that the fixed-effects specification is more appropriate. Consequently, the study relies on a two-way fixed-effects model with Driscoll–Kraay standard errors for statistical inference.

Diagnostic testing, however, demonstrates rejection of classical panel assumptions such as cross-sectional dependence and heteroskedasticity. The model is estimated with Driscoll–Kraay standard errors to ensure consistent statistical inference, robust to heteroskedasticity, serial correlation as well as cross-sectional dependence. The expression for the adjusted variance–covariance matrix is as follows:

{\hat{V}}_{D K} = (X^{'} X)^{- 1} (\sum_{t = - L}^{L} w_{t} {\hat{Γ}}_{t}) (X^{'} X)^{- 1}

where

{\hat{Γ}}_{t}

denotes the covariance of residuals across cross-sections at lag

t

, and

w_{t}

represents kernel-based weights. This approach ensures consistent estimation in panels characterized by complex dependence structures, which are common in financial data.

4.5. Volatility Analysis

In addition to the panel framework, this study models the time-series dynamics of stock return volatility using ARCH-family models, which capture volatility clustering and persistence (Antonakakis et al., 2020; Bollerslev, 1986; Hamilton & Susmel, 1994). The ARCH model specifies conditional variance as a function of past squared residuals:

σ_{t}^{2} = ω + \sum_{i = 1}^{q} α_{i} ϵ_{t - i}^{2}

To allow for both short-term shock effects and long-term persistence, the GARCH (1,1) model is employed:

σ_{t}^{2} = ω + α ϵ_{t - 1}^{2} + β σ_{t - 1}^{2}

where α measures the impact of recent shocks (the ARCH effect) and β captures volatility persistence (the GARCH effect). In conventional GARCH models, the condition α + β < 1 ensures covariance stationarity of the volatility process, whereas values approaching or exceeding unity indicate a highly persistent volatility process in which shocks dissipate slowly over time. Consequently, the magnitude of α + β provides an indication of the degree of volatility persistence in the return series.

To further account for potential asymmetry in the response of volatility to positive and negative shocks, the Exponential GARCH (EGARCH) model is estimated as follows:

l n (σ_{t}^{2}) = ω + β l n (σ_{t - 1}^{2}) + α (| \frac{ϵ_{t - 1}}{σ_{t - 1}} | - E | \frac{ϵ_{t - 1}}{σ_{t - 1}} |) + γ \frac{ϵ_{t - 1}}{σ_{t - 1}}

where

γ

captures the asymmetric effect of shocks on conditional volatility and is commonly interpreted as the leverage effect in financial markets.

The EGARCH can also be estimated under a student–t distribution to account for possible deviations from normality and to capture heavy-tailed behavior in financial returns. This specification provides greater flexibility for modeling large observations, which are common in financial time series.

The joint specification combining a panel econometric model with ARCH-family volatility models provides a complete framework for analyzing the cross-sectional determinants of stock volatility and its dynamic behavior simultaneously. This joint estimation scheme ensures that the fundamental firm characteristics and the stochastic volatility processes are jointly accounted for in the empirical analysis.

The GARCH-family models were estimated using the pooled series of lagged stock returns obtained from the final panel dataset. Consequently, the estimated volatility models capture aggregate volatility dynamics represented in the pooled sample rather than firm-specific or country-specific conditional volatility processes.

4.6. Robustness Checks

To assess the robustness of the empirical findings, additional estimations were conducted by incorporating industry and country fixed effects. Industry fixed effects control for sector-specific characteristics that may influence stock return volatility, while country fixed effects account for differences in institutional environments, regulatory frameworks, and levels of financial market development across MENA countries. The results remained qualitatively similar to the baseline specification, indicating that the main findings are not driven by industry-specific or country-specific effects. Therefore, the conclusions regarding the relationship between financial risk factors and stock return volatility appear robust to alternative model specifications.

4.7. Machine Learning Models

To capture nonlinear relationships, ensemble machine learning models are implemented, specifically Random Forest and Extreme Gradient Boosting (XGBoost), as shown in Figure 1 (Breiman, 2001; Chen & Guestrin, 2016). Model hyperparameters are optimized via grid search with time-series cross-validation, ensuring temporal ordering is preserved.

The Random Forest prediction function is given by the following:

\hat{f} (x) = \frac{1}{B} \sum_{b = 1}^{B} T_{b} (x)

while XGBoost models the output as follows:

{\hat{y}}_{i} = \sum_{k = 1}^{K} f_{k} (x_{i})

To prevent data leakage, the dataset is partitioned using a chronological split, where training data precede validation and testing data. Model performance is evaluated on strictly out-of-sample observations. These models are particularly suited to high-dimensional financial datasets characterized by nonlinear dependencies.

4.8. Deep Learning Models

To capture the nonlinear and dynamic nature of stock return volatility, this study employs two recurrent neural network architectures: Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks. These models are particularly suitable for financial time-series forecasting because they can capture temporal dependencies and sequential patterns that may not be adequately modeled using traditional econometric techniques (Cahuantzi et al., 2023).

The LSTM architecture addresses the vanishing gradient problem through a memory cell and a set of gating mechanisms that regulate the flow of information over time presented in Figure 2. The model is represented as follows:

f_{t} = σ (W_{f} x_{t} + U_{f} h_{t - 1} + b_{f}) i_{t} = σ (W_{i} x_{t} + U_{i} h_{t - 1} + b_{i}) {\tilde{C}}_{t} = t a n h (W_{c} x_{t} + U_{c} h_{t - 1} + b_{c}) C_{t} = f_{t} ⊙ C_{t - 1} + i_{t} ⊙ {\tilde{C}}_{t} o_{t} = σ (W_{o} x_{t} + U_{o} h_{t - 1} + b_{o}) h_{t} = o_{t} ⊙ t a n h (C_{t})

where

f_{t}

,

i_{t}

, and

o_{t}

denote the forget, input, and output gates, respectively;

C_{t}

represents the cell state;

h_{t}

denotes the hidden state;

σ (\cdot)

is the sigmoid activation function; and

⊙

denotes element-wise multiplication.

The GRU architecture provides a computationally simpler alternative by combining update and reset gates while preserving the ability to capture long-term dependencies in sequential data.

To prevent look-ahead bias and data leakage, observations are partitioned using a chronological time-series split in which the training sample precedes the validation sample and the validation sample precedes the testing sample. The dataset is divided into 70% training observations, 15% validation observations, and 15% testing observations. Rolling sequences of historical observations are constructed and used as inputs to predict future stock volatility. Model performance is evaluated exclusively on strictly out-of-sample test observations.

Both the LSTM and GRU models consist of two hidden layers containing 64 and 32 units, respectively, with a dropout rate of 0.20 applied between layers to reduce overfitting. The models are trained using the Adam optimizer and the mean squared error (MSE) loss function. Hyperparameters are selected based on validation performance, and early stopping is implemented to prevent overfitting and improve model generalization. The deep learning models were estimated using a pooled dataset of firms while preserving the chronological ordering of observations to avoid look-ahead bias and data leakage. Historical observations were transformed into rolling input sequences and divided into training (70%), validation (15%), and testing (15%) samples. Both the LSTM and GRU architectures consisted of two hidden layers containing 64 and 32 units, respectively, with a dropout rate of 0.20 applied between layers to mitigate overfitting. Model training was conducted using the Adam optimizer and the mean squared error (MSE) loss function, with a batch size of 32 observations and a maximum of 100 training epochs. Early stopping based on validation loss was implemented to improve model generalization and prevent overfitting. These implementation choices enhance the transparency, reproducibility, and robustness of the forecasting framework while ensuring that predictive performance is evaluated exclusively on strictly out-of-sample observations.

4.9. Hybrid GARCH–Deep Learning Model

A hybrid modeling framework is employed to integrate econometric and deep learning approaches presented in Figure 3. In the first stage, firm-level GJR-GARCH models are estimated to generate conditional variance forecasts

{\hat{h}}_{i, t}

. In the second stage, these forecasts are incorporated as inputs into the deep learning models alongside lagged returns and financial variables:

V o l_{i, t} = f (R_{i, t - 1}, {\hat{h}}_{i, t - 1}, X_{i, t - 1})

This approach captures both structured volatility dynamics and complex nonlinear interactions, enhancing predictive performance.

A rolling-window forecasting approach is adopted to ensure a realistic out-of-sample evaluation. Models are estimated using data up to time _t and used to generate one-step-ahead forecasts,

t + 1

with the estimation window expanding iteratively.

Forecast accuracy is evaluated using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), out-of-sample R², and the QLIKE loss function. Statistical differences in predictive performance are assessed using the Diebold–Mariano test.

4.10. Model Implementation and Replication Procedures

To enhance transparency and reproducibility, all machine learning, deep learning, and hybrid forecasting models were implemented within a standardized computational framework shown in Appendix A. Data preprocessing included missing-value treatment, variable transformations, winsorization at the 1st and 99th percentiles, and the construction of lagged explanatory variables. All models were estimated using identical input variables and evaluation procedures to ensure comparability across forecasting approaches.

To ensure robust performance and prevent look-ahead bias, observations are partitioned using a chronological time-series split (70% training, 15% validation, and 15% testing), ensuring that the training set strictly precedes the validation and testing sets. Historical data are transformed into rolling sequences, which serve as inputs for predicting future stock volatility.

Both the LSTM and GRU models consist of two hidden layers (64 and 32 units, respectively) with a 0.20 dropout rate applied between layers to mitigate overfitting. The models are trained using the Adam optimizer and the mean squared error (MSE) loss function, with a batch size of 32 and a maximum of 100 epochs. To further improve model generalization and prevent overfitting, early stopping based on validation loss is implemented. Model performance is evaluated exclusively on strictly out-of-sample test observations, ensuring the transparency and reproducibility of the forecasting framework.

Hyperparameter tuning was performed using the validation dataset. For the Random Forest model, the number of trees, maximum tree depth, and minimum node size were evaluated across alternative specifications. For XGBoost, tuning included the learning rate, tree depth, number of estimators, and subsampling parameters. For the LSTM and GRU models, alternative sequence lengths, hidden-layer dimensions, dropout rates, batch sizes, and learning rates were evaluated. The final model specifications were selected based on the lowest validation error.

To ensure reproducibility, all experiments were conducted using a fixed random seed. The complete forecasting workflow followed a consistent sequence of data preprocessing, model training, hyperparameter optimization, out-of-sample forecasting, and performance evaluation. Forecasting accuracy was assessed using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R²) is presented in Algorithm 1.

The forecasting procedure can be summarized as follows:

Algorithm 1. Stock Volatility Forecasting Framework

Input: Firm-level financial variables, governance variables, and stock return data.
Output: Out-of-sample stock volatility forecasts.

▪: Collect firm-level financial data and stock return data.
▪: Clean, transform, and winsorize variables.
▪: Construct lagged explanatory variables and stock volatility measures.
▪: Partition observations into training (70%), validation (15%), and testing (15%) samples.
▪: Generate rolling input sequences for recurrent neural network models.
▪: Train Random Forest and XGBoost models using the training sample.
▪: Train LSTM and GRU models and optimize hyperparameters using the validation sample.
▪: Estimate GARCH-family models and generate volatility forecasts.
▪: Construct hybrid GARCH–deep learning models using GARCH volatility estimates as additional inputs.
▪: Generate out-of-sample forecasts using the testing sample.
▪: Evaluate forecasting performance using RMSE, MAE, MAPE, and (R²).
▪: Compare forecasting performance across econometric, machine learning, deep learning, and hybrid models.

5. Results

5.1. Descriptive Analysis

Table 5 presents the descriptive statistics for all variables employed in the analysis. The average stock volatility is 0.3968, with a median value of 0.2956, indicating moderate variation in stock price fluctuations across firms. The standard deviation of 0.5016 suggests considerable dispersion in volatility levels among the sampled firms.

Regarding firm-specific characteristics, the mean leverage ratio is 2.2025, while the median is substantially lower at 0.8034. Similarly, Tobin’s Q exhibits a mean of 7.2272 and a median of 0.8719. The large discrepancies between means and medians, together with extremely high maximum values, indicate the presence of outliers and positively skewed distributions. Sales growth also demonstrates substantial variability, with a mean of 6.0096 and a maximum value of 33,878.5915, reflecting significant heterogeneity in firm growth performance.

Liquidity measures display notable dispersion. The mean liquidity ratio and cash ratio are 3.5676 and 1.2676, respectively, whereas their median values are considerably lower at 1.4746 and 0.2168. This suggests that a relatively small number of firms maintain exceptionally high liquidity positions. Board independence averages 4.8899, indicating that independent directors constitute a meaningful proportion of board membership across the sample.

The average firm size, measured by the natural logarithm of total assets, is 18.6814, with a median value of 18.6551, suggesting a relatively symmetric distribution. Firm age averages 27.51 years, indicating that the sample primarily consists of mature firms. Profitability, measured by return on assets (ROA), exhibits a mean of −0.0572 and a median of 0.0315, implying that while most firms report positive profitability, several firms experienced substantial losses during the sample period.

The lagged variables exhibit characteristics similar to their contemporaneous counterparts. In particular, lagged volatility has a mean of 0.3908 and a median of 0.2915, reflecting persistence in stock volatility over time. Likewise, lagged returns display a mean of 0.0753 and a median close to zero, suggesting considerable variation in firms’ historical stock performance.

5.2. Correlation Analysis, Multicollinearity, and Diagnostic Tests

Table 6 presents the Pearson correlation matrix for all variables included in this study. The results indicate that most pairwise correlations among the explanatory variables are relatively low, suggesting weak linear relationships. However, a relatively high correlation is observed between Lag Liquidity Ratio and Lag Cash Ratio (r = 0.800), which may signal potential multicollinearity according to some thresholds in the literature. In addition, a strong correlation is found between Stock Volatility and Lagged Volatility (r = 0.828), which is theoretically expected and reflects the persistence of volatility over time rather than redundancy among independent variables, as it involves a dependent variable and its lagged value.

Despite the relatively high correlation between the liquidity and cash ratios, reliance on pairwise correlations alone is insufficient to conclude the presence of multicollinearity. Therefore, the Variance Inflation Factor (VIF) is employed as a more robust diagnostic measure. As reported in Table 7, the VIF values range from 1.000 to 2.140, which are well below the commonly accepted thresholds of 5 and 10. The highest VIF values correspond to Lag Liquidity Ratio (2.140) and Lag Cash Ratio (2.050), consistent with their correlation level. Nevertheless, these values remain within acceptable limits, indicating that multicollinearity is not severe and does not undermine the reliability of the estimated coefficients.

Table 8 summarizes the diagnostic tests conducted to evaluate the suitability of the panel regression model. The Breusch–Pagan test yields a statistically significant result (BP = 440.430, p < 0.001), indicating the presence of heteroskedasticity. Similarly, the Wooldridge/Breusch–Godfrey test reveals serial correlation in the idiosyncratic errors (χ² = 6.007, p = 0.014). The Pesaran CD test further indicates significant cross-sectional dependence among firms (z = 34.336, p < 0.001), suggesting that common market-wide shocks affect multiple firms simultaneously. In addition, the structural break test associated with the COVID-19 period is highly significant (F = 19.998, p < 0.001), providing evidence that the pandemic altered the volatility process during the sample period.

Finally, the Hausman specification test strongly rejects the null hypothesis that the random-effects estimator is consistent (χ² = 884.380, p < 0.001). This finding indicates that firm-specific effects are correlated with the explanatory variables and supports the use of a fixed-effects model rather than a random-effects model. Given the presence of heteroskedasticity, serial correlation, and cross-sectional dependence, this study employs fixed-effects estimation with Driscoll–Kraay robust standard errors. This approach produces reliable statistical inference by correcting for these econometric issues while accounting for unobserved firm heterogeneity.

5.3. Econometric Analysis Results

Table 9 reports the estimation results from pooled OLS, firm fixed-effects, two-way fixed-effects, and two-way fixed-effects models with Driscoll–Kraay standard errors. In addition, the diagnostic tests reveal the presence of heteroskedasticity, serial correlation, and cross-sectional dependence. Accordingly, the two-way fixed-effects model with Driscoll–Kraay standard errors is adopted as the preferred specification, as it provides more reliable inference under these conditions.

The results show that lagged volatility is the most consistent determinant of current stock volatility. Its coefficient remains positive and highly statistically significant across all model specifications, confirming the presence of strong volatility persistence over time. Lagged return also exhibits a positive and statistically significant effect in the preferred model, suggesting that higher past returns are associated with increased subsequent volatility.

Regarding firm-specific characteristics, the liquidity ratio is positively related to stock volatility and remains statistically significant in the preferred specification. In contrast, the cash ratio shows a negative association with volatility, although its level of statistical significance weakens after applying Driscoll–Kraay standard errors. These findings suggest that while higher liquidity may be linked to greater exposure to market fluctuations, holding more cash can contribute to stabilizing stock price movements.

Some variables, however, do not display robust effects across model specifications. In particular, leverage, tangibility, board independence, firm size, Tobin’s Q, and profitability are not statistically significant in the preferred model, despite showing significance in pooled OLS in some cases. This indicates that their effects are sensitive to controlling for unobserved heterogeneity and time effects.

Firm age shows a positive and statistically significant effect in the preferred model, although its magnitude differs across specifications, suggesting some degree of sensitivity. Finally, while lagged sales growth appears statistically significant in Driscoll–Kraay, its coefficient is economically negligible, implying a limited practical impact on stock volatility.

Consequently, the results provide evidence that stock volatility is primarily driven by its own past behavior and, to a lesser extent, by return dynamics and selected liquidity-related factors. However, the sensitivity of some coefficients across specifications suggests that the findings should be interpreted with appropriate caution.

5.4. Robustness Analysis of Standard Errors

Table 10 reports a comparison of alternative standard error estimators for the two-way fixed-effects model. The results highlight clear differences between conventional and adjusted standard errors, underscoring the importance of correcting for violations of classical regression assumptions.

In general, standard errors increase for several key variables when more robust estimation techniques are employed. This pattern is particularly evident for Lagged Volatility and Lagged Return. The standard error of Lagged Volatility rises substantially from 0.00789 under conventional estimation to 0.05473 using Driscoll–Kraay corrections. Similarly, the standard error of Lagged Return increases from 0.00411 to 0.01301. These findings indicate that conventional fixed-effects standard errors may considerably underestimate true variability when heteroskedasticity, serial correlation, and cross-sectional dependence are present.

For other variables, such as Lag Liquidity Ratio and Lag ROA, standard errors also increase under robust and clustered estimators, although to a lesser extent. In contrast, some variables, including Lag Tangibility and Lag Board Independence, exhibit relatively small changes or even slight reductions in standard errors, suggesting that the impact of misspecification is not uniform across regressors.

A comparison across estimators shows that standard errors clustered at the firm and year levels are, in some cases, comparable to Driscoll–Kraay estimates. However, notable differences remain for key variables, particularly Lagged Volatility and Lagged Return, where Driscoll–Kraay standard errors are substantially larger. This reinforces the relevance of accounting for cross-sectional dependence in addition to heteroskedasticity and serial correlation.

It is also worth noting that the standard errors associated with Lag Sales Growth are extremely small across all specifications. This likely reflects the limited scale of the variable rather than exceptionally high estimation precision.

Accordingly, the evidence suggests that inference based on conventional standard errors may be misleading in this context. Given that the diagnostic tests confirm the presence of heteroskedasticity, serial correlation, and cross-sectional dependence, the use of Driscoll–Kraay standard errors is warranted, as this approach simultaneously addresses these econometric issues and provides more reliable statistical inference.

Furthermore, to verify the robustness of the baseline findings, the model is re-estimated by incorporating country fixed effects, industry fixed effects, and both sets simultaneously (Table 11). The results show that the main variables of interest remain largely stable across specifications, although some control variables exhibit changes in statistical significance.

In particular, the Lag Liquidity Ratio remains positive and highly statistically significant across all models, with coefficients ranging from 0.00193 to 0.00207. Similarly, the Lag Cash Ratio retains a negative and highly significant effect, with coefficients between −0.00288 and −0.00315. Lagged Volatility continues to display the strongest positive association with current stock volatility, with coefficients close to 0.79 in all specifications, indicating persistent volatility dynamics. Lagged Return also remains positive and statistically significant, with coefficients around 0.04.

With respect to control variables, some coefficients exhibit sensitivity to the inclusion of additional fixed effects. For instance, firm size becomes statistically significant across specifications, while profitability (ROA) and firm age show significance in certain models. These variations suggest that controlling for country and industry heterogeneity affects the estimated impact of selected firm characteristics.

The explanatory power of the model improves slightly after accounting for country and industry effects. The R² increases to 0.6708 under the country fixed-effects specification, 0.6698 under the industry specification, and 0.6727 when both are included. However, the differences across these specifications remain limited, indicating that both dimensions contribute comparably to explaining variation in stock volatility.

Thus, the results are broadly consistent with the baseline findings, providing supportive evidence for the robustness of the main conclusions while highlighting some sensitivity among control variables.

5.5. Volatility Analysis Results

To examine the dynamic behavior of stock return volatility, this study estimates three conditional volatility models: GARCH (1,1), EGARCH, and EGARCH-t. These models are designed to capture volatility clustering, persistence, and potential asymmetric responses to market shocks. Model performance is assessed using the Akaike Information Criterion (AIC), where lower values indicate a superior fit. All volatility models were estimated using maximum likelihood estimation on the aggregate stock return series.

Table 12 presents the estimated parameters and model selection statistics. Among the competing specifications, the EGARCH-t model yields the lowest AIC value (0.7704), followed by the EGARCH model (1.6673) and the standard GARCH (1,1) model (1.7698). While these results favor the EGARCH-t specification, the differences across models should be interpreted with caution.

The persistence parameter (beta) is highest in the EGARCH-t model (beta = 0.9720), followed by the EGARCH model (beta = 0.9205), indicating that volatility shocks are highly persistent and dissipate gradually over time. By comparison, the GARCH(1,1) model exhibits lower persistence (beta = 0.4413), implying a weaker carryover effect of past volatility.

The shock parameter (alpha) captures the immediate response of volatility to new information. Within the EGARCH framework, the estimated coefficients (alpha = −0.3175 for EGARCH and alpha = −0.6931 for EGARCH-t) suggest a relatively strong response of volatility to shocks, particularly under the Student-t specification. The GARCH(1,1) model also exhibits a substantial shock response (alpha = 0.6208), although direct comparisons across model types should be interpreted with caution due to differences in model structure.

The estimated asymmetry coefficients further highlight differences across specifications. The EGARCH model produces a positive asymmetry coefficient (γ = 0.4325), while the EGARCH-t model reports a larger positive coefficient (γ = 1.0550). These results indicate the presence of asymmetric volatility dynamics, suggesting that positive and negative shocks have different effects on conditional volatility. The larger coefficient under the Student-t specification further indicates that asymmetry becomes more pronounced when heavy-tailed return behavior is taken into account.

The GARCH(1,1) model was estimated using maximum likelihood under the assumption of conditional normality. Both the shock parameter (α = 0.6208) and the persistence parameter (β = 0.4413) are statistically significant. The combined persistence measure (α + β = 1.0621) exceeds unity, indicating a highly persistent or near-integrated volatility process rather than strict covariance stationarity.

Diagnostic tests support the adequacy of the GARCH specification. The Ljung–Box tests on standardized squared residuals fail to reject the null hypothesis of no remaining serial correlation, while the ARCH-LM tests indicate the absence of residual ARCH effects. In addition, the sign bias tests do not indicate significant model misspecification. Hence, these results suggest that the estimated volatility models provide a broadly adequate representation of the underlying volatility process.

5.6. Machine Learning and Deep Learning Forecasting Performance

To complement the econometric and volatility modeling frameworks, this study evaluates the out-of-sample forecasting performance of several machine learning and deep learning approaches. Specifically, Random Forest (RF), Extreme Gradient Boosting (XGBoost), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) models are employed to forecast stock volatility. Forecast accuracy is assessed using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R²). Furthermore, the Diebold–Mariano (DM) test is conducted to compare predictive accuracy across competing forecasting frameworks.

Table 13 presents a comprehensive cross-framework comparison encompassing econometric, volatility, machine learning, deep learning, and hybrid forecasting models. The results indicate that forecasting performance is highly similar across the alternative approaches. The Pooled OLS model achieves the lowest RMSE (0.1704) and shares the lowest MAE (0.1471) with the LSTM model. Among the machine learning and deep learning techniques, LSTM exhibits the lowest MAPE (75.9936%), followed closely by Pooled OLS (76.4019%) and Random Forest (76.4015%). In contrast, the volatility-based GARCH (1,1) and EGARCH models produce substantially larger forecasting errors, with RMSE values of 0.1840 and 0.1839, respectively, and MAPE values exceeding 92%. The Hybrid GARCH–LSTM model does not generate meaningful forecasting gains relative to the standalone LSTM model, recording marginally higher error measures. Moreover, all competing models exhibit negative or near-zero R² values, suggesting limited explanatory power and underscoring the inherent difficulty of accurately forecasting firm-level stock volatility.

The Diebold–Mariano predictive accuracy statistic is −1.2520. This result suggests that differences in forecast accuracy between the benchmark forecasts under comparison are relatively modest. Consistent with the small variations observed in RMSE, MAE, and MAPE across the leading models, the findings indicate that advanced machine learning, deep learning, and hybrid forecasting architectures offer only limited incremental predictive benefits over simpler econometric specifications. The evidence supports the view that stock volatility remains highly persistent and challenging to forecast, regardless of the modeling framework employed.

The out-of-sample forecasting results are presented in Figure 4, which compares actual stock volatility with forecasts generated by the Random Forest and Hybrid GARCH–LSTM models. The results reveal distinct forecasting behaviors across the two approaches. The Random Forest model exhibits slightly greater responsiveness to fluctuations in realized volatility than the GRU model. However, both forecasting approaches generate substantially smoother volatility paths and do not fully capture the magnitude of extreme volatility spikes and troughs observed in the actual series. In contrast, the Hybrid GARCH–LSTM model produces smoother forecasts that follow the underlying trend of volatility but are less responsive to abrupt changes in realized volatility. While the hybrid model effectively filters noise and captures broader volatility regimes, its forecasts exhibit a smoothing effect that reduces sensitivity to extreme market movements. Figure 4 suggests that the Random Forest model provides closer short-term tracking of realized volatility, whereas the Hybrid GARCH–LSTM model is more effective in capturing long-run volatility trends.

To evaluate the stability of forecasting performance over time, Figure 5 presents the rolling RMSE calculated across the testing period. The results indicate that forecast errors fluctuate within a relatively narrow range, varying between approximately 0.16 and 0.18 throughout the evaluation horizon. Although periods of higher and lower forecasting accuracy are evident, no persistent upward or downward trend is observed. Instead, the rolling RMSE exhibits cyclical movements, suggesting that forecasting performance remains relatively stable over time while responding to changing market conditions and volatility regimes. These findings highlight the time-varying nature of stock volatility and the ongoing challenges associated with maintaining consistent forecasting accuracy across different market environments.

Figure 6 presents the residual diagnostics of the Hybrid GARCH–LSTM model. The left panel illustrates the residual dispersion profile by plotting empirical forecasting errors against the conditional volatility predictions generated by the model. The residuals are distributed around the zero-error benchmark without any discernible systematic pattern, suggesting that forecast errors are largely random and that the model does not exhibit substantial bias across different volatility levels. Moreover, the dispersion of residuals remains relatively stable across fitted values, indicating the absence of pronounced heteroskedasticity.

The right panel of Figure 6 displays the empirical distribution of residuals. The histogram and kernel density estimate show that residuals are centered close to zero and are broadly symmetric around the reference line, suggesting that the model does not systematically overestimate or underestimate stock volatility. While residual dispersion remains evident, the overall error distribution supports the adequacy of the Hybrid GARCH–LSTM framework in generating balanced volatility forecasts. Collectively, Figure 6 indicates that prediction errors are largely random, although the magnitude of residual variation highlights the inherent difficulty of forecasting stock market volatility.

Figure 7 and Figure 8 present SHAP-based feature attribution beeswarm plots for the Random Forest and XGBoost volatility forecasting models, respectively. The figures provide insights into both the relative importance of predictor variables and the direction of their influence on volatility forecasts. In both models, most SHAP values are concentrated around zero, indicating that individual predictors exert modest marginal effects on the predicted volatility. Nevertheless, notable differences emerge in the ranking and distribution of influential features.

As shown in Figure 7, the Random Forest model identifies Lag_Cash_Ratio, Lag_Liquidity_Ratio, and Lag_Board_Independence as the most influential determinants of stock volatility. Higher values of liquidity-related variables generally contribute positively to the model output, while lower values tend to reduce predicted volatility. Governance characteristics, particularly board independence, also display meaningful explanatory power, suggesting that corporate governance conditions play a role in shaping future volatility dynamics. Financial market variables such as Lagged_Return and Lagged_Volatility exhibit comparatively smaller SHAP magnitudes, indicating a more limited contribution within the Random Forest framework.

In contrast, Figure 8 demonstrates that the XGBoost model assigns the highest importance to Lag_Tobins_Q, followed by Lag_Firm_Age, Lag_Cash_Ratio, and Lag_Leverage. The wider dispersion of SHAP values observed for Tobin’s Q indicates that firm valuation metrics contribute more strongly to volatility predictions under the gradient boosting architecture. The positive SHAP values associated with high Tobin’s Q observations suggest that firms with stronger market valuations are more likely to experience elevated future volatility. Similarly, firm age, leverage, and liquidity measures exhibit nonlinear effects, reflecting the ability of XGBoost to capture complex interactions among firm characteristics.

A comparison of Figure 7 and Figure 8 reveals that while both models emphasize liquidity, governance, and firm-specific financial indicators, the XGBoost model places greater weight on market valuation and firm maturity variables, whereas the Random Forest model highlights liquidity and governance attributes. These findings suggest that the determinants of stock volatility are multifaceted and model-dependent, with different machine learning algorithms capturing distinct aspects of the underlying data-generating process. The SHAP analysis enhances the interpretability of the forecasting models by identifying the key drivers of volatility predictions and clarifying how variations in firm-level characteristics influence model outputs.

6. Discussion

The empirical findings provide important insights into the determinants of stock return volatility in MENA markets. The results indicate that liquidity ratio, cash ratio, sales growth, firm age, lagged volatility, and lagged returns are significant determinants of stock volatility under the preferred two-way fixed-effects model with Driscoll–Kraay standard errors. In contrast, leverage, tangibility, board independence, firm size, Tobin’s Q, and profitability (ROA) do not exhibit statistically significant effects after controlling for firm-specific and time-specific heterogeneity.

The results provide mixed evidence regarding the proposed hypotheses. Hypothesis H1a posited that financial structure risk, proxied by leverage and tangibility, is significantly associated with stock return volatility. However, neither leverage nor tangibility is statistically significant in the preferred specification. These findings suggest that financial structure characteristics do not exert an independent influence on stock volatility after controlling for firm-specific heterogeneity, time effects, and cross-sectional dependence. This result contrasts with prior studies that document a significant relationship between leverage and volatility (Bhandari, 1988; Christie, 1982), indicating that the influence of financial structure may be less pronounced in the MENA context.

Hypothesis H1b proposed that liquidity risk, proxied by the liquidity ratio and cash ratio, is significantly associated with stock return volatility. The empirical evidence strongly supports this hypothesis. The liquidity ratio exhibits a positive and statistically significant association with stock volatility, whereas the cash ratio displays a negative and statistically significant effect. These findings suggest that different dimensions of liquidity influence stock volatility in distinct ways. Firms with higher current asset positions may face greater uncertainty arising from growth opportunities, investment decisions, or increased investor attention, leading to higher volatility. In contrast, larger cash reserves appear to stabilize stock performance by reducing financial distress concerns and improving firms’ ability to absorb adverse shocks (Bates et al., 2009; Opler et al., 1999).

Hypothesis H1c suggested that governance risk, proxied by board independence, is significantly associated with stock return volatility. The results do not support this hypothesis, as board independence is not statistically significant in the preferred model. This finding indicates that board composition may not be a primary determinant of stock volatility within the sampled MENA firms after accounting for other firm-specific characteristics and market dynamics.

Hypothesis H1d proposed that firm performance, proxied by return on assets (ROA), is significantly associated with stock return volatility. The insignificant coefficient on ROA suggests that profitability does not explain stock volatility once firm-specific and time-specific effects are properly controlled for. This finding implies that profitability may play a limited role in explaining volatility dynamics in MENA markets relative to other financial risk dimensions.

Beyond the hypothesized variables, several additional factors emerge as important determinants of stock volatility. The positive and statistically significant coefficient on firm age suggests that older firms experience greater stock return variability. Although this result differs from the conventional expectation that mature firms are less risky, it may reflect the greater visibility, market prominence, and investor attention typically associated with established firms in the MENA region. Similarly, the significance of sales growth indicates that expanding firms may face greater uncertainty regarding future performance and investment opportunities, contributing to increased volatility.

Finally, the highly significant coefficients on lagged volatility and lagged returns confirm the importance of dynamic market effects. The strong persistence observed in lagged volatility provides further evidence of volatility clustering, a well-established characteristic of financial markets (Engle, 1982; Bollerslev, 1986). Likewise, the positive association between past returns and current volatility suggests that previous market performance contains valuable information for understanding future volatility dynamics.

The volatility analysis further reinforces the importance of dynamic volatility processes. Among the competing specifications, the EGARCH-t model provides the best fit according to the Akaike Information Criterion (AIC = 0.7704), outperforming both the standard EGARCH model (AIC = 1.6673) and the da) model (AIC = 1.7698). The persistence parameter is highest in the EGARCH-t specification (β = 0.9720), indicating that volatility shocks dissipate slowly and confirming strong persistence over time. Moreover, the asymmetry parameters are positive and substantial in both the EGARCH (γ = 0.4325) and EGARCH-t (γ = 1.0550) models, providing evidence that positive and negative shocks affect volatility differently. This aligns with the broader volatility literature documenting asymmetric behavior in financial markets (Black, 1976; Nelson, 1991) and supports the use of asymmetric volatility models.

The forecasting results provide only partial support for the predictive hypotheses. Hypothesis H2a proposed that machine learning models would outperform traditional econometric approaches, while Hypothesis H2b suggested that deep learning models would outperform conventional machine learning techniques. Although deep learning models such as GRU and LSTM achieve slightly better forecasting performance, the Diebold–Mariano tests fail to identify statistically significant differences in predictive accuracy. Therefore, neither H2a nor H2b receives strong empirical support, suggesting that the predictive gains from more sophisticated algorithms remain modest and data-dependent (Christensen et al., 2021; Kelly et al., 2019).

Similarly, the evidence does not provide strong support for Hypothesis H3a, which predicted that hybrid GARCH–deep learning models would outperform standalone approaches. While hybrid models produce competitive forecasts, they do not generate statistically significant improvements in predictive accuracy. However, some support is found for Hypothesis H3b, as hybrid frameworks are able to combine volatility dynamics with nonlinear modeling, although their performance remains limited during periods of extreme market volatility. This is consistent with prior research indicating that hybrid models may perform better under normal conditions but struggle during volatility spikes (Baruník & Křehlík, 2018).

Accordingly, the insignificance of leverage and profitability, combined with the importance of liquidity and dynamic factors, provides important insight into risk pricing in MENA equity markets. The results suggest that investors place greater emphasis on time-varying market conditions and liquidity dynamics than on relatively stable accounting-based indicators. This may be because variables such as leverage and profitability exhibit limited within-firm variation and are largely absorbed by fixed effects, whereas liquidity conditions and past volatility provide more timely signals about changing risk. Consequently, stock volatility appears to be driven by a combination of firm-specific characteristics and evolving market expectations, particularly in emerging markets characterized by institutional changes and heightened exposure to external shocks.

Given the inclusion of lagged explanatory variables, the empirical specification was supplemented with multiple robustness checks, including alternative model specifications, fixed-effects estimations, and robust standard error corrections. The consistency of the findings across these specifications provides additional confidence in the reported results.

7. Conclusions

This study examined whether stock return volatility in MENA equity markets can be explained and predicted using firm-level financial characteristics, volatility models, and advanced forecasting techniques. Using an unbalanced panel dataset covering 1596 firms and 19,752 firm-year observations during the period 2010–2024, the analysis combined panel-data econometric models, GARCH-family volatility models, machine learning algorithms, deep learning architectures, and hybrid forecasting frameworks. The preferred econometric specification was a two-way fixed-effects model with Driscoll–Kraay standard errors, selected on the basis of extensive diagnostic testing.

The results indicate that stock volatility is driven primarily by liquidity conditions and dynamic market effects rather than by traditional measures of capital structure and profitability. Specifically, liquidity ratio, cash ratio, sales growth, firm age, lagged volatility, and lagged returns emerge as significant determinants, whereas leverage, tangibility, board independence, firm size, Tobin’s Q, and return on assets do not remain statistically significant after controlling for firm-specific heterogeneity, time effects, and dependence structures. These findings provide limited support for Hypotheses H1a and H1b and suggest that liquidity management and volatility persistence play a more central role in explaining market risk in MENA equity markets.

The volatility analysis further confirms the importance of dynamic processes. The EGARCH-t model provides the best fit according to the Akaike Information Criterion, indicating that stock return volatility is characterized by both persistence and asymmetry. The high persistence parameters imply that volatility shocks dissipate slowly, while positive asymmetry coefficients indicate that shocks affect volatility differently depending on their sign. These results support the use of asymmetric volatility models over symmetric GARCH specifications.

The forecasting results offer only limited support for the predictive hypotheses. Although deep learning models such as GRU and LSTM show slightly improved performance, the Diebold–Mariano tests do not reveal statistically significant differences across models. As a result, Hypotheses H2a and H2b receive weak empirical support. Similarly, Hypotheses H3a and H3b receive mixed empirical support. Hybrid GARCH–deep learning models do not produce statistically significant improvements over standalone approaches (rejecting H3a), although they demonstrate some ability to integrate volatility dynamics with nonlinear relationships, with reduced effectiveness during periods of extreme volatility (supporting H3b).

These findings contribute to the literature by challenging the generalizability of leverage and profitability as key drivers of stock volatility, while emphasizing the role of liquidity conditions and persistence effects. In addition, the results suggest that increasing model complexity does not necessarily translate into meaningful gains in predictive accuracy, highlighting the practical limitations of advanced forecasting techniques in financial markets.

Several limitations should be acknowledged. The analysis focuses on firm-level determinants and does not explicitly incorporate macroeconomic, political, or behavioral factors that may influence volatility. Moreover, all models exhibit limited ability to predict extreme volatility episodes and sudden market disruptions. As with many dynamic panel specifications, the inclusion of lagged variables may introduce estimation challenges. Nevertheless, the consistency of the findings across alternative specifications suggests that the main conclusions are not driven by a particular model formulation. Future research could extend the analysis by incorporating macroeconomic indicators, sentiment measures, alternative data sources, and regime-switching frameworks capable of capturing structural breaks. Cross-country comparisons may also provide additional insights into institutional differences in volatility dynamics.

In conclusion, the findings indicate that stock volatility in MENA markets is shaped more by liquidity conditions and persistent volatility dynamics than by traditional financial indicators. At the same time, the modest forecasting gains achieved by advanced models underscore the continuing challenges associated with predicting financial market volatility, supporting a balanced approach that integrates firm fundamentals with robust econometric modeling.

Author Contributions

Conceptualization, T.M.S.M. and Y.I.; methodology, Y.I.; software, Y.I.; validation, T.M.S.M.; formal analysis, Y.I.; investigation, T.M.S.M.; resources, Y.I.; data curation, Y.I., K.H. and T.M.S.M.; writing—original draft preparation, Y.I., K.H. and T.M.S.M.; writing—review and editing, Y.I., K.H. and T.M.S.M.; visualization, Y.I.; supervision, K.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available from the corresponding author upon reasonable request.

Acknowledgments

During the preparation of this manuscript, the authors used Grammarly for proofreading and ChatGPT (5.5) to generate Figure 1, Figure 2 and Figure 3. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

VOL	Stock Return Volatility
RET	Stock Return
LEV	Leverage
LIQ	Liquidity Ratio
SIZE	Firm Size
TQ	Tobin’s Q
ROA	Return on Assets
SG	Sales Growth
AGE	Firm Age
VR	Value Relevance
CS	Classification Shifting
VOL(t−1)	Lagged Stock Volatility
RET(t−1)	Lagged Stock Return
PCA	Principal Component Analysis
LASSO	Least Absolute Shrinkage and Selection Operator
VIF	Variance Inflation Factor
ADF	Augmented Dickey–Fuller Test
LLC	Levin–Lin–Chu Test
IPS	Im–Pesaran–Shin Test
RMSE	Root Mean Squared Error
MAE	Mean Absolute Error
QLIKE	Quasi-Likelihood (QLIKE) Loss Function
VaR	Value-at-Risk
LSTM	Long Short-Term Memory
GRU	Gated Recurrent Unit
XGBoost	Extreme Gradient Boosting
SHAP	Shapley Additive Explanations

Appendix A. Reproducibility, Computational Environment, and Model Implementation Details

Appendix A.1. Reproducibility Statement

The empirical analyses were conducted using Python 3.12.13 within a Linux-based computational environment (Linux 6.6.122+, x86_64 architecture). The implementation relied primarily on NumPy 2.0.2, Pandas 2.2.2, Scikit-learn 1.6.1, TensorFlow 2.20.0, and XGBoost 3.2.0. Model estimation and training were performed in a GPU-enabled environment to improve computational efficiency.

To ensure reproducibility and consistency across all machine learning and deep learning experiments, random seeds were fixed at 42 for both NumPy and TensorFlow throughout the analysis.

Table A1. Computational Environment.

Component	Specification
Python	3.12.13
Operating System	Linux 6.6.122+
Architecture	x86_64
NumPy	2.0.2
Pandas	2.2.2
Scikit-learn	1.6.1
TensorFlow	2.20.0
XGBoost	3.2.0
Hardware	GPU-enabled environment
Random Seed	42

Appendix A.2. Data Preparation and Experimental Design

The dataset was obtained from the Refinitiv (LSEG) database and subsequently prepared for machine learning and deep learning analyses. Prior to model estimation, incomplete observations were removed to ensure data quality and consistency.

Given the presence of extreme observations frequently encountered in financial datasets, all predictor variables were normalized using the RobustScaler transformation. Temporal information was incorporated through a rolling-window sequence generation procedure employing a look-back window of five periods.

To preserve chronological ordering and eliminate look-ahead bias, the dataset was partitioned sequentially into training, validation, and testing samples. Specifically, 70% of observations were allocated to model training, 15% to validation, and the remaining 15% to out-of-sample testing. The validation sample was used for model tuning and overfitting control, whereas the testing sample was reserved exclusively for final performance evaluation.

Table A2. Data Processing and Experimental Settings.

Parameter	Value
Scaling Method	RobustScaler
Sequence Length	5 periods
Training Sample	70%
Validation Sample	15%
Testing Sample	15%
Random Seed	42

Appendix A.3. Random Forest Model Specification

The Random Forest model was implemented using bootstrap aggregation and the squared-error splitting criterion. The final model consisted of 100 decision trees with a maximum tree depth of 10 levels. Node splitting required a minimum of two observations, while terminal nodes contained at least one observation. Feature selection was performed using all available predictor variables at each split. Parallel processing was enabled using all available processor cores to improve computational efficiency.

Table A3. Random Forest Hyperparameters.

Hyperparameter	Value
Number of Trees (n_estimators)	100
Maximum Depth	10
Maximum Features	1.0
Minimum Samples Split	2
Minimum Samples Leaf	1
Criterion	Squared Error
Bootstrap Sampling	True
Random State	42
Parallel Processing	Enabled (−1)

Appendix A.4. XGBoost Model Specification

The XGBoost model was estimated using gradient boosting regression with the squared-error objective function. The final specification employed 100 boosting trees, a maximum tree depth of 6, and a learning rate of 0.05. The random seed was fixed at 42 to ensure reproducibility.

No explicit regularization parameters were imposed, and the default XGBoost settings for gamma, L1 regularization (reg_alpha), and L2 regularization (reg_lambda) were retained.

Table A4. XGBoost Hyperparameters.

Hyperparameter	Value
Objective Function	reg:squarederror
Number of Trees (n_estimators)	100
Maximum Depth	6
Learning Rate	0.05
Random State	42
L1 Regularization (reg_alpha)	Default
L2 Regularization (reg_lambda)	Default
Gamma	Default

Appendix A.5. Deep Learning Model Specification

The deep learning framework incorporated Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks together with several enhanced hybrid architectures. All neural networks were trained using the Adam optimizer and Mean Squared Error (MSE) loss function.

Training was conducted using a batch size of 32 observations and a maximum of 100 epochs. Early stopping procedures were implemented to mitigate overfitting and improve generalization performance.

Table A5. Deep Learning Training Configuration.

Parameter	Value
Optimizer	Adam
Loss Function	MSE
Batch Size	32
Maximum Epochs	100
Early Stopping	Enabled
Random Seed	42

Appendix A.6. LSTM Architecture

The baseline LSTM model consisted of two stacked recurrent layers containing 64 and 32 hidden units, respectively. Dropout regularization layers were introduced between recurrent layers to reduce overfitting. The final output was generated through a dense layer containing a single neuron.

The complete architecture contained 32,161 trainable parameters.

Table A6. Baseline LSTM Architecture.

Layer	Units
LSTM Layer 1	64
Dropout	Applied
LSTM Layer 2	32
Dropout	Applied
Dense Output Layer	1
Trainable Parameters	32,161

Appendix A.7. GRU Architecture

The baseline GRU model employed two recurrent layers containing 64 and 32 hidden units, respectively, followed by dropout regularization and a dense output layer. Compared with the LSTM model, the GRU architecture utilized fewer trainable parameters while maintaining the ability to capture temporal dependencies.

The model contained 24,417 trainable parameters.

Table A7. Baseline GRU Architecture.

Layer	Units
GRU Layer 1	64
Dropout	Applied
GRU Layer 2	32
Dropout	Applied
Dense Output Layer	1
Trainable Parameters	24,417

Appendix A.8. Hybrid Deep Learning Architectures

To capture more complex nonlinear relationships and interaction effects, several hybrid deep learning architectures were evaluated. These models combined recurrent neural networks with fully connected layers, Batch Normalization layers, Dropout regularization, and nonlinear activation functions.

The optimized hybrid architectures were designed to enhance learning stability, improve convergence behavior, and increase representational flexibility. Additional dense layers and Batch Normalization components were incorporated to strengthen nonlinear pattern extraction. Furthermore, the High-Gain architecture integrated LeakyReLU activation functions and deeper network structures to improve predictive performance.

Table A8. Deep Learning Architecture Summary.

Model	Number of Layers	Trainable Parameters
LSTM	5	32,161
GRU	5	24,417
Hybrid LSTM	4	32,417
Hybrid Model	6	32,929
Optimized Hybrid	8	33,633
Panel LSTM	5	6497
Demeaned LSTM	5	6497
High-Gain Model	9	33,377
Bounded Model	5	2225

Appendix A.9. Model Evaluation

The predictive performance of all machine learning and deep learning models was evaluated using multiple complementary forecasting metrics. Specifically, Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R²) were employed to assess predictive accuracy and explanatory performance.

In addition, the Diebold–Mariano (DM) predictive accuracy test was conducted to evaluate whether forecast errors differed significantly across competing forecasting frameworks.

Models exhibiting lower forecast errors and higher explanatory power were considered superior.

Table A9. Performance Evaluation Metrics.

Metric	Interpretation
RMSE	Average magnitude of prediction errors
MAE	Average absolute deviation
MAPE	Percentage forecasting error
R²	Explanatory power of the model
Diebold–Mariano Statistic	Relative predictive accuracy between competing forecasts

Appendix A.10. Explainability and Feature Importance Analysis

To improve model interpretability and address the black-box nature of machine learning algorithms, SHAP (SHapley Additive exPlanations) analysis was performed for both the Random Forest and XGBoost models.

SHAP values decompose model predictions into variable-specific contributions and provide a theoretically grounded measure of feature importance. Variable rankings were generated using mean absolute SHAP values, while SHAP summary beeswarm plots were employed to visualize both the magnitude and direction of predictor effects.

The analysis enabled identification of the most influential determinants of stock volatility and provided additional insights into the nonlinear relationships captured by the machine learning models. Comparative SHAP analyses further highlighted differences in feature attribution patterns between the Random Forest and XGBoost frameworks.

Appendix A.11. Replication Workflow

The complete analytical workflow can be summarized as follows:

-: Extract firm-level financial and market variables from Refinitiv (LSEG).
-: Construct lagged explanatory variables and stock volatility measures.
-: Remove incomplete observations.
-: Apply RobustScaler normalization.
-: Generate rolling sequences using a five-period look-back window.
-: Partition the dataset into training (70%), validation (15%), and testing (15%) samples.
-: Estimate Random Forest and XGBoost forecasting models.
-: Train LSTM, GRU, and hybrid deep learning architectures.
-: Generate out-of-sample volatility forecasts.
-: Evaluate predictive performance using RMSE, MAE, MAPE, and R².
-: Conduct Diebold–Mariano predictive accuracy testing.
-: Perform SHAP-based explainability analysis for Random Forest and XGBoost models.
-: Compare forecasting performance across econometric, volatility, machine learning, deep learning, and hybrid forecasting frameworks.

References

Almeida, H., Campello, M., & Weisbach, M. S. (2011). Corporate financial and investment policies when future financing is not frictionless. Journal of Corporate Finance, 17(3), 675–693. [Google Scholar] [CrossRef]
Aloui, R., Aïssa, M. S. B., & Nguyen, D. K. (2011). Global financial crisis, extreme interdependences, and contagion effects: The role of economic structure? Journal of Banking & Finance, 35(1), 130–141. [Google Scholar] [CrossRef]
Andersen, T. G., Bollerslev, T., Diebold, F. X., & Labys, P. (2003). Modeling and forecasting realized volatility. Econometrica, 71(2), 579–625. [Google Scholar] [CrossRef]
Ané, T., & Geman, H. (2000). Order flow, transaction clock, and normality of asset returns. The Journal of Finance, 55(5), 2259–2284. [Google Scholar] [CrossRef]
Antonakakis, N., Chatziantoniou, I., & Gabauer, D. (2020). Refined measures of dynamic connectedness based on time-varying parameter vector autoregressions. Journal of Risk and Financial Management, 13(4), 84. [Google Scholar] [CrossRef]
Bartram, S. M., Brown, G. W., & Stulz, R. M. (2011). Why are U.S. stocks more volatile? SSRN Electronic Journal, 67, 1329–1370. [Google Scholar] [CrossRef]
Baruník, J., & Křehlík, T. (2018). Measuring the frequency dynamics of financial connectedness and systemic risk. Journal of Financial Econometrics, 16(2), 271–296. [Google Scholar] [CrossRef]
Basu, S. (1997). The conservatism principle and the asymmetric timeliness of earnings. Journal of Accounting and Economics, 24(1), 3–37. [Google Scholar] [CrossRef]
Bates, T. W., Kahle, K. M., & Stulz, R. M. (2009). Why do U.S. firms hold so much more cash than they used to? Journal of Finance, 64(5), 1985–2021. [Google Scholar] [CrossRef]
Bekaert, G., & Harvey, C. R. (2002). Research in emerging markets finance: Looking to the future. Emerging Markets Review, 3(4), 429–448. [Google Scholar] [CrossRef]
Bekaert, G., Harvey, C. R., & Lundblad, C. (2007). Liquidity and expected returns: Lessons from emerging markets. The Review of Financial Studies, 20(6), 1783–1831. [Google Scholar] [CrossRef]
Ben Naceur, S., Ghazouani, S., & Omran, M. (2008). Does stock market liberalization spur financial and economic development in the MENA region? Journal of Comparative Economics, 36(4), 673–693. [Google Scholar] [CrossRef]
Bhandari, L. C. (1988). Debt/Equity ratio and expected common stock returns: Empirical evidence. The Journal of Finance, 43(2), 507–528. [Google Scholar] [CrossRef]
Black, F. (1976). Studies of stock price volatility changes. In Proceedings of the American statistical association, business and economic statistics (pp. 177–181). American Statistical Association. [Google Scholar]
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3), 307–327. [Google Scholar] [CrossRef]
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. [Google Scholar] [CrossRef]
Cahuantzi, R., Chen, X., & Güttel, S. (2023). A comparison of LSTM and GRU networks for learning symbolic sequences. In Intelligent computing: Proceedings of the 2023 computing conference, volume 2 (pp. 771–785). Lecture notes in networks and systems, 739 LNNS. Springer. [Google Scholar] [CrossRef]
Campbell, J. Y., Lettau, M., Malkiel, B. G., & Xu, Y. (2001). Have individual stocks become more volatile? An empirical exploration of idiosyncratic risk. The Journal of Finance, 56(1), 1–43. [Google Scholar] [CrossRef]
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (pp. 785–794). Association for Computing Machinery. [Google Scholar] [CrossRef]
Christensen, K., Siggaard, M., & Veliyev, B. (2021). A machine learning approach to volatility forecasting [CREATES Research Papers]. Department of Economics and Business Economics, Aarhus University. Available online: https://ideas.repec.org/p/aah/create/2021-03.html (accessed on 3 March 2026).
Christensen, K., Siggaard, M., & Veliyev, B. (2023). A machine learning approach to volatility forecasting. Journal of Financial Econometrics, 23(1), 1680–1727. [Google Scholar] [CrossRef]
Christie, A. A. (1982). The stochastic behavior of common stock variances: Value, leverage and interest rate effects. Journal of Financial Economics, 10(4), 407–432. [Google Scholar] [CrossRef]
Claessens, S., & Yurtoglu, B. B. (2013). Corporate governance in emerging markets: A survey. Emerging Markets Review, 15, 1–33. [Google Scholar] [CrossRef]
Cont, R. (2001). Empirical properties of asset returns: Stylized facts and statistical issues. Quantitative Finance, 1(2), 223–236. [Google Scholar] [CrossRef]
Daníelsson, J. (2011). Financial risk forecasting: The theory and practice of forecasting market risk, with implementation in R and Matlab. John Wiley & Sons. [Google Scholar]
Dechow, P., Ge, W., & Schrand, C. (2010). Understanding earnings quality: A review of the proxies, their determinants and their consequences. Journal of Accounting and Economics, 50, 344–401. [Google Scholar] [CrossRef]
Dixon, M. F., Halperin, I., & Bilokon, P. (2020). Machine learning in finance (pp. 1–548). Springer Books. [Google Scholar] [CrossRef]
Elmagrhi, M. H., Ntim, C. G., Malagila, J., Fosu, S., & Tunyi, A. A. (2018). Trustee board diversity, governance mechanisms, capital structure and performance in UK charities. Corporate Governance, 18(3), 478–508. [Google Scholar] [CrossRef]
Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica, 50(4), 987–1007. [Google Scholar] [CrossRef]
Estrada, J. (2026). Volatility: A dead ringer for downside risk. Finance Research Open, 2(1), 100099. [Google Scholar] [CrossRef]
Falzon, J., & Micallef, R. (2022). ESG factors: How are stock returns, operating performance, and firm value impacted? Review of Economics and Finance, 20, 144–153. [Google Scholar] [CrossRef]
Fama, E. (1965). Random walks in stock market prices. Financial Analysts Journal, 21(5), 55–59. [Google Scholar] [CrossRef]
Fama, E., & French, K. (2006). Profitability, investment and average returns. Journal of Financial Economics, 82(3), 491–518. [Google Scholar] [CrossRef]
Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. The Journal of Finance, 25(2), 383–417. [Google Scholar] [CrossRef]
Fama, E. F., & French, K. R. (1992). The cross-section of expected stock returns. The Journal of Finance, 47(2), 427–465. [Google Scholar] [CrossRef]
Fama, E. F., & French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics, 116(1), 1–22. [Google Scholar] [CrossRef]
Fischer, T., & Krauss, C. (2018). Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 270(2), 654–669. [Google Scholar] [CrossRef]
Giannopoulos, G., Fagernes, R. V. K., Elmarzouky, M., & Hossain, K. A. B. M. A. (2022). The ESG disclosure and the financial performance of norwegian listed firms. Journal of Risk and Financial Management, 15(6), 237. [Google Scholar] [CrossRef]
Gu, S., Kelly, B., & Xiu, D. (2020). Empirical asset pricing via machine learning. The Review of Financial Studies, 33(5), 2223–2273. [Google Scholar] [CrossRef]
Guiso, L., Sapienza, P., & Zingales, L. (2008). Trusting the stock market. Journal of Finance, 63(6), 2557–2600. [Google Scholar] [CrossRef]
Hamilton, J. D., & Susmel, R. (1994). Autoregressive conditional heteroskedasticity and changes in regime. Journal of Econometrics, 64(1–2), 307–333. [Google Scholar] [CrossRef]
Harrison, B., & Paton, D. (2004). Transition, the evolution of stock market efficiency and entry into EU: The case of Romania. Economics of Planning, 37(3), 203–223. [Google Scholar] [CrossRef]
Hausman, J. A. (2015). Specification tests in econometrics. Applied Econometrics, 38(2), 112–134. [Google Scholar] [CrossRef]
Hesham, S., Elkomity, M., Mohsen, S., & Ibrahim, Y. (2025). Risk interactions and bank performance in emerging markets: Examining the nexus of credit risk, profitability, and financial stability. Journal of Emerging Markets and Management, 1(2), 81–95. [Google Scholar] [CrossRef]
Hommes, C. (2013). Behavioral rationality and heterogeneous expectations in complex economic systems. Cambridge University Press. [Google Scholar] [CrossRef]
Hong, E., Kottimukkalur, B., & Noh, J. (2026). Uncertain text and price reactions to earnings releases. Journal of Banking & Finance, 182, 107580. [Google Scholar] [CrossRef]
Hou, K., Xue, C., & Zhang, L. (2015). Digesting anomalies: An investment approach. The Review of Financial Studies, 28(3), 650–705. [Google Scholar] [CrossRef]
Jensen, M. C. (1986). Agency costs of free cash flow, corporate finance, and takeovers. The American Economic Review, 76(2), 323–329. [Google Scholar]
Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260. [Google Scholar] [CrossRef] [PubMed]
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017, December 4–9). LightGBM: A highly efficient gradient boosting decision tree [Conference session]. 31st International Conference on Neural Information Processing Systems (pp. 3149–3157), Long Beach, CA, USA. [Google Scholar]
Kelly, B. T., Pruitt, S., & Su, Y. (2019). Characteristics are covariances: A unified model of risk and return. Journal of Financial Economics, 134(3), 501–524. [Google Scholar] [CrossRef]
Khan, H., Hassan, R., & Marimuthu, M. (2017). Diversity on corporate boards and firm performance: An empirical evidence from Malaysia. American Journal of Social Sciences and Humanities, 2(1), 1–8. [Google Scholar] [CrossRef]
Knack, S., & Keefer, P. (1997). Does social capital have an economic payoff? A cross-country investigation. The Quarterly Journal of Economics, 112(4), 1251–1288. [Google Scholar] [CrossRef]
Kraus, A., & Litzenberger, R. H. (1973). A state-preference model of optimal financial leverage. The Journal of Finance, 28(4), 911–922. [Google Scholar] [CrossRef]
Krauss, C., Do, X. A., & Huck, N. (2017). Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500. European Journal of Operational Research, 259(2), 689–702. [Google Scholar] [CrossRef]
Mahmood, F., Ahmed, Z., Hussain, N., & Ben-Zaied, Y. (2023). Working capital financing and firm performance: A machine learning approach. Review of Quantitative Finance and Accounting, 65(1), 71–106. [Google Scholar] [CrossRef]
Modigliani, F., & Miller, M. (1958). The cost of capital, corporation finance and the theory of investment. American Economic Review, 48(3), 261–297. [Google Scholar]
Mousa, R., Nabil, J., Safty, A., Hassan, I., & Ibrahim, Y. (2025). Liquidity–credit risk dynamics and bank profitability: Hybrid econometric and machine learning evidence from MENA. Journal of Financial Reporting and Accounting, 1–27. [Google Scholar] [CrossRef]
Naeem, N., Cankaya, S., & Bildik, R. (2022). Does ESG performance affect the financial performance of environmentally sensitive industries? A comparison between emerging and developed markets. In Borsa Istanbul review (Vol. 22, pp. S128–S140). Borsa Istanbul Anonim Sirketi. [Google Scholar] [CrossRef]
Nelson, D. B. (1991). Conditional heteroskedasticity in asset returns: A new approach. Econometrica, 59(2), 347–370. [Google Scholar] [CrossRef]
Opler, T., Pinkowitz, L., Stulz, R., & Williamson, R. (1999). The determinants and implications of corporate cash holdings. Journal of Financial Economics, 52(1), 3–46. [Google Scholar] [CrossRef]
Patton, A. J. (2011). Volatility forecast comparison using imperfect volatility proxies. Journal of Econometrics, 160(1), 246–256. [Google Scholar] [CrossRef]
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. [Google Scholar] [CrossRef]
Rajan, R. G., & Zingales, L. (1995). What do we know about capital structure? Some evidence from international data. The Journal of Finance, 50(5), 1421–1460. [Google Scholar] [CrossRef]
Tobin, J. (1969). A general equilibrium approach to monetary theory. Journal of Money, Credit and Banking, 1(1), 15–29. [Google Scholar] [CrossRef]
Yahya, S., & Ibrahim, Y. (2021). Determinants of Islamic and conventional banks profitability: A contingency approach. Asian Journal of Business and Accounting, 14(2), 279–319. [Google Scholar]
Zhang, Z., Zohren, S., & Roberts, S. (2019). Deep reinforcement learning for trading. Available online: https://ideas.repec.org/p/arx/papers/1911.10107.html (accessed on 8 March 2026).

Figure 1. Machine Learning Framework.

Figure 2. Deep Learning Architecture: LSTM Network.

Figure 3. Hybrid GARCH–Deep Learning Framework4.10 Forecasting Strategy and Model Evaluation.

Figure 4. Out-of-Sample Volatility Forecasting: Actual vs. Machine Learning and Hybrid LSTM Predictions.

Figure 5. Rolling RMSE of Forecast Models.

Figure 6. Residual Dispersion and Error Density Analysis of Hybrid GARCH–LSTM Model.

Figure 7. SHAP Global Feature Importance (Random Forest).

Figure 8. SHAP Global Feature Importance (XG Boost).

Table 1. Sample Selection Procedure.

Sample Selection Stage	Firms	Observations
Initial sample	2093	26,505
Less: Financial institutions and REITs	(497)	(6753)
Final regression sample	1596	19,752

Table 2. Distribution of Firms and Observations by Country.

Country	Companies	Observations
Israel	528	6360
Saudi Arabia	322	3113
Egypt	178	2494
Jordan	127	1894
United Arab Emirates	97	1039
Kuwait	85	1224
Oman	76	1064
Morocco	57	813
Tunisia	49	704
Qatar	34	448
Palestine	22	298
Bahrain	18	256
Lebanon	3	45
Final regression sample	1596	19,752

Table 3. Largest Industry Groups in Final Sample.

Industry	Companies	Observations
Real Estate Management & Development	228	3029
Food Products	112	1479
Hotels, Restaurants & Leisure	74	957
Construction Materials	64	875
Chemicals	62	807
Construction & Engineering	59	735
Health Care Providers & Services	50	491
Metals & Mining	50	645
Pharmaceuticals	48	665
Oil, Gas & Consumable Fuels	47	596
Software	45	475
Specialty Retail	43	501
Electronic Equipment, Instruments & Components	37	409
IT Services	36	358
Building Products	34	429
Consumer Staples Distribution & Retail	33	377
Commercial Services & Supplies	31	355
Electrical Equipment	30	385
Diversified Consumer Services	29	307
Biotechnology	27	325

Table 4. Variables’ Definitions.

Variable	Category	Conceptual Definition	Operational Definition	References
Stock Volatility	Market Risk	Degree of variation in a firm’s stock returns, reflecting overall market risk and uncertainty.	Annualized standard deviation of daily stock returns within each year.	Andersen et al. (2003); Engle (1982)
Leverage	Financial Structure Risk	Extent to which a firm relies on debt financing, increasing financial fragility and financial risk.	Total Liabilities/Shareholders’ Equity.	Modigliani and Miller (1958); Rajan and Zingales (1995); Mousa et al. (2025)
Tangibility	Financial Structure Risk	Proportion of fixed assets that can be used as collateral to secure financing.	Net Property, Plant & Equipment/Total Assets.	Almeida et al. (2011)
Liquidity Ratio	Liquidity Risk	Firm’s ability to meet short-term obligations using current assets.	Current Assets/Current Liabilities.	Hesham et al. (2025)
Cash Ratio	Liquidity Risk	Immediate liquidity available to cover short-term liabilities.	Cash and Cash Equivalents/Current Liabilities.	Bates et al. (2009)
Board Independence	Governance Risk	Effectiveness of board oversight and monitoring through independent directors.	Independent Board Members/Board Size.	Elmagrhi et al. (2018); Khan et al. (2017)
Firm Size	Control Variable	Scale of the firm affecting information environment, risk exposure, and market visibility.	Natural logarithm of Total Assets [ln(Total Assets)].	Yahya and Ibrahim (2021)
Tobin’s Q	Control Variable	Market valuation relative to the firm’s asset base and growth opportunities.	Enterprise Value/Total Assets.	Tobin (1969); Naeem et al. (2022)
ROA	Control Variable	Firm profitability reflects operational efficiency and financial performance.	Net Income/Total Assets.	Giannopoulos et al. (2022)
Sales Growth	Control Variable	Growth in firm operations and market opportunities over time.	Percentage change in Revenue per Share.	Falzon and Micallef (2022)
Firm Age	Control Variable	Firm maturity and accumulated experience affect stability and risk.	Current Year−Year of Incorporation.	Mahmood et al. (2023)
Lagged Variables (t − 1)	Model Specification	Use of past firm characteristics to explain current risk outcomes and reduce simultaneity concerns.	One-period lag of all independent variables.	Campbell et al. (2001); Patton (2011); Estrada (2026)
Lagged Volatility	Time-Series Control	Persistence of volatility over time due to volatility clustering effects.	One-period lag of Stock Volatility.	Campbell et al. (2001); Patton (2011); Estrada (2026)
Lagged Returns	Time-Series Control	Influence of previous stock performance on current volatility dynamics.	One-period lag of annual stock returns.	Campbell et al. (2001); Patton (2011); Estrada (2026)

Table 5. Descriptive Statistics.

Variable	Mean	Median	SD	Min	Max
Stock Volatility	0.3968	0.2956	0.5016	0.0000	10.6812
Lag Leverage	2.2025	0.8034	187.0960	−3138.9766	23,482.3333
Lag Tangibility	0.3165	0.2220	3.2141	0.0000	430.7641
Lag Liquidity Ratio	3.5676	1.4746	19.5908	0.0001	1187.6469
Lag Cash Ratio	1.2676	0.2168	10.3772	0.0000	629.3393
Lag Board Independence	4.8899	4.3938	3.0756	0.0000	28.3450
Lag Firm Size	18.6814	18.6551	2.0898	13.3230	24.7983
Lag Tobin’s Q	7.2272	0.8719	279.6224	−265.6180	32,683.7342
Lag ROA	−0.0572	0.0315	1.3701	−134.0082	46.3000
Lag Sales Growth	6.0096	0.0219	352.6802	−1.0000	33,878.5915
Lag Firm Age	27.5139	24.0000	18.7085	−14.0000	119.0000
Lagged Volatility	0.3908	0.2915	0.4947	0.0000	10.6812
Lagged Return	0.0753	0.0000	0.6202	−0.9979	23.6667

Notes: All independent variables are lagged (t − 1) to mitigate endogeneity. Skewness and kurtosis indicate acceptable distributional properties after 1–99% winsorization.

Table 6. Correlation Matrix.

Variable	1	2	3	4	5	6	7	8	9	10	11	12	13
1. Stock Volatility	1.000
2. Lag Leverage	0.003	1.000
3. Lag Tangibility	−0.082	−0.001	1.000
4. Lag Liquidity Ratio	0.044	−0.001	−0.007	1.000
5. Lag Cash Ratio	0.019	−0.001	−0.005	0.800	1.000
6. Lag Board Independence	0.064	0.002	−0.005	0.329	0.286	1.000
7. Lag Firm Size	−0.099	0.000	−0.013	−0.099	−0.089	−0.055	1.000
8. Lag Tobin’s Q	0.249	0.000	0.010	−0.001	0.002	0.007	−0.047	1.000
9. Lag ROA	−0.049	0.001	0.003	0.001	−0.007	−0.088	0.146	−0.016	1.000
10. Lag Sales Growth	−0.004	0.001	−0.007	−0.002	−0.002	−0.007	−0.003	−0.001	0.000	1.000
11. Lag Firm Age	0.016	−0.012	−0.012	−0.048	−0.055	−0.148	0.091	−0.017	0.041	0.009	1.000
12. Lagged Volatility	0.828	0.004	−0.079	0.025	0.016	0.060	−0.101	0.266	−0.059	−0.002	0.013	1.000
13. Lagged Return	0.348	0.003	−0.033	0.003	0.002	0.005	0.025	−0.001	0.025	−0.007	0.026	0.361	1.000

Table 7. Variance Inflation Factors (VIFs).

Variable	VIF
Lag Leverage	1.000208
Lag Tangibility	1.022586
Lag Liquidity Ratio	2.140384
Lag Cash Ratio	2.050214
Lag Board Independence	1.199844
Lag Firm Size	1.070694
Lag Tobin’s Q	1.005638
Lag ROA	1.080893
Lag Sales Growth	1.000284
Lag Firm Age	1.034849
Lagged Volatility	1.157671
Lagged Return	1.146536

All VIF values are well below the common threshold of 5 (and far below 10), indicating that multicollinearity is not a significant concern in the model.

Table 8. Diagnostics Analysis.

Diagnostic Test	Statistic	df	p-Value	Conclusion/Decision
Pearson Correlation (Highest Correlation)	0.828	–	–	High correlation between Stock Volatility and Lagged Volatility
Breusch–Pagan Test	440.430	12	<0.001	Heteroskedasticity present
Pesaran CD Test	34.336	–	<0.001	Cross-sectional dependence present
Wooldridge/Breusch–Godfrey Test	6.007	1	0.014	Serial correlation present
COVID Structural Break Test (ANOVA/F-test)	19.998	1	<0.001	Significant structural break during the COVID-19 period
Hausman Test	884.380	12	<0.001	Fixed Effects model preferred over Random Effects

Table 9. Panel Regression Results.

Variables	Pooling OLS	Fixed Effects (Firm)	Two-Way FE	Two-Way FE (Driscoll–Kraay)
Lag Leverage	0.00002	0.00012 †	0.00013 *	0.00013
Lag Tangibility	−0.02315 **	0.00812	0.00252	0.00252
Lag Liquidity Ratio	0.00192 ***	0.00228 ***	0.00226 ***	0.00226 **
Lag Cash Ratio	−0.00291 ***	−0.00331 ***	−0.00328 ***	−0.00328 *
Lag Board Independence	0.00037	−0.00182	−0.00189	−0.00189
Lag Firm Size	−0.00351 **	−0.00468	−0.00534	−0.00534
Lag Tobin’s Q	−0.00008	−0.00002	−0.00002	−0.00002
Lag ROA	−0.01278 **	−0.00441	−0.00418	−0.00418
Lag Sales Growth	−0.000003	−0.000005	−0.000005	−0.000005 **
Lag Firm Age	−0.00012	0.00503 ***	0.09269	0.09269 ***
Lagged Volatility	0.80157 ***	0.65930 ***	0.65921 ***	0.65921 ***
Lagged Return	0.03948 ***	0.03599 ***	0.03775 ***	0.03775 **
Constant	0.16327 ***	–	–	–
R²	0.668	0.437	0.425	0.425
Observations	12,895	12,895	12,895	12,895
Firm FE	No	Yes	Yes	Yes
Year FE	No	No	Yes	Yes
Robust DK SEs	No	No	No	Yes

Notes: *** p < 0.01, ** p < 0.05, * p < 0.10, and † p < 0.10. The final specification employs Driscoll–Kraay standard errors to correct for heteroskedasticity, serial correlation, and cross-sectional dependence.

Table 10. Standard Error Comparison.

Variable	Standard SE	Robust HC1 SE	Cluster (Firm) SE	Cluster (Year) SE	Driscoll–Kraay SE
Lag Leverage	0.00007	0.00012	0.00012	0.00013	0.00012
Lag Tangibility	0.02454	0.02899	0.02899	0.02718	0.02120
Lag Liquidity Ratio	0.00030	0.00077	0.00077	0.00081	0.00077
Lag Cash Ratio	0.00053	0.00164	0.00164	0.00121	0.00133
Lag Board Independence	0.00180	0.00159	0.00159	0.00157	0.00126
Lag Firm Size	0.00546	0.00884	0.00884	0.01353	0.01517
Lag Tobin’s Q	0.00043	0.00031	0.00031	0.00031	0.00033
Lag ROA	0.00513	0.01025	0.01025	0.01036	0.01001
Lag Sales Growth	0.00001	0.00000	0.00000	0.00000	0.00000
Lag Firm Age	0.00073	0.00089	0.00089	0.00154	0.00143
Lagged Volatility	0.00789	0.01339	0.01339	0.05535	0.05473
Lagged Return	0.00411	0.00497	0.00497	0.01431	0.01301

Table 11. Robustness Tests: Country and Industry Fixed Effects.

Variable	Country FE	Industry FE	Country + Industry FE
Lag Leverage	0.00002	0.00003	0.00003
Lag Tangibility	−0.00258	−0.01413	0.00243
Lag Liquidity Ratio	0.00207 ***	0.00193 ***	0.00206 ***
Lag Cash Ratio	−0.00315 ***	−0.00288 ***	−0.00311 ***
Lag Board Independence	0.00032	0.00022	0.00024
Lag Firm Size	−0.00454 ***	−0.00349 **	−0.00551 ***
Lag Tobin’s Q	−0.00010	−0.00009	−0.00009
Lag ROA	−0.00942 *	−0.01179 **	−0.00898 *
Lag Sales Growth	−0.00000	−0.00000	−0.00000
Lag Firm Age	−0.00030 *	−0.00012	−0.00033 *
Lagged Volatility	0.79101 ***	0.79765 ***	0.78757 ***
Lagged Return	0.04034 ***	0.03936 ***	0.04012 ***
R²	0.6708	0.6698	0.6727
Observations	12,895	12,895	12,895

Notes: ***, **, and * denote statistical significance at the 1%, 5%, and 10% levels, respectively. Specifically, ***

p < 0.01

, **

p < 0.05

, and *

p < 0.10

.

Table 12. Volatility Model Comparison.

Model	α (Shock)	β (Persistence)	γ (Asymmetry)	Distribution	AIC
GARCH (1,1)	0.6208	0.4413	–	Normal	1.7698
EGARCH	−0.3175	0.9205	0.4325	Normal	1.6673
EGARCH-t	−0.6931	0.9720	1.0550	Student-t	0.7704

Notes: α measures the impact of new shocks on volatility, β captures volatility persistence, and γ represents the asymmetry (leverage) effect. Model selection is based on the Akaike Information Criterion (AIC), where lower values indicate a better fit.

Table 13. Cross-Framework Predictive Performance Benchmark.

Model	RMSE	MAE	MAPE (%)	R²
Pooled OLS	0.1704	0.1471	76.4019	−0.0004
GARCH (1,1)	0.1840	0.1556	92.9397	−0.1665
EGARCH	0.1839	0.1555	92.8059	−0.1642
Random Forest	0.1707	0.1472	76.4015	−0.0041
XGBoost	0.1723	0.1480	76.8708	−0.0221
LSTM	0.1705	0.1471	75.9936	−0.0016
GRU	0.1707	0.1473	76.9999	−0.0035
Hybrid GARCH–LSTM	0.1706	0.1472	76.8771	−0.0025

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ibrahim, Y.; Hussainey, K.; Moawad, T.M.S. Predicting Stock Volatility Using Multidimensional Financial Risk: Evidence from Machine Learning and Hybrid GARCH–Deep Learning Models. J. Risk Financial Manag. 2026, 19, 444. https://doi.org/10.3390/jrfm19060444

AMA Style

Ibrahim Y, Hussainey K, Moawad TMS. Predicting Stock Volatility Using Multidimensional Financial Risk: Evidence from Machine Learning and Hybrid GARCH–Deep Learning Models. Journal of Risk and Financial Management. 2026; 19(6):444. https://doi.org/10.3390/jrfm19060444

Chicago/Turabian Style

Ibrahim, Yara, Khaled Hussainey, and Taghred Mokhtar Sayed Moawad. 2026. "Predicting Stock Volatility Using Multidimensional Financial Risk: Evidence from Machine Learning and Hybrid GARCH–Deep Learning Models" Journal of Risk and Financial Management 19, no. 6: 444. https://doi.org/10.3390/jrfm19060444

APA Style

Ibrahim, Y., Hussainey, K., & Moawad, T. M. S. (2026). Predicting Stock Volatility Using Multidimensional Financial Risk: Evidence from Machine Learning and Hybrid GARCH–Deep Learning Models. Journal of Risk and Financial Management, 19(6), 444. https://doi.org/10.3390/jrfm19060444

Article Menu

Predicting Stock Volatility Using Multidimensional Financial Risk: Evidence from Machine Learning and Hybrid GARCH–Deep Learning Models

Abstract

1. Introduction

2. Theoretical Background and Literature Review

3. Hypotheses Development

3.1. Multidimensional Financial Risk and Stock Volatility

3.2. Predictive Performance of Advanced Models

3.3. Incremental Value of Hybrid Modeling Approaches

4. Research Methodology

4.1. Sample and Data Sources

4.2. Data Structure and Variable Construction

4.3. Data Preprocessing and Statistical Diagnostics

4.4. Econometric Analysis

4.5. Volatility Analysis

4.6. Robustness Checks

4.7. Machine Learning Models

4.8. Deep Learning Models

4.9. Hybrid GARCH–Deep Learning Model

4.10. Model Implementation and Replication Procedures

5. Results

5.1. Descriptive Analysis

5.2. Correlation Analysis, Multicollinearity, and Diagnostic Tests

5.3. Econometric Analysis Results

5.4. Robustness Analysis of Standard Errors

5.5. Volatility Analysis Results

5.6. Machine Learning and Deep Learning Forecasting Performance

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Reproducibility, Computational Environment, and Model Implementation Details

Appendix A.1. Reproducibility Statement

Appendix A.2. Data Preparation and Experimental Design

Appendix A.3. Random Forest Model Specification

Appendix A.4. XGBoost Model Specification

Appendix A.5. Deep Learning Model Specification

Appendix A.6. LSTM Architecture

Appendix A.7. GRU Architecture

Appendix A.8. Hybrid Deep Learning Architectures

Appendix A.9. Model Evaluation

Appendix A.10. Explainability and Feature Importance Analysis

Appendix A.11. Replication Workflow

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI