1. Introduction
The precise measurement and forecasting of stock return volatility are a cornerstone of contemporary financial economics, given their critical importance for asset pricing, portfolio allocation, and financial risk management (
Christensen et al., 2023;
Gu et al., 2020). The definition of volatility relates to the conditional variability of asset returns, construed as a measure of market risk and uncertainty (
Andersen et al., 2003;
Patton, 2011). Inspired by the fact that stylized facts also characterize financial returns, including volatility clustering (
Campbell et al., 2001), leverage effects (
Black, 1976), and fat-tailed distributions (
E. Fama, 1965), research over the past few decades has documented econometric models to account for time-varying variance originating from return series (
Bollerslev, 1986;
Engle, 1982). While these models remain fundamental, recent studies show that they fail to account for the nonlinearity and high-dimensional interactions underlying modern financial markets (
Gu et al., 2020;
Zhang et al., 2019), leading to inaccurate predictions and mispricing of assets. As a result, machine learning and deep learning approaches have become one of the most compelling avenues for improving volatility forecasting performance, as they permit flexible, data-driven modeling without restrictive functional-form assumptions (
Christensen et al., 2021;
Dixon et al., 2020;
Fischer & Krauss, 2018).
With the growing complexity of financial systems and their increasing reliance on quantitative risk management frameworks, the demand for volatility forecasting has grown recently. The volatility estimates are necessary for pricing derivatives, for portfolio optimization, and as one of the risk measures used by regulators (value at risk and expected shortfall) that inform investors about worst-case loss scenarios (
Daníelsson, 2011;
Patton, 2011). The inability to model volatility correctly can lead to severe mispricing of this risk and to insufficient capital allocation, especially during financial turmoil (
Baruník & Křehlík, 2018). Modern empirical research also points to the potential importance of firm-specific characteristics, such as leverage, and liquidity conditions, which give insight into the underlying state of firms’ information environment or financial health in influencing volatility beyond mere historical returns (
Bates et al., 2009;
Dechow et al., 2010;
E. F. Fama & French, 2015). Nevertheless, most volatility forecasting models have been based almost entirely on market-based variables, with little integration of firm-level financial risk factors, which limits their ability to accurately predict volatility across different market conditions, particularly during periods of financial distress or economic downturns when firm-specific characteristics become more relevant.
This is particularly relevant in emerging markets, and more specifically in the MENA region, where financial systems are characterized by structural heterogeneity, developing regulatory regimes, and varying levels of integration into global markets (
Bekaert & Harvey, 2002;
Ben Naceur et al., 2008).
Aloui et al. (
2011) and
Bekaert et al. (
2007) argue that the MENA region has lower informational efficiency, higher transaction costs, and is more sensitive to firm-specific and macroeconomic shocks, which generate stronger volatility dynamics. Moreover, corporate governance systems, financial openness, and de jure preferences for external finance differ widely across firms, further increasing heterogeneity in risk exposure (
Claessens & Yurtoglu, 2013). Recent studies show that nonlinear modeling techniques, such as those borrowed from machine learning, are more appropriate for capturing the complexities of emerging markets, where traditional linear models may not sufficiently characterize the underlying dynamics (
Gu et al., 2020;
Zhang et al., 2019).
Against this backdrop, the present study examines whether stock volatility can be better estimated by embedding multidimensional financial risk factors within a single modeling framework. This study addresses three key empirical questions: First, what is the impact of financial structure, and liquidity conditions on firm-level stock volatility? Second, do machine learning and deep learning models outperform traditional econometric methodologies in capturing these relationships? Third, do hybrid modeling approaches that combine traditional econometric volatility estimates with neural network architectures deliver better predictive accuracy? These questions are grounded in integrating approaches that rely heavily on theory-driven financial modeling alongside those that rely more on data-driven prediction, particularly in complex and heterogeneous environments.
This study makes three contributions. The first is a structured, multidimensional financial risk framework that incorporates accounting-based and financial indicators into volatility forecasting, expanding beyond the traditional focus on market-based predictors (
Dechow et al., 2010;
E. F. Fama & French, 2015). Second, it offers a comprehensive empirical comparison of econometric, machine learning, and deep learning models, adding to the rapidly growing literature on AI applications in finance (
Christensen et al., 2021;
Dixon et al., 2020;
Gu et al., 2020). The third element creates a mixed modeling framework that combines GARCH-based volatility estimates with recurrent neural networks, leveraging both statistical structure and nonlinear learning to improve predictive power. Integrating complementary modeling paradigms using hybrid approaches has been shown to enhance forecasting performance, particularly in complex time-series environments (
Krauss et al., 2017;
Zhang et al., 2019).
The novelty of this study lies in integrating multiple firm-level financial risk dimensions with advanced predictive models within a unified empirical framework, with a particular focus on emerging markets. While recent studies have demonstrated the effectiveness of machine learning techniques in forecasting financial volatility (
Fischer & Krauss, 2018;
Gu et al., 2020), relatively few studies have simultaneously incorporated a comprehensive set of firm-specific financial risk indicators into volatility prediction models. Furthermore, although hybrid econometric–machine learning approaches have gained increasing attention, existing research has primarily concentrated on aggregate market indices or developed economies, leaving a significant gap in understanding firm-level volatility dynamics in emerging regions such as the MENA region (
Zhang et al., 2019). This study contributes to the financial econometrics literature by identifying additional determinants of financial risk and enhancing the understanding of volatility predictability. It also advances the application of machine learning in finance by providing evidence from an underexplored emerging-market context.
The rest of this paper is organized as follows. A review of the literature on volatility modeling, financial risk determinants, and machine learning applications in finance is presented in the next section. The following section details the research methodology. The empirical findings are then presented and discussed, followed by some concluding remarks and avenues for future research.
2. Theoretical Background and Literature Review
Stock return volatility is an essential subject in financial economics, serving as a proxy for market risk and uncertainty (
Andersen et al., 2003;
Patton, 2011). The conception of a theoretical framework, specifically the Efficient Market Hypothesis, assumes that volatility arises from the continuous arrival of new information (
E. F. Fama, 1970). Volatility clustering (
Ané & Geman, 2000), fat tails (
Bollerslev, 1986), and nonlinear dependence between values (
Andersen et al., 2003;
Cont, 2001) have been the core justification for features of the time series that have been repeatedly documented by empirical sources, and this has a primary role in developing econometric models that can be used to account for time-varying conditional variance.
The ARCH model (
Engle, 1982) and its generalized form, the GARCH model (
Bollerslev, 1986), have become the workhorse frameworks for modeling volatility dynamics because they capture persistence and clustering effects effectively. More extensions, such as EGARCH, enable the modeling of asymmetric volatility responses to positive and negative shocks, consistent with leverage effects (
Black, 1976;
Nelson, 1991). This parametrization is always present in a model, and these models themselves provide order and a theoretically grounded framework; however, the limitations of leveraging parametric assumptions and linear specifications become apparent when using this approach to capture nonlinear relationships observed in broader financial properties.
Simultaneously with the development of volatility models, a substantial body of research examines the influence of firm-specific factors on stock return behavior. From a corporate finance perspective, capital structure is a key risk driver. Trade-off theory implies that as leverage increases, the risk of financial distress also rises, and equity returns are more sensitive to shocks (
Kraus & Litzenberger, 1973). Although controlling for firm heterogeneity and other risk factors has been known to weaken the relationship (
Bartram et al., 2011;
E. F. Fama & French, 1992), empirical studies have evinced a positive effect of leverage on volatility (
Bhandari, 1988;
Christie, 1982).
Another theoretical misconception pertains to liquidity, a crucial element of financial risk management. From an alternative perspective, the likelihood of financial distress diminishes with increased liquidity, thereby enhancing the stability of the firm’s value. (
Bates et al., 2009;
Opler et al., 1999). On the other hand, excessive liquidity may facilitate agency problems or the misallocation of capital; viewed through this lens, excess liquidity may create uncertainty (
Jensen, 1986). The vagueness of the trust concept is paralleled by empirical evidence of stabilizing effects (
Basu, 1997), weak links between trust and growth (
Guiso et al., 2008;
Knack & Keefer, 1997), and context-dependent effects (
Harrison & Paton, 2004).
Various other stylized facts, such as corporate performance or growth variables intrinsically linked to financial structure and reporting quality, have been shown to be associated with different volatility dynamics. Profitability, e.g., Return on Assets (ROA), is often used as a proxy for financial stability (
E. Fama & French, 2006).
But there is also the possibility that such firms attract more investor attention and speculative trading in their stocks, which increases return variability. Likewise, growth opportunities (defined as sales growth) introduce uncertainty about future cash flows, leading to unclear empirical relationships (
Hong et al., 2026). This suggests that the relationship between firm fundamentals and volatility is fundamentally nonlinear and context dependent.
Under these complexities, traditional econometric models may fail to disentangle the interplay between firm characteristics and volatility dynamics. This results in increased adoption of machine learning approaches in financial modeling. Meanwhile, many machine learning algorithms, including Random Forest and gradient boosting methods, are potentially flexible nonparametric frameworks that can accommodate high-dimensional and nonlinear relationships (
Chen & Guestrin, 2016;
Christensen et al., 2023;
Gu et al., 2020;
Jordan & Mitchell, 2015;
Ke et al., 2017;
Quinlan, 1986). More importantly, this practice has been shown to yield better predictive performance than traditional approaches, supported by empirical evidence in settings where complex interactions between predictors are expected.
In this respect, deep learning models go one step further and exploit temporal dependencies in financial time series. Long Short-Term Memory (LSTM) or Gated Recurrent Unit (GRU) architectures of recurrent neural networks have proven very effective for modeling general sequence data over the past few years and, in particular, for the dynamic, temporal structure of asset returns (
Dixon et al., 2020;
Fischer & Krauss, 2018). However, empirical performance gains are often modest, and even when statistically significant, they may fall below conventional thresholds, raising concerns about model fit and interpretability.
Hybrid modeling frameworks have been proposed to offer the advantages of both approaches, combining the structure of structured econometric models with the flexibility of machine learning approaches. They combine GARCH-type models that generate volatility estimates with either ML or DL architectures to leverage the nonlinear learning capabilities of ML/DL approaches without sacrificing too much model transparency (
Krauss et al., 2017;
Zhang et al., 2019). Even if hybrid models have the potential to improve forecasting performance, existing evidence indicates that the gains from their application are modest and context specific.
Such findings reinforce the call for a framework that goes beyond aggregation and incorporates multidimensional firm-level financial risk factors using sophisticated modeling methods. This study adds to the literature by examining these associations within a joint framework while offering an extensive comparison of econometric, volatility-based, machine learning, deep learning, and hybrid methods in a common empirical environment, focusing on emerging markets.
4. Research Methodology
4.1. Sample and Data Sources
The sample employed in this study consists of an unbalanced panel of publicly listed non-financial firms operating across Middle East and North Africa (MENA) capital markets over the period 2010–2024. The initial dataset was obtained from the London Stock Exchange Group (LSEG) and contained 2093 firms and 26,505 firm-year observations spanning 67 industries and 13 countries. The broad geographical coverage provides substantial variation in institutional quality, financial development, corporate governance practices, and market structures, making the sample particularly suitable for examining the determinants of stock return volatility in emerging markets. Following the data cleaning procedures and the exclusion of financial institutions and REITs, the final cleaned dataset comprised 1596 firms and 19,752 firm-year observations. However, the econometric analyses were conducted using a final estimation sample of 12,895 observations. This reduction is attributable to the use of lagged explanatory variables, lagged stock volatility, and lagged stock returns in the empirical models. Because all explanatory variables were specified in lagged form (t − 1), observations lacking the required prior-year information could not be retained in the estimation sample.
Firm-level financial and governance information was extracted from LSEG and subsequently transformed into a structured panel dataset. The data cleaning process involved several stages. First, duplicate variables and inconsistently labeled observations resulting from the original extraction format were identified and removed. Second, the dataset was reshaped into a firm-year panel structure, and all variables were standardized to ensure consistency across firms and reporting periods. Missing values were examined extensively, and observations with incomplete identifiers or insufficient information were corrected where possible. To preserve the maximum amount of information while minimizing data loss, missing observations were addressed using interpolation techniques and multiple imputation procedures for selected variables. Furthermore, logarithmic and inverse hyperbolic sine (IHS) transformations were applied to highly skewed financial variables to improve their distributional properties and reduce the influence of extreme values.
To mitigate the impact of outliers commonly observed in accounting and market-based variables, all continuous financial variables were winsorized at the 1st and 99th percentiles. Winsorization reduces the influence of extreme observations without eliminating valid firm-year information and is widely employed in empirical corporate finance and accounting research. Following winsorization, the distributions of the transformed variables were reassessed using skewness and kurtosis measures, resulting in substantially improved distributional characteristics suitable for econometric analysis.
Consistent with prior corporate finance and risk management research, financial institutions were excluded from the final sample. Specifically, firms classified within the Banks, Capital Markets, Consumer Finance, Financial Services, and Insurance industries were removed because their capital structures, regulatory environments, and risk profiles differ fundamentally from those of non-financial corporations. In addition, Real Estate Investment Trusts (REITs), including Diversified REITs, Hotel & Resort REITs, Industrial REITs, and Residential REITs, were excluded because their operating and financing structures are governed by specialized regulatory frameworks that make them incomparable to conventional corporations.
After applying these screening criteria, the final sample comprises 1596 non-financial firms and 19,752 firm-year observations distributed across 58 industries and 13 MENA countries. The sample remains highly diversified geographically and industrially, providing a comprehensive representation of non-financial corporate activity across the MENA region as shown in
Table 1,
Table 2 and
Table 3.
The resulting dataset forms an unbalanced panel, reflecting differences in firm listing dates, disclosure practices, delistings, and data availability across countries and years. Such unbalanced panels are common in emerging-market research and are particularly suitable for panel-data econometric techniques because they maximize the use of available information while minimizing potential measurement error arising from excessive observation deletion. The final sample therefore provides a comprehensive and representative basis for investigating the determinants of stock return volatility across non-financial firms in MENA capital markets.
The dependent variable is stock return volatility, which serves as a measure of firm-level market risk. The explanatory variables are classified into three categories: financial structure risk, liquidity risk, and governance risk. Financial structure risk is represented by leverage and tangibility, liquidity risk is measured using the liquidity ratio and cash ratio, and governance risk is captured through board independence.
Several firm-specific control variables are included to account for differences in firm characteristics that may influence stock return volatility. These controls comprise firm size, measured as the natural logarithm of total assets; a market valuation proxy, measured as enterprise value divided by total assets; profitability, measured by return on assets (ROA); sales growth, measured as the annual percentage change in revenue per share; and firm age, measured as the number of years since incorporation. To address potential simultaneity bias and reduce endogeneity concerns, all explanatory and control variables are specified in lagged form
(t − 1). In addition, lagged stock volatility and lagged stock returns are incorporated as dynamic control variables to capture the persistence and clustering effects commonly observed in financial market volatility. The definitions of all the variables are presented in
Table 4.
4.2. Data Structure and Variable Construction
Let
denote firms and
denote years. Because the study relies on annual firm-level data, stock return volatility is measured using annual stock returns derived from year-end stock prices. Annual stock returns are calculated as follows:
where
represents the stock price of firm
at the end of year
. Stock return volatility is measured as the annualized standard deviation of daily stock returns within each year and is calculated as follows:
where
denotes the daily stock return of firm
on trading day
during year
, and
represents the standard deviation of all daily returns observed within that year. The factor
annualizes the volatility measure based on the average number of trading days in a year.
This measure captures the annual level of stock return variability and serves as the study’s primary proxy for firm-level market risk. By utilizing daily return information while retaining an annual firm-year structure, the measure provides a more comprehensive assessment of volatility than volatility estimates derived solely from annual stock returns.
All explanatory variables are specified in lagged form, , to mitigate simultaneity bias and reduce potential endogeneity concerns. The explanatory variables are classified into three categories: financial structure risk, liquidity risk, and governance risk. Financial structure risk is represented by leverage, measured as total liabilities divided by shareholders’ equity, and tangibility, measured as net property, plant, and equipment divided by total assets. Liquidity risk is captured using the liquidity ratio (current assets divided by current liabilities) and the cash ratio (cash and cash equivalents divided by current liabilities). Governance risk is measured through board independence, calculated as the proportion of independent directors on the board.
Several firm-specific control variables are included to account for differences in firm characteristics that may affect stock return volatility. These controls include firm size, measured as the natural logarithm of total assets; market valuation, proxied by enterprise value divided by total assets; profitability, measured using return on assets (ROA); sales growth, measured as the annual percentage change in revenue per share; and firm age, measured as the number of years since incorporation. To capture the persistence typically observed in financial market volatility, the empirical models also include lagged stock volatility and lagged stock returns as dynamic control variables. This specification enables the analysis to account for both firm-specific characteristics and the time-series dynamics of stock return volatility.
4.3. Data Preprocessing and Statistical Diagnostics
Missing observations were handled using a sequential imputation procedure. First, missing values were linearly interpolated within each firm using the panel structure of the dataset. Second, any remaining gaps were addressed through forward-fill and backward-fill procedures. Finally, residual missing observations were imputed using Multiple Imputation by Chained Equations (MICE) implemented through Scikit-learn’s IterativeImputer with 20 iterations and a fixed random seed of 42. Following the imputation process, no missing values remained in the final analytical dataset.
The dataset is preprocessed according to standard procedures. We winsorize extreme observations at the 1st and 99th percentiles. We test for stationarity employing time-series and panel unit root tests (Augmented Dickey–Fuller (ADF), Levin–Lin–Chu (LLC), Im–Pesaran–Shin (IPS), with appropriate transformations applied where necessary. Additionally, cross-sectional dependence is tested in the Pesaran CD test.
Correlation matrices and Variance Inflation Factors are used for assessing multicollinearity. The Breusch–Pagan and White tests are used to test for heteroskedasticity. At the firm level, all econometric specifications use clustered robust standard errors.
Structural stability is examined using a COVID-19 dummy variable representing the pandemic period. An F-test compares the restricted and unrestricted specifications to determine whether a significant structural break occurred during the COVID-19 period.
4.4. Econometric Analysis
To examine the determinants of stock volatility, this study employs a panel data framework that accounts for unobserved firm-specific heterogeneity (
Hausman, 2015;
Hommes, 2013). The baseline model is specified as follows:
where
represents the stock volatility of firm
in period
. The term
captures firm-specific effects that are invariant over time, while
denotes a vector of lagged explanatory variables. The parameter vector
measures the impact of the explanatory variables on stock volatility. Finally,
represents the idiosyncratic error term. All independent variables are lags by one period to reduce possible endogeneity and simultaneity issues.
Both fixed-effects and random-effects models were estimated. The Hausman test strongly rejected the null hypothesis of no systematic difference between the estimators (χ2 = 884.38, p < 0.001), indicating that the fixed-effects specification is more appropriate. Consequently, the study relies on a two-way fixed-effects model with Driscoll–Kraay standard errors for statistical inference.
Diagnostic testing, however, demonstrates rejection of classical panel assumptions such as cross-sectional dependence and heteroskedasticity. The model is estimated with Driscoll–Kraay standard errors to ensure consistent statistical inference, robust to heteroskedasticity, serial correlation as well as cross-sectional dependence. The expression for the adjusted variance–covariance matrix is as follows:
where
denotes the covariance of residuals across cross-sections at lag
, and
represents kernel-based weights. This approach ensures consistent estimation in panels characterized by complex dependence structures, which are common in financial data.
4.5. Volatility Analysis
In addition to the panel framework, this study models the time-series dynamics of stock return volatility using ARCH-family models, which capture volatility clustering and persistence (
Antonakakis et al., 2020;
Bollerslev, 1986;
Hamilton & Susmel, 1994). The ARCH model specifies conditional variance as a function of past squared residuals:
To allow for both short-term shock effects and long-term persistence, the GARCH (1,1) model is employed:
where α measures the impact of recent shocks (the ARCH effect) and β captures volatility persistence (the GARCH effect). In conventional GARCH models, the condition α + β < 1 ensures covariance stationarity of the volatility process, whereas values approaching or exceeding unity indicate a highly persistent volatility process in which shocks dissipate slowly over time. Consequently, the magnitude of α + β provides an indication of the degree of volatility persistence in the return series.
To further account for potential asymmetry in the response of volatility to positive and negative shocks, the Exponential GARCH (EGARCH) model is estimated as follows:
where
captures the asymmetric effect of shocks on conditional volatility and is commonly interpreted as the leverage effect in financial markets.
The EGARCH can also be estimated under a student–t distribution to account for possible deviations from normality and to capture heavy-tailed behavior in financial returns. This specification provides greater flexibility for modeling large observations, which are common in financial time series.
The joint specification combining a panel econometric model with ARCH-family volatility models provides a complete framework for analyzing the cross-sectional determinants of stock volatility and its dynamic behavior simultaneously. This joint estimation scheme ensures that the fundamental firm characteristics and the stochastic volatility processes are jointly accounted for in the empirical analysis.
The GARCH-family models were estimated using the pooled series of lagged stock returns obtained from the final panel dataset. Consequently, the estimated volatility models capture aggregate volatility dynamics represented in the pooled sample rather than firm-specific or country-specific conditional volatility processes.
4.6. Robustness Checks
To assess the robustness of the empirical findings, additional estimations were conducted by incorporating industry and country fixed effects. Industry fixed effects control for sector-specific characteristics that may influence stock return volatility, while country fixed effects account for differences in institutional environments, regulatory frameworks, and levels of financial market development across MENA countries. The results remained qualitatively similar to the baseline specification, indicating that the main findings are not driven by industry-specific or country-specific effects. Therefore, the conclusions regarding the relationship between financial risk factors and stock return volatility appear robust to alternative model specifications.
4.7. Machine Learning Models
To capture nonlinear relationships, ensemble machine learning models are implemented, specifically Random Forest and Extreme Gradient Boosting (XGBoost), as shown in
Figure 1 (
Breiman, 2001;
Chen & Guestrin, 2016). Model hyperparameters are optimized via grid search with time-series cross-validation, ensuring temporal ordering is preserved.
The Random Forest prediction function is given by the following:
while XGBoost models the output as follows:
To prevent data leakage, the dataset is partitioned using a chronological split, where training data precede validation and testing data. Model performance is evaluated on strictly out-of-sample observations. These models are particularly suited to high-dimensional financial datasets characterized by nonlinear dependencies.
4.8. Deep Learning Models
To capture the nonlinear and dynamic nature of stock return volatility, this study employs two recurrent neural network architectures: Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks. These models are particularly suitable for financial time-series forecasting because they can capture temporal dependencies and sequential patterns that may not be adequately modeled using traditional econometric techniques (
Cahuantzi et al., 2023).
The LSTM architecture addresses the vanishing gradient problem through a memory cell and a set of gating mechanisms that regulate the flow of information over time presented in
Figure 2. The model is represented as follows:
where
,
, and
denote the forget, input, and output gates, respectively;
represents the cell state;
denotes the hidden state;
is the sigmoid activation function; and
denotes element-wise multiplication.
The GRU architecture provides a computationally simpler alternative by combining update and reset gates while preserving the ability to capture long-term dependencies in sequential data.
To prevent look-ahead bias and data leakage, observations are partitioned using a chronological time-series split in which the training sample precedes the validation sample and the validation sample precedes the testing sample. The dataset is divided into 70% training observations, 15% validation observations, and 15% testing observations. Rolling sequences of historical observations are constructed and used as inputs to predict future stock volatility. Model performance is evaluated exclusively on strictly out-of-sample test observations.
Both the LSTM and GRU models consist of two hidden layers containing 64 and 32 units, respectively, with a dropout rate of 0.20 applied between layers to reduce overfitting. The models are trained using the Adam optimizer and the mean squared error (MSE) loss function. Hyperparameters are selected based on validation performance, and early stopping is implemented to prevent overfitting and improve model generalization. The deep learning models were estimated using a pooled dataset of firms while preserving the chronological ordering of observations to avoid look-ahead bias and data leakage. Historical observations were transformed into rolling input sequences and divided into training (70%), validation (15%), and testing (15%) samples. Both the LSTM and GRU architectures consisted of two hidden layers containing 64 and 32 units, respectively, with a dropout rate of 0.20 applied between layers to mitigate overfitting. Model training was conducted using the Adam optimizer and the mean squared error (MSE) loss function, with a batch size of 32 observations and a maximum of 100 training epochs. Early stopping based on validation loss was implemented to improve model generalization and prevent overfitting. These implementation choices enhance the transparency, reproducibility, and robustness of the forecasting framework while ensuring that predictive performance is evaluated exclusively on strictly out-of-sample observations.
4.9. Hybrid GARCH–Deep Learning Model
A hybrid modeling framework is employed to integrate econometric and deep learning approaches presented in
Figure 3. In the first stage, firm-level GJR-GARCH models are estimated to generate conditional variance forecasts
. In the second stage, these forecasts are incorporated as inputs into the deep learning models alongside lagged returns and financial variables:
This approach captures both structured volatility dynamics and complex nonlinear interactions, enhancing predictive performance.
A rolling-window forecasting approach is adopted to ensure a realistic out-of-sample evaluation. Models are estimated using data up to time t and used to generate one-step-ahead forecasts, with the estimation window expanding iteratively.
Forecast accuracy is evaluated using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), out-of-sample R2, and the QLIKE loss function. Statistical differences in predictive performance are assessed using the Diebold–Mariano test.
4.10. Model Implementation and Replication Procedures
To enhance transparency and reproducibility, all machine learning, deep learning, and hybrid forecasting models were implemented within a standardized computational framework shown in
Appendix A. Data preprocessing included missing-value treatment, variable transformations, winsorization at the 1st and 99th percentiles, and the construction of lagged explanatory variables. All models were estimated using identical input variables and evaluation procedures to ensure comparability across forecasting approaches.
To ensure robust performance and prevent look-ahead bias, observations are partitioned using a chronological time-series split (70% training, 15% validation, and 15% testing), ensuring that the training set strictly precedes the validation and testing sets. Historical data are transformed into rolling sequences, which serve as inputs for predicting future stock volatility.
Both the LSTM and GRU models consist of two hidden layers (64 and 32 units, respectively) with a 0.20 dropout rate applied between layers to mitigate overfitting. The models are trained using the Adam optimizer and the mean squared error (MSE) loss function, with a batch size of 32 and a maximum of 100 epochs. To further improve model generalization and prevent overfitting, early stopping based on validation loss is implemented. Model performance is evaluated exclusively on strictly out-of-sample test observations, ensuring the transparency and reproducibility of the forecasting framework.
Hyperparameter tuning was performed using the validation dataset. For the Random Forest model, the number of trees, maximum tree depth, and minimum node size were evaluated across alternative specifications. For XGBoost, tuning included the learning rate, tree depth, number of estimators, and subsampling parameters. For the LSTM and GRU models, alternative sequence lengths, hidden-layer dimensions, dropout rates, batch sizes, and learning rates were evaluated. The final model specifications were selected based on the lowest validation error.
To ensure reproducibility, all experiments were conducted using a fixed random seed. The complete forecasting workflow followed a consistent sequence of data preprocessing, model training, hyperparameter optimization, out-of-sample forecasting, and performance evaluation. Forecasting accuracy was assessed using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R2) is presented in Algorithm 1.
The forecasting procedure can be summarized as follows:
| Algorithm 1. Stock Volatility Forecasting Framework |
Input: Firm-level financial variables, governance variables, and stock return data. Output: Out-of-sample stock volatility forecasts.- ▪
Collect firm-level financial data and stock return data. - ▪
Clean, transform, and winsorize variables. - ▪
Construct lagged explanatory variables and stock volatility measures. - ▪
Partition observations into training (70%), validation (15%), and testing (15%) samples. - ▪
Generate rolling input sequences for recurrent neural network models. - ▪
Train Random Forest and XGBoost models using the training sample. - ▪
Train LSTM and GRU models and optimize hyperparameters using the validation sample. - ▪
Estimate GARCH-family models and generate volatility forecasts. - ▪
Construct hybrid GARCH–deep learning models using GARCH volatility estimates as additional inputs. - ▪
Generate out-of-sample forecasts using the testing sample. - ▪
Evaluate forecasting performance using RMSE, MAE, MAPE, and (R2). - ▪
Compare forecasting performance across econometric, machine learning, deep learning, and hybrid models.
|
5. Results
5.1. Descriptive Analysis
Table 5 presents the descriptive statistics for all variables employed in the analysis. The average stock volatility is 0.3968, with a median value of 0.2956, indicating moderate variation in stock price fluctuations across firms. The standard deviation of 0.5016 suggests considerable dispersion in volatility levels among the sampled firms.
Regarding firm-specific characteristics, the mean leverage ratio is 2.2025, while the median is substantially lower at 0.8034. Similarly, Tobin’s Q exhibits a mean of 7.2272 and a median of 0.8719. The large discrepancies between means and medians, together with extremely high maximum values, indicate the presence of outliers and positively skewed distributions. Sales growth also demonstrates substantial variability, with a mean of 6.0096 and a maximum value of 33,878.5915, reflecting significant heterogeneity in firm growth performance.
Liquidity measures display notable dispersion. The mean liquidity ratio and cash ratio are 3.5676 and 1.2676, respectively, whereas their median values are considerably lower at 1.4746 and 0.2168. This suggests that a relatively small number of firms maintain exceptionally high liquidity positions. Board independence averages 4.8899, indicating that independent directors constitute a meaningful proportion of board membership across the sample.
The average firm size, measured by the natural logarithm of total assets, is 18.6814, with a median value of 18.6551, suggesting a relatively symmetric distribution. Firm age averages 27.51 years, indicating that the sample primarily consists of mature firms. Profitability, measured by return on assets (ROA), exhibits a mean of −0.0572 and a median of 0.0315, implying that while most firms report positive profitability, several firms experienced substantial losses during the sample period.
The lagged variables exhibit characteristics similar to their contemporaneous counterparts. In particular, lagged volatility has a mean of 0.3908 and a median of 0.2915, reflecting persistence in stock volatility over time. Likewise, lagged returns display a mean of 0.0753 and a median close to zero, suggesting considerable variation in firms’ historical stock performance.
5.2. Correlation Analysis, Multicollinearity, and Diagnostic Tests
Table 6 presents the Pearson correlation matrix for all variables included in this study. The results indicate that most pairwise correlations among the explanatory variables are relatively low, suggesting weak linear relationships. However, a relatively high correlation is observed between Lag Liquidity Ratio and Lag Cash Ratio (r = 0.800), which may signal potential multicollinearity according to some thresholds in the literature. In addition, a strong correlation is found between Stock Volatility and Lagged Volatility (r = 0.828), which is theoretically expected and reflects the persistence of volatility over time rather than redundancy among independent variables, as it involves a dependent variable and its lagged value.
Despite the relatively high correlation between the liquidity and cash ratios, reliance on pairwise correlations alone is insufficient to conclude the presence of multicollinearity. Therefore, the Variance Inflation Factor (VIF) is employed as a more robust diagnostic measure. As reported in
Table 7, the VIF values range from 1.000 to 2.140, which are well below the commonly accepted thresholds of 5 and 10. The highest VIF values correspond to Lag Liquidity Ratio (2.140) and Lag Cash Ratio (2.050), consistent with their correlation level. Nevertheless, these values remain within acceptable limits, indicating that multicollinearity is not severe and does not undermine the reliability of the estimated coefficients.
Table 8 summarizes the diagnostic tests conducted to evaluate the suitability of the panel regression model. The Breusch–Pagan test yields a statistically significant result (BP = 440.430,
p < 0.001), indicating the presence of heteroskedasticity. Similarly, the Wooldridge/Breusch–Godfrey test reveals serial correlation in the idiosyncratic errors (χ
2 = 6.007,
p = 0.014). The Pesaran CD test further indicates significant cross-sectional dependence among firms (z = 34.336,
p < 0.001), suggesting that common market-wide shocks affect multiple firms simultaneously. In addition, the structural break test associated with the COVID-19 period is highly significant (F = 19.998,
p < 0.001), providing evidence that the pandemic altered the volatility process during the sample period.
Finally, the Hausman specification test strongly rejects the null hypothesis that the random-effects estimator is consistent (χ2 = 884.380, p < 0.001). This finding indicates that firm-specific effects are correlated with the explanatory variables and supports the use of a fixed-effects model rather than a random-effects model. Given the presence of heteroskedasticity, serial correlation, and cross-sectional dependence, this study employs fixed-effects estimation with Driscoll–Kraay robust standard errors. This approach produces reliable statistical inference by correcting for these econometric issues while accounting for unobserved firm heterogeneity.
5.3. Econometric Analysis Results
Table 9 reports the estimation results from pooled OLS, firm fixed-effects, two-way fixed-effects, and two-way fixed-effects models with Driscoll–Kraay standard errors. In addition, the diagnostic tests reveal the presence of heteroskedasticity, serial correlation, and cross-sectional dependence. Accordingly, the two-way fixed-effects model with Driscoll–Kraay standard errors is adopted as the preferred specification, as it provides more reliable inference under these conditions.
The results show that lagged volatility is the most consistent determinant of current stock volatility. Its coefficient remains positive and highly statistically significant across all model specifications, confirming the presence of strong volatility persistence over time. Lagged return also exhibits a positive and statistically significant effect in the preferred model, suggesting that higher past returns are associated with increased subsequent volatility.
Regarding firm-specific characteristics, the liquidity ratio is positively related to stock volatility and remains statistically significant in the preferred specification. In contrast, the cash ratio shows a negative association with volatility, although its level of statistical significance weakens after applying Driscoll–Kraay standard errors. These findings suggest that while higher liquidity may be linked to greater exposure to market fluctuations, holding more cash can contribute to stabilizing stock price movements.
Some variables, however, do not display robust effects across model specifications. In particular, leverage, tangibility, board independence, firm size, Tobin’s Q, and profitability are not statistically significant in the preferred model, despite showing significance in pooled OLS in some cases. This indicates that their effects are sensitive to controlling for unobserved heterogeneity and time effects.
Firm age shows a positive and statistically significant effect in the preferred model, although its magnitude differs across specifications, suggesting some degree of sensitivity. Finally, while lagged sales growth appears statistically significant in Driscoll–Kraay, its coefficient is economically negligible, implying a limited practical impact on stock volatility.
Consequently, the results provide evidence that stock volatility is primarily driven by its own past behavior and, to a lesser extent, by return dynamics and selected liquidity-related factors. However, the sensitivity of some coefficients across specifications suggests that the findings should be interpreted with appropriate caution.
5.4. Robustness Analysis of Standard Errors
Table 10 reports a comparison of alternative standard error estimators for the two-way fixed-effects model. The results highlight clear differences between conventional and adjusted standard errors, underscoring the importance of correcting for violations of classical regression assumptions.
In general, standard errors increase for several key variables when more robust estimation techniques are employed. This pattern is particularly evident for Lagged Volatility and Lagged Return. The standard error of Lagged Volatility rises substantially from 0.00789 under conventional estimation to 0.05473 using Driscoll–Kraay corrections. Similarly, the standard error of Lagged Return increases from 0.00411 to 0.01301. These findings indicate that conventional fixed-effects standard errors may considerably underestimate true variability when heteroskedasticity, serial correlation, and cross-sectional dependence are present.
For other variables, such as Lag Liquidity Ratio and Lag ROA, standard errors also increase under robust and clustered estimators, although to a lesser extent. In contrast, some variables, including Lag Tangibility and Lag Board Independence, exhibit relatively small changes or even slight reductions in standard errors, suggesting that the impact of misspecification is not uniform across regressors.
A comparison across estimators shows that standard errors clustered at the firm and year levels are, in some cases, comparable to Driscoll–Kraay estimates. However, notable differences remain for key variables, particularly Lagged Volatility and Lagged Return, where Driscoll–Kraay standard errors are substantially larger. This reinforces the relevance of accounting for cross-sectional dependence in addition to heteroskedasticity and serial correlation.
It is also worth noting that the standard errors associated with Lag Sales Growth are extremely small across all specifications. This likely reflects the limited scale of the variable rather than exceptionally high estimation precision.
Accordingly, the evidence suggests that inference based on conventional standard errors may be misleading in this context. Given that the diagnostic tests confirm the presence of heteroskedasticity, serial correlation, and cross-sectional dependence, the use of Driscoll–Kraay standard errors is warranted, as this approach simultaneously addresses these econometric issues and provides more reliable statistical inference.
Furthermore, to verify the robustness of the baseline findings, the model is re-estimated by incorporating country fixed effects, industry fixed effects, and both sets simultaneously (
Table 11). The results show that the main variables of interest remain largely stable across specifications, although some control variables exhibit changes in statistical significance.
In particular, the Lag Liquidity Ratio remains positive and highly statistically significant across all models, with coefficients ranging from 0.00193 to 0.00207. Similarly, the Lag Cash Ratio retains a negative and highly significant effect, with coefficients between −0.00288 and −0.00315. Lagged Volatility continues to display the strongest positive association with current stock volatility, with coefficients close to 0.79 in all specifications, indicating persistent volatility dynamics. Lagged Return also remains positive and statistically significant, with coefficients around 0.04.
With respect to control variables, some coefficients exhibit sensitivity to the inclusion of additional fixed effects. For instance, firm size becomes statistically significant across specifications, while profitability (ROA) and firm age show significance in certain models. These variations suggest that controlling for country and industry heterogeneity affects the estimated impact of selected firm characteristics.
The explanatory power of the model improves slightly after accounting for country and industry effects. The R2 increases to 0.6708 under the country fixed-effects specification, 0.6698 under the industry specification, and 0.6727 when both are included. However, the differences across these specifications remain limited, indicating that both dimensions contribute comparably to explaining variation in stock volatility.
Thus, the results are broadly consistent with the baseline findings, providing supportive evidence for the robustness of the main conclusions while highlighting some sensitivity among control variables.
5.5. Volatility Analysis Results
To examine the dynamic behavior of stock return volatility, this study estimates three conditional volatility models: GARCH (1,1), EGARCH, and EGARCH-t. These models are designed to capture volatility clustering, persistence, and potential asymmetric responses to market shocks. Model performance is assessed using the Akaike Information Criterion (AIC), where lower values indicate a superior fit. All volatility models were estimated using maximum likelihood estimation on the aggregate stock return series.
Table 12 presents the estimated parameters and model selection statistics. Among the competing specifications, the EGARCH-t model yields the lowest AIC value (0.7704), followed by the EGARCH model (1.6673) and the standard GARCH (1,1) model (1.7698). While these results favor the EGARCH-t specification, the differences across models should be interpreted with caution.
The persistence parameter (beta) is highest in the EGARCH-t model (beta = 0.9720), followed by the EGARCH model (beta = 0.9205), indicating that volatility shocks are highly persistent and dissipate gradually over time. By comparison, the GARCH(1,1) model exhibits lower persistence (beta = 0.4413), implying a weaker carryover effect of past volatility.
The shock parameter (alpha) captures the immediate response of volatility to new information. Within the EGARCH framework, the estimated coefficients (alpha = −0.3175 for EGARCH and alpha = −0.6931 for EGARCH-t) suggest a relatively strong response of volatility to shocks, particularly under the Student-t specification. The GARCH(1,1) model also exhibits a substantial shock response (alpha = 0.6208), although direct comparisons across model types should be interpreted with caution due to differences in model structure.
The estimated asymmetry coefficients further highlight differences across specifications. The EGARCH model produces a positive asymmetry coefficient (γ = 0.4325), while the EGARCH-t model reports a larger positive coefficient (γ = 1.0550). These results indicate the presence of asymmetric volatility dynamics, suggesting that positive and negative shocks have different effects on conditional volatility. The larger coefficient under the Student-t specification further indicates that asymmetry becomes more pronounced when heavy-tailed return behavior is taken into account.
The GARCH(1,1) model was estimated using maximum likelihood under the assumption of conditional normality. Both the shock parameter (α = 0.6208) and the persistence parameter (β = 0.4413) are statistically significant. The combined persistence measure (α + β = 1.0621) exceeds unity, indicating a highly persistent or near-integrated volatility process rather than strict covariance stationarity.
Diagnostic tests support the adequacy of the GARCH specification. The Ljung–Box tests on standardized squared residuals fail to reject the null hypothesis of no remaining serial correlation, while the ARCH-LM tests indicate the absence of residual ARCH effects. In addition, the sign bias tests do not indicate significant model misspecification. Hence, these results suggest that the estimated volatility models provide a broadly adequate representation of the underlying volatility process.
5.6. Machine Learning and Deep Learning Forecasting Performance
To complement the econometric and volatility modeling frameworks, this study evaluates the out-of-sample forecasting performance of several machine learning and deep learning approaches. Specifically, Random Forest (RF), Extreme Gradient Boosting (XGBoost), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) models are employed to forecast stock volatility. Forecast accuracy is assessed using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R2). Furthermore, the Diebold–Mariano (DM) test is conducted to compare predictive accuracy across competing forecasting frameworks.
Table 13 presents a comprehensive cross-framework comparison encompassing econometric, volatility, machine learning, deep learning, and hybrid forecasting models. The results indicate that forecasting performance is highly similar across the alternative approaches. The Pooled OLS model achieves the lowest RMSE (0.1704) and shares the lowest MAE (0.1471) with the LSTM model. Among the machine learning and deep learning techniques, LSTM exhibits the lowest MAPE (75.9936%), followed closely by Pooled OLS (76.4019%) and Random Forest (76.4015%). In contrast, the volatility-based GARCH (1,1) and EGARCH models produce substantially larger forecasting errors, with RMSE values of 0.1840 and 0.1839, respectively, and MAPE values exceeding 92%. The Hybrid GARCH–LSTM model does not generate meaningful forecasting gains relative to the standalone LSTM model, recording marginally higher error measures. Moreover, all competing models exhibit negative or near-zero R
2 values, suggesting limited explanatory power and underscoring the inherent difficulty of accurately forecasting firm-level stock volatility.
The Diebold–Mariano predictive accuracy statistic is −1.2520. This result suggests that differences in forecast accuracy between the benchmark forecasts under comparison are relatively modest. Consistent with the small variations observed in RMSE, MAE, and MAPE across the leading models, the findings indicate that advanced machine learning, deep learning, and hybrid forecasting architectures offer only limited incremental predictive benefits over simpler econometric specifications. The evidence supports the view that stock volatility remains highly persistent and challenging to forecast, regardless of the modeling framework employed.
The out-of-sample forecasting results are presented in
Figure 4, which compares actual stock volatility with forecasts generated by the Random Forest and Hybrid GARCH–LSTM models. The results reveal distinct forecasting behaviors across the two approaches. The Random Forest model exhibits slightly greater responsiveness to fluctuations in realized volatility than the GRU model. However, both forecasting approaches generate substantially smoother volatility paths and do not fully capture the magnitude of extreme volatility spikes and troughs observed in the actual series. In contrast, the Hybrid GARCH–LSTM model produces smoother forecasts that follow the underlying trend of volatility but are less responsive to abrupt changes in realized volatility. While the hybrid model effectively filters noise and captures broader volatility regimes, its forecasts exhibit a smoothing effect that reduces sensitivity to extreme market movements.
Figure 4 suggests that the Random Forest model provides closer short-term tracking of realized volatility, whereas the Hybrid GARCH–LSTM model is more effective in capturing long-run volatility trends.
To evaluate the stability of forecasting performance over time,
Figure 5 presents the rolling RMSE calculated across the testing period. The results indicate that forecast errors fluctuate within a relatively narrow range, varying between approximately 0.16 and 0.18 throughout the evaluation horizon. Although periods of higher and lower forecasting accuracy are evident, no persistent upward or downward trend is observed. Instead, the rolling RMSE exhibits cyclical movements, suggesting that forecasting performance remains relatively stable over time while responding to changing market conditions and volatility regimes. These findings highlight the time-varying nature of stock volatility and the ongoing challenges associated with maintaining consistent forecasting accuracy across different market environments.
Figure 6 presents the residual diagnostics of the Hybrid GARCH–LSTM model. The left panel illustrates the residual dispersion profile by plotting empirical forecasting errors against the conditional volatility predictions generated by the model. The residuals are distributed around the zero-error benchmark without any discernible systematic pattern, suggesting that forecast errors are largely random and that the model does not exhibit substantial bias across different volatility levels. Moreover, the dispersion of residuals remains relatively stable across fitted values, indicating the absence of pronounced heteroskedasticity.
The right panel of
Figure 6 displays the empirical distribution of residuals. The histogram and kernel density estimate show that residuals are centered close to zero and are broadly symmetric around the reference line, suggesting that the model does not systematically overestimate or underestimate stock volatility. While residual dispersion remains evident, the overall error distribution supports the adequacy of the Hybrid GARCH–LSTM framework in generating balanced volatility forecasts. Collectively,
Figure 6 indicates that prediction errors are largely random, although the magnitude of residual variation highlights the inherent difficulty of forecasting stock market volatility.
Figure 7 and
Figure 8 present SHAP-based feature attribution beeswarm plots for the Random Forest and XGBoost volatility forecasting models, respectively. The figures provide insights into both the relative importance of predictor variables and the direction of their influence on volatility forecasts. In both models, most SHAP values are concentrated around zero, indicating that individual predictors exert modest marginal effects on the predicted volatility. Nevertheless, notable differences emerge in the ranking and distribution of influential features.
As shown in
Figure 7, the Random Forest model identifies Lag_Cash_Ratio, Lag_Liquidity_Ratio, and Lag_Board_Independence as the most influential determinants of stock volatility. Higher values of liquidity-related variables generally contribute positively to the model output, while lower values tend to reduce predicted volatility. Governance characteristics, particularly board independence, also display meaningful explanatory power, suggesting that corporate governance conditions play a role in shaping future volatility dynamics. Financial market variables such as Lagged_Return and Lagged_Volatility exhibit comparatively smaller SHAP magnitudes, indicating a more limited contribution within the Random Forest framework.
In contrast,
Figure 8 demonstrates that the XGBoost model assigns the highest importance to Lag_Tobins_Q, followed by Lag_Firm_Age, Lag_Cash_Ratio, and Lag_Leverage. The wider dispersion of SHAP values observed for Tobin’s Q indicates that firm valuation metrics contribute more strongly to volatility predictions under the gradient boosting architecture. The positive SHAP values associated with high Tobin’s Q observations suggest that firms with stronger market valuations are more likely to experience elevated future volatility. Similarly, firm age, leverage, and liquidity measures exhibit nonlinear effects, reflecting the ability of XGBoost to capture complex interactions among firm characteristics.
A comparison of
Figure 7 and
Figure 8 reveals that while both models emphasize liquidity, governance, and firm-specific financial indicators, the XGBoost model places greater weight on market valuation and firm maturity variables, whereas the Random Forest model highlights liquidity and governance attributes. These findings suggest that the determinants of stock volatility are multifaceted and model-dependent, with different machine learning algorithms capturing distinct aspects of the underlying data-generating process. The SHAP analysis enhances the interpretability of the forecasting models by identifying the key drivers of volatility predictions and clarifying how variations in firm-level characteristics influence model outputs.
6. Discussion
The empirical findings provide important insights into the determinants of stock return volatility in MENA markets. The results indicate that liquidity ratio, cash ratio, sales growth, firm age, lagged volatility, and lagged returns are significant determinants of stock volatility under the preferred two-way fixed-effects model with Driscoll–Kraay standard errors. In contrast, leverage, tangibility, board independence, firm size, Tobin’s Q, and profitability (ROA) do not exhibit statistically significant effects after controlling for firm-specific and time-specific heterogeneity.
The results provide mixed evidence regarding the proposed hypotheses. Hypothesis H1a posited that financial structure risk, proxied by leverage and tangibility, is significantly associated with stock return volatility. However, neither leverage nor tangibility is statistically significant in the preferred specification. These findings suggest that financial structure characteristics do not exert an independent influence on stock volatility after controlling for firm-specific heterogeneity, time effects, and cross-sectional dependence. This result contrasts with prior studies that document a significant relationship between leverage and volatility (
Bhandari, 1988;
Christie, 1982), indicating that the influence of financial structure may be less pronounced in the MENA context.
Hypothesis H1b proposed that liquidity risk, proxied by the liquidity ratio and cash ratio, is significantly associated with stock return volatility. The empirical evidence strongly supports this hypothesis. The liquidity ratio exhibits a positive and statistically significant association with stock volatility, whereas the cash ratio displays a negative and statistically significant effect. These findings suggest that different dimensions of liquidity influence stock volatility in distinct ways. Firms with higher current asset positions may face greater uncertainty arising from growth opportunities, investment decisions, or increased investor attention, leading to higher volatility. In contrast, larger cash reserves appear to stabilize stock performance by reducing financial distress concerns and improving firms’ ability to absorb adverse shocks (
Bates et al., 2009;
Opler et al., 1999).
Hypothesis H1c suggested that governance risk, proxied by board independence, is significantly associated with stock return volatility. The results do not support this hypothesis, as board independence is not statistically significant in the preferred model. This finding indicates that board composition may not be a primary determinant of stock volatility within the sampled MENA firms after accounting for other firm-specific characteristics and market dynamics.
Hypothesis H1d proposed that firm performance, proxied by return on assets (ROA), is significantly associated with stock return volatility. The insignificant coefficient on ROA suggests that profitability does not explain stock volatility once firm-specific and time-specific effects are properly controlled for. This finding implies that profitability may play a limited role in explaining volatility dynamics in MENA markets relative to other financial risk dimensions.
Beyond the hypothesized variables, several additional factors emerge as important determinants of stock volatility. The positive and statistically significant coefficient on firm age suggests that older firms experience greater stock return variability. Although this result differs from the conventional expectation that mature firms are less risky, it may reflect the greater visibility, market prominence, and investor attention typically associated with established firms in the MENA region. Similarly, the significance of sales growth indicates that expanding firms may face greater uncertainty regarding future performance and investment opportunities, contributing to increased volatility.
Finally, the highly significant coefficients on lagged volatility and lagged returns confirm the importance of dynamic market effects. The strong persistence observed in lagged volatility provides further evidence of volatility clustering, a well-established characteristic of financial markets (
Engle, 1982;
Bollerslev, 1986). Likewise, the positive association between past returns and current volatility suggests that previous market performance contains valuable information for understanding future volatility dynamics.
The volatility analysis further reinforces the importance of dynamic volatility processes. Among the competing specifications, the EGARCH-t model provides the best fit according to the Akaike Information Criterion (AIC = 0.7704), outperforming both the standard EGARCH model (AIC = 1.6673) and the da) model (AIC = 1.7698). The persistence parameter is highest in the EGARCH-t specification (β = 0.9720), indicating that volatility shocks dissipate slowly and confirming strong persistence over time. Moreover, the asymmetry parameters are positive and substantial in both the EGARCH (γ = 0.4325) and EGARCH-t (γ = 1.0550) models, providing evidence that positive and negative shocks affect volatility differently. This aligns with the broader volatility literature documenting asymmetric behavior in financial markets (
Black, 1976;
Nelson, 1991) and supports the use of asymmetric volatility models.
The forecasting results provide only partial support for the predictive hypotheses. Hypothesis H2a proposed that machine learning models would outperform traditional econometric approaches, while Hypothesis H2b suggested that deep learning models would outperform conventional machine learning techniques. Although deep learning models such as GRU and LSTM achieve slightly better forecasting performance, the Diebold–Mariano tests fail to identify statistically significant differences in predictive accuracy. Therefore, neither H2a nor H2b receives strong empirical support, suggesting that the predictive gains from more sophisticated algorithms remain modest and data-dependent (
Christensen et al., 2021;
Kelly et al., 2019).
Similarly, the evidence does not provide strong support for Hypothesis H3a, which predicted that hybrid GARCH–deep learning models would outperform standalone approaches. While hybrid models produce competitive forecasts, they do not generate statistically significant improvements in predictive accuracy. However, some support is found for Hypothesis H3b, as hybrid frameworks are able to combine volatility dynamics with nonlinear modeling, although their performance remains limited during periods of extreme market volatility. This is consistent with prior research indicating that hybrid models may perform better under normal conditions but struggle during volatility spikes (
Baruník & Křehlík, 2018).
Accordingly, the insignificance of leverage and profitability, combined with the importance of liquidity and dynamic factors, provides important insight into risk pricing in MENA equity markets. The results suggest that investors place greater emphasis on time-varying market conditions and liquidity dynamics than on relatively stable accounting-based indicators. This may be because variables such as leverage and profitability exhibit limited within-firm variation and are largely absorbed by fixed effects, whereas liquidity conditions and past volatility provide more timely signals about changing risk. Consequently, stock volatility appears to be driven by a combination of firm-specific characteristics and evolving market expectations, particularly in emerging markets characterized by institutional changes and heightened exposure to external shocks.
Given the inclusion of lagged explanatory variables, the empirical specification was supplemented with multiple robustness checks, including alternative model specifications, fixed-effects estimations, and robust standard error corrections. The consistency of the findings across these specifications provides additional confidence in the reported results.
7. Conclusions
This study examined whether stock return volatility in MENA equity markets can be explained and predicted using firm-level financial characteristics, volatility models, and advanced forecasting techniques. Using an unbalanced panel dataset covering 1596 firms and 19,752 firm-year observations during the period 2010–2024, the analysis combined panel-data econometric models, GARCH-family volatility models, machine learning algorithms, deep learning architectures, and hybrid forecasting frameworks. The preferred econometric specification was a two-way fixed-effects model with Driscoll–Kraay standard errors, selected on the basis of extensive diagnostic testing.
The results indicate that stock volatility is driven primarily by liquidity conditions and dynamic market effects rather than by traditional measures of capital structure and profitability. Specifically, liquidity ratio, cash ratio, sales growth, firm age, lagged volatility, and lagged returns emerge as significant determinants, whereas leverage, tangibility, board independence, firm size, Tobin’s Q, and return on assets do not remain statistically significant after controlling for firm-specific heterogeneity, time effects, and dependence structures. These findings provide limited support for Hypotheses H1a and H1b and suggest that liquidity management and volatility persistence play a more central role in explaining market risk in MENA equity markets.
The volatility analysis further confirms the importance of dynamic processes. The EGARCH-t model provides the best fit according to the Akaike Information Criterion, indicating that stock return volatility is characterized by both persistence and asymmetry. The high persistence parameters imply that volatility shocks dissipate slowly, while positive asymmetry coefficients indicate that shocks affect volatility differently depending on their sign. These results support the use of asymmetric volatility models over symmetric GARCH specifications.
The forecasting results offer only limited support for the predictive hypotheses. Although deep learning models such as GRU and LSTM show slightly improved performance, the Diebold–Mariano tests do not reveal statistically significant differences across models. As a result, Hypotheses H2a and H2b receive weak empirical support. Similarly, Hypotheses H3a and H3b receive mixed empirical support. Hybrid GARCH–deep learning models do not produce statistically significant improvements over standalone approaches (rejecting H3a), although they demonstrate some ability to integrate volatility dynamics with nonlinear relationships, with reduced effectiveness during periods of extreme volatility (supporting H3b).
These findings contribute to the literature by challenging the generalizability of leverage and profitability as key drivers of stock volatility, while emphasizing the role of liquidity conditions and persistence effects. In addition, the results suggest that increasing model complexity does not necessarily translate into meaningful gains in predictive accuracy, highlighting the practical limitations of advanced forecasting techniques in financial markets.
Several limitations should be acknowledged. The analysis focuses on firm-level determinants and does not explicitly incorporate macroeconomic, political, or behavioral factors that may influence volatility. Moreover, all models exhibit limited ability to predict extreme volatility episodes and sudden market disruptions. As with many dynamic panel specifications, the inclusion of lagged variables may introduce estimation challenges. Nevertheless, the consistency of the findings across alternative specifications suggests that the main conclusions are not driven by a particular model formulation. Future research could extend the analysis by incorporating macroeconomic indicators, sentiment measures, alternative data sources, and regime-switching frameworks capable of capturing structural breaks. Cross-country comparisons may also provide additional insights into institutional differences in volatility dynamics.
In conclusion, the findings indicate that stock volatility in MENA markets is shaped more by liquidity conditions and persistent volatility dynamics than by traditional financial indicators. At the same time, the modest forecasting gains achieved by advanced models underscore the continuing challenges associated with predicting financial market volatility, supporting a balanced approach that integrates firm fundamentals with robust econometric modeling.