Next Article in Journal
Blockchain in the Smart City and Its Financial Sustainability from a Stakeholder’s Perspective
Previous Article in Journal
Properties of VaR and CVaR Risk Measures in High-Frequency Domain: Long–Short Asymmetry and Significance of the Power-Law Tail
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Simulation Framework to Determine Suitable Innovations for Volatility Persistence Estimation: The GARCH Approach

by
Richard T. A. Samuel
1,*,†,
Charles Chimedza
1,† and
Caston Sigauke
2,†
1
School of Statistics and Actuarial Science, University of the Witwatersrand, Private Bag 3, Johannesburg 2050, South Africa
2
Department of Mathematical and Computational Sciences, University of Venda, Private Bag X5050, Thohoyandou 0950, South Africa
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
J. Risk Financial Manag. 2023, 16(9), 392; https://doi.org/10.3390/jrfm16090392
Submission received: 2 June 2023 / Revised: 14 August 2023 / Accepted: 29 August 2023 / Published: 1 September 2023
(This article belongs to the Section Risk)

Abstract

:
This study rolls out a robust framework relevant for simulation studies through the Generalised Autoregressive Conditional Heteroscedasticity (GARCH) model using the rugarch package. The package is thoroughly investigated, and novel findings are identified for improved and effective simulations. The focus of the study is to provide necessary simulation steps to determine appropriate distributions of innovations relevant for estimating the persistence of volatility. The simulation steps involve “background (optional), defining the aim, research questions, method of implementation, and summarised conclusion”. The method of implementation is a workflow that includes writing the code, setting the seed, setting the true parameters a priori, data generation process and performance assessment through meta-statistics. These novel, easy-to-understand steps are demonstrated on financial returns using illustrative Monte Carlo simulation with empirical verification. Among the findings, the study shows that regardless of the arrangement of the seed values, the efficiency and consistency of an estimator generally remain the same as the sample size increases. The study also derived a new and flexible true-parameter-recovery measure which can be used by researchers to determine the level of recovery of the true parameter by the MCS estimator. It is anticipated that the outcomes of this study will be broadly applicable in finance, with intuitive appeal in other areas, for volatility modelling.

1. Introduction

A simulation-based experiment is not often included in research because many upcoming researchers do not have an adequate understanding of the nitty-gritty involved. Although the details involved in simulation modelling are generally inexhaustible, this study, however, unveils a crucial framework relevant for the simulation of financial time series data using the Generalised Autoregressive Conditional Heteroscedasticity (GARCH) model for volatility persistence estimation. Volatility persistence describes the effect of a shock to future expectation of the variance process (see Ding and Granger 1996). The ultimate goal of the study is to familiarise researchers with the concepts of simulation modelling through this model. The framework utilises the robust simulating resources of the GARCH model, through set parameters, to generate data that are analysed, and the estimates from the process are then used by chosen metrics to explain the behaviour of selected statistics of interest.
Monte Carlo simulation (MCS) studies are computer-based experiments that use known probability distributions to create data by pseudo-random sampling. The data may be simulated through a parametric model or via repeated resampling (Morris et al. 2019). MCS applies the concept of imitating a real-life scenario on the computer through a certain model that can hypothetically generate the scenario. By simulating or repeating this process a considerably large number of times, it is possible to obtain outcomes that can enable precise computation of desired issues of concern, such as the possible assumed error distribution/s that can suitably describe a given stock market. In preparing for a simulation experiment, reasonably ample time is needed to organise a well-written and readable computer code and for simulated data generation. Implementation of a good simulation experiment and reporting the outcomes require adequate planning.
Series of R application software packages such as the rugarch (Ghalanos 2022), GAS (Ardia et al. 2019), GRETL (Baiocchi and Distaso 2003), fGARCH (Wuertz et al. 2020), SimDesign (Chalmers and Adkins 2020), tidyverse (Wickham et al. 2019), to mention but a few, are currently available for simulation studies. This study exemplifies how the GARCH model through the rugarch package can be effectively used to improve volatility modelling through a MCS experiment, with outcomes verified empirically. The simulation steps are designed to be reasonably general with the expectation that any other related packages1 should be able to replicate the routine. Although there are good books on simulation approaches in general (see Bratley et al. 2011; Kleijnen 2015), up until now, to the best of our knowledge, there has not been any monograph with a direct step-by-step comprehensive layout on a simulation framework using the GARCH model. Hence, this study rolls out an inclusive simulation design that is summarily required for a robust simulation practice in finance to determine appropriate assumed innovations, relevant for estimating the persistence of volatility using this model, with the knowledge applicable in other fields. Here, the MCS approach is used to obtain the most suitable assumed innovation, through which the volatility persistence is empirically estimated.
Since the rugarch package does not make provision for calculating the coverage probability2, this study also computes the MCS estimator’s recovery levels through the “true parameter recovery (TPR)” measure as a proxy for the coverage. The results show that the MCS estimates considerably recover the true parameters. The raw data used for this study are the daily closing S&P South African sovereign bond index, abbreviated to S&P SA bond index. They are Standard & Poor data for the bond market in US dollars from Datastream (2021) for the period 4 January 2000 to 17 June 2021 with 5598 observations. The rest of the paper is organised as follows: Section 2 reviews the theories underpinning two heteroscedastic models, the TPR measure, and the description of the design of the simulation framework. Section 3 presents the practical illustration of the simulation framework, with empirical verification, on financial bond return data. Section 4 discusses the key findings and Section 5 concludes.

2. Materials and Methods

2.1. The GARCH Model

The GARCH model was developed by Bollerslev (1986) as a generalisation of the Autoregressive Conditional Heteroscedasticity (ARCH) model introduced by Engle (1982). It is a classical model that is normally defined by its conditional mean and variance equations for modelling financial returns volatility (Kim et al. 2020). The mean equation is stated as
r t = μ t + ε t ,
where r t is the return series, ε t = z t σ t denotes the residual part of the return series that is random and unpredictable, where z t N ( 0 , 1 ) are the standardised residuals which are independent and identically distributed (i.i.d.) random variables with mean 0 and variance 1 (McNeil and Frey 2000; Smith 2003), μ t is the mean function that is usually stated as an Autoregressive Moving Average (ARMA) process,
μ t = i = 1 p ϕ i r t i + i = 1 q θ i ε t i ,
where ϕ i (i = 1, , p) and θ i (i = 1, , q) are unknown parameters. The variance equation of the GARCH ( u , v ) model is defined as
σ t 2 = ω + α 1 ε t 1 2 + + α v ε t v 2 + β 1 σ t 1 2 + + β u σ t u 2 ,
where ω > 0 is the intercept (white noise), coefficients α j 0 (j = 1, , v) and β i 0 (i = 1, , u), respectively, measure the short-term and long-term effects of ε t on the conditional variance (Maciel and Ballini 2017). The non-negativity restrictions on the unknown parameters, α j and β i , are imposed for σ t 2 > 0 . The equation shows that the conditional variance σ t 2 is a linear function of past squared innovation ε t j 2 and past conditional variances σ t i 2 . The GARCH model is a more parsimonious specification (Nelson 1991; Samiev 2012) since it is an equivalence of a certain ARCH() model (Zivot 2009). When u = 0 in Equation (3), the GARCH model changes to the ARCH model with conditional variance stated as
σ t 2 = ω + α 1 ε t 1 2 + + α v ε t v 2 .
GARCH(1,1) is the simplest model specification with u = 1 and v = 1 in Equation (3), and it is conceivably the best candidate GARCH model for several applications (Fan et al. 2014; Zivot 2009). The volatility persistence of the GARCH ( 1 , 1 ) model is defined as α 1 + β 1 (see Engle and Bollerslev 1986; Ghalanos 2018). Volatility persistence is used to evaluate the speed of decay of shocks to volatility (Kim et al. 2020). Volatility exhibits long persistence into the future if α + β → 1, hence the closer the sum of the coefficients is to one (zero), the greater (lesser) the persistence. However, if the sum is equal to one, then shocks to volatility persist forever and the unconditional variance is not determined by the model. This process is called integrated-GARCH (Chou 1988; Engle and Bollerslev 1986). If the sum is greater than one, the conditional variance process is explosive, suggesting that shocks to the conditional variance are highly persistent. Covariance stationarity of the GARCH model is ensured when j = 1 v α j + i = 1 u β i < 1 , while the unconditional variance of ε t is σ 2 = E ε t 2 = ω / { 1 ( j = 1 v α j + i = 1 u β i ) } (Zivot 2009).
For the maximum likelihood estimation (MLE), the log-likelihood function for maximising the likelihood of the unknown parameters given the observations is stated as
L ( ϑ | ε ) = t = 1 N 1 2 π σ t 2 exp ε t 2 2 σ t 2 ,
where ϑ = ( μ , ω , α 1 , , α v , β 1 , , β u ) is a vector of parameters, and ε = ( ε 1 , , ε N ) is a realisation of length N. The quasi-maximum likelihood estimation (QMLE) based on the Normal distribution and MLE have the same set of instructions for estimating ϑ ^ ; the only difference, however, is in the estimation of a robust standard deviation of ϑ ^ (see Bollerslev and Wooldridge 1992; Feng and Shi 2017; Francq and Zakoïan 2004).
The maximised log-likelihood function with Student’s t distribution (Duda and Schmidt 2009) is stated as
ln L ( ϑ | ε ) = t = 1 N ln Γ ν + 1 2 Γ ν 2 ( ν 2 ) π 1 2 ln ( σ 2 ) ν + 1 2 ln 1 + ε t 2 σ 2 ( ν 2 ) ,
where Γ ( · ) and ν are the gamma function and degree of freedom, respectively.

2.2. The fGARCH Model

The family GARCH (fGARCH) model, developed by Hentschel (1995), is an inclusive model that nests some important symmetric and asymmetric GARCH models as sub-models. The nesting includes the simple GARCH (sGARCH) model (Bollerslev 1986), the Absolute Value GARCH (AVGARCH) model (Schwert 1990; Taylor 1986), the GJR GARCH (GJRGARCH) model (Glosten et al. 1993), the Threshold GARCH (TGARCH) model (Zakoian 1994), the Nonlinear ARCH (NGARCH) model (Higgins and Bera 1992), the Nonlinear Asymmetric GARCH (NAGARCH) model (Engle and Ng 1993), the Exponential GARCH (EGARCH) model (Nelson 1991), and the Asymmetric Power ARCH (apARCH) model (Ding et al. 1993). The sub-model apARCH is also a family model (but less general than the fGARCH model) that nests the sGARCH, AVGARCH, GJRGARCH, TGARCH, NGARCH models, and the Log ARCH model (Geweke 1986; Pantula 1986). The fGARCH ( u , v ) model is stated as
σ t γ = ω + j = 1 v α j σ t j γ ( | z t j λ 2 j | λ 1 j { z t j λ 2 j } ) δ + j = 1 u β j σ t j γ .
This robust fGARCH model allows different powers for σ t and z t to drive how the residuals are decomposed in the conditional variance equation. Equation (7) is the conditional standard deviation’s Box–Cox transformation, where the transformation of the absolute value function is carried out by the parameter δ , and γ determines the shape. The λ 2 j and λ 1 j control the shifts for asymmetric small shocks and rotations for large shocks, respectively. The fit of the full fGARCH model can be implemented with γ = δ (see Ghalanos 2018). Volatility clustering in the returns can be quantified through the model’s volatility persistence stated as
P ^ = j = 1 u β j + j = 1 v α j ϱ j ,
where ϱ j , expressed in Equation (9), is the expected value of z t in the absolute value asymmetry term’s Box–Cox transformation. Volatility clustering implies that large changes in returns tend to be followed by large changes and small changes tend to be followed by small changes. The persistence is obtained in this study through the “persistence()” function in the R rugarch package. See (Ghalanos 2022; Hentschel 1995) for details on fGARCH and the nested models.
ϱ j = E ( | z t j λ 2 j | λ 1 j ( z t j λ 2 j ) ) δ = ( | z λ 2 j | λ 1 j ( z λ 2 j ) ) δ f ( z , 0 , 1 , ) d z

2.3. The True Parameter Recovery Measure

Since the focus of MCS studies involves the ability of the estimator to recover the true parameter (see Chalmers 2019), this study applies the “true parameter recovery (TPR)” measure in Equation (10) to compute the level (degree) of recovery of the true parameter through the MCS estimator. The TPR measure is a means of evaluating the performance of the MCS estimates in recovering the true parameter. That is, it is used to determine how much of the true parameter value is recovered by the MCS estimator.
TPR = K ( ϑ ϑ ^ ) ϑ × K % ,
where K = 0, 1, 2, , 100 is the nominal recovery level, ϑ is the true data-generating parameter and ϑ ^ is the estimator from the simulated (synthetic) data. For instance, a TPR estimated value of 95% or 100% denotes that the MCS estimator recovers the complete 95% or 100% of the true parameter. This complete recovery of the true parameter ϑ can be achieved by the MCS estimator ϑ ^ when ϑ ^ = ϑ , where ϑ > 0 , such that the TPR outcome equals the given nominal recovery level K (i.e., TPR = K%).

2.4. Simulation Design

The design of the simulation framework includes “background (optional), defining the aim, research questions, method of implementation, and summarised conclusion”. The method of implementation is a workflow that involves writing the code, setting the seed, setting the true parameter/s a priori, data generation process, and performance evaluation through meta-statistics. As summarised by the flowchart in Figure 1, these crucial steps are relevant for successful simulations through the GARCH model. The details of each design step are as follows:

2.4.1. Aim of the Simulation Study

After optionally stating the background that explains crucial underlying facts about the study, the next step is to define the aim of the study, and it must be clearly, concisely and unambiguously stated for the reader’s understanding. The focus of MCS studies generally dwells on estimators’ capabilities in recovering the true parameters ϑ , such that E ( ϑ ^ ) = ϑ for unbiasedness, ϑ ^ ϑ as the sample size N for consistency, and root mean square error (RMSE) or standard error (SE) tends to zero as N for good efficiency or precision of the true parameter’s estimator. Hence, the aim of the study may revolve around those properties, such as bias or unbiasedness, consistency, efficiency or precision of the estimator. The aim can also evolve from comparisons of multiple entities, such as comparing the efficiency of various error distributions, or comparing the performance of multiple models, or on improvement to an existing method.

2.4.2. State the Research Questions

After defining the study’s aim, relevant questions concerning the purpose of the simulation should be outlined. These will be pointers to the objectives of the study. The intricacies of some statistical research questions make them better resolved via simulation approaches. Simulation provides a robust procedure for responding to a wide range of theoretical and methodological questions and can offer a flexible structure for answering specific questions pertinent to researchers’ quest (Hallgren 2013).

2.4.3. Method of Implementation

The simulation and empirical modelling of this study are implemented in R Statistical Software, version 4.0.3, with RStudio version 2022.12.0+353, using the rugarch (Ghalanos 2018, 2022), SimDesign (Chalmers and Adkins 2020), tidyverse (Wickham et al. 2019), zoo (Zeileis and Grothendieck 2005), aTSA (Qiu 2015) and forecast (Hyndman and Khandakar 2008) packages. Computation is executed on Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz 1.80 GHz. The method of implementing the simulation is as follows:
  • Write the code: Carrying out a proper simulation experiment that mirrors real-life situations can be very demanding and computationally intensive, hence readable computer code with the right syntax must be produced. The code in this study is written to fit the true model3 to the real data to obtain the true parameter representations4 for the MCS. These true parameter values and other outputs from the fit are used in the code to generate simulated datasets that are analysed to obtain the MCS estimators. The standard errors of the estimates are also obtained in the process.
  • Set the seed: Simulation code will generate a different sequence of random numbers each time it is run unless a seed is set (Danielsson 2011). A set seed initialises the random number generator (Ghalanos 2022) and ensures reproducibility, where the same result is obtained for different runs of the simulation process (Foote 2018). The seed needs to be set only once, for each simulation, at the start of the simulation session (Ghalanos 2022; Morris et al. 2019), and it is better to use the same seed values throughout the process (Morris et al. 2019).
Now, through the GARCH model, this study carries out an MCS experiment to ascertain whether the seed values’ pattern or arrangement affects the estimators’ efficiency and consistency properties. Two sets of seeds are used for the experiment, where each set contains three different patterns of seed values. The first set is S 1 = {12345, 54321, 15243}, while the second set S 2 = {34567, 76543, 36547}. In each set, the study tries to use seed values arranged in ascending order, then reverses the order, and finally mixes up the ordered arrangement. The simulation starts by using GARCH(1,1)-Student’s t, with a degree of freedom ν = 3, as the true model under four assumed error distributions of a Normal, Student’s t, Generalised Error Distribution (GED) and Generalised Hyperbolic (GHYP) distribution. Details on these selected error distributions can be seen in Ghalanos (2018) and Barndorff-Nielsen et al. (2013). The true parameter values used are ( μ , ω , α , β ) = (0.0678, 0.0867, 0.0931, 0.9059), and they are obtained by fitting GARCH(1,1)-Student’s t to the SA bond return data.
Using each of the seed patterns in turn, simulated datasets of sample size N = 12,000, repeated 1000 times are generated through the parameter values. However, because of the effect of initial values in the data generating process, which may lead to size distortion (Su 2011), the first N = {11,000, 10,000, 9000, 8000} sets of observations are each discarded at each stage of the generated 12,000 observations to circumvent such distortion. That is, only the last N = {1000, 2000, 3000, 4000} are used under each of the four assumed error distributions, as shown in Table A1, Appendix A. These trimming steps are carried out following the simulation structure of Feng and Shi (2017)5. An observation-driven process such as the GARCH can be size-distorted with regards to its kurtosis, where strong size distortion may be a result of high kurtosis (Silvennoinen and Teräsvirta 2016). The extracts of the RMSE and SE outcomes for the GARCH volatility persistence estimator α ^ + β ^ are shown in Table A1. For S 1 in Panel A of the table, as N tends to its peak, the performance of the RMSE from the lowest to the highest under the four error distribution assumptions is Student’s t, GHYP, GED and Normal in that order, while that of SE from the lowest to the highest is GHYP, Student’s t, GED and Normal in that order, for the three arrangements of seed values.
For S 2 in Panel B of the table, as N reaches its peak at 4000, the performance of the RMSE from the lowest to the highest is Student’s t, GHYP, GED and Normal in that order, while that of SE from the lowest to the highest is GHYP, GED, Student’s t and Normal in that order, for the three S 2 patterns of seed values. Hence, efficiency and precision in terms of RMSE and SE are the same as the sample size N becomes larger under the three seeds, regardless of the arrangement of the seed values under S 2 , as also observed under S 1 . In addition, the flows of N consistency of the estimator under the seed values in S 1 are roughly the same; this is also applicable to those of the seed values in S 2 . The plotted outcomes can be visualised as displayed by the trend lines within the 95% confidence intervals in Figure 2 for the three seed values of sets S 1 in Panel A and S 2 in Panel B, where the efficiency and consistency outcomes are roughly the same with increase in N.
To summarise, this study observes that, as N , the pattern or arrangement of the seed values does not affect the estimator’s overall consistency and efficiency properties, but this may likely depend on the quality of the model used. The seed is primarily used to ensure reproducibility. Panels C and D of the figure further reveal that the RMSE/SE → 0 as N for the four error distributions in S 1 and S 2 .
Table A1 further shows that the MCS estimator α ^ + β ^ considerably recovers the true parameter α + β at the 95% nominal recovery level, where some of the estimates even recover the complete true value (0.9990) with TPR outcomes of 95%. These recovery outcomes can be seen in the visual plots of Figure 3 (or as shown in Panels A and B of Figure A1, Appendix B), where Panels A(i) and B(i) reveal that the MCS estimates perform quite well in recovering the true parameter as shown by the closeness of the TPR outcomes to the 95% (i.e., 0.95) nominal recovery level for S 1 and S 2 , respectively. The bunched up TPR outcomes in Panels A(i) and B(i) are clearly spread out as shown in Panels A(ii) and B(ii) for S 1 and S 2 , respectively. From these recovery outputs, two distinct features can be observed. First, the TPR results do not depend on the sample size as shown in Panels A and B of Figure 4 for S 1 and S 2 , which is a feature of coverage probability (see Hilary 2002); second, the closer (farther) the MCS estimate is to zero, the smaller (larger) the TPR outcome, as revealed in Panels C and D of the figure.
  • After setting the seed, the true parameter representations of the true sampling distribution (or true model) are then set a priori (Koopman et al. 2017; Mooney 1997).
  • Next, simulated observations are generated using the true sampling distribution or the true model given some sets of (or different sets of) fixed parameters. Generation of simulated datasets through the GARCH model is carried out using the R package “rugarch”. Random data generation involving this package can be implemented using either of two approaches. The first approach is to carry out the data-generating simulation directly on a fitted object “fit” using the ugarchsim function for the simulated random data. The second approach uses the ugarchpath function, which enables simulation of desired number of volatility paths through different parameter combinations (see Ghalanos 2018, 2022; Pfaff 2016 for relevant details on the two functions and their usage).
The simulation or data-generating process can be run once or replicated multiple times. This study carries out another MCS investigation through the GARCH model to determine the effect (on the outcomes) of running a given GARCH simulation once or replicating it multiple times. That is, for a given sample size and seed value, the outcome of running the simulation once is compared to that of running it with different replications such as 2500, 1000 and 300. This MCS experiment uses GARCH(1,1)-Student’s t, with ν = 3, as the true model under four assumed error distributions of a Normal, Student’s t, GED and GHYP. However, it should be understood that any non-normal error distributions (apart from the Student’s t that is used here) can also be used with GARCH(1,1) model for the true model. The GARCH(1,1)-Student’s t fitted to the SA bond return data yields the true parameter values ( μ , ω , α , β ) = (0.0678, 0.0867, 0.0931, 0.9059).
Using these parameter values, datasets of sample size N = 12,000 are generated in each of the four distinct simulations (i.e., simulations with 1, 2500, 1000 and 300 replicates). After necessary trimmings in each simulation, to evade initial values effect, the last N = {1000, 2000, 3000} sets of observations are used at each stage of the generated 12,000 observations under the four assumed innovation distributions. That is, datasets of the last three sample sizes, each simulated once, then replicated L = {2500, 1000, and 300} times are consecutively generated. From the modelling outputs, it is observed that the log-likelihood (llk), RMSE, SE and bias outcomes of α ^ , β ^ and α ^ + β ^ estimators for each simulation under the four assumed errors are the same for the three sample-size datasets with the same seed value, regardless of whether the simulation is run once or replicated multiple times. For brevity, this study only displays the outcomes of the experiment under the assumed GED error for each run in Table 1. However, increasing the number of replications may reduce sampling uncertainty in meta-statistics (Chalmers and Adkins 2020).
  • The generated (simulated) data are analysed, and the estimates from them are evaluated using classic methods through meta-statistics to derive relevant information about the estimators. Meta-statistics (see Chalmers and Adkins 2020) are performance measures or metrics for assessing the modelling outputs by judging the closeness between an estimate and the true parameter. A few of the frequently used meta-statistical summaries, as described below, include bias, root mean square error (RMSE) and standard error (SE). For more meta-statistics, see Chalmers and Adkins (2020); Morris et al. (2019); Sigal and Chalmers (2016).

Bias

The bias, on average, measures the tendency of the simulated estimators ϑ ^ to be smaller or larger than their true parameter value ϑ . It is defined as the average difference between the true (population) parameter and its estimate (Feng and Shi 2017). The optimal value of bias is 0 (Harwell 2018; Sigal and Chalmers 2016). An unbiased estimator, on average, yields the correct value of the true parameter. Bias with a positive (negative) value indicates that the true parameter value is over-estimated (under-estimated). However, in absolute values, the closer the estimator is to 0, the better it is. Bias is stated mathematically as E ( ϑ ˜ ϑ ) , but can be presented in MCS (see Chalmers 2019) as b i a s = 1 L i = 1 L ( ϑ i ^ ϑ ) . The two formulae are connected as
E ( ϑ ˜ ϑ ) = E ( ϑ ˜ ) ϑ = 1 L i = 1 L ϑ ^ i ϑ = 1 L i = 1 L ( ϑ ^ i ϑ ) ,
where ϑ ˜ = ϑ ^ i ( i = 1 , , L ) is a finite ith number of the sample estimate ϑ ^ for the datasets, L is the number of replications, and ϑ is the true parameter.

Standard Error

Sampling variability in the estimation can be evaluated via the standard error (SE) as stated (see Chalmers 2019; Yuan et al. 2015) in Equation (12). Also called Monte Carlo standard deviation, it is a measure of the efficiency or precision of the true parameter’s estimator, which is used to estimate the long-run standard deviation of ϑ ^ i for finite repetitions. It does not require knowing the true parameter ϑ but depends on its estimator ϑ ^ i only. The smaller the sampling variability, the more the efficiency or precision of the ϑ ’s estimator (see Morris et al. 2019). Sampling variability decreases with increased sample size (Sigal and Chalmers 2016).
SE = 1 L i = 1 L ( ϑ ^ i ϑ ¯ ) 2 , where ϑ ¯ = i = 1 L ϑ ^ / L .

RMSE

The root mean square error (RMSE) is an accuracy measure for evaluating the difference between a model’s true value and its prediction. RMSE measure indicates the sampling error of an estimator when compared to the true parameter value (Sigal and Chalmers 2016) and it is stated as
RMSE = 1 L i = 1 L ϑ ^ i ϑ 2 .
Its computation involves the true parameter ϑ . An estimator with lesser RMSE is more efficient in recovering the true parameter value (Sigal and Chalmers 2016; Yuan et al. 2015), and minimum RMSE produces maximum precision (Wang et al. 2018). Consistency of the estimator occurs when RMSE decreases such that ϑ ^ ϑ as the sample size N (Ghalanos 2018; Morris et al. 2019). RMSE is related to bias and sampling variability as
RMSE = bias 2 + SE 2 .
That is, the RMSE is an inclusive measure that combines bias and SE, such that low SE can be penalised for bias. The mean squared error (MSE) is obtained by squaring the RMSE. MCS is highly reliant on the law of large numbers, and it is expected that the distribution of an appropriately large sample should converge to that of the underlying population as the sample size increases (Gilli et al. 2019). It is also expected that the Monte Carlo sampling error should decrease as the sample size increases, but this is not always the case. That is, the sample size cannot always be sufficiently increased to limit the sampling error to a tolerable level (Gilli et al. 2019).

2.4.4. Discussion and Summary

After implementing the method, the last stage in the framework steps is the conclusion, which needs to reflect a summary discussion of all logical findings from the experiments, with answers to the research questions. The conclusion brings out the novelty of the research and may also include the limitations experienced and opportunities for future work. In addition, relevant information on simulation results can be conveyed through graphics, tabular presentation, or both.

3. Results: Simulation and Empirical

3.1. Practical Illustrations of the Simulation Design: Application to Bond Return Data

By way of illustration, this section practically describes how the stated steps can be applied using Monte Carlo simulations with a real data empirical verification.

3.1.1. The Background

It is believed that observation-driven models can appropriately estimate volatility when fitted with a suitable error distribution (Bollerslev 1987). Observation-driven modelling exists in the presence of time-varying parameters, where parameters are functions of lagged dependent variables, concurrent variables and lagged exogenous variables (see Buccheri et al. 2021; Creal et al. 2013 for details). Data generation using the rugarch package can be performed through a variety of models that include the simple GARCH, the exponential GARCH (EGARCH), the GJR-GARCH, the Component GARCH (CGARCH) (Lee and Engle 1999), the Multiplicative Component GARCH (MCGARCH) (Engle and Sokalska 2012), among others, and two omnibus models apARCH and fGARCH (as described in Section 2.2).
The apARCH model is less robust than the fGARCH model (Ghalanos 2018), hence the latter is used for the data generation in this study. Specifically, fGARCH(1,1) model is used as the true data-generating process (DGP) for the MC simulation because the first lag of conditional variability can considerably capture volatility clustering existing in the time series data. In other words, the dependence of volatility on recent past activities is more than on distant past activities (Javed and Mantalos 2013). Hence, this illustrative study showcases the effectiveness of the observation-driven model fGARCH for estimating the persistence of volatility, where the outcomes of the model fitted with each of ten assumed innovation distributions of the Normal, skew-Normal, Student’s t, skew-Student’s t, GED, skew-GED, GHYP, Normal Inverse Gaussian (NIG), Generalised Hyperbolic Skew-Student’s t (GHST) distribution and Johnson’s reparametrised SU (JSU) distribution are compared. Details on the error distributions can be found in Ashour and Abdel-hameed (2010); Azzalini (1985); Azzalini and Capitanio (2003); Barndorff-Nielsen et al. (2013); Branco and Dey (2001); Eling (2014); Ghalanos (2018); Lee and Pai (2010); Pourahmadi (2007).
The DGP fGARCH(1,1) model, as stated in Equation (15), is used to generate simulated return observations using the non-Normal Student’s t error with ν = 4.1 as the true error distribution.
σ t γ = ω + α 1 σ t 1 γ ( | z t 1 λ 21 | λ 11 { z t 1 λ 21 } ) δ + β 1 σ t 1 γ
Here, a Student’s t with shape parameter or degree of freedom ν = 4.1 is used to ensure that E [ z t 4 ] < , which enables N consistency of the QML estimation following the assumption of Francq and Thieu (2019) (see Hoga 2022). Moreover, the Student’s t distribution is used as the true error distribution in this study because it can suitably deal with leptokurtic or fat-tailed features (Duda and Schmidt 2009; Lin and Shen 2006) experienced in financial data (Hentschel 1995), and it is also assumed that stock prices appear to have a distribution much like the Student’s t (Heracleous 2007). However, based on relevance and research needs, users may choose to use any leptokurtic distributions, such as the GED or others, for their data generation. Simulation through the rugarch package can be carried out using the ugarchsim6 and ugarchpath functions, but not all the stated data-generating models currently support the use of ugarchpath method (see Ghalanos 2018). Hence, this illustration is implemented using the ugarchsim function through the “fit object” approach. The ugarchsim function has been used in Shahriari et al. (2023); Søfteland and Iversen (2021); Zhang (2017) as it gives the user flexibility and control.
Further background study reveals the findings of Morris et al. (2019), where the authors showed that RMSE is more applicable as a performance measure where the objective of the simulation is prediction rather than estimation. The authors discussed how more sensitive RMSE is to the choice of the number of observations used during method comparisons than when only SE or bias is used. Hence, for fairness in performance assessments, the SE is used as the key metric or measure of efficiency (precision) in this illustrative study.
It is also noticed from the outcomes of the fGARCH modelling that two sets of standard error (SE) estimates are returned. That is, the default MLE SE and the robust QMLE SE (Ghalanos 2018; White 1982; Zivot 2013). This study used the robust fGARCH QMLE SEs for the simulation illustrations because they are claimed to be consistent (but not efficient) and asymptotically normally distributed if the volatility and mean equations are well specified (Bollerslev and Wooldridge 1992; Wuertz et al. 2020).

3.1.2. Aim of the Simulation Study

This study aims at obtaining the most appropriate assumed error distribution for volatility persistence estimation when the underlying (true) error distribution is unknown.

3.1.3. Research Questions

This simulation study should result in responses to the following questions:
  • Which among the assumed error distributions is the most appropriate from the fGARCH process simulation for estimating the persistence of the volatility?
  • Financial data are fat-tailed (Li 2008), i.e., non-Normal. Hence, will the combined volatility estimator α ^ + β ^ of the most suitable error assumption still be consistent under departure from Normal assumption?
  • What type (i.e., strong, weak or inconsistent) of N consistency, in terms of RMSE and SE, does the fGARCH estimator α ^ + β ^ exhibit?
  • How is the performance of the MCS estimator α ^ + β ^ in recovering the true parameter?

3.1.4. Method of Implementation

To initiate the implementation method, the written code in Appendix C is first used to fit the true model fGARCH(1,1)-Student’s t to the SA bond return data (BondDataSA) through the ugarchfit function of the fGARCH fit object. Next, through the ugarchsim function, using seed 12345 in the code, the outputs from the fit are set (or used) a priori as the true parameter values ( α , β ) = (0.0748, 0.9243) for the simulation process as shown in Table 2. These parameter values with other estimates from the fit object are used directly to generate (simulate) N = 15,000 sample size observations, replicated 1000 times. However, after trimming down the simulated dataset, following the simulation structure of Feng and Shi (2017), to prevent the effect of initial values, by removing the first N = {7000, 6000, 5000} sets of observations at each stage of the simulated 15,000 observations, the last N = {8000, 9000, 10,000} observations are processed under each of the ten assumed innovation distributions as shown in Table 2. For brevity, the presented code in Appendix C only shows the command lines for the first stage of the simulated data, with the trimming. This briefly illustrates how the 15,000 observations are generated through the ugarchsim function and then trimmed down to 8000. The remaining two stages (i.e., N = 9000 and 10,000) of the data generation and trimmings are run following this same pattern.
Figure 5 displays the visual outlooks of the simulated returns and volatilities for the first three series in the 1000 replicated series for N = 8000. These sampled visuals show that each of the 1000 replicated series of the simulated (synthetic) data has a unique randomness and shape that make them different from every other series. Hence, the estimate from the family GARCH simulation is the average of all the estimates from the different replicated series.
After generating simulated observations, the fGARCH(1,1) model is fitted to each simulated dataset under the ten error assumptions as shown in the code. However, for brevity in the written code, the fGARCH(1,1) model is only fitted under the Normal error using the “distribution.model = norm” argument in the ugarchspec function. All the other error assumptions can be fitted in the same pattern by simply replacing the “norm” with the naming convention of the relevant error distribution, e.g., “snorm” for skew-Normal, ”std“ for Student’s t, “sstd” for skew-Student’s t (see Ghalanos 2018 for details, and the complete code can be found at https://github.com/rsamuel11 accessed on 14 August 2023). The parsimonious ARMA(1,1) model is also used in the code as the most suitable among the tested candidate ARMA models to remove serial correlation in the simulated observations. However, consistency can still be achieved in simulation modelling regardless of correlated sample draws. That is, the sampled variates do not need to be independent to achieve consistency (Chib 2015).
Next, the selected meta-statistics are now used to evaluate the estimators. The most suitable assumed error distribution for estimating the persistence will be obtained from the estimator/s with the best precision and efficiency from the meta-statistical comparisons made under all the selected error assumptions. Three meta-statistical summaries that include the bias, RMSE and SE, are used in this illustration. The computation of the metrics is direct but may sometimes be nerve-racking, and manual programming may even cause unanticipated coding errors and other abrupt setbacks. To circumvent this, SimDesign statistical package (Chalmers and Adkins 2020; Sigal and Chalmers 2016) with in-built meta-statistical functions for computational accuracy is used in this illustration, beginning from bias estimation. The log-likelihood (llk) of the estimates, with the RMSE, bias and SE for α ^ , β ^ and α ^ + β ^ estimators are displayed in Table 2, but SE is the key performance measure for efficiency and precision.
Now comparing RMSE for α ^ , the results from the table show that both the skew-GED and skew-Student’s t outperform the other assumed innovations in efficiency with the least values as N tends to the peak at 10,000. For β ^ , the JSU, followed by the GHST and NIG, outperforms the rest of the innovation assumptions in efficiency with the least RMSE value as N tends to the peak. For α ^ + β ^ , the GHST followed by the skew-Student’s t outperform the remaining eight innovation assumptions as N tends to the peak, but the skew-Student’s t is the best as the sample size reaches the middle at N = 9000 for both α ^ + β ^ and β ^ .
Comparing bias for α ^ , as N approaches the peak, the absolute values of biases for the GED and skew-Student’s t outperform the rest. For β ^ , the JSU and the true innovation Student’s t both take the lead as N reaches the peak. For α ^ + β ^ , the JSU followed by the GHYP outperform the other innovations in absolute values of biases as N reaches the peak.
For precision and efficiency comparison in terms of the key performance metric SE, the skew-Student’s t relatively outperforms the others in efficiency and precision as N tends to the peak for α ^ , β ^ and α ^ + β ^ , in particular, for β ^ and α ^ + β ^ . Finally, for the llk comparison, the GHYP outperforms the rest, with the largest estimates at the three N sample sizes. To summarise, when the true innovation is Student’s t, the skew-Student’s t assumed innovation distribution relatively outperforms the other nine innovation assumptions in efficiency and precision, while the GHYP performs better than the rest through the log-likelihood. It is observed here that the SEs of α ^ , β ^ and α ^ + β ^ estimators for the assumed Normal innovation distribution are the largest when compared with those of the other nine assumed innovation distributions. This justifies the claim that the QMLE of the family GARCH model (with Normal innovation) is inefficient. Furthermore, it is observed from the tabulated outputs that the RMSE and SE of the estimators are considerably N consistent in recovering the true parameters under the assumed innovations. The visual illustrations of the consistency for the outputs of α ^ + β ^ estimator are graphically displayed in Figure 6. The figure shows that the closer the absolute values of biases are to zero, the closer the SE is to the RMSE. Whenever bias drifts away from zero, the gap between SE and RMSE widens, but if otherwise, then their trend lines closely follow the same trajectory. The visual plots also show that RMSE and SE decrease as N increases, but bias is independent of N.
It is also observed from the table and as shown in Panels A and B of Figure 7 that the MCS estimates for the estimator α ^ + β ^ considerably recover the true (volatility) parameter value of 0.9991, with TPR outcomes closely clustered around the 95% (i.e., 0.95) nominal recovery level under the ten error assumptions. This indicates a good performance of the MCS experiments with suitably valid outputs. However, the non-Normal errors perform slightly better in the recovery than the Normal and skew-Normal errors, as clearly revealed in Panel B. It can also be seen from the tabular results that the TPR outcomes are independent of N, and the closer the MCS estimate is to zero, the smaller the TPR estimate.

3.2. Empirical Verification

Next, the outcomes of the MCS experiments empirically verified using the real return data from the SA bond market index. Among the ten assumed error distributions, the most appropriate for the fGARCH process to estimate the volatility persistence of the bond market’s returns is examined. For the market index, the price data are transformed to the log-daily returns by taking the difference of logarithms of the price, expressed in percentage as
r t = ln P t P t 1 × 100 .
The P t and P t 1 are the closing bond price index at time t and the previous day’s closing price at time t 1 , respectively; r t is the current return, and ln represents the natural logarithm.

3.2.1. Exploratory Data Analysis

To start with, the price index and returns are first inspected through exploratory data analysis (EDA) as displayed in Figure 8. The EDA visually sheds light on the content of the dataset to reveal relevant information and potential outliers. Figure 8 unearths some downswings or steep falls in the volatility of price (in plot a) and returns (in plot c) around the years 2002, 2008, 2016 and 2020. The most recent as shown by the plots for 2020 was due to the global COVID-19 pandemic.
For further inspection, the figure is now separated into two panels: left and right. The left panels contain plots a, b, e, f for daily bond prices, while the right panels consist of plots c, d, g, h for the returns. The left panels reveal non-stationarity in the price index as observed in the price series plot, the density plot, quantile–quantile (QQ) plot and the box plot. On the other hand, the right panels show stationarity in the returns through the return series plot, the density plot, the QQ plot and the box plot. These summarily elucidate the non-stationarity in the SA daily bond prices and stationarity in the returns.

3.2.2. Tests for Serial Correlation and Heteroscedasticity

Next, linear dependence (or serial correlation) and heteroscedasticity are filtered out by fitting ARMA-fGARCH models with each of the ten innovation distributions to the stationary return series. The ARMA(1,1) model, as stated in Equation (17), is found to be the most adequate, among all the examined candidate ARMA(p,q) models, to remove serial correlation from the SA bond market’s return residuals. Table 3 presents the outcomes of the Weighted Ljung–Box (WLB) test (see Fisher and Gallagher 2012 for details) for fitting the ARMA(1,1) model. The p-values of the test at lag 5 all exceed 0.05 under each error distribution. Based on this, we fail to reject the null hypothesis of “no serial correlation” in the SA bond market’s returns. This means there is no evidence of autocorrelation in the return residuals.
r t = ϕ 0 + ϕ 1 r t 1 + φ 1 ε t 1 + ε t
Following the filtering of linear dependence in the return series, Engle’s ARCH test (see Engle 1982) is carried out using the Lagrange Multiplier (LM) and Portmanteau-Q (PQ) tests to check for the presence of heteroscedasticity or ARCH effects in the residuals. These tests are implemented based on the null hypothesis of homoscedasticity in the residuals of an Autoregressive Integrated Moving Average (ARIMA) model. Both tests’ outcomes show highly significant p-values of 0 as shown in Figure 9. Hence, the null hypothesis of “no ARCH effect” in the residuals is rejected, which denotes the existence of volatility clustering. Based on this, a heteroscedastic model can be fitted to remove the ARCH effects in the series. To achieve this, the candidate robust fGARCH ( u , v ) models, with each of the ten error distributions, are fitted to the SA bond returns, where the fit of the parsimonious fGARCH(1,1) model as shown in Equation (18) is found to be the most suitable.
σ t γ = ω + α 1 σ t 1 γ ( | z t 1 λ 21 | λ 11 { z t 1 λ 21 } ) δ + β 1 σ t 1 γ .
After fitting the fGARCH model to the returns, the weighted ARCH LM test is used to ascertain if ARCH effects have been filtered out. The p-value of the “ARCH LM statistic (7)” at lag 7 in Table 3 is greater than 5% under each of the ten innovation distributions. Hence, this indicates that heteroscedasticity is filtered out since we fail to reject the null hypothesis of “no ARCH effect” in the residuals. These outcomes show that the variance equation is well specified.

3.2.3. Selection of the Most Suitable Error Distribution

Next, selection of the most suitable assumed error distribution to describe the market’s returns, when fitted with the fGARCH model for volatility persistence estimation, is obtained from Table 3. It is observed from the table that all, but two, of the fGARCH volatility parameter estimates ( ω ^ , α ^ , β ^ , λ ^ 11 , λ ^ 21 and γ ^ ) under the ten innovation assumptions are statistically significant at 1% level. This means that these parameters are actively needed in the model. The two exceptions are the insignificant ω ^ for the Normal and skew-Normal, and the estimates λ ^ 21 that are mostly not significant or barely significant. The strongly significant λ ^ 11 indicates the dominance of asymmetric large shocks in the return series.
Comparisons of the error distributions are carried out using the log-likelihood and four information criteria that include the Akaike information criterion (AIC), Bayesian information criterion (BIC), Hannan–Quinn information criterion (HQIC) and Shibata information criterion (SIC) (see Ghalanos 2018 for details). The largest log-likelihood value with the smallest values of the information criteria under a given assumed innovation indicates that it is the most appropriate innovation distribution to describe the market for volatility persistence estimation.
It is observed from Table 3 that the values of all four information criteria are smallest under the skew-Student’s t innovation distribution, but the GHYP innovation has the highest log-likelihood value. Hence, the skew-Student’s t is the most suitable innovation assumption strictly based on the information criteria, while the GHYP is the most appropriate if the decision is made using the log-likelihood. The GHYP and skew-Student’s t also yield better goodness of fit (GoF) outcomes when compared with the remaining eight errors, as shown by their large p-values in the table, which shows that they are the best fit among the ten error assumptions for the distribution of the SA bond’s return residuals. Hence, the volatility persistence of the SA bond market’s returns can be most suitably estimated through the ARMA(1,1)-fGARCH(1,1) model fitted with the GHYP or skew-Student’s t assumed error distribution. These empirical results are consistent with the Monte Carlo simulation outcomes. The estimated volatility persistence under these most suitable error distributions are 0.9795 for the GHYP and 0.9792 for skew-Student’s t. Hence, this indicates that the volatility of the SA bond market’s returns is considerably highly persistent.
This study also checked the empirical outcomes of fitting the less omnibus apARCH(1,1) model to the bond return data and we arrived at the same results (of skew-Student’s t through information criteria and GHYP via log-likelihood) obtained by the fGARCH(1,1) model (see Table 4). The table only shows the outcomes of the log-likelihood and information criteria for brevity.
For the run-time, it is observed that the skew-Student’s t is about four (approximately 4.2) times faster than the GHYP for both simulation and empirical modelling. That is, it takes the GHYP about four times the computational time it takes the skew-Student’s t to run the same process. Since the empirical and simulation run-times are approximately the same, we only present the empirical run-times for the ten innovations in Table 3 to conserve space. For both simulation and empirical runs, the GHYP has the highest runtime among the ten innovations, followed by the NIG, while the Normal has the least.
From the outputs of the ARMA(1,1)-fGARCH(1,1) model in Table 3, the mean and variance (from the conditional standard deviation’s Box–Cox transformation in Section 2.2) equations of the model fitted with each of the GHYP and skew-Student’s t are stated as
With GHYP : r t = μ t + ε t = 0.0156 + ε t σ t γ = ω + α 1 σ t 1 γ ( | z t 1 λ 21 | λ 11 { z t 1 λ 21 } ) δ + β 1 σ t 1 γ σ t 1.2086 = 0.0267 + 0.0661 σ t 1 1.2086 ( | z t 1 0.0942 | 0.3370 { z t 1 0.0942 } ) 1.2086 + 0.9241 σ t 1 1.2086
With skew-Student s   t : r t = μ t + ε t = 0.0177 + ε t σ t γ = ω + α 1 σ t 1 γ ( | z t 1 λ 21 | λ 11 { z t 1 λ 21 } ) δ + β 1 σ t 1 γ σ t 1.2058 = 0.0270 + 0.0661 σ t 1 1.2058 ( | z t 1 0.0943 | 0.3445 { z t 1 0.0943 } ) 1.2058 + 0.9236 σ t 1 1.2058

4. Discussion and Summarised Conclusions

In conclusion, it is observed that using the uGARCHsim approach for the MCS illustration, the GHYP and skew-Student’s t evolve as the most suitable assumed error distributions to reckon with and use with the fGARCH model for volatility persistence estimation of the SA bond returns when the underlying error distribution is unknown. These outcomes are verified empirically. The estimated persistence of volatility under these most suitable error distributions are 0.9795 for the GHYP and 0.9792 for skew-Student’s t. Hence, this indicates considerably high volatility persistence in the SA bond market’s returns. It is also observed that the use of the ugarchsim approach provided considerably consistent estimates (and recovery) of the true data-generating parameters.
The conclusion under this section continues by providing answers to the four research questions. In this study, consistency is termed “strong” when the estimator’s RMSE/SE value decreases as the sample size N increases without distortion. Otherwise, it is weak. Now, answering the questions: first, the GHYP and the skew-Student’s t distributions are the most appropriate among the stated assumed error distributions from the fGARCH process simulation for the volatility persistence estimation. Second, the volatility estimator α ^ + β ^ for each of the most suitable assumed errors GHYP and skew-Student’s t is strongly N consistent for both RMSE and SE under departures from the Normal assumption as revealed in Panels G and D of Table 2. Third, there are strong N consistencies for the RMSE and SE of the fGARCH estimator α ^ + β ^ under all, but one, of the ten assumed error distributions as shown in Table 2. The lone exception, however, is the weak consistency in the SE of the Normal assumption. Fourth, as a proxy for the coverage of the MCS experiment, the MCS estimator α ^ + β ^ performed well in recovering the true parameter α + β through the TPR measure for the 95% nominal recovery level as revealed in Table 2 and Figure 7. The results show that the TPR outcomes are suitably close to the 95% nominal recovery level under the ten error assumptions.

5. Conclusions

This study showcases a robust step-by-step framework for a comprehensive simulation by presenting the functionalities of the rugarch package in R for simulating and estimating time-varying parameters through the family GARCH observation-driven model. The framework hands out an organised approach to a Monte Carlo simulation (MCS) study that involves “background (optional), defining the aim, research questions, method of implementation, and summarised conclusion”. The method of implementation is a workflow that includes writing the code, setting the seed, setting the true parameters a priori, data-generation process, and performance assessment through meta-statistics.
This novel, easy-to-understand framework is illustrated using financial return data; hence, users can easily use it for effective MCS studies. With the uGARCHsim simulation approach involved in the modelling, the implementation method is clearly explained with relevant details. Key observations are identified, and novel findings brought to light. The framework also outlays clear coding guidelines for data generation using the package, since data generation is without a doubt an integral part of MCS studies. The key observations and novel findings in this study include, first, it is shown in the experiment that as the sample size N becomes larger, the consistency and efficiency properties of an estimator in a Monte Carlo process are generally not affected by the pattern or arrangement of the seed values, but this may depend on the quality of the model used. Hence, regardless of the arrangement of the seed values, the efficiency and consistency of an estimator generally remain the same as N tends to infinity.
Second, it is investigated and revealed in this study that the outcomes of the GARCH MCS experiments are the same regardless of whether the simulation or data generating process is run once or replicated multiple times. Third, this study derived a “true parameter recovery (TPR)” measure as a proxy for the coverage of the MCS experiment. This new (original) novel measure is flexible to apply and can henceforth be used by upcoming researchers to determine the level of recovery of the true parameter value by the MCS estimates. It is also observed that the volatility estimator of the used fGARCH model displays considerably strong N consistency.
Lastly, the outcomes of the illustrative study show that the GHYP and skew-Student’s t errors are the most suitable among the ten assumed innovations to describe the SA bond returns for volatility persistence estimation. The fit of these two error assumptions with the fGARCH model revealed considerably high volatility persistence in the returns. On a wider scale, since volatility is a practical measure of risk, the fit of the GHYP and skew-Student’s t errors with a specification of the fGARCH (or, apARCH) model for a robust volatility modelling may benefit financial institutions and markets by enhancing the accuracy of their risk estimations. This could potentially lead to a significant reduction in asset losses. Moreover, it is documented that shocks with a permanent influence on the variance will have a greater effect on price than those with temporary influence (Arago and Fernandez-Izquierdo 2003). Hence, through the fit of these innovations, policymakers and other financial market participants may benefit from a better understanding of the effects of shocks to future volatility, especially from knowing whether the effects of the shocks are transient or (highly) persistent. It is anticipated that researchers will leverage this study’s novel findings and robust design for improved simulation studies in finance and other sectors.

5.1. Limitations in the Study

Three limitations or challenges are noticed in this study. The first is on how to obtain a sufficient sample size that can generate accurate outcomes. To tackle this using the illustrative example, the process involves testing a selected number of sample sizes, with each used in turn, until a pattern of efficiency and/or consistency starts to evolve under the stated error distributions. The most efficient error distribution in terms of the given performance measure under a particular sample size is carefully noticed. If a set of sample sizes yields the same efficiency outcomes, the outcome with the best consistency among the set can be used for a final decision on sample size determination. This is a guide to obtaining the required sample size/s.
The second is running time. It is observed that a large simulated dataset may sometimes be needed to obtain accurate computations and this may be carried out at the cost of a large computational (or running) time, depending on the model used. This can be very demanding, especially when dealing with different stages of large sample sizes. Third, since the rugarch package does not make provision for calculating the coverage probability, this study derived a proxy for the coverage using the TPR measure, and it is observed that the MCS estimates considerably recover the true parameters.

5.2. Future Research Interest

The authors intend to further use the ugarchpath function of the rugarch package through any of the models that support its use for the framework illustration. The authors also intend extending the simulation framework ideas to other volatility (persistence) estimating models, such as the Generalised Autoregressive Score (GAS) model, and to modelling and estimating multivariate processes. The future extension also includes a framework for volatility forecasting and portfolio management.   

Author Contributions

Conceptualization, R.T.A.S., C.C. and C.S.; methodology, R.T.A.S., C.C. and C.S.; software, R.T.A.S., C.C. and C.S.; validation, R.T.A.S., C.C. and C.S.; formal analysis, R.T.A.S.; investigation, R.T.A.S.; resources, R.T.A.S., C.C. and C.S.; data curation, R.T.A.S.; writing—original draft preparation, R.T.A.S.; writing—review and editing, R.T.A.S., C.C. and C.S.; visualization, R.T.A.S., C.C. and C.S.; supervision, C.C. and C.S.; project administration, C.C., C.S. and R.T.A.S.; funding acquisition, R.T.A.S., C.C. and C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

This study used the daily closing Standard & Poor (S&P) South African sovereign bond data for the period 4 January 2000 to 17 June 2021 with 5598 observations. The data were collected from the Thomson Reuters Datastream (accessed on 18 June 2021). The analytic data can be downloaded from https://github.com/rsamuel11 (accessed on 13 August 2023).

Acknowledgments

The authors are grateful to Thomson Reuters Datastream for providing the data, and to the University of the Witwaterstrand and the University of Venda for their resources. The authors would also like to thank the anonymous referees for their valuable comments that helped to improve the quality of the work.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
MCSMonte Carlo simulation
SASouth Africa
GARCHGeneralised Autoregressive Conditional Heteroscedasticity
ugarchsimUnivariate GARCH Simulation
ugarchpathUnivariate GARCH Path Simulation
ARCHAutoregressive Conditional Heteroscedasticity
TPRTrue Parameter Recovery
S&PStandard & Poor
ARMAAutoregressive Moving Average
ARIMAAutoregressive Integrated Moving Average
i.i.d.Independent and identically distributed
MLEMaximum likelihood estimation
QMLEQuasi-maximum likelihood estimation
fGARCHfamily GARCH
sGARCHsimple GARCH
AVGARCHAbsolute Value GARCH
GJR GARCHGlosten-Jagannathan-Runkle GARCH
TGARCHThreshold GARCH
NGARCHNonlinear ARCH
NAGARCHNonlinear Asymmetric GARCH
EGARCHExponential GARCH
apARCHAsymmetric Power ARCH
CGARCHComponent GARCH
MCGARCHMultiplicative Component GARCH
P ^ Persistence
DGPData generation process
RMSERoot mean square error
SEStandard error
GEDGeneralised Error Distribution
GHYPGeneralised Hyperbolic
NIGNormal Inverse Gaussian
GHSTGeneralised Hyperbolic Skew-Student’s t
JSUJohnson’s reparametrised SU
llklog-likelihood
EDAExploratory Data Analysis
QQQuantile–Quantile
LMLagrange Multiplier
PQPortmanteau-Q
WLBWeighted Ljung–Box
AICAkaike information criterion
BICBayesian information criterion
HQICHannan-Quinn information criterion
SICShibata information criterion
AP-GoFAdjusted Pearson Goodness-of-Fit
p-valueProbability value
GASGeneralised Autoregressive Score

Appendix A. Outcomes of Different Patterns of Seed Values for Sets S1 and S2

Table A1. Outcomes of different patterns of seed values, with true parameter α + β = 0.9990.
Table A1. Outcomes of different patterns of seed values, with true parameter α + β = 0.9990.
Panel A ( S 1 ) Seed: 12345Seed: 54321Seed: 15243
N α ^ + β ^ RMSE α ^ + β ^ SE α ^ + β ^ TPR
(95%)
α ^ + β ^ RMSE α ^ + β ^ SE α ^ + β ^ TPR
(95%)
α ^ + β ^ RMSE α ^ + β ^ SE α ^ + β ^ TPR
(95%)
Normal10000.99900.08620.086295.00%0.95630.07570.062590.94%0.99090.08010.079694.23%
20000.97710.09340.090892.91%0.99030.04190.041094.17%0.98390.04900.046693.56%
30000.97270.07560.070992.50%0.98500.03730.034593.67%0.97910.05910.055793.11%
40000.97000.06930.062992.24%0.98460.04120.038793.64%0.99720.03040.030394.83%
Student’s t10000.99900.05250.052595.00%0.98330.04410.041293.50%0.99900.07680.076895.00%
20000.99020.05000.049294.16%0.99900.03490.034995.00%0.99580.03880.038694.69%
30000.99730.03270.032794.83%0.99770.03080.030894.87%0.99560.03180.031694.68%
40000.99180.02770.026794.31%0.99740.02580.025894.85%0.99630.02470.024694.74%
GED10000.98750.07190.071093.90%0.96880.04990.039792.13%0.98990.06300.062494.13%
20000.96630.06080.051291.89%0.99080.03360.032694.22%0.98470.03870.036093.64%
30000.96840.04410.031792.09%0.98460.03470.031593.63%0.97950.03850.033393.15%
40000.96920.04100.028292.16%0.98330.03280.028893.51%0.98390.03000.025993.57%
GHYP10000.99400.05570.055594.52%0.97850.04370.038693.05%0.98970.06570.065094.12%
20000.97480.05070.044692.70%0.99790.03280.032894.89%0.98710.03640.034493.87%
30000.97800.03530.028493.00%0.99010.03050.029294.15%0.98490.03250.029393.66%
40000.97760.03220.024192.97%0.98980.02630.024794.12%0.98920.02470.022694.07%
Panel B ( S 2 ) Seed: 34567Seed: 76543Seed: 36547
N α ^ + β ^ RMSE α ^ + β ^ SE α ^ + β ^ TPR
(95%)
α ^ + β ^ RMSE α ^ + β ^ SE α ^ + β ^ TPR
(95%)
α ^ + β ^ RMSE α ^ + β ^ SE α ^ + β ^ TPR
(95%)
Normal10000.98560.05830.056893.72%0.99420.04240.042194.54%0.98230.38880.388493.41%
20000.98140.03960.035493.33%0.98910.03700.035794.06%0.98060.14190.140793.25%
30000.98450.07080.069393.62%0.98090.03340.028193.28%0.98220.08050.078793.40%
40000.99900.03970.039795.00%0.97780.03260.024892.98%0.97790.05750.053592.99%
Student’s t10000.99710.04740.047494.82%0.99900.04220.042295.00%0.99900.03290.032995.00%
20000.97890.03640.030393.08%0.99900.02810.028195.00%0.99900.03150.031595.00%
30000.97810.03260.024993.01%0.99750.02370.023694.86%0.99900.02340.023495.00%
40000.98710.02530.022393.87%0.99550.02180.021594.67%0.99460.02380.023494.58%
GED10000.98020.04630.042393.21%0.98990.03890.037894.13%0.99860.04900.049094.96%
20000.97260.03860.028292.49%0.98980.02800.026594.13%0.98790.03880.037193.94%
30000.97100.03980.028292.33%0.98200.02760.021893.38%0.98080.03030.024393.27%
40000.98000.02850.021393.19%0.97820.02840.019493.02%0.97520.03210.021592.73%
GHYP10000.98630.04360.041793.80%0.99280.03830.037894.41%0.99900.03700.037095.00%
20000.97440.03770.028592.66%0.99520.02650.026294.64%0.99900.03580.035895.00%
30000.97370.03510.024292.59%0.98720.02430.021393.88%0.98940.02560.023794.09%
40000.98100.02780.021293.29%0.98350.02460.019293.53%0.98160.02750.021393.35%

Appendix B. Further Visual Illustrations of S1 and S2 TPR Outcomes

Figure A1. The TPR outcomes of S 1 and S 2 in Panels (A,B), respectively, where the dotted lines are the 95% (i.e., 0.95) nominal recovery levels.
Figure A1. The TPR outcomes of S 1 and S 2 in Panels (A,B), respectively, where the dotted lines are the 95% (i.e., 0.95) nominal recovery levels.
Jrfm 16 00392 g0a1

Appendix C. The Code for DGP through the Ugarchsim Function

Listing A1. The Code for the Method of Implementation of the MCS Experiment.
library (rugarch)
attach (BondDataSA)
BondDataSA<-as . data . frame (BondDataSA)
spec = ugarchspec (variance . model = list (model = ‘‘fGARCH’’,
                                                      garchOrder = c (1,1),
        submodel = ‘‘ALLGARCH’’),
   mean . model = list (armaOrder = c (1,1),
   include . mean = TRUE),
   distribution . model = ‘‘std’’,
   fixed . pars = list (shape = 4.1))
fit = ugarchfit (data = BondDataSA [,4, drop = FALSE], spec = spec)
fit
coef (fit)
 
simulate for N = 8000
sim = ugarchsim (fit, n . sim = 15000, n . start = 1, m. sim = 1000,
  rseed = 12345, startMethod = ‘‘sample’’)
simGARCH <- fitted (sim)
simGARCH
simGARCH <- as . data . frame (simGARCH)
simGARCH
 
# Remove the first 7000 for initial values effect
R_simGARCH <- simGARCH[-c (1:7000), ]
R_simGARCH
 
# Fit ARMA(1,1)-fGARCH(1,1) to the simulated dataset R_simGARCH
# For Normal
spec <- ugarchspec (variance . model = list (model = ‘‘fGARCH’’,
      garchOrder = c (1,1),
      submodel = ‘‘ALLGARCH’’),
  mean . model = list (armaOrder = c (1,1),
   include . mean = TRUE),
  distribution . model = ‘‘norm’’)
fit = ugarchfit (data = R_simGARCH, spec = spec)
show (fit)
coef (fit)

Notes

1
The GRETL (Baiocchi and Distaso 2003; Cottrell and Lucchetti 2023), GAS (Ardia et al. 2019) and fGARCH (Pfaff 2016; Wuertz et al. 2020) are also among the freely available and applicable software.
2
Coverage probability is the probability that a confidence interval of estimates contains or covers the true parameter value (Hilary 2002).
3
We describe the true model as the data-generating model fitted with the true sampling distribution (see Feng and Shi 2017).
4
When the true model is fitted to the real data, the estimates from the fit represent the true parameters.
5
This study only follows the authors’ trimming steps for initial values effect. The other trimming by the authors for “simulation bias” (where some initial numbers of replications are further discarded after the initial value effect adjustment) are not used here because it is observed that it sometimes distorts the estimator’s N consistency.
6
See the “Synopsis of R packages” pages 125–27 in (Pfaff 2016) for relevant details on rugarch package and ugarchsim function.

References

  1. Arago, Vicent, and Angeles Fernandez-Izquierdo. 2003. GARCH models with changes in variance: An approximation to risk measurements. Journal of Asset Management 4: 277–87. [Google Scholar] [CrossRef]
  2. Ardia, David, Kris Boudt, and Leopoldo Catania. 2019. Generalized autoregressive score models in R: The GAS package. Journal of Statistical Software 88: 1–28. [Google Scholar] [CrossRef]
  3. Ashour, Samir K., and Mahmood A. Abdel-hameed. 2010. Approximate skew normal distribution. Journal of Advanced Research 1: 341–50. [Google Scholar] [CrossRef]
  4. Azzalini, Adelchi 1985. A class of distributions which includes the normal ones. Scandinavian Journal of Statistics 12: 171–78.
  5. Azzalini, Adelchi, and Antonella Capitanio. 2003. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. Journal of the Royal Statistical Society. Series B: Statistical Methodology 65: 367–89. [Google Scholar] [CrossRef]
  6. Baiocchi, Giovanni, and Walter Distaso. 2003. GRETL: Econometric software for the GNU generation. Journal of Applied Econometrics 18: 105–1102. [Google Scholar] [CrossRef]
  7. Barndorff-Nielsen, Ole E., Thomas Mikosch, and Sidney I. Resnick. 2013. Levy Processes: Theory and Applications. Boston: Birkhauser. New York: Springer Science+Business Media. [Google Scholar] [CrossRef]
  8. Bollerslev, Tim. 1986. Generalized autoregressive conditional heteroskedastic. Journal of Econometrics 31: 307–27. [Google Scholar] [CrossRef]
  9. Bollerslev, Tim. 1987. Conditionally heteroskedasticity time series model for speculative prices and rates of returns. The Review of Economic and Statistics 69: 542–47. [Google Scholar] [CrossRef]
  10. Bollerslev, Tim, and Jeffrey M. Wooldridge. 1992. Quasi-maximum likelihood estimation and inference in dynamic models with time-varying covariances. Econometric Reviews 11: 143–72. [Google Scholar] [CrossRef]
  11. Branco, Márcia D., and Dipak K. Dey. 2001. A general class of multivariate skew-elliptical distributions. Journal of Multivariate Analysis 79: 99–113. [Google Scholar] [CrossRef]
  12. Bratley, Paul, Bennett L. Fox, and Linus E. Schrage. 2011. A Guide to Simulation, 2nd ed. New York: Springer Science. New York: Business Media. [Google Scholar] [CrossRef]
  13. Buccheri, Giuseppe, Giacomo Bormetti, Fulvio Corsi, and Fabrizio Lillo. 2021. A score-driven conditional correlation model for noisy and asynchronous data: An application to high-frequency covariance dynamics. Journal of Business and Economic Statistics 39: 920–36. [Google Scholar] [CrossRef]
  14. Chalmers, Phil. 2019. Introduction to Monte Carlo Simulations with Applications in R Using the SimDesign Package. pp. 1–46. Available online: philchalmers.github.io/SimDesign/pres.pdf (accessed on 28 August 2023).
  15. Chalmers, R. Philip, and Mark C. Adkins. 2020. Writing effective and reliable Monte Carlo simulations with the SimDesign package. The Quantitative Methods for Psychology 16: 248–80. [Google Scholar] [CrossRef]
  16. Chib, Siddhartha. 2015. Monte Carlo Methods and Bayesian Computation: Overview. Amsterdam: Elsevier, pp. 763–67. [Google Scholar] [CrossRef]
  17. Chou, Ray Yeutien. 1988. Volatility persistence and stock valuations: Some empirical evidence using GARCH. Journal of Applied Econometrics 3: 279–94. [Google Scholar] [CrossRef]
  18. Cottrell, Allin, and Riccardo J. Lucchetti. 2023. Gnu regression, econometrics, and time-series library. In Gretl User’s Guide. Boston: Free Software Foundation, pp. 1–486. [Google Scholar]
  19. Creal, Drew, Siem Jan Koopman, and André Lucas. 2013. Generalized autoregressive score models with applications. Journal of Applied Econometrics 28: 777–95. [Google Scholar] [CrossRef]
  20. Danielsson, Jon. 2011. Financial Risk Forecasting: The Theory and Practice of Forecasting Market Risk with Implementation in R and Matlab. Chichester: John Wiley & Sons. [Google Scholar]
  21. Datastream. 2021. Thomson Reuters Datastream. Available online: https://solutions.refinitiv.com/datastream-macroeconomic-analysis? (accessed on 17 June 2021).
  22. Ding, Zhuanxin, and Clive W. J. Granger. 1996. Modeling volatility persistence of speculative returns: A new approach. Journal of Econometrics 73: 185–215. [Google Scholar] [CrossRef]
  23. Ding, Zhuanxin, Clive W. J. Granger, and Robert F. Engle. 1993. A long memory property of stock market returns and a new model. Journal of Empirical Finance 1: 83–106. [Google Scholar] [CrossRef]
  24. Duda, Matej, and Henning Schmidt. 2009. Evaluation of Various Approaches to Value at Risk. Master’s thesis, Lund University, Lund, Sweden. Available online: https://lup.lub.lu.se/luur/download?func=downloadFile&recordOId=1436923&fileOId=1646971 (accessed on 28 August 2023).
  25. Eling, Martin. 2014. Fitting asset returns to skewed distributions: Are the skew-normal and skew-student good models? Insurance: Mathematics and Economics 59: 45–56. [Google Scholar] [CrossRef]
  26. Engle, Robert F. 1982. Autoregressive conditional heteroscedacity with estimates of variance of United Kingdom inflation. Econometrica 50: 987–1008. [Google Scholar] [CrossRef]
  27. Engle, Robert F., and Magdalena E. Sokalska. 2012. Forecasting intraday volatility in the US equity market. Multiplicative component GARCH. Journal of Financial Econometrics 10: 54–83. [Google Scholar] [CrossRef]
  28. Engle, Robert F., and Tim Bollerslev. 1986. Modelling the persistence of conditional variances. Econometric Reviews 5: 1–50. [Google Scholar]
  29. Engle, Robert F., and Victor K. Ng. 1993. Measuring and testing the impact of news on volatility. The Journal of Finance 48: 17749–78. [Google Scholar] [CrossRef]
  30. Fan, Jianqing, Lei Qi, and Dacheng Xiu. 2014. Quasi-maximum likelihood estimation of GARCH models with heavy-tailed likelihoods. Journal of Business and Economic Statistics 32: 178–91. [Google Scholar] [CrossRef]
  31. Feng, Lingbing, and Yanlin Shi. 2017. A simulation study on the distributions of disturbances in the GARCH model. Cogent Economics and Finance 5: 1355503. [Google Scholar] [CrossRef]
  32. Fisher, Thomas J., and Colin M. Gallagher. 2012. New weighted portmanteau statistics for time series goodness of fit testing. Journal of the American Statistical Association 107: 777–87. [Google Scholar] [CrossRef]
  33. Foote, William G. 2018. Financial Engineering Analytics: A Practice Manual Using R. Available online: https://bookdown.org/wfoote01/faur/ (accessed on 28 August 2023).
  34. Francq, Christian, and Jean Michel Zakoïan. 2004. Maximum likelihood estimation of pure GARCH and ARMA-GARCH processes. Bernoulli 10: 605–37. [Google Scholar] [CrossRef]
  35. Francq, Christian, and Le Quyen Thieu. 2019. QML inference for volatility models with covariates. Econometric Theory 35: 37–72. [Google Scholar] [CrossRef]
  36. Geweke, John. 1986. Comment on: Modelling the persistence of conditional variances. Econometric Reviews 5: 57–61. [Google Scholar] [CrossRef]
  37. Ghalanos, Alexios. 2018. Introduction to the Rugarch Package. (Version 1.3-8). Available online: mirrors.nic.cz/R/web/packages/rugarch/vignettes/Introduction_to_the_rugarch_package.pdf (accessed on 28 August 2023).
  38. Ghalanos, Alexios. 2022. Rugarch: Univariate GARCH Models. R Package Version 1.4-7. Available online: https://cran.r-project.org/web/packages/rugarch/rugarch.pdf (accessed on 28 August 2023).
  39. Gilli, Manfred, D. Maringer, and Enrico Schumann. 2019. Generating Random Numbers. Cambridge: Academic Press, pp. 103–32. [Google Scholar] [CrossRef]
  40. Glosten, Lawrence R., Ravi Jagannathan, and Daviid E. Runkle. 1993. On the relation between the expected value and the volatility of the nominal excess return on stocks. The Journal of Finance 48: 1779–801. [Google Scholar] [CrossRef]
  41. Hallgren, Kevin A. 2013. Conducting simulation studies in the R programming environment. Tutorials in Quantitative Methods for Psychology 9: 43–60. [Google Scholar] [CrossRef]
  42. Harwell, Michael. 2018. A strategy for using bias and RMSE as outcomes in Monte Carlo studies in statistics. Journal of Modern Applied Statistical Methods 17: 1–16. [Google Scholar] [CrossRef]
  43. Hentschel, Ludger. 1995. All in the family nesting symmetric and asymmetric GARCH models. Journal of Financial Economics 39: 71–104. [Google Scholar] [CrossRef]
  44. Heracleous, Maria S. 2007. Sample Kurtosis, GARCH-t and the Degrees of Freedom Issue. EUR Working Papers. Florence: European University Institute, pp. 1–22. Available online: http://hdl.handle.net/1814/7636 (accessed on 28 August 2023).
  45. Higgins, Matthew L., and Anil K. Bera. 1992. A class of nonlinear Arch models. International Economic Review 33: 137–58. [Google Scholar] [CrossRef]
  46. Hilary, Term. 2002. Descriptive Statistics for Research. Available online: https://www.stats.ox.ac.uk/pub/bdr/IAUL/Course1Notes2.pdf (accessed on 28 August 2023).
  47. Hoga, Yannick. 2022. Extremal dependence-based specification testing of time series. Journal of Business and Economic Statistics, 1–14. [Google Scholar] [CrossRef]
  48. Hyndman, Rob J., and Yeasmin Khandakar. 2008. Automatic time series forecasting: The forecast package for R. Journal of Statistical Software 27: 1–22. [Google Scholar] [CrossRef]
  49. Javed, Farrukh, and Panagiotis Mantalos. 2013. GARCH-type models and performance of information criteria. Communications in Statistics: Simulation and Computation 42: 1917–33. [Google Scholar] [CrossRef]
  50. Kim, Su Young, David Huh, Zhengyang Zhou, and Eun Young Mun. 2020. A comparison of Bayesian to maximum likelihood estimation for latent growth models in the presence of a binary outcome. International Journal of Behavioral Development 44: 447–57. [Google Scholar] [CrossRef]
  51. Kleijnen, Jack P. C. 2015. Design and Analysis of Simulation Experiments, 2nd ed. New York: Springer, vol. 230. [Google Scholar] [CrossRef]
  52. Koopman, Siem Jan, André Lucas, and Marcin Zamojski. 2017. Dynamic Term Structure Models with Score Driven Time Varying Parameters: Estimation and Forecasting. NBP Working Paper No. 258. Warszawa: Narodowy Bank Polski, Education & Publishing Department, Poland. Available online: https://static.nbp.pl/publikacje/materialy-i-studia/258_en.pdf (accessed on 28 August 2023).
  53. Lee, Gary J., and Robert F. Engle. 1999. A Permanent and Transitory Component Model of Stock Return Volatility. In Cointegration, Causality and Forecasting: A Festschrift in Honor of Clive W. J. Granger. New York: Oxford University Press, pp. 475–97. Available online: https://scirp.org/reference/referencespapers.aspx?referenceid=1232518 (accessed on 28 August 2023).
  54. Lee, Yen-Hsien, and Tung-Yueh Pai. 2010. REIT volatility prediction for skew-GED distribution of the GARCH model. Expert Systems with Applications 37: 4737–41. [Google Scholar] [CrossRef]
  55. Li, Qianru. 2008. Three Essays on Stock Market Volatility. All Graduate Theses and Dissertations, Spring 1920 to Summer 2023. p. 308. Available online: http://digitalcommons.usu.edu/etd/308/ (accessed on 28 August 2023).
  56. Lin, Chu Hsiung, and Shan Shan Shen. 2006. Can the student-t distribution provide accurate value at risk? Journal of Risk Finance 7: 292–300. [Google Scholar] [CrossRef]
  57. Maciel, Leandro dos Santos, and Rosangela Ballini. 2017. Value-at-risk modeling and forecasting with range-based volatility models: Empirical evidence. Revista Contabilidade e Financas 28: 361–76. [Google Scholar] [CrossRef]
  58. McNeil, Alexander J., and Rudiger Frey. 2000. Estimation of tail-related risk measures for heteroscedastic financial time series: An extreme value approach. Journal of Empirical Finance 7: 271–300. [Google Scholar] [CrossRef]
  59. Mooney, Christopher Z. 1997. Monte Carlo Simulation, 1st ed. Thousand Oaks: SAGE Publications, vol. 116. [Google Scholar]
  60. Morris, Tim P., Ian R. White, and Michael J. Crowther. 2019. Using simulation studies to evaluate statistical methods. Statistics in Medicine 38: 2074–102. [Google Scholar] [CrossRef] [PubMed]
  61. Nelson, Daniel B. 1991. Conditional heteroskedasticity in asset returns: A new approach. Econometrica 59: 347–70. [Google Scholar] [CrossRef]
  62. Pantula, Sastry G. 1986. Comment: Modelling the persistence of conditional variances. Econometric Reviews 5: 71–74. [Google Scholar] [CrossRef]
  63. Pfaff, Bernhard. 2016. Modelling Volatility. Hoboken: John Wiley & Sons, Ltd., pp. 116–32. [Google Scholar]
  64. Pourahmadi, Mohsen. 2007. Construction of skew-normal random variables: Are they linear combinations of normal and half-normal? Journal of Statistical Theory and Applications 3: 314–28. [Google Scholar]
  65. Qiu, Debin. 2015. aTSA: Alternative Time Series Analysis. Available online: https://cran.r-project.org/web/packages/aTSA/aTSA.pdf (accessed on 28 August 2023).
  66. Samiev, Sarvar. 2012. GARCH (1,1) with Exogenous Covariate for EUR/SEK Exchange Rate Volatility: On the Effects of Global Volatility Shock on Volatility. Master’s thesis, Umea University, Umea, Sweden. Available online: https://www.diva-portal.org/smash/get/diva2:676106/FULLTEXT01.pdf (accessed on 28 August 2023).
  67. Schwert, G. William. 1990. Stock volatility and the crash of ’87. The Review of Financial Studies 3: 77–102. [Google Scholar]
  68. Shahriari, Siroos, Sisson S. A., and Taha Rashidi. 2023. Copula ARMA-GARCH modelling of spatially and temporally correlated time series data for transportation planning use. Transportation Research Part C 146: 103969. [Google Scholar] [CrossRef]
  69. Sigal, Matthew J., and Philip R. Chalmers. 2016. Play it again: Teaching statistics with Monte Carlo simulation. Journal of Statistics Education 24: 136–56. [Google Scholar] [CrossRef]
  70. Silvennoinen, Annastiina, and Timo Teräsvirta. 2016. Testing constancy of unconditional variance in volatility models by misspecification and specification tests. Studies in Nonlinear Dynamics and Econometrics 20: 347–64. [Google Scholar] [CrossRef]
  71. Smith, Richard L. 2003. Statistics of extremes, with applications in environment, insurance, and finance. In Extreme Values in Finance, Telecommunications, and the Environment. Boca Raton: Chapman & Hall/CRC, pp. 20–97. [Google Scholar]
  72. Su, Jen Je. 2011. On the oversized problem of Dickey-Fuller-type tests with GARCH errors. Communications in Statistics: Simulation and Computation 40: 1364–72. [Google Scholar] [CrossRef]
  73. Søfteland, Andreas, and IversenGlenn Stian. 2021. Applying GARCH-EVT-Copula Forecasting in Active Portfolio Management. Master’s thesis, NTNU, Trondheim, Norway. Available online: https://no.ntnu_inspera_82752696_84801861.pdf (accessed on 13 August 2023).
  74. Taylor, Stephen J. 1986. Modelling Financial Time Series, 2nd ed. Singapore: World Scientific Publishing Co. Pte. Ltd. [Google Scholar]
  75. Wang, Chao, Richard Gerlach, and Qian Chen. 2018. A semi-parametric realized joint value-at-risk and expected shortfall regression framework. arXiv arXiv:1807.02422. [Google Scholar]
  76. White, Halbert. 1982. Maximum likelihood estimation of misspecified models. Econometrica 50: 1–25. [Google Scholar] [CrossRef]
  77. Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy McGowan, Romain François, Garrett Grolemund, Alex Hayes, Lionel Henry, Jim Hester, and et al. 2019. Welcome to the Tidyverse. Journal of Open Source Software 4: 1–6. [Google Scholar] [CrossRef]
  78. Wuertz, Diethelm, Tobias Setz, Yohan Chalabi, Chris Boudt, Pierre Chausse, and Michal Miklovac. 2020. fGarch: Rmetrics—AutoregressivE Conditional Heteroskedastic Modelling. R Package Version 3042.83.2. Available online: https://cran.r-project.org/web/packages/fGarch/fGarch.pdf (accessed on 28 August 2023).
  79. Yuan, Ke Hai, Xin Tong, and Zhiyong Zhang. 2015. Bias and efficiency for SEM with missing data and auxiliary variables: Two-stage robust method versus two-stage ML. Structural Equation Modeling: A Multidisciplinary Journal 22: 178–92. [Google Scholar] [CrossRef]
  80. Zakoian, Jean-Michel. 1994. Threshold heteroscedastic models. Journal of Economic Dynamics and Control 18: 931–55. [Google Scholar]
  81. Zeileis, Achim, and Gabor Grothendieck. 2005. Zoo: S3 infrastructure for regular and irregular time series. Journal of Statistical Software 14: 1–27. [Google Scholar]
  82. Zhang, Yidong Terrence. 2017. Volatility Forecasting with the Multifractal Model of Asset Returns. Gainesville: University of Florida, pp. 1–24. [Google Scholar] [CrossRef]
  83. Zivot, Eric. 2009. Practical Issues in the Analysis of Univariate GARCH Models. Berlin and Heidelberg: Springer, pp. 113–55. [Google Scholar] [CrossRef]
  84. Zivot, Eric. 2013. Univariate GARCH. Available online: https://faculty.washington.edu/ezivot/econ589/univariateGarch2012powerpoint.pdf (accessed on 28 August 2023).
Figure 1. Simulation design flowchart to determine suitable assumed innovations.
Figure 1. Simulation design flowchart to determine suitable assumed innovations.
Jrfm 16 00392 g001
Figure 2. Panels (A,B) show the efficiency and N consistency in S 1 and S 2 for each seed pattern, while Panels (C,D) reveal the impacts of sample size on RMSE and SE under the assumed errors in S 1 and S 2 .
Figure 2. Panels (A,B) show the efficiency and N consistency in S 1 and S 2 for each seed pattern, while Panels (C,D) reveal the impacts of sample size on RMSE and SE under the assumed errors in S 1 and S 2 .
Jrfm 16 00392 g002
Figure 3. TPR outcomes in Panels A(i) and B(i) for S 1 and S 2 , respectively. The outcomes are clearly spread out in Panels A(ii) and B(ii) for S 1 and S 2 . The dotted lines are the 95% (i.e., 0.95) nominal recovery levels.
Figure 3. TPR outcomes in Panels A(i) and B(i) for S 1 and S 2 , respectively. The outcomes are clearly spread out in Panels A(ii) and B(ii) for S 1 and S 2 . The dotted lines are the 95% (i.e., 0.95) nominal recovery levels.
Jrfm 16 00392 g003
Figure 4. The TPR outcomes against sample size are shown in Panels (A,B), while Panels (C,D) show the TPR outcomes against the MCS estimates.
Figure 4. The TPR outcomes against sample size are shown in Panels (A,B), while Panels (C,D) show the TPR outcomes against the MCS estimates.
Jrfm 16 00392 g004
Figure 5. Simulated returns (in Panel A) and simulated volatility (Panel B) of the first three replicated series.
Figure 5. Simulated returns (in Panel A) and simulated volatility (Panel B) of the first three replicated series.
Jrfm 16 00392 g005
Figure 6. The impact of sample size on RMSE, SE and bias for the fGARCH(1,1)-Student’s t MCS modelling. The RMSE and SE are considerably N consistent, but bias is independent of N.
Figure 6. The impact of sample size on RMSE, SE and bias for the fGARCH(1,1)-Student’s t MCS modelling. The RMSE and SE are considerably N consistent, but bias is independent of N.
Jrfm 16 00392 g006
Figure 7. Panels (A,B) display the TPR outcomes, where the clustered outcomes in Panel (A) are clearly spread out in Panel (B). The dotted line is the 95% (i.e., 0.95) nominal recovery level.
Figure 7. Panels (A,B) display the TPR outcomes, where the clustered outcomes in Panel (A) are clearly spread out in Panel (B). The dotted line is the 95% (i.e., 0.95) nominal recovery level.
Jrfm 16 00392 g007
Figure 8. EDA of price (panels (a,b,e,f)) and returns (panels (c,d,g,h)) for SA Bond Index.
Figure 8. EDA of price (panels (a,b,e,f)) and returns (panels (c,d,g,h)) for SA Bond Index.
Jrfm 16 00392 g008
Figure 9. ARCH Portmanteau Q and Lagrange Multiplier tests.
Figure 9. ARCH Portmanteau Q and Lagrange Multiplier tests.
Jrfm 16 00392 g009
Table 1. Outcomes of different simulation replicates.
Table 1. Outcomes of different simulation replicates.
Panel A: Simulation run once
α β N llkRMSE α ^ Bias α ^ SE α ^ RMSE β ^ Bias β ^ SE β ^ RMSE α ^ + β ^ Bias α ^ + β ^ SE α ^ + β ^
0.09310.90591000−2020.50.05040.03280.03830.0551−0.04430.03270.0719−0.01150.0710
2000−3813.80.02460.00460.02410.0462−0.03740.02710.0608−0.03270.0512
3000−5734.20.0156−0.00370.01520.0316−0.02690.01660.0441−0.03060.0317
Panel B: Simulation run with 2500 replications
0.09310.90591000−2020.50.05040.03280.03830.0551−0.04430.03270.0719−0.01150.0710
2000−3813.80.02460.00460.02410.0462−0.03740.02710.0608−0.03270.0512
3000−5734.20.0156−0.00370.01520.0316−0.02690.01660.0441−0.03060.0317
Panel C: Simulation run with 1000 replications
0.09310.90591000−2020.50.05040.03280.03830.0551−0.04430.03270.0719−0.01150.0710
2000−3813.80.02460.00460.02410.0462−0.03740.02710.0608−0.03270.0512
3000−5734.20.0156−0.00370.01520.0316−0.02690.01660.0441−0.03060.0317
Panel D: Simulation run with 300 replications
0.09310.90591000−2020.50.05040.03280.03830.0551−0.04430.03270.0719−0.01150.0710
2000−3813.80.02460.00460.02410.0462−0.03740.02710.0608−0.03270.0512
3000−5734.20.0156−0.00370.01520.0316−0.02690.01660.0441−0.03060.0317
Table 2. True model fGARCH(1,1)-Student’s t with true parameters α = 0.0748, β = 0.9243 and α + β = 0.9991.
Table 2. True model fGARCH(1,1)-Student’s t with true parameters α = 0.0748, β = 0.9243 and α + β = 0.9991.
N α ^ β ^ α ^ + β ^ llkRMSE α ^ Bias α ^ SE α ^ RMSE β ^ Bias β ^ SE β ^ RMSE α ^ + β ^ Bias α ^ + β ^ SE α ^ + β ^ TPR α ^ + β ^
95%
Panel A80000.08350.92341.0069−13,860.40.03900.00870.03800.0377−0.00090.03770.07610.00780.075795.74%
Normal90000.07900.92591.0049−15,490.90.02970.00420.02940.04650.00150.04640.07600.00580.075895.55%
10,0000.08030.92811.0085−17,081.00.00880.00550.00690.00910.00380.00820.01780.00940.015195.89%
Panel B80000.08340.92351.0069−13,860.40.03850.00860.03750.0372−0.00090.03720.07520.00780.074895.74%
skew-90000.07920.92621.0055−15,490.60.01380.00440.01310.02160.00190.02150.03520.00640.034695.61%
Normal10,0000.08010.92841.0085−17,080.20.00850.00530.00660.00890.00410.00790.01730.00940.014595.89%
Panel C80000.07360.92260.9963−13,337.10.0060−0.00120.00580.0059−0.00170.00560.0118−0.00290.011594.73%
Student t90000.07270.92791.0006−14,912.00.0054−0.00210.00500.00560.00360.00430.00940.00140.009395.14%
10,0000.07350.92630.9999−16,428.30.0043−0.00130.00410.00350.00200.00280.00700.00080.006995.07%
Panel D80000.07320.92250.9957−13,337.10.0084−0.00160.00830.0064−0.00180.00620.0149−0.00340.014594.68%
skew-90000.07150.92620.9977−14,912.20.0061−0.00330.00510.00400.00190.00360.0088−0.00140.008794.87%
Student t10,0000.07430.92771.0020−16,428.40.0035−0.00050.00340.00410.00340.00240.00650.00290.005895.27%
Panel E80000.07700.92441.0014−13,386.30.00790.00220.00760.00760.00010.00760.01530.00230.015295.22%
GED90000.07340.92661.0000−14,966.30.0056−0.00140.00540.00530.00230.00480.01030.00090.010395.09%
10,0000.07530.92751.0028−16,492.30.00360.00050.00350.00420.00320.00270.00730.00370.006295.35%
Panel F80000.07500.92210.9971−13,386.20.00590.00020.00590.0059−0.00220.00540.0115−0.00200.011394.81%
skew-90000.07340.92650.9999−14,966.00.0055−0.00140.00540.00540.00220.00490.01030.00080.010395.08%
GED10,0000.07530.92751.0028−16,492.30.00350.00060.00350.00400.00310.00250.00700.00370.006095.35%
Panel G80000.07320.92340.9966−13,336.30.0065−0.00160.00630.0054−0.00090.00530.0119−0.00250.011694.76%
GHYP90000.07200.92790.9999−14,911.40.0057−0.00280.00500.00560.00360.00430.00930.00080.009395.08%
10,0000.07290.92650.9994−16,427.70.0045−0.00190.00400.00350.00220.00270.00670.00030.006795.03%
Panel H80000.07310.92290.9961−13,343.30.0058−0.00170.00560.0057−0.00140.00550.0115−0.00310.011194.71%
NIG90000.07190.92750.9994−14,919.70.0059−0.00290.00520.00530.00310.00430.00950.00030.009595.03%
10,0000.07290.92660.9995−16,438.10.0045−0.00190.00410.00340.00230.00250.00660.00040.006695.04%
Panel I80000.07110.92180.9930−13,435.00.0067−0.00360.00560.0070−0.00250.00650.0135−0.00620.012194.42%
GHST90000.06990.92610.9960−15,027.30.0071−0.00490.00510.00490.00180.00460.0102−0.00310.009794.71%
10,0000.07340.92660.9999−16,569.10.0038−0.00140.00350.00340.00220.00260.00610.00080.006195.08%
Panel J80000.07310.92320.9963−13,337.10.0057−0.00170.00550.0057−0.00110.00550.0114−0.00280.011094.74%
JSU90000.07190.92770.9996−14,912.40.0057−0.00290.00500.00530.00330.00420.00910.00050.009195.04%
10,0000.07270.92640.9991−16,429.30.0045−0.00200.00400.00330.00200.00260.00660.00000.006695.00%
Table 3. ARMA(1,1)-fGARCH(1,1) models’ empirical outcomes on SA Bond return data.
Table 3. ARMA(1,1)-fGARCH(1,1) models’ empirical outcomes on SA Bond return data.
Panel A
Normal
Panel B
skew-Normal
Panel C
Student’s t
Panel D
skew-Student’s t
Panel E
GED
μ ^ 0.0164 ***0.0078 *0.0387 *0.01770.0378 **
ω ^ 0.03230.02780.0297 *0.0270 *0.0311 *
α ^ 0.0701 *0.0670 *0.0690 *0.0661 *0.0700 *
β ^ 0.9093 *0.9170 *0.9188 *0.9236 *0.9137 *
λ ^ 11 0.2504 *0.2344 *0.3499 *0.3445 *0.2879 *
λ ^ 21 0.22450.2209 ***0.07290.09430.1445 *
γ ^ = δ ^ 1.4550 *1.4233 *1.2362 *1.2058 *1.3436 *
Persistence ( P ^ ) 0.97940.98250.97640.97920.9762
WLB (5)0.32270.83830.91031.60601.3361
p-value (5)(1.0000)(1.0000)(1.0000)(0.9955)(0.9995)
ARCH LM statistic(7)3.09793.18543.88974.12663.4264
p-value (7)(0.4953)(0.4793)(0.3627)(0.3287)(0.4369)
AP-GoF87.264.5642.3218.4853.68
p-value(0.0000)(0.0000)(0.0016)(0.4908)(0.0000)
Log-likelihood−8909.189−8886.553−8803.012−8790.528−8825.745
AIC3.18623.17853.14863.14453.1568
BIC3.19693.19033.16053.15763.1686
SIC3.18623.17853.14863.14453.1567
HQIC3.18993.18263.15283.14913.1609
Run-time (seconds)4.32456.66367.646311.91779.1407
Panel F
skew-GED
Panel G
GHYP
Panel H
NIG
Panel I
GHST
Panel J
JSU
μ ^ 0.01570.01560.0155−0.00620.0159
ω ^ 0.0273 *0.0267 *0.0261 *0.0251 *0.0265 *
α ^ 0.0665 *0.0661 *0.0657 *0.0650 *0.0658 *
β ^ 0.9206 *0.9241 *0.9246 *0.9284 *0.9243 *
λ ^ 11 0.2823 *0.3370 *0.3341 *0.3202 *0.3378 *
λ ^ 21 0.1592 **0.09420.09640.1163 **0.0949
γ ^ = δ ^ 1.3048 *1.2086 *1.2171 *1.1942 *1.2102 *
Persistence ( P ^ ) 0.97970.97950.98000.98260.9796
WLB (5)2.53501.59901.82602.59201.7170
p-value (5)(0.7599)(0.9957)(0.9822)(0.7277)(0.9906)
ARCH LM statistic(7)3.63314.07054.02494.23544.0750
p-value (7)(0.4026)(0.3365)(0.3430)(0.3139)(0.3359)
AP-GoF46.1817.0122.2329.3721.66
p-value(0.0005)(0.5890)(0.2730)(0.0604)(0.3013)
Log-likelihood−8810.111−8790.079−8793.107−8800.387−8791.112
AIC3.15153.14473.14543.14803.1447
BIC3.16463.15893.15853.16113.1578
SIC3.15153.14473.14543.14803.1447
HQIC3.15613.14973.15003.15263.1493
Run-time (seconds)19.005849.946120.780316.852510.6434
Note: The “*”, “**” and “***” are 1%, 5% and 10% significance levels, respectively. The p-values at 5% significance levels are given in the round brackets, while “(5)” and “(7)” are lags 5 and 7, respectively. The AP-GoF (for group 20) is the Adjusted Pearson Goodness-of-Fit Test, and WLB denotes the Weighted Ljung–Box test.
Table 4. ARMA(1,1)-apARCH(1,1) models’ empirical outcomes on SA Bond data.
Table 4. ARMA(1,1)-apARCH(1,1) models’ empirical outcomes on SA Bond data.
Panel A
Normal
Panel B
skew-Normal
Panel C
Student’s t
Panel D
skew-Student’s t
Panel E
GED
Log-likelihood−8910.136−8887.475−8803.200−8790.782−8826.007
AIC3.18623.17843.14833.14433.1565
BIC3.19573.18913.15903.15613.1671
SIC3.18623.17843.14833.14433.1565
HQIC3.18953.18223.15213.14843.1602
Panel  F
skew-GED
Panel  G
GHYP
Panel  H
NIG
Panel  I
GHST
Panel  J
JSU
Log-likelihood−8810.472−8790.315−8793.329−8802.039−8791.340
AIC3.15133.14443.14523.14833.1445
BIC3.16313.15753.15703.16013.1563
SIC3.15133.14443.14523.14833.1445
HQIC3.15543.14903.14933.15243.1486
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Samuel, R.T.A.; Chimedza, C.; Sigauke, C. Simulation Framework to Determine Suitable Innovations for Volatility Persistence Estimation: The GARCH Approach. J. Risk Financial Manag. 2023, 16, 392. https://doi.org/10.3390/jrfm16090392

AMA Style

Samuel RTA, Chimedza C, Sigauke C. Simulation Framework to Determine Suitable Innovations for Volatility Persistence Estimation: The GARCH Approach. Journal of Risk and Financial Management. 2023; 16(9):392. https://doi.org/10.3390/jrfm16090392

Chicago/Turabian Style

Samuel, Richard T. A., Charles Chimedza, and Caston Sigauke. 2023. "Simulation Framework to Determine Suitable Innovations for Volatility Persistence Estimation: The GARCH Approach" Journal of Risk and Financial Management 16, no. 9: 392. https://doi.org/10.3390/jrfm16090392

Article Metrics

Back to TopTop