Enhancing Integer Time Series Model Estimations through Neural Network-Based Fuzzy Time Series Analysis

: This investigation explores the effects of applying fuzzy time series (FTSs) based on neural network models for estimating a variety of spectral functions in integer time series models. The focus is particularly on the skew integer autoregressive of order one (NSINAR(1)) model. To support this estimation, a dataset consisting of NSINAR(1) realizations with a sample size of n = 1000 is created. These input values are then subjected to fuzzification via fuzzy logic. The prowess of artificial neural networks in pinpointing fuzzy relationships is harnessed to improve prediction accuracy by generating output values. The study meticulously analyzes the enhancement in smoothing of spectral function estimators for NSINAR(1) by utilizing both input and output values. The effectiveness of the output value estimates is evaluated by comparing them to input value estimates using a mean-squared error (MSE) analysis, which shows how much better the output value estimates perform.


Introduction
In recent years, a variety of FTS methodologies have emerged in the literature.FTSs can be defined as employing fuzzy sets to represent its observations.Ref. [1] provided the initial description of FTSs and outlined its fundamental definitions.Fundamentally, an FTS is comprised of three consecutive steps: fuzzification, fuzzy relationship formation, and defuzzification.Numerous studies that have investigated these processes [2][3][4][5] have focused solely on the most recent process.Additionally, because the formation of fuzzy links is directly related to predicting, numerous studies have concentrated on it [6][7][8][9][10].Artificial neural networks appear to be quite good at figuring out fuzzy relationships that increase the forecasting performance's accuracy.Generally, artificial neural network techniques have shown success in a wide range of applications as in [11][12][13], and researchers often employ neural networks to build fuzzy relationships of the FTS.Over time, there have been notable breakthroughs in the field of modeling integer-valued time series due to the substantial attention given to this topic.The integer autoregressive (INAR) model is one of the best among them for modeling counting series.In recent years, researchers have been striving to improve the INAR model's capacity to accurately reproduce observed data.This effort has involved modifications to various aspects of INAR models, including adjustments to the marginal distribution (see [14][15][16][17]), others the ranking of the models as [18,19], and yet others, the thinning operators as [20][21][22][23][24][25][26][27].Ref. [28] investigates certain statistical metrics for a number of INAR(1) models.To match specific data and more accurately characterize it, [29] provided an INAR(1) model based on two thinning operators.The fuzzy logic employed in [30] enhanced the estimation of all density functions.Ref. [31] conducted research on the FTS technique to enhance periodograms while keeping their statistical characteristics.Ref. [32] investigates a few statistical properties and all density functions for the ZTPINAR(1) process.Ref. [33] applies the FTS based on the Chen method to smoothing estimates for the DCGINAR(1) process.Ref. [34] used a novel technique of FTSs to improve the estimation of stationary processes' unknown parameters by traditional methods.Ref. [35] employed a fuzzy Markov chain to enhance the estimates of density functions with respect to the MCGINAR(1) process.
Since it focuses on building fuzzy relationships that lead to forecasting, [36] was the first to apply neural networks to FTS predictions.Ref. [37] proposes a novel method for handling high-order multivariate fuzzy time series that is based on artificial neural networks.Ref. [38] develops a fuzzy time series model based on neural networks to enhance the forecasting of observations.Ref. [39] provides a brand-new hybrid fuzzy time series technique that uses artificial neural networks for defuzzification and a fuzzy c-means (FCM) method for fuzzification.Ref. [40] creates and develops precise statistical forecasting models to predict the monthly API and assesses these models to track the state of the air quality.In a unique method, [41] created a PSNN to create fuzzy associations in high-order FTSs and modified PSO to train the network's weights.By combining a convectional neural network and FTS method, [42] suggested short-term load forecasting.Ref. [43] employed various techniques to forecast the air pollution index (API) of Kuala Lumpur, Malaysia, for the year 2017.These techniques included the use of artificial neural networks (ANNs), autoregressive integrated moving average (ARIMA), trigonometric regressors, Box-Cox transformation, ARMA errors, trend and seasonality (TBATS), and multiple fuzzy time series (FTS) models.Ref. [44] built an LSTM recurrent neural network based on trend fuzzy granulation for long-term time series forecasting.A unique multi-functional recurrent fuzzy neural network (MFRFNN) for time series prediction was developed in [45].Ref. [46] suggested a shallow and deep neural network model for demand forecasting fuzzy time series pharmaceutical data.
The purpose of this research is to present fuzzy time series (FTSs), which are based on neural network models [36], and is intended to enhance estimates of density functions for integer time series.These functions are spectral, bispectral, and normalized bispectral density functions.The NSINAR(1) model is used in this strategy.All spectrum functions and their smoothed estimations for NSINAR(1) are computed for this purpose.We use neural network-based FTSs and forecast realizations to generate the output values for NSINAR (1) observational "input values".All density functions are estimated with input and output values.The contribution of the output values of neural network-based FTSs to smoothing of these estimations is investigated by contrasting the two cases using the results.

Developing the Forecasting Models
The following are the procedures for establishing a forecasting model: data preparation, evaluation, and selection; neural network construction (in terms of input variable selection, transfer function, structure, etc. [47].

Preparing Data
A number of crucial decisions, including data preparation, input variable selection, network type and design, the transfer function, training methodology, and model validation, assessment, and selection, must be made by the neural network forecaster.Some of these choices can be made while building the model, but others must be carefully thought through before beginning any modeling work.Given that neural networks operate on data-driven principles, data preparation emerges as a pivotal initial step in crafting an effective neural network model.Indeed, the creation of a practical and predictive neural network model heavily relies on the availability of a robust, sufficient, and representative dataset.Consequently, the quality of the data significantly influences the reliability of neural network models.Moreover, ample data are indispensable for the training process of neural networks.Numerous studies have utilized a practical ratio ranging from 70%:30% to 90%:10% for distinguish between in-samples and out-of-samples [47].As a result, we use NSINAR (1) to construct a series of size n=1000.For our estimation (the in-sample), we chose the data from the first observation to the 800th, and for our forecast (the out-ofsample), we used the data from the 801st and 1000th.In other words, the ratio is around 800 1000 : 200 1000 = 80% : 20% falling among the two categories of samples.

Setup of a Neural Network
The model used in forecasting the most is a multilayer feedforward structure [48].As a result, backpropagation [49] was selected as the model, and PC Neuron Institutional, a backpropagation software package, was selected as the method for creating the forecasting model.In the setup, our goal was to first build (or train) the fuzzy associations between each FLR before forecasting.An FLR is a 1-1 relationship.Hence, there is one input layer and one output layer with one node each.The majority of forecasting applications have employed only one hidden layer and a suitable number of hidden nodes, even if there are several criteria for selecting the number of hidden layers and hidden nodes in each layer [50][51][52].A minimal neural network was used [53] to avoid over-fitting.As a result, we employed two hidden nodes and one hidden layer.As a result, we created the neural network structure shown in Figure 1. Figure 2 shows the purpose of each node in the hidden and output layers.The previous layer's node(t) r, like X r in the diagram, provides input(t) to the node s.A weight, Wrs, which represents the strength of the connection between each connection from node r to s, is attached to each one.The following formula is used to calculate node s ′ output, Y t [30].
where f(z) is a sigmoid function.

Model Selection and Evaluation
A pertinent study [54] states that no single approach has proven to be consistently best for all kinds of applications, including neural network applications.Thus, we used a simple model, as well as a hybrid one.An in-sample and an out-of-sample category were created for the observations.A further division into known and unknown patterns was made for the out-of-sample observations.In contrast to the unidentified patterns that constituted the majority of the data outside of the sample, the observations that were present in both outside of the sample and inside the sample were regarded as established patterns.With regard to their respective original functions, we used the mean-squared error (MSE) to compare the performance of the three estimators in each of the three scenarios.For details on the calculation of the MSE, please refer to [32,33], which results in a different evaluation of the three different types of observations.

Fuzzy Time Series Models Utilizing Neural Networks
This study applied a backpropagation neural network to forecast the fuzzy observations with respect to the FTS of NSINAR (1).There are six steps in this study, which can be reported as follows: 1.
Defining and partitioning the universe of discourse: According to the problem domain of NSINAR(1)' series, the universe of discourse for observations, U =[starting, ending], is defined.After the length of the intervals, l, is determined, U can be partitioned into equal-length intervals u 1 , u 2 , u 3 , . . ., u b , b = 1, . . .and their corresponding midpoints m 1 , m 2 , m 3 , . . ., m b : respectively.
Defining fuzzy sets for observations: Each linguistic observation, A i , can be defined by the intervals u 1 , u 2 , u 3 , . . ., u b , where Fuzzifying the observations: A fuzzy set is created by fuzzifying each observation.An observation is fuzzified to A i if its greatest degree of membership is in A i , just as in [6,55].

4.
Establishing the fuzzy relationship (neural network training): The fuzzy associations in these FLRs were built (or trained) using a backpropagation neural network.Index i served as the input, and index j served as the appropriate output for each FLR, A i → A j .These FLRs became the input and output patterns for neural network training.

5.
Forecasting: A description of the hybrid and basic models can be found below.The basic model uses a neural network methodology to forecast each piece of data, while the hybrid model uses the same neural network approach to forecast the known patterns together with a simple strategy to anticipate the unknown patterns.
Model 1 (basic mode): Assume F(t − 1) = A i ′ .We chose i ′ as the forecast input in order to make the calculations easier.Assume that j ′ is the neural network's output.The hazy forecast is A j ′ , we say.In other words, Model 2 (hybrid mode): Assume F(t − 1) = A i ′ .If A i ′ is a recognized pattern, the basic model is used to obtain the fuzzy forecast.If A i ′ is an unknown pattern, then we merely take Ai as the fuzzily predicted value for F(t), in accordance with Chen's model [6].That is, Defuzzifying: No matter whatever model is used, the defuzzified forecast is always equal to the fuzzy forecast's midpoint.Assume A k ′ is the fuzzy prediction for F(t).
The forecast that has been defuzzed corresponds to A k ′ 's middle, i.e., For additional details on this methodology, see [36].Therefore, the forecast observations that were obtained from the generated realizations from the NSINAR(1) process-known as the "input values"-are the "output values" of this approach.

The New Skew INAR(1) Model
The NSINAR(1) process was first defined by [56].These model, also known as true integer autoregressive models, have become necessary for the modeling of count data with positive and negative values.It defines the NSINAR(1) as where is the difference between the negative binomial thinning and binomial thinning operators, where the counting series β * A t−1 and β • B t−1 are independent random variables.
, where W i is a sequence of i.i.d geometric random variables and U i is a sequence of i.i.d.Bernoulli random variables independent of W i , X t = A t − B t , where A t ∼ geometric( γ 1+γ ), B t ∼ Poisson(λ), {X t } is a sequence of random variables having a geometric-Poisson (γ, λ), and η t has the distribution of ϵ t − ε t , where {ϵ t } and {ε t } are independent r.v.s and X t and η t−l are independent for all l ≥ 1. ε t are i.i.d.r.v.s with a common Poisson(λ(1 − β)) distribution, and ϵ t is a mixture of two random variables with geometric γ/(1 + γ) and geometric(β/(1 + β)) distributions.The condition of the stationarity of the process {X t } is 0 ≤ β ≤ γ/(1 + γ), and the condition of the non-stationarity of the process Here, our study is restricted to the stationary case.Some properties of X t are introduced: The mean (1−β) .Refer to [23] and [56] for detailed information on the thinning operator ⋆ and the NSINAR(1) model, respectively.

Spectral and Bispectral Density Functions
In time series analysis, statistical spectral analysis plays a number of roles, including the estimate, hypothesis testing, data reduction, and description.It is quite helpful to examine a time series in both the time and frequency domains when performing an analysis.The Fourier transform, which converts the time series from the time domain to the frequency domain, is the fundamental tool of spectral analysis.This field frequently contains a wealth of important information that requires additional research to locate.Second-order spectra have played a very important role in the analysis of linear and Gaussian time series data and in signal processing.When the process is Gaussian, the second-order spectra contain all the necessary and useful information about the process, and we do not need to consider higher order spectra.Before analyzing the time series data, in most cases, we assume that the series is generated from a linear process and even from a Gaussian process for further simplified analysis.But, in reality, this is not the case for every process.The reason behind assuming the linearity of the process is that it is very easy to fit a linear model in comparison to fitting the actual non-linear model.In order to study non-linear and non-Gaussian processes, one needs to consider higher order spectra.Theoretically, it is possible to compute the spectrum of any order, but computationally, it is very costly.Hence, we will consider only the bispectrum, the simplest higher order spectrum.So, we can say that the second-order spectra will not adequately characterize the series (unless it is Gaussian), and hence, there is a need for higher order spectral analysis.The simplest type of higher order spectral analysis is the bispectral analysis.This leads us to the bispectral density function, which gives us important details regarding the process' nonlinearity.The normalized bispectrum's modulus for continuous non-Gaussian time series is flat.
The spectral density function "SDF", bispectral density function "BDF", and normalized bispectral density function "NBDF" of NSINAR(1) are provided in this section.8) is satisfied by {X t }, then the SDF, denoted by g(w), is computed as Let {X t } satisfy (8), then the SDF, represented by g(w) is calculated as Proof.
Theorem 2. If ( 8) is satisfied by {X t }, then the BDF, denoted by g(w 1 , w 2 ), are computed as where Proof.We can write g(w 1 , w 2 ) as (see [17]) in the following formulae: Applying the symmetry characteristics of the third-order cumulants (see [57]), then Using the expressions of R 3 (0, τ), R 3 (τ, τ) and R 3 (s, s + τ), we obtain All these summations can be evaluated as follows, for example Hence, following a few computations and summations, we have and by taking, , the proof is complete.The NBDF, represented by f (w 1 , w 2 ), is calculated as where g(w 1 , w 2 ) and g(w) are obtained by (11) and (9), respectively.
Figure 3 illustrates the generated observations from the NSINAR(1) process at λ = 3, β = 0.25, and γ = 5.Figures 4, 5, and 6 show, respectively, the SDF, BDF, and NBDF of NSINAR(1) at λ = 3, β = 0.25, and γ = 5.We infer that the model is linear from Figures 5 and 6 and the results in Tables 1 and 2, since the NBDF values, which fall between (0.73, 0.78), are flatter (constant, very tightly spaced apart) than the BDF values, which fall between (5,20).The simulated series of the forecasted NSINAR(1) observations of the neural network-based FTS in both the basic model and the hybrid model are shown in Figure 7. Figure 7 shows that the shape and properties of the time series are maintained by the forecasted observations.From Figures 3 and 7, one can deduce that the series is stationary and its values are positive-and negative-valued, and this agrees with the definition of the NSINAR(1) model.

Estimations of SDF, BDF, and NBDF
The smoothed periodogram technique by the lag window is utilized to estimate the spectral density functions.The lag window is one-dimensional in the case of the SDF and two-dimensional lag in the case of the BDF.Generally, if X 1 , X 2 , . . ., X N is the realizations of a real-valued third-order stationary process {X t } with mean γ, autocovariance R 2 (s), and third cumulant R 3 (s 1 , s 2 ), the smoothed spectrum, smoothed bispectrum, and smoothed normalized bispectrum are, respectively, given by (see [28]) where R2 (t) and R3 (t 1 , t 2 ), the natural estimators for R 2 (t) and R 3 (t 1 , t 2 ) are, respectively, given by where s 1 , s 2 ≥ 0, β = max(0, s 1 , s 2 ), s = 0, ±1, ±2, ..., ±(N − 1), −π ≤ w 1 , w 2 ≤ π, "Φ(s)" is a one-dimensional lag window and "Φ(s 1 , s 2 )" = Φ(s 1 − s 2 )Φ(s 1 )Φ(s 2 ) is a two-dimensional lag window given by [58].This section calculates estimates of the SDF, BDF, and NBDF by employing a smoothed periodogram approach that depends on the lag windows.The Daniell [59] window was chosen with a different number of frequencies M = 7 and M = 9.These estimates are calculated for three scenarios: (i) the generated realizations from NSINAR(1), which serve as the FTS's "input values", and (ii) and (iii), the forecasted observations, which serve as the output values of the FTS, which is based on a neural network.Figures 8 and 9 illustrate the estimated SDF for the input and output values of the FTS using Daniell lag windows at M = 7 and M = 9, respectively.Figures 10 and 11

Discussion of Results
In contrast to the input values for the FTS and the output values of the FTS, which depend on a neural network (in the case of the basic and hybrid models), and depending on the M.S.E at the top of each image, we discover the following: • Figures 8-13 illustrate the preference for the hybrid model, followed by the basic model, over the input values in estimating the SDF, BDF, and NBDF, as the hybrid model's outputs had the lowest mean-squared error (refer to [32] and [33] to find out how to compute the MSE).In general, this indicates that the neural network-based FTS (whether used with the basic model or the hybrid model) significantly improved the smoothing of density functions' estimates.

•
The Daniell window at M = 7 performs better than that at M = 9 in estimating the SDF for all three situations (input values and output values for the basic and hybrid models), according to a comparison of Figures 8 and 9. • The Daniell window at M = 7 performs better than that at M = 9 in estimating the BDF for all three situations (input values and output values for the basic and hybrid models), according to a comparison of Figures 10 and 11.

•
The Daniell window at M = 7 performs better than that at M = 9 in estimating the NBDF for all three situations (input values and output values for the basic and hybrid models), according to a comparison of Figures 12 and 13.

Empirical Analysis
To create a series with size n = 1000, we employed NSINAR (1).Data ranging from the first observation to the 800th were utilized for our estimation (the in-sample), whereas data spanning the 801st and 1000th were used for our forecast (the out-of-sample).These observations were generated from NSINAR(1) fifteen times, and thus, fifteen time series are available.The proposed method is applied to each time series generated from NSINAR(1).In the following scenarios, the estimators of the spectral density functions were found: for the observations produced by the NSINAR(1) model and for the observations estimated by the basic and hybrid models using the suggested methodology.The MSE was used to compare these estimators in the three scenarios.The input values were compared to the respective performances of the two models.Based on the mean-squared errors (MSEs) of each of the fifteen series, we discovered that the hybrid model outperformed the basic model in terms of performance.When comparing all the series, the hybrid model outperformed the input values in 12 out of the 15 series, while the basic model outperformed the input values in 9 out of the 15 series.

Conclusions
More precise smoothing estimates of the SDF, BDF, and NBDF for the INAR(1) models were suggested using a unique strategy.A neural network-based FTS was employed in this method.Consequently, the SDF, BDF, and NBDF for a recognized process (NSINAR(1)) were computed.The observations produced by this process served as the FTS input values.In order to anticipate the observations of NSINAR(1)'s observations, two models were employed: the basic and hybrid models, both applying neural networks.These forecasted observations served as the output values of the neural network-based FTS.Both the input and output values were used for calculating the estimates of the SDF, BDF, and NBDF of NSINAR (1).A definite improvement in the estimation for the hybrid model over the basic model, as well as the input values was discovered by comparing all density functions with their estimations in each scenario.As a result, the neural network-based FTS (either via the basic model or the hybrid model) helped further improve the smoothing of the INAR(1) estimates.Future research will attempt to improve the aforementioned technique using pi-sigma artificial neural networks, despite the fact that this requires far more work than this method to achieve the best estimate smoothing when compared to the findings reached here.

Figure 2 .
Figure 2. A node in a neural network.

Figure 7 .
Figure 7.The simulated series of the forecasted NSINAR(1) observations of neural network-based FTS.
illustrate the estimated BDF for the input and output values of the FTS using Daniell lag windows at M = 7 and M = 9, respectively.Figures 12 and 13 illustrate the estimated NBDF for the input and output values of the FTS using Daniell lag windows at M = 7 and M = 9, respectively.Tables 3-5 display the estimated BDF obtained with the Daniell window when M = 7 for the input values for the FTS, as well as the output values of the FTS, which depend on a neural network (in the case of the basic and hybrid models).

Figure 8 .Figure 9 .
Figure 8.The SDF and estimated SDF obtained with the Daniell window when M = 7, both in the case of the input and output values of the FTS.

Figure 10 .Figure 11 .Figure 12 .Figure 13 .
Figure 10.Estimated BDF obtained with the Daniell window when M = 7, both in the case of the input and output values.

Table 3 .
Estimated BDF of NSINAR(1) using the Daniell window at M=7 by the input values of the FTS.

Table 4 .
Estimated BDF of NSINAR(1) using the Daniell window at M=7 by the output values of the neural network-based FTS "Model 1".

Table 5 .
Estimated BDF of NSINAR(1) using the Daniell window at M=7 by the outputvalues of the neural network-based FTS "Model 2".