The decision to carry out this study was made during the spectrum measurements campaign held in Bogota-Colombia where we obtained the measurements employed here from spectrum occupancy study previously carried out [
1,
32]. The band analyzed was the GSM 850 MHz, as it is a band constantly used and viable for analysis in time function with conventional equipment, like a spectrum analyzer. The measurements used in this study correspond to a week, from 23 December to 29 December 2012. In some studies [
33], it has been indicated that a reasonable option to obtain representative data without any a priori information about a band is to consider measurement periods of at least 24 h in order to avoid under or overestimating frequency bands occupancy with some temporary patterns. While a 24-h measurement period could be thought of as adequate in order to properly characterize the activity of determined spectrum bands [
34], in this research 7 days were analyzed, including patterns for workdays and weekends. Additionally, this time period is sufficient to measure occupancy in mobile networks with low use, as indicated in [
2,
34].
The channels to be modeled were selected after measuring the duty cycles of 60 channels at GSM band. From these, three channels with different occupancy levels (high, medium and low), were chosen.
Figure 1 presents results of power measure for three downlink channels during a week, with different power level. Spectrum analyzer configuration for this band was the following: a resolution bandwidth of 100 kHz with a sweep time of 290 ms, which guarantees GSM signal detection with a bandwidth of 200 kHz. Daily duty cycles from PUs at selected channels are shown in
Figure 2. Threshold (λ) used, which for this event is of −89 dBm, was obtained from Equation (6) with a probability of false alarm (Pfa) of 1% [
35]. λ which is above of the detected noise floor of −102 dBm [
1],
where Г(.) y Γ(. , .) are complete and incomplete gamma functions, respectively, and
m is the product of time times bandwidth.
Figure 3,
Figure 4 and
Figure 5 present histograms corresponding to opportunities distribution during time periods of GSM band channels; it is observed that such opportunities have an exponential behavior, whose approximate equations and the coefficient of determination (R
2) are exhibited in each figure. Thus, the occurrence increases for the channel occupancy, especially for shorter time periods of use. For low, medium and high occupancy channels, total times of opportunities were approximately 84 h, 81 h and 78 h, respectively, which indicates relatively low occupancy [
2].
Following this, we proceeded to analyze the time series of measured channels over a week, which was equivalent to 1,062,514 samples. To do this, ACF is initially presented, as observed in
Figure 6. ACF diagrams for the three channels presents forms which are alternately positive and negative, decaying to zero, the values are in 95%-confidence intervals, shown with the blue lines. Therefore, this indicates that there is correlation [
2,
26].
When analyzing channels stationarity of
Figure 6, it is observed that the mean and variance are constant and similar to each other, on each one of the days from Monday to Friday. Therefore, measurements at the weekend are not taken into consideration when training the analyzed models, because the mean and variance are not similar and change in a significant way with regard to the measurements from Monday to Friday.
3.1. Design of SARIMA Algorithm
In
Figure 7 the trend and seasonality are presented in occupancy level for the three channels. Seasonality had a period of 24 h, practically without trend and with stationary components, which makes possible the use of a SARIMA model to forecast behavior of the GSM channels [
26].
Delay difference
s, which for this event is selected as five (∆
5), was equivalent to the number of days of the week in which the signal was stationary [
15]. Applying the augmented Dickey–Fuller test [
36], in the series of three channels from Monday to Friday, the null hypothesis of unit root is rejected, which indicates stationarity. In order to find the parameters of SARIMA(p,d,q)(P,D,Q)s model, ACF and PACF were calculated for ∆
5 of respective channels, as shown in
Figure 8.
Using Box-Jenkins methodology [
23],
Figure 8 shows that PACF of ∆
5 decays to zero with a seasonal pattern, and crosses confidence level initially in lag 5 for negative side. This suggests that a term non-seasonal AR(1) could be used, and a seasonal MA(5) could be added.
In order to avoid forecast overestimation (small variance and big errors), the Akaike information criterion (AIC) [
37] was selected to evaluate different reasonable combinations, as is observed in
Table 2. Thus, selected models were: SARIMA(1,0,5)(1,0,1)
5, SARIMA(1,0,5)(0,0,1)
5 and SARIMA(1,0,5)(0,0,1)
5, for occupancy levels of low, medium and high channels, respectively, and the characteristic equations, in the same order are:
3.2. Design of GARCH Algorithm
When analyzing in detail the large amount of acquired information, the existence of standard deviation was observed; therefore the GARCH algorithm was used to forecast the behavior of measured series. Stochastic models ARIMA and SARIMA are methods for univariate modeling. The main difference among former models and GARCH model lies in the constant variance assumption.
Even though for the developed algorithm there is stationarity in original signal from Monday to Friday, for this case the fifth difference is developed because there is a greater degree of stationarity. In
Figure 9 the difference for each channel is presented. Channel measurements are converted into returns by logarithmic transformation. The logarithmic returns are defined in Equation (10),
where
Pt is power value in time
t and
Pt−1 is power value in time
t−1.
Now we present a formal statistical test in order to establish the presence of ARCH effects in the data and correlation. H = 0 implies that there exist no significant correlation as well as H = 1 indicates that there exists a significant correlation. In
Table 3 and
Table 4, all the
p values show that Ljung-Box-Pierce Q-Test and Engle ARCH Test in lag 10, 15 and 20 are significant, revealing the presence of ARCH effects (heteroskedasticity), indicating that GARCH modeling is appropriate.
Dependence in data x1,…, xn was determined by computing correlations. This was done by representing the ACF.
If the time series is the result of a completely random phenomenon, the autocorrelation should be close to zero for all time-lag separations. Otherwise, one or more of the autocorrelations will be significantly different from zero. Another useful way to examine dependencies of the series is to revise the PACF, where the dependence of intermediate elements (those within the lag) is eliminated. In
Figure 10, graphs of ACF, PACF and ACF of square returns present the existence of correlation in data of channel occupancy.
Below, in
Table 5,
Table 6 and
Table 7, the evaluation and selection of the GARCH model for each channel was performed.
The GARCH model selection for each channel was done by fulfilling
criterion, so the model is stationary, and then taking into account the more proximate values to zero of MAE, MAPE and SMAPE from
Table 5,
Table 6 and
Table 7. Therefore, the selected models for low, medium and high channel are GARCH(2,2), GARCH(0,2) and GARCH(0,1), respectively.
Parameters for low channel model were estimated and are presented in
Table 8. GARCH(2,2), where
is fulfilled.
Thus, the model according to
Table 8 is presented in Equations (11) and (12),
For medium channel, GARCH(0,2), model values presented in
Table 9 are estimated.
Therefore, Equations (13) and (14) are obtained,
For high channel, GARCH(0,1), the following parameters were obtained, as shown in
Table 10.
Then the model is described in Equations (15) and (16),
ARCH-GARCH model analysis is based on evaluation of standardized residuals [
31]. One assumption with GARCH model is that for a good model, residuals should follow a white noise process. This is to say that it is expected that residuals be at random, independent and identically distributed, following a normal distribution.
Figure 11 presents the relationship between innovations (residuals) derivate from adjusted model, the corresponding conditional standard deviations and returns.
Figure 11 shows that both innovations and returns exhibit variations. In the following we intend to find out if by performing GARCH the autocorrelation of the standardized innovations disappears, which would indicate the effectiveness of GARCH model.
Figure 12 corresponds to the autocorrelation of the squared standardized innovations, in which correlation was not observed.
In
Table 11 and
Table 12, results of
Ljung-Box-Pierce Q-Test and
Engle ARCH test for later analysis are presented using standardized innovations. These tests indicate no presence of correlation or ARCH effects. We have GARCH effects and also correlation between innovations that disappear after treating the data. Therefore, the GARCH model is a proper model for explaining the variances of the three channels.
Normality verification was performed by analyzing histograms of residuals and normal probability graph, as shown in
Figure 13. The histograms of the three channels shows that the residuals are normally distributed. In turn, the probability graph confirms that residuals respond to a normal distribution, since most of data are spread along the straight line.