Space–Time Kriging of Precipitation: Modeling the Large-Scale Variation with Model GAMLSS

: Knowing the dynamics of spatial–temporal precipitation distribution is of vital signiﬁcance for the management of water resources, in highlight, in the northeast region of Brazil (NEB). Several models of large-scale precipitation variability are based on the normal distribution, not taking into consideration the excess of null observations that are prevalent in the daily or even monthly precipitation information of the region under study. This research proposes a novel way of modeling the trend component by using an inﬂated gamma distribution of zeros. The residuals of this regression are generally space–time dependent and have been modeled by a space–time covariance function. The ﬁndings show that the new techniques have provided reliable and precise precipitation estimates, exceeding the techniques used previously. The modeling provided estimates of precipitation in nonsampled locations and unobserved periods, thus serving as a tool to assist the government in improving water management, anticipating society’s needs and preventing water crises.


Introduction
An effort has emerged in several areas of knowledge, especially in the climate sciences, to analyze a phenomenon by both space and time [1][2][3]. It is already established that the spatial-temporal behavior of rainfall is of great importance for the water resources of a region and that it has a direct influence on human activities, such as agriculture and commerce [4].
The northeast region of Brazil (NEB) is characterized by irregular distributions of rainfall, resulting in high spatial and temporal variability of precipitation [4]. In this region, it is common to observe high rainfall indices in a particular location and no record of rain in its surroundings [5]. The state of Paraíba, located in this region, presents the same characteristics, with periods of drought in rainy seasons and behavior different from the precipitation between the mesoregions that constitute the region [6]. Consequently, Paraíba presents a problem concerning the spatiotemporal variability of rainfall. The high variability of recorded rainfall in this region may be attributed to the characteristics of the meteorological systems and their dates of installation in different regions of the state. The periodicity is related to the general behavior of sea surface temperature (SST) in the tropical Pacific region, contributing to the events of El Niño and La Niña, and most notably, by the behavior of the SST of the SST in the tropical Atlantic region due to its proximity to the study area. In the literature, several studies have evaluated precipitation in this region using spatial statistics techniques [7,8]. However, these previous works did not take spatial and temporal dependence into account simultaneously.
Several previous works used multiple linear regression to model the deterministic component, such as the Gaussian model [9][10][11]. These works assumed that the variable response to be studied presented a behavior that could be correctly modeled by a Gaussian approach.
However, recent studies have evaluated non-Gaussian models, including generalized linear models (GLM), with variable response Gamma [12,13]. Once the Gamma model is assumed, the variable domain has to be strictly positive, and it does not include the value zero. For example, Menezes et al. (2016) [12] and Monteiro et al. (2017) [13] analyzed the concentration of NO2, which had asymmetric behavior with excesses of zero. However, when using GLM with Gamma distribution, it is not possible to model the frequency of zeroes in the variable under study. To circumvent this limitation, some models captured the frequency of zeroes in continuous variables. Stauffer et al. (2017) [3] analyzed the spatial-temporal distribution of daily precipitation at 117 stations located in Tirol, Austria, over 42 years. These authors pointed to the high frequency of zero observations in the data, suggesting the use of the normal zero-censored model. However, only the trend component was modeled, not taking into account the spatial-temporal dependence of precipitation in this region.
In this study, we analyzed the total monthly rainfall measured in the State of Paraíba, Brazil, which presents an asymmetrical behavior and has a high incidence of zeros, which is a climate characteristic of the region under study. Thus, to model this behavior, a new approach in the regression adjustment is proposed by applying generalized additive models for location, scale, and shape (GAMLSS) [14]. These models have several advantages, such as all distribution parameters can be modeled as covariate functions, cases of over-dispersion and excess of zeroes in the data can be modeled, and any distribution to the response variable can be adjusted.

Materials
The study region was Paraíba State, Brazil (Figure 1), located in the NEB, which is positioned between parallels 6° and 8° S and meridians 34° and 39° W, with an area of approximately 56,500 km². Paraíba is included in the Tropical Region, and its area is divided into four geographical mesoregions: Zona da Mata, Agreste, Borborema, and Sertão.   Table 1. The total monthly precipitation data measured at 54 pluviometric stations were considered during the period from 1994 to 2014. Table 1 shows the identification number, name of pluviometric station, latitude, longitude, altitude above sea level, and quantiles of precipitation (mm) at 2.5%, 50%, and 97.5%.

Methods
In space−time geostatistics, observations are modeled as a stochastic process using a random function Z(s, t) : (s, t) ∈ D ⊆ R d × R . We assumed that Z has the first and second moments. Thus, Z(s, t) was decomposed by the sum of the trend components and the stochastic residue: In Equation (1), the trend component m(s, t) = E[Z(s, t)] is not constant in space and time, representing the large-scale variation. The residual component ε(s, t) describes random fluctuations on a small scale and encompasses the three components: spatial, temporal, and interaction.

Trend Component
The trend component is responsible for the large-scale variation of the stochastic process. In spatiotemporal datasets, this component is not a deterministic function. It was, therefore, necessary to specify stochastic patterns that represented the observed variability. The specification of the trend component included the geographic coordinates as covariables, as well as time. The purpose of this study was to adjust this component using GAMLSS models. In the GAMLSS models, it is assumed that the observations z i are independent, with i = 1, 2, . . . , n and conditioned to θ k = (µ, σ, ν, τ), with k = 1, 2, . . . , n. So, Z θ k ∼ G(µ, σ, ν, τ) , where G represents Z distribution. The parameters µ, σ, ν, and τ are classified as the parameters of the location, scale, asymmetry, and kurtosis, respectively, where the latter two represent the distribution shape [14]. Thus, according to these authors, the GAMLSS models were expressed in the form, where g k (·) is a monotonic link function, and θ k is related to explanatory variables and random effects; θ k and η k are vectors of size n; β k is a parameter vector of size J k ; X k and Y jk are the fixed (covariate) matrices of size n × J k and n × q ij , respectively; and γ jk is a random variable of dimension q ij . In a purely parametric linear model, we have that J k j=1 Y jk γ jk = 0, that is, there are no additive terms associated with the distribution parameters in the model. In the space-time case, we have η(s, t) = g(m(s, t)).
In the modeling of the trend, the component was considered Y jk = I n , where I n is the identity matrix of order n × n and γ jk = f jk = f jk x jk . Thus, Equation (2) is rewritten as: The model given in Equation (3) was fitted to the data of this study, where f jk is an unknown function of the explanatory variable x jk and f jk x jk is a vector that evaluates the function f jk in x jk . This model is called the semiparametric additive GAMLSS and may contain parametric, nonparametric and random effects terms [14]. We considered the geographic coordinates (latitude and longitude) and the temporal index (time) as covariables to model the trend. This time index was constructed as follows: an index equal to 1 denoted Jan/1994, the index equal to 2 denoted Feb./1994, and so on. In all covariates, cubic splines (cs) were inserted, including the covariate time that was designed to model the seasonal effect of precipitation. Thus, the terms of the trend component were expressed as: where x 11 , x 12 , and x 13 represent the covariates of longitude, latitude, and time, respectively (Equation (4)). The parameters β 10 , β 11 , β 12 , and β 13 are related to the lease vector of the GAMLSS model, β 20 is the parameter related to the intercept for the scale vector, and β 30 is the parameter associated with covariable time and that is responsible for modeling the occurrence of no zeroes in the response variable.
In order to obtain the parameter estimates for this regression model, the penalized maximum likelihood method was used, which differs from conventional methods because it assigns the value zero to the coefficients of nonsignificant variables. However, this method does not take into account the assumption that errors are uncorrelated. Thus, it was necessary to evaluate the residuals of this model, taking into account the correlation structure. For this, we proposed to fit a valid nonseparable space-time variogram model.

Spatiotemporal Variogram
When inserting structures in the trend component, one advantage is that it describes the large-scale variation of the phenomenon being studied, causing the residuals to present small-scale variations, consequently reducing the prediction error in the kriging. After obtaining the estimates of the trend component parameters, the stochastic residue is obtained by difference,ε(s, t) = Z(s, t) −m(s, t). Therefore, the next step of the analysis was the empirical estimation of the covariance function or spatiotemporal variogram to model this residue [2]. Spatiotemporal covariates are usually described using a space-time variogram (γ st ), which measures the mean difference between separated data in the space-time domain using the distance vector. The estimate of this variogram, obtained by the moment's method, was given by: where L(r s , r t ) is the cardinality for the set L(r s , r t ). It is noted in Equation (5) that the empirical variogram depends only on spatial and temporal distances, and this may not result in a valid variogram. Thus, to obtain estimates ofγ st in any arbitrary lag, the empirical variogram must be smoothed. As a solution to this limitation found in the empirical variogram, one should use a theoretical model, γ st (h s , h t , ω), which fits, as best as possible, the space-time dependence structure of the residues [15]. The vector ω contains all the unknown parameters to be estimated. The typical approach is to fit a theoretical model to the empirical variogram. The least-squares estimation method is the most common approach. However, there are other methods, such as maximum likelihood, generalized estimation equations, and composite likelihood, among others [16]. In this study, the least-squares method was used.
There are no assumptions about the probability distribution of the empirical variogram's generated values in the least-squares method. The approach considers the statistical model of the formγ(h s , h t ) = γ(h s , h t ) + ε (h s , h t ). Assume that ε (h s , h t ) has null mean and covariance matrix R = R(ω), which depends on ω. We suggested that the diagonal of R(ω) can be approximated to: where N h s i , h t j is the number of pairs at each space-time distance. Thus, the weights assigned in the weighting matrix are proportional to the number of pairs in a given space-time distance. The adjustment of the theoretical space-time variogram model was made by using the weighted least-squares (WLS) method, which replaces R(ω) with the diagonal matrix W(ω). The elements of W(ω) were obtained from Equation (6). Thus, using the weights given in this Equation (7), the WLS method is given by: Generally, statistical assumptions are considered, which comprise a space-time covariance function and ensure that this function is defined as positive [17][18][19]. Several researchers have studied the construction of valid space-time covariance functions [15,20,21]. Although this type of construction is a difficult task, the complexity becomes more significant when it is desired to determine valid models of space-time covariance. In the space-time approach, there are two types of general model classes: the separable and the nonseparable.
The separable models assume that spatial and temporal processes are uncorrelated and that variograms are composed of purely spatial and purely temporal models. As an example, the space-time covariance function can be expressed as the product between purely spatial and purely temporal components. The major drawback of this type of model is the nonincorporation of the spatial-temporal interaction component [18]. However, the nonseparable models assume that the space-time phenomenon is correlated in space-time [22]. We used the generalized sum-product model, which belongs to the nonseparable covariance model class.
The stationary covariance model product-sum [17,23] is given by Assuming that ε(s, t) is second-order stationary, we have that: , in which, based on the assumption of second-order stationarity, we have that the product-sum covariance function is defined as: where k 1 > 0, k 2 ≥ 0 e are constants that ensure that the covariance function is positive definite (Equation (8)). C s (0) e C t (0) are the spatial and temporal thresholds, respectively. Applying the relation, γ(h s , h t ) = C(0, 0) − C(h s , h t ), we have the product-sum variogram. This model gives origin to the following relations: where k s and k t are the proportionality coefficients between the space-time variograms, γ s (h s , 0) and γ t (0, h t ), and the spatial and temporal variograms, γ s (h s ) and γ t (h t ), respectively. Therefore, modeling γ(h s , 0) is equivalent to modeling γ s (h s ). Likewise, estimating and modeling γ(0, h t ) is equal to estimating and modeling γ t (h t ). The two relations established in Equation (9) can be simplified by imposing three constraints: These constraints facilitate the estimation of the model parameters γ(h s , 0) and γ(0, h t ) using γ s (h s ) and γ t (h t ), for this it is necessary to determine k 1 , k 2 , and k 3 . These three coefficients of the model can be written in terms of the thresholds and parameters k s and k t [17]. In this particular case, this leads to modeling γ s (h s ) and γ t (h t ), using the models of γ(h s , 0) and γ(0, h t ), respectively. In addition, these two parameters were combined into a single parameter k, resulting in the generalized product-sum model (Equation (10)).
By combining the parameters k s and k t to form a single parameter k, the space-time variogram expression, γ(h s , h t ), was simplified to, where k = k 1 k s k t . In Equation (11), the parameter k is expressed as: We used the covariance generalized product-sum model, and the variogram function of this model is expressed by: where γ st (h s , 0) and γ st (0, h t ) are the marginal spatial and temporal variograms, respectively (Equation (12)). The great advantage of this type of model is that sample-based marginal adjustments are used, and only one global threshold parameter is incorporated for space-time interaction [24]. In Equation (13) the parameter k is positive and has an identity involving the global threshold, C st (0, 0) together with the spatial and temporal threshold, C s (0) and C t (0), respectively, given by:

Geostatistical Prediction
Once the trend component was determined and estimated and a valid space-time variogram model fitted, the next step consisted of the space-time kriging of the residuals using the ordinary kriging method. The variogram model is crucial in space-time kriging to calculate the best invariant linear predictor. Thus, it was necessary to obtain the space-time kriging equations, which provide the weights of the observations in such a predictor and the prediction variation that indicates the prediction accuracy [1].
Ordinary space-time kriging [15] consists of predicting the value ε(s 0 , t 0 ), at the nonsampled space-time point (s 0 , t 0 ), from the stochastic component ε(s, t 0 ). For this, the linear predictor, , is used, where λ i are the kriging weights obtained by imposing the condition that the prediction error expectation is zero and that it has minimum variance; that is, it is the best linear unbiased predictor (BLUP). To obtain the weights, we used the ordinary space-time kriging equations that were calculated in terms of the variogram: In the system of Equations (14), α is the Lagrange multiplier required to minimize the prediction variance. The prediction variance (Equation (15)) is given by: in which the final predictor (ẑ(s 0 , t 0 )) for the precipitation variable (Z), at the location (s 0 , t 0 ), is defined as:ẑ (s 0 , t 0 ) =m(s 0 , t 0 ) +ε(s 0 , t 0 ) wherem(s 0 , t 0 ) is the estimated value for the location (s 0 , t 0 ) obtained by adjusting GAMLSS models (Equation (16)). Figure 2 shows the flowchart of regression kriging with GAMLSS model. The interpolation performance in space-time kriging was evaluated using the leave-one-out method, which consisted of removing an observed point from the precipitation in space-time ( ( 0 , )) and then predicting its value (̂( 0 , )). This process was then repeated for all remaining points, and the residue (E) of this procedure was then obtained by the difference between the observed and predicted values at each location ( =̂( , ) − ( 0 , )). After that, the RMSE (root mean squared error), MAE (mean absolute error), and R² (coefficient of determination) were extracted as selection criteria.

Descriptive Analysis
The precipitation variable, measured monthly in each of the 54 locations, in the period 1994-2014, presented a median value of 29.05 mm and an interquartile range of 3.60-90.20 mm. In the whole historical series, approximately 20% of the observations presented 0.00 mm of total monthly precipitation. In order to evaluate the precipitation distribution of the studied region, a histogram was initially developed for the entire dataset ( Figure 3). The interpolation performance in space-time kriging was evaluated using the leave-one-out method, which consisted of removing an observed point from the precipitation in space-time (p(s 0 , t)) and then predicting its value (p(s 0 , t)). This process was then repeated for all remaining points, and the residue (E) of this procedure was then obtained by the difference between the observed and predicted values at each location (E =p(s 0 , t) − p(s 0 , t)). After that, the RMSE (root mean squared error), MAE (mean absolute error), and R 2 (coefficient of determination) were extracted as selection criteria.

Descriptive Analysis
The precipitation variable, measured monthly in each of the 54 locations, in the period 1994-2014, presented a median value of 29.05 mm and an interquartile range of 3.60-90.20 mm. In the whole historical series, approximately 20% of the observations presented 0.00 mm of total monthly precipitation. In order to evaluate the precipitation distribution of the studied region, a histogram was initially developed for the entire dataset ( Figure 3).
In Figure 3, it is possible to observe the excess of zeroes values in the dataset and, consequently, the presence of asymmetric behavior, indicating that the Gaussian model is not the most suitable for modeling.
Box plots were built to explore the behavior of precipitation at the 54 pluviometric stations (Figure 4). In Figure 3, it is possible to observe the excess of zeroes values in the dataset and, consequently, the presence of asymmetric behavior, indicating that the Gaussian model is not the most suitable for modeling.
Box plots were built to explore the behavior of precipitation at the 54 pluviometric stations ( Figure 4).   Table 1, in this study.
Due to the asymmetric behavior and the high variability of precipitation, the State of Paraíba suffers from irregular distribution of rainfall throughout the region. This distribution presented high variability among the mesoregions analyzed in this work. A positive asymmetry was observed in all the locations, as well as the presence of atypical points (Figure 4).

Trend Analysis
The distribution of precipitation relative to geographical coordinates (latitude and longitude) and temporal index, with the corresponding adjustments through cubic splines, is presented in  Table 1, in this study.
Due to the asymmetric behavior and the high variability of precipitation, the State of Paraíba suffers from irregular distribution of rainfall throughout the region. This distribution presented high variability among the mesoregions analyzed in this work. A positive asymmetry was observed in all the locations, as well as the presence of atypical points (Figure 4).

Trend Analysis
The distribution of precipitation relative to geographical coordinates (latitude and longitude) and temporal index, with the corresponding adjustments through cubic splines, is presented in Figure 5. An exploratory analysis by adjusting the cubic smoothness function of the spline showed that precipitation occurred more frequently in the extremes of the studied region, corresponding to the mesoregions of Sertão and Zona da Mata, and that there was limited precipitation in the areas located near the meridian 36 W, corresponding to the Borborema mesoregion (Figure 5a). The spatial distribution of rainfall in the region presented a spatial tendency that varied with the latitudinal and longitudinal gradients (Figure 5a,b). For the analysis of the time trend, the graph of the time series of precipitation was calculated, based on the average of 54 pluviometric stations (Figure 5c). This series showed an evident seasonal behavior of 12 months. Thus, for this series, a GAMLSS model was fitted using cubic smoothing splines function.
To model the trend of precipitation data, the GAMLSS model was used, via zero-adjusted gamma (ZAGA) distribution, as indicated in the system of Equations (4). Table 2 presents the coefficient estimates, standard error, t value, p-value, and the coefficient of determination (R²) in the model adjustment.  An exploratory analysis by adjusting the cubic smoothness function of the spline showed that precipitation occurred more frequently in the extremes of the studied region, corresponding to the mesoregions of Sertão and Zona da Mata, and that there was limited precipitation in the areas located near the meridian 36 • W, corresponding to the Borborema mesoregion (Figure 5a). The spatial distribution of rainfall in the region presented a spatial tendency that varied with the latitudinal and longitudinal gradients (Figure 5a,b). For the analysis of the time trend, the graph of the time series of precipitation was calculated, based on the average of 54 pluviometric stations (Figure 5c). This series showed an evident seasonal behavior of 12 months. Thus, for this series, a GAMLSS model was fitted using cubic smoothing splines function.
To model the trend of precipitation data, the GAMLSS model was used, via zero-adjusted gamma (ZAGA) distribution, as indicated in the system of Equations (4). Table 2 presents the coefficient estimates, standard error, t value, p-value, and the coefficient of determination (R 2 ) in the model adjustment. The results presented in Table 2 indicate, at the level of 0.01 of significance, that the covariates longitude, latitude, and the temporal index are related to the precipitation. The R 2 shows that 48% of the large-scale variation of rainfall was explained by the trend component, corroborating with the results found through the descriptive analysis.
After estimating the trend component, which was described in Equation (1), we then analyzed the stochastic residue,ε(s, t) = Z(s, t) −m(s, t). Next, the space-time sampling variogram of these residues was calculated, taking as reference the adjustment of the generalized product-sum theoretical model.

Geostatistical Analysis
The residuals, obtained from the regression via GAMLSS models, presented an evident pattern of spatiotemporal correlation (Figure 6a). There was a strong spatial correlation between nearby sites, and the spatial structure became weaker as time differences increased. Similarly, the same occurred with the temporal structure ( Figure 6b). Thus, spatial-temporal kriging of the residue was required. The results presented in Table 2 indicate, at the level of 0.01 of significance, that the covariates longitude, latitude, and the temporal index are related to the precipitation. The R² shows that 48% of the large-scale variation of rainfall was explained by the trend component, corroborating with the results found through the descriptive analysis.
After estimating the trend component, which was described in Equation (1), we then analyzed the stochastic residue, ( , ) = ( , ) −̂( , ). Next, the space-time sampling variogram of these residues was calculated, taking as reference the adjustment of the generalized product-sum theoretical model.

Geostatistical Analysis
The residuals, obtained from the regression via GAMLSS models, presented an evident pattern of spatiotemporal correlation (Figure 6a). There was a strong spatial correlation between nearby sites, and the spatial structure became weaker as time differences increased. Similarly, the same occurred with the temporal structure ( Figure 6b). Thus, spatial-temporal kriging of the residue was required.  The spatial variability exhibited by the marginal spatial variogram (Figure 7a) differed from the temporal variability shown in the marginal temporal variogram (Figure 7b), since a difference, concerning the threshold, was observed between the two marginal variograms. For this reason, the spatiotemporal variability of the residues could be modeled through the generalized product-sum The results presented in Table 2 indicate, at the level of 0.01 of significance, that the covariates longitude, latitude, and the temporal index are related to the precipitation. The R² shows that 48% of the large-scale variation of rainfall was explained by the trend component, corroborating with the results found through the descriptive analysis.
After estimating the trend component, which was described in Equation (1), we then analyzed the stochastic residue, ( , ) = ( , ) −̂( , ). Next, the space-time sampling variogram of these residues was calculated, taking as reference the adjustment of the generalized product-sum theoretical model.

Geostatistical Analysis
The residuals, obtained from the regression via GAMLSS models, presented an evident pattern of spatiotemporal correlation (Figure 6a). There was a strong spatial correlation between nearby sites, and the spatial structure became weaker as time differences increased. Similarly, the same occurred with the temporal structure ( Figure 6b). Thus, spatial-temporal kriging of the residue was required.  The spatial variability exhibited by the marginal spatial variogram (Figure 7a) differed from the temporal variability shown in the marginal temporal variogram (Figure 7b), since a difference, concerning the threshold, was observed between the two marginal variograms. For this reason, the spatiotemporal variability of the residues could be modeled through the generalized product-sum The spatial variability exhibited by the marginal spatial variogram (Figure 7a) differed from the temporal variability shown in the marginal temporal variogram (Figure 7b), since a difference, concerning the threshold, was observed between the two marginal variograms. For this reason, the spatiotemporal variability of the residues could be modeled through the generalized product-sum model. One of the advantages of constructing marginal variograms is the fact that it is possible to identify some theoretical variogram models (for example, Gaussian, Exponential, and Spherical) that can compose the structure in the generalized product-sum model (Equation (12)).
As described above, the Normal, Gamma, and ZAGA models have been adjusted for large-scale variations. The different structures of the generalized product-sum model were adjusted for the residues obtained in each of these models, followed by the spatial-temporal kriging of that model. The cross-validation results are provided in Table 3. The results of the validation were calculated after the values adjusted by the trend component were added to the values interpolated in space-time kriging. Table 3. Space-time kriging results combining the different trend models (NORMAL, GAMMA and ZAGA) with the different variogram models: Gaussian (GAU), Exponential (EXP) and Spherical (SPH). The RMSE (root mean squared error), MAE (mean absolute error), and R 2 (coefficient of determination) estimates were obtained by "leave-one-out" cross-validation. The results presented in Table 3 indicate that the Exponential model was the best fit for the spatial (γ st (h s , 0)) and temporal (γ st (0, h t )) structures in all regression models, since it presented higher R 2 and smaller RMSE and MAE. Note that the ZAGA model (EXP + EXP) presented the best results, with RMSE of 34.598 mm, MAE of 20.970 mm, and R 2 equal to 82.8%. These results demonstrated the effectiveness of adjusting the precipitation data to an adequate model that contemplates the spatial-temporal characteristics presented by the phenomenon.

MODEL
The components of the generalized product-sum model (Equation (12)) and their estimates are shown in Table 4. The components were chosen based on the results in Table 3. Table 4. Parameter estimates of the generalized product-sum variogram model adjusted to precipitation residues. The final model is obtained by replacing these estimates in Equation (12).

Component
Variogram The parameter estimates (Table 4) indicated that all components act on the spatial-temporal pattern of the ZAGA regression residuals. The parameter estimation related to the spatial extent was high, indicating that the residues were correlated in distances of up to 171 km. The temporal scale estimate indicated that sites with a delay of up to 47 days had residual autocorrelation.
The main advantage of the methodology proposed in this study is that with the space-time kriging technique, predictions can be made for unobserved locations and times [21,24]. As an example of this methodology applicability, the spatial-temporal kriging of precipitation was carried out in the period 2015 ( Figure 8). In Figure 8, it is possible to evaluate the precipitation behavior in nonsampled sites, as well as in times not observed in the sample, helping policymakers manage the allocation of water in the present and the future. It is noted that the spatial distribution of rainfall presented variability within each mesoregion and also among the four mesoregions. Precipitation in the region occurred more intensely during the first half of the year, but during the same period, there was an irregularity in the distribution of rainfall between the mesoregions. In the Sertão, the high spatial variability can be justified by meteorological systems in the area, such as the the intertropical convergence zone (ITCZ) and the high-level cyclonic vortices (VCAN). It is observed that in this region, the rainfall distribution during February to May occurred irregularly. In years when El Niño and La Niña occurred with strong intensity, there was substantial evidence that rainfall occurred, respectively, above or below the expected value, resulting in high temporal variability of precipitation [27]. The rainy season in this area is concentrated between February and May, presenting high variation in space and time. The low volume of rainfall and the irregularity of its distribution have contributed, for example, to the occurrence of the desertification phenomenon [5,28].

Discussion
The state of Paraíba has abnormalities in the distribution of precipitation, with high spatial and temporal variability, resulting in heavy rains with short duration and prolonged periods of drought. Consequently, the scarcity of rain directly affects water resources, resulting in severe problems in the region's economy, which focuses on family agriculture and livestock [29]. Also, drought has affected water distribution for human consumption, electricity supply, the occurrence of diseases and migration, among other issues, resulting in an unprecedented water crisis [30]. Thus, a space-time distribution modeling of rainfall becomes of paramount importance for the water resources management to reduce the impacts caused by drought in this region.
Irregular rainfall, resulting in periods of prolonged drought, is associated with meteorological phenomena. Strong El Niño years have caused intense droughts in Paraíba [27]. Rainfall periods are concentrated in a few months, occurring mainly in winter, which is associated with the ITCZ action In Figure 8, it is possible to evaluate the precipitation behavior in nonsampled sites, as well as in times not observed in the sample, helping policymakers manage the allocation of water in the present and the future. It is noted that the spatial distribution of rainfall presented variability within each mesoregion and also among the four mesoregions. Precipitation in the region occurred more intensely during the first half of the year, but during the same period, there was an irregularity in the distribution of rainfall between the mesoregions. In the Sertão, the high spatial variability can be justified by meteorological systems in the area, such as the the Intertropical Convergence Zone (ITCZ) and the Upper Tropospheric Cyclonic Vortices (UTCV). It is observed that in this region, the rainfall distribution during February to May occurred irregularly. In years when El Niño and La Niña occurred with strong intensity, there was substantial evidence that rainfall occurred, respectively, above or below the expected value, resulting in high temporal variability of precipitation [27]. The rainy season in this area is concentrated between February and May, presenting high variation in space and time. The low volume of rainfall and the irregularity of its distribution have contributed, for example, to the occurrence of the desertification phenomenon [5,28].

Discussion
The state of Paraíba has abnormalities in the distribution of precipitation, with high spatial and temporal variability, resulting in heavy rains with short duration and prolonged periods of drought. Consequently, the scarcity of rain directly affects water resources, resulting in severe problems in the region's economy, which focuses on family agriculture and livestock [29]. Also, drought has affected water distribution for human consumption, electricity supply, the occurrence of diseases and migration, among other issues, resulting in an unprecedented water crisis [30]. Thus, a space-time distribution modeling of rainfall becomes of paramount importance for the water resources management to reduce the impacts caused by drought in this region. Irregular rainfall, resulting in periods of prolonged drought, is associated with meteorological phenomena. Strong El Niño years have caused intense droughts in Paraíba [27]. Rainfall periods are concentrated in a few months, occurring mainly in winter, which is associated with the ITCZ action and the passage of cold fronts from the South Atlantic Ocean [31]. Due to the high rainfall variability of the region, rainfall behavior can be classified into three periods, namely: pre-rainy season, rainy season and post-rainy season. During the pre-rainy season, which runs from December to January in the Sertão mesoregion, meteorological systems such as UTCV and instability associated with cold fronts occur. The rainy season itself persists from February to May, when the ITCZ acts as the primary meteorological system responsible for the recharge of water resources in the state. The post-rainy season usually occurs between May and July in the Zona da Mata mesoregion and some localities of the Southeast. The Eastern Wave Disturbances (EWDs) are the primary meteorological system acting in this period.
In this study, we used the gamma distribution, which shows asymmetric behavior, to model precipitation data. However, it is known that this probability distribution does not include values equal to zero, not having a parameter that can be used to model the excess of zero that is common in data of monthly total precipitation throughout the NEB. As an advantage, the methodology allowed modeling the occurrence of null values, resulting in better estimates when compared with models that do not contemplate the excess of zeroes in the dataset.
Several studies that deal with spatial-temporal modeling and which have null values in the database have used the regression model with Normal distribution [11,32] or the Gamma regression model [12,13]. However, the Normal regression does not solve the data asymmetry problem, and the Gama regression does not contemplate in its domain the occurrence of zeroes. Thus, we propose a new way of modeling the trend component through GAMLSS models, especially using the ZAGA distribution, allowing one of its three parameters to model the excess of zeroes that occurs with high frequency in data of monthly total precipitation.
The trend component alone does not model the spatiotemporal dependence that is present in the regression residuals. Thus, the need for space-time modeling of these residues arises. Previous studies did not take into account this spatial and/or temporal residue dependence [7,33]. Consequently, we modeled the spatiotemporal dependence structure of this stochastic component, using a covariance function. However, this approach has numerous parameters present in the space-time theoretical variogram, resulting in great computational difficulties to determine the parameter estimates.
A significant benefit of the suggested methodology is that it can predict in nonsampled places and in periods that have not been observed, thereby assisting water-related initiatives in the region. The Agreste and Borborema mesoregions, with a reduced amount of precipitation, were discovered to be an aggravating factor in the uneven rainfall distribution in the study region. The majority of the state of Paraíba is part of the Brazilian semi-arid region, which has a climate characterized by low humidity and rainfall. Studies in recent years have revealed the process of desertification in the semi-arid Northeast regions [30]. This research can, therefore, identify areas that will soon be affected by water supplies for human and animal consumption. These findings can serve as a supporting tool to help government agencies in water management to predict societal needs and avoid water shortages that are rapidly intensifying, not only in Paraíba State, but also in the NEB as a whole.

Conclusions
In this study, total monthly precipitation in the state of Paraíba was analyzed using spatiotemporal geostatistical tools. Rainfall data of 54 pluviometric stations from 1994 to 2014 were analyzed. Large-scale variation was modeled using spatial and temporal covariates to remove the spatiotemporal trend in the regression residuals. It was proposed to use the adjustment of GAMLSS models with different distributions for the response variable (Normal, Gamma, and ZAGA). The results of the cross-validation indicated that the ZAGA distribution best fits the data, with the exponential model for the spatial component and also for the time component of the generalized product-sum variogram model. Also, the results indicated the presence of the space-time correlation in the rainfall phenomenon, implying in this way the proper use of space-time kriging proposed in this study. Finally, space-time kriging in Paraíba state for the year 2015 evidenced the great irregularity in the distribution of rainfall throughout the region, providing a detailed visual analysis of sectors suffering from scarcity and excess of rain, favoring government policies that address water resources.