# Spatial Modeling of Precipitation Based on Data-Driven Warping of Gaussian Processes

^{1}

^{2}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Methodology

#### 2.1. Introduction to Gaussian Process Regression

#### 2.2. Warping (Gaussian Anamorphosis) for Non-Gaussian Distributions

#### 2.3. Data-Driven Warping of Gaussian Processes

#### 2.4. Hyperparameter Estimation

#### 2.5. Assessment of Predictive Performance

## 3. Application of GPR and Warped GPR to Synthetic Data

## 4. Application of GPR and Warped GPR to Reanalysis Data

#### 4.1. Study Area and Data Description

#### 4.2. Exploratory Statistical Analysis

#### 4.3. GPR and Warped GPR Comparison Based on the Reanalysis Data

## 5. Discussion and Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

AIC | Akaike Information Criterion |

BIC | Bayesian Information Criterion |

GEV | Generalized Extreme Value distribution |

GP | Generalized Pareto distribution |

GPR | Gaussian process regression |

KCDE | Kernel cumulative distribution estimate |

KDE | Kernel density estimate |

wGPR | Warped Gaussian Process Regression |

## Appendix A. Variogram Models

## Appendix B. Cross-Validation Metrics

## References

- Stocker, T.; Qin, D.; Plattner, G.K.; Tignor, M.; Allen, S.; Boschung, J.; Nauels, A.; Xia, Y.; Bex, V.; Midgley, P.M. (Eds.) Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK, 2013. [Google Scholar] [CrossRef] [Green Version]
- IPCC. Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change; Masson-Delmotte, V., Zhai, P., Pirani, A., Connors, S.L., Péan, C., Berger, S., Caud, N., Chen, Y., Goldfarb, L., Gomis, M.I., et al., Eds.; Cambridge University Press: Cambridge, UK, 2021. [Google Scholar]
- Varouchakis, E.A.; Corzo, G.A.; Karatzas, G.P.; Kotsopoulou, A. Spatio-temporal analysis of annual rainfall in Crete, Greece. Acta Geoph.
**2018**, 66, 319–328. [Google Scholar] [CrossRef] - Zribi, M.; Brocca, L.; Tramblay, Y.; Molle, F. Water Resources in the Mediterranean Region; Elsevier: Amsterdam, The Netherlands, 2020. [Google Scholar]
- Deng, K.A.K.; Lamine, S.; Pavlides, A.; Petropoulos, G.P.; Srivastava, P.K.; Bao, Y.; Hristopulos, D.; Anagnostopoulos, V. Operational soil moisture from ASCAT in support of water resources management. Remote Sens.
**2019**, 11, 579. [Google Scholar] [CrossRef] [Green Version] - Giorgi, F. Climate change hot-spots. Geoph. Res. Lett.
**2006**, 33, L08707. [Google Scholar] [CrossRef] - Luterbacher, J.; Xoplaki, E.; Casty, C.; Wanner, H.; Pauling, A.; Küttel, M.; Brönnimann, S.; Fischer, E.; Fleitmann, D.; Gonzalez-Rouco, F.J.; et al. Mediterranean climate variability over the last centuries: A review. In Mediterranean Climate Variability; Lionello, P., Malanotte-Rizzoli, P., Boscolo, R., Eds.; Elsevier: Amsterdam, The Netherlands, 2006; Volume 4, pp. 27–148. [Google Scholar] [CrossRef]
- Norrant, C.; Douguédroit, A. Monthly and daily precipitation trends in the Mediterranean (1950–2000). Theor. Appl. Climatol.
**2006**, 83, 89–106. [Google Scholar] [CrossRef] - Christakos, G. Spatiotemporal Random Rields: Theory and Applications; Elsevier: Amsterdam, The Netherlands, 2017. [Google Scholar]
- Varouchakis, E.A.; Hristopulos, D.T. Comparison of spatiotemporal variogram functions based on a sparse dataset of groundwater level variations. Spat. Stat.
**2019**, 34, 100245. [Google Scholar] [CrossRef] - Porcu, E.; Furrer, R.; Nychka, D. 30 Years of space–time covariance functions. Wiley Interdiscip. Rev. Comput. Stat.
**2021**, 13, e1512. [Google Scholar] [CrossRef] - Christakos, G. Random Field Models in Earth Sciences; Academic Press: San Diego, CA, USA, 1992. [Google Scholar]
- Cressie, N. Statistics for Spatial Data, revised ed.; Series in Probability and Statistics; Wiley: New York, NJ, USA, 1993. [Google Scholar]
- Wackernagel, H. Multivariate Geostatistics; Springer: Berlin, Germany, 2003. [Google Scholar]
- Olea, R.A. Geostatistics for Engineers and Earth Scientists; Springer: New York, NY, USA, 1999. [Google Scholar]
- Chilès, J.P.; Delfiner, P. Geostatistics: Modeling Spatial Uncertainty, 2nd ed.; John Wiley & Sons: New York, NY, USA, 2012. [Google Scholar]
- Boer, E.P.; de Beurs, K.M.; Hartkamp, A.D. Kriging and thin plate splines for mapping climate variables. Int. J. Appl. Earth Obser. Geoinfor.
**2001**, 3, 146–154. [Google Scholar] [CrossRef] - Guan, H.; Wilson, J.L.; Makhnin, O. Geostatistical mapping of mountain precipitation incorporating autosearched effects of terrain and climatic characteristics. J. Hydrometeorol.
**2005**, 6, 1018–1031. [Google Scholar] [CrossRef] - Moral, F.J. Comparison of different geostatistical approaches to map climate variables: Application to precipitation. Int. J. Climatol.
**2010**, 30, 620–631. [Google Scholar] [CrossRef] - Verdin, A.; Funk, C.; Rajagopalan, B.; Kleiber, W. Kriging and local polynomial methods for blending satellite-derived and gauge precipitation estimates to support hydrologic early warning systems. IEEE Tran. Geosci. Remote Sens.
**2016**, 54, 2552–2562. [Google Scholar] [CrossRef] - Agou, V.D.; Varouchakis, E.A.; Hristopulos, D.T. Geostatistical analysis of precipitation in the island of Crete (Greece) based on a sparse monitoring network. Environ. Monit. Assess.
**2019**, 191, 1573–2959. [Google Scholar] [CrossRef] - Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA, 2006. [Google Scholar]
- Hristopulos, D. Random Fields for Spatial Data Modeling: A Primer for Scientists and Engineers; Springer: Dordrecht, The Netherlands, 2020. [Google Scholar] [CrossRef]
- Papalexiou, S.M.; Serinaldi, F.; Porcu, E. Advancing space-time simulation of random fields: From storms to cyclones and beyond. Water Resour. Res.
**2021**, 57, e2020WR029466. [Google Scholar] [CrossRef] - Papalexiou, S.M.; Serinaldi, F. Random fields simplified: Preserving marginal distributions, correlations, and intermittency, with applications from rainfall to humidity. Water Resour. Res.
**2020**, 56, e2019WR026331. [Google Scholar] [CrossRef] [Green Version] - Snelson, E.; Rasmussen, C.E.; Ghahramani, Z. Warped Gaussian processes. Adv. Neural Inf. Process. Syst.
**2004**, 16, 337–344. [Google Scholar] - Pavlides, A.; Agou, V.; Hristopulos, D.T. Non-parametric kernel-based estimation of probability distributions for precipitation modeling. arXiv
**2021**, arXiv:2109.09961. [Google Scholar] - Papalexiou, S.M.; AghaKouchak, A.; Foufoula-Georgiou, E. A diagnostic framework for understanding climatology of tails of hourly precipitation extremes in the United States. Water Resour. Res.
**2018**, 54, 6725–6738. [Google Scholar] [CrossRef] - Ye, L.; Hanson, L.S.; Ding, P.; Wang, D.; Vogel, R.M. The probability distribution of daily precipitation at the point and catchment scales in the United States. Hydrol. Earth Syst. Sci.
**2018**, 22, 6519–6531. [Google Scholar] [CrossRef] [Green Version] - Wilks, D.S. Maximum likelihood estimation for the gamma distribution using data containing zeros. J. Clim.
**1990**, 3, 1495–1501. [Google Scholar] [CrossRef] [Green Version] - Wilks, D.S.; Eggleston, K.L. Estimating monthly and seasonal precipitation distributions using the 30-and 90-day outlooks. J. Clim.
**1992**, 5, 252–259. [Google Scholar] [CrossRef] [Green Version] - Shoji, T.; Kitaura, H. Statistical and geostatistical analysis of rainfall in central Japan. Comput. Geosci.
**2006**, 32, 1007–1024. [Google Scholar] [CrossRef] - Kedem, B.; Chiu, L.S.; North, G.R. Estimation of mean rain rate: Application to satellite observations. J. Geoph. Res. Atmos.
**1990**, 95, 1965–1972. [Google Scholar] [CrossRef] - Cho, H.K.; Bowman, K.P.; North, G.R. A comparison of gamma and lognormal distributions for characterizing satellite rain rates from the tropical rainfall measuring mission. J. Appl. Meteorol.
**2004**, 43, 1586–1597. [Google Scholar] [CrossRef] - Wang, Z.; Zeng, Z.; Lai, C.; Lin, W.; Wu, X.; Chen, X. A regional frequency analysis of precipitation extremes in mainland China with fuzzy c-means and L-moments approaches. Int. J. Climatol.
**2017**, 37, 429–444. [Google Scholar] [CrossRef] - Coles, S. An Introduction to Statistical Modeling of Extreme Values; Springer Series in Statistics; Springer: London, UK, 2001. [Google Scholar]
- Gellens, D. Combining regional approach and data extension procedure for assessing GEV distribution of extreme precipitation in Belgium. J. Hydrol.
**2002**, 268, 113–126. [Google Scholar] [CrossRef] - Scheuerer, M. Probabilistic quantitative precipitation forecasting using ensemble model output statistics. Quart. J. R. Meteorol. Soc.
**2014**, 140, 1086–1096. [Google Scholar] [CrossRef] [Green Version] - Koutsoyiannis, D. Statistics of extremes and estimation of extreme rainfall: II. Empirical investigation of long rainfall records. Hydrol. Sci. J.
**2004**, 49, 591–610. [Google Scholar] [CrossRef] - Moccia, B.; Papalexiou, S.M.; Russo, F.; Napolitano, F. Spatial variability of precipitation extremes over Italy using a fine-resolution gridded product. J. Hydrol. Reg. Stud.
**2021**, 37, 100906. [Google Scholar] [CrossRef] - Baxevani, A.; Lennatsson, J. A spatiotemporal precipitation generator based on a censored latent Gaussian field. Water Resour. Res.
**2015**, 51, 4338–4358. [Google Scholar] [CrossRef] [Green Version] - Botev, Z.I.; Grotowski, J.F.; Kroese, D. Kernel density estimation via diffusion. Ann. Stat.
**2010**, 38, 2916–2957. [Google Scholar] [CrossRef] [Green Version] - Madsen, H.; Lawrence, D.; Lang, M.; Martinkova, M.; Kjeldsen, T. Review of trend analysis and climate change projections of extreme precipitation and floods in Europe. J. Hydrol.
**2014**, 519, 3634–3650. [Google Scholar] [CrossRef] [Green Version] - Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat.
**1962**, 33, 1065–1076. [Google Scholar] [CrossRef] - Arenas, A.; Chorin, A.J. On the existence and scaling of structure functions in turbulence according to the data. Proc. Nat. Acad. Sci. USA
**2006**, 103, 4352–4355. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Matheron, G. Principles of Geostatistics. Econ. Geol.
**1963**, 58, 1246–1266. [Google Scholar] [CrossRef] - Sylvain, A.; Alain, C. A survey of cross-validation procedures for model selection. Stat. Surv.
**2010**, 4, 40–79. [Google Scholar] [CrossRef] - Grossman, R.; Seni, G.; Elder, J.; Agarwal, N.; Liu, H. Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions; Morgan & Claypool: San Rafael, CA, USA, 2010. [Google Scholar] [CrossRef]
- Li, K.C. Asymptotic optimality for C
_{p},C_{L}, cross-validation and generalized cross-validation: Discrete index set. Ann. Statist.**1987**, 15, 958–975. [Google Scholar] [CrossRef] - Stone, M. Cross-validatory choice and assessment of statistical predictions. J. R. Statist. Soc. Ser. B
**1974**, 36, 111–147. [Google Scholar] [CrossRef] - Burman, P. A comparative study of ordinary cross-validation, υ-fold cross-validation and the repeated learning-testing methods. Biometrika
**1989**, 76, 503–514. [Google Scholar] [CrossRef] [Green Version] - Efron, B. Estimating the error rate of a prediction rule: Improvement on cross-validation. J. Am. Statist. Assoc.
**1983**, 78, 316–331. [Google Scholar] [CrossRef] - Watrous, L. Lasithi: A History of Settlement on a Highland Plain in Crete, xviii ed.; American School of Classical Studies: Princeton, NJ, USA, 1982. [Google Scholar]
- Copernicus Climate Change Service C3S. ERA5: Fifth Generation of ECMWF Atmospheric Reanalyses of the Global Climate. 2018. Available online: https://cds.climate.copernicus.eu/cdsapp#!/home (accessed on 10 March 2020).
- Dee, D.; Fasullo, J.; Shea, D. The Climate Data Guide: Atmospheric Reanalysis: Overview & Comparison Tables. Last modified 12 December 2016. Available online: https://climatedataguide.ucar.edu/climate-data/atmospheric-reanalysis-overview-comparison-tables (accessed on 7 October 2021).
- Reichle, R.H.; Liu, Q.; Koster, R.D.; Draper, C.S.; Mahanama, S.P.P.; Partyka, G.S. Land Surface Precipitation in MERRA-2. J. Clim.
**2017**, 30, 1643–1664. [Google Scholar] [CrossRef] - Muñoz Sabater, J.; Dutra, E.; Agustí-Panareda, A.; Albergel, C.; Arduini, G.; Balsamo, G.; Boussetta, S.; Choulga, M.; Harrigan, S.; Hersbach, H.; et al. ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data
**2021**, 13, 4349–4383. [Google Scholar] [CrossRef] - Xu, X.; Frey, S.K.; Ma, D. Hydrological performance of ERA5 and MERRA-2 precipitation products over the Great Lakes Basin. J. Hydrol. Reg. Stud.
**2022**, 39, 100982. [Google Scholar] [CrossRef] - Google Earth Pro 7.3.4.8248. (14 December 2015). Crete island, Greece, 35°16’12.97"N, 25°1’25.14"E, Eye alt 273.94 km. SIO, NOAA, U.S. Navy, NGA, GEBCO. Image Landsat/Copernicus. Available online: https://earth.google.com/web/ (accessed on 13 April 2021).
- Gneiting, T.; Raftery, A.E. Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc.
**2007**, 102, 359–378. [Google Scholar] [CrossRef] - Marmin, S.; Baccou, J.; Liandrat, J.; Ginsbourger, D. Non-parametric warping via local scale estimation for non-stationary Gaussian process modelling. Int. Soc. Opt. Photonics
**2017**, 10394, 1039421. [Google Scholar] - Lu, C.K.; Shafto, P. Conditional deep Gaussian processes: Multi-fidelity kernel learning. Entropy
**2021**, 23, 1545. [Google Scholar] [CrossRef] [PubMed] - Peters, G.W.; Nevat, I.; Nagarajan, S.G.; Matsui, T. Spatial warped Gaussian processes: Estimation and efficient field reconstruction. Entropy
**2021**, 23, 1323. [Google Scholar] [CrossRef] - Xu, G.; Genton, M.G. Tukey g-and-h random fields. J. Am. Stat. Assoc.
**2017**, 112, 1236–1249. [Google Scholar] [CrossRef] - Barbero, G.; Moisello, U.; Todeschini, S. Evaluation of the Areal Reduction Factor in an Urban Area through Rainfall Records of Limited Length: A Case Study. J. Hydrol. Engin.
**2014**, 19, 05014016. [Google Scholar] [CrossRef] - Hristopulos, D.T.; Agou, V.D. Stochastic local interaction model with sparse precision matrix for space–time interpolation. Spat. Stat.
**2020**, 40, 100403. [Google Scholar] [CrossRef] [Green Version] - Hristopulos, D.T.; Pavlides, A.; Agou, V.D.; Gkafa, P. Stochastic local interaction model: An alternative to kriging for massive datasets. Math. Geosci.
**2021**, 53, 1907–1949. [Google Scholar] [CrossRef] - Hegde, P.; Heinonen, M.; Kaski, S. Variational zero-inflated Gaussian processes with sparse kernels. In Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI), Monterey, CA, USA, 6–10 August 2018; Globerson, A., Silva, R., Eds.; AUAI Press: Corvallis, OR, USA, 2018; Volume 1, pp. 361–371. [Google Scholar]
- Hristopulos, D.T. Covariance functions motivated by spatial random field models with local interactions. Stoch. Environ. Res. Risk Assess.
**2015**, 29, 739–754. [Google Scholar] [CrossRef] [Green Version]

**Figure 1.**Empirical variograms (markers) and model fits (continuous lines) for the training set data (

**a**) and the warped-space (normalized) data (

**b**).

**Figure 2.**GPR and wGPR approximation of the function in Equation (16). Blue dots: Training set. Black line: The function $z\left(\mathbf{s}\right)$ plotted versus $\pi s$ on the horizontal axis. GPR approximations (classical GPR: magenta line, warped GPR, blue line) and 95.45% prediction intervals (GPR: green dash lines, wGPR: cyan dash lines.)

**Figure 3.**Geomorphological map of Crete showing the 65 nodes (blue markers) of ERA5 grid covering Crete, where the precipitation reanalysis data used in this study are located [59].

**Figure 4.**Violin plots for the mean, median, minimum and maximum values of monthly ERA5 precipitation statistics based on 246 monthly values. Each monthly statistic is based on the data at the 65 ERA5 grid nodes. The values for CoV (coefficient of variation), Skew (skewness) and Kurt (kurtosis) are dimensionless. All other values are measured in mm.

**Figure 5.**Distribution of monthly precipitation during the wet season of 2008. Histograms are based on ERA5 precipitation data at 65 grid locations over and around the island of Crete. Best fits to the optimal Gaussian PDF models (red line) are also shown. The vertical axis of the histograms represents frequency; the horizontal axis represents precipitation amount measured in mm.

**Figure 6.**GPR and wGPR LOO-CV mean error (ME) and mean absolute error (MAE) for the wet-season ERA5 precipitation data. The lower indices “1” and “2” refer to the exponential and Matérn models respectively.

**Figure 7.**GPR and wGPR LOO-CV root mean error error (RMSE) and the Spearman correlation coefficient (RS) between the true and predicted values for the wet-season ERA5 precipitation data. The lower indices “1” and “2” refer to the exponential and Matérn models respectively.

**Figure 8.**GPR and wGPR LOO-CV for two interval scores: the empirical interval coverage (CVG) and the negatively oriented interval score (NINTS) for the wet-season ERA5 precipitation data. The lower indices “1” and “2” refer to the exponential and Matérn covariance kernels respectively.

**Table 1.**Cross-validation metrics for GPR and wGPR based on the validation set of 400 points from the function of Equation (16). ME: Mean error. MAE: Mean absolute error. RMSE: Root mean square error. RP: Pearson’s correlation coefficient. NS: Nash-Sutcliffe coefficient. ErrMin: ${min}_{{\mathbf{s}}_{1}^{*},\dots ,{\mathbf{s}}_{P}^{*}}\left(z\left({\mathbf{s}}_{p}^{*}\right)-\widehat{z}\left({\mathbf{s}}_{p}^{*}\right)\right)$. ErrMax: ${max}_{{\mathbf{s}}_{1}^{*},\dots ,{\mathbf{s}}_{P}^{*}}\left(z\left({\mathbf{s}}_{p}^{*}\right)-\widehat{z}\left({\mathbf{s}}_{p}^{*}\right)\right)$.

ME | MAE | RMSE | RP | NS | ErrMin | ErrMax | |
---|---|---|---|---|---|---|---|

GPR | −0.012 | 0.095 | 0.147 | 0.985 | 0.97 | −0.50 | 0.38 |

wGPR | −0.016 | 0.050 | 0.119 | 0.990 | 0.98 | −0.76 | 0.64 |

**Table 2.**Mean, median, minimum and maximum values (shown across rows) of monthly ERA5 precipitation statistics (shown across the columns) based on 246 monthly values (measured in mm). Each monthly statistic is based on the data at the 65 ERA5 grid nodes. The values for CoV (coefficient of variation), Skew (skewness) and Kurt (kurtosis) are dimensionless. All other values are measured in mm.

Mean | Median | Min | Max | Std | CoV | Skew | Kurt | |
---|---|---|---|---|---|---|---|---|

Mean | 61.25 | 55.69 | 26.19 | 132.70 | 25.53 | 0.48 | 0.82 | 3.16 |

Median | 59.19 | 51.78 | 21.23 | 123.98 | 23.67 | 0.45 | 0.81 | 3.04 |

Minimum | 1.75 | 1.05 | 0.05 | 6.10 | 1.16 | 0.16 | −0.01 | 1.56 |

Maximum | 198.27 | 194.15 | 110.03 | 375.32 | 81.54 | 1.57 | 2.26 | 7.75 |

**Table 3.**Optimal probability distribution fits (based on BIC) for the monthly ERA5 precipitation data in the year 2008. The models studied include the following: “GP”: Generalized Pareto, “InvGauss”: Inverse Gaussian, “Logn”: Lognormal, and “Wei”: Weibull distribution. The optimal probability distributions for each wet-season month are not uniformly the same for different years.

January | February | March | October | November | December |
---|---|---|---|---|---|

GP | InvGauss | Logn | GP | GP | Wei |

**Table 4.**Average values of LOO-CV metrics based on the 246 time slices of ERA5 precipitation data for the wet-season months.

ME | MAE | RMSE | RS | NINTS | CVG | |
---|---|---|---|---|---|---|

GPR (Expo) | $-0.15$ | 7.60 | 0.25 | 0.90 | 67.31 | 0.97 |

GPR (Mate) | $-0.15$ | 7.60 | 0.25 | 0.90 | 67.31 | 0.97 |

wGPR (Expo) | $-0.53$ | 7.53 | 0.21 | 0.90 | 65.50 | 0.98 |

wGPR (Mate) | $-0.53$ | 7.53 | 0.21 | 0.90 | 65.50 | 0.98 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Agou, V.D.; Pavlides, A.; Hristopulos, D.T.
Spatial Modeling of Precipitation Based on Data-Driven Warping of Gaussian Processes. *Entropy* **2022**, *24*, 321.
https://doi.org/10.3390/e24030321

**AMA Style**

Agou VD, Pavlides A, Hristopulos DT.
Spatial Modeling of Precipitation Based on Data-Driven Warping of Gaussian Processes. *Entropy*. 2022; 24(3):321.
https://doi.org/10.3390/e24030321

**Chicago/Turabian Style**

Agou, Vasiliki D., Andrew Pavlides, and Dionissios T. Hristopulos.
2022. "Spatial Modeling of Precipitation Based on Data-Driven Warping of Gaussian Processes" *Entropy* 24, no. 3: 321.
https://doi.org/10.3390/e24030321