# Statistical Modelling of the Annual Rainfall Pattern in Guanacaste, Costa Rica

^{1}

^{2}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

- Stochastic rainfall simulation. These models are used to provide a statistical representation of rainfall for forecasting floods and similar events. A particular variant is used to generate synthetic sequences of rainfall for scenario building. The functional part of these models uses atmospheric thermodynamics and semi-empirical models for rainfall in various weather situations. The models are not location or time-specific and are used in both regional and point applications. Examples of such models appear in the literature [16,17].
- Operational rainfall modelling. With this approach, rainfall is generated inside a fully constituted weather forecast modelling system that is run continually by many national weather services. The models are based on the fundamental constitutive equations of the atmosphere. Precipitation is modelled by semi-empirical schemes that use local thermodynamic, topographic, and weather conditions. The output, a sequence of predictions, is generally on a grid resolution of a few tens of kilometres and is produced approximately once per hour. This kind of modelling is enormously computationally intensive.
- General circulation modelling. These models are also based on the fundamental constitutive equations of the atmosphere. However, in contrast with operational rainfall modelling, these models are averaged in space and time over larger areas and longer times. This type of modelling is designed to study climate change on a scale of 100 years or longer. The spatial resolution is quite coarse—tens to hundreds of kilometres. Statistical downscaling approaches can be used to produce finer-resolution (regional scale) rainfall predictions. Similar to operational modelling, this kind of modelling is enormously computationally intensive. The literature [18] provides a description of this approach to precipitation modelling.

## 2. Station Data and Preliminary Analyses

## 3. Statistical Models and Estimation of the ONI Effect and Long-Term Time Trend

#### 3.1. Double-Gaussian Model

#### 3.2. Regression Model with ARMA Errors

`arima`function in the R package

`stats`. Table 2 shows the estimates of some of the parameters and their standard errors. (The remainder appear in the online supplementary material.) Of note are the statistically significant effects of ONI on the spring peak (Liberia) and summer peaks (Liberia and Nicoya). However, we found no evidence of a long-term time trend. A caveat to our findings is that we used the same dataset both to select the model and to make inferences about the regression coefficients. Consequently, the reported p-values may be too small. The estimated lag-1 coefficients are small for both stations. In other words, after adjusting for month and ONI effects, the rainfall observations are approximately independent. These results are broadly consistent with those based on the double-Gaussian model, lending strength to our conclusions based on the latter and, importantly, further justifying the independence assumption.

#### 3.3. Tweedie Model

`multiroot`function in the R package

`rootSolve`to solve the estimating equations. For simplicity, we handle the missing values in the Nicoya data by replacing them with the sample median rainfall for their corresponding months. Following previous authors [24], we did not estimate ${q}_{t}$. Instead, for $t=1,2,3,4$, we set ${q}_{t}=1.62$ (the maximum likelihood estimate obtained by assuming independent Tweedie observations in those months). For $t=5,6,\dots ,12$, we set ${q}_{t}=2$, which corresponds to the gamma distribution. This distribution provides a reasonable fit to the rainfall in those months (where no zero rainfall observations occurred). The online supplementary material contains details about the goodness of fit of this model.

#### 3.4. Summary of Models and Inferential Results

## 4. Rainfall Forecasting

- Spring peak (total rainfall in May, June, and July) conditional on all observations prior to January of the year in question.
- Summer peak (total rainfall in August, September and October) conditional on all observations prior to January of the year in question.
- Annual total conditional on all observations prior to January of the year in question.
- Monthly rainfall conditional on all observations prior to the month in question (“One-month-ahead predictions”).
- Monthly rainfall conditional on all but the last three observations prior to the month in question (“Three-month-ahead predictions”).

#### 4.1. Double-Gaussian Model

#### 4.2. Regression Model with AR(1) Errors

`forecast`in the package

`forecast`) to predict rainfall when the future ONI is treated as known or unknown.

#### 4.3. Tweedie Model

#### 4.4. Summary of Models’ Predictive Performance

## 5. Discussion and Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Magaña, V.; Amador, J.A.; Medina, S. The midsummer drought over Mexico and Central America. J. Clim.
**1999**, 12, 1577–1588. [Google Scholar] [CrossRef] - Hund, S.V.; Grossmann, I.; Steyn, D.G.; Allen, D.M.; Johnson, M.S. Changing water resources under El Niño, climate change, and growing water demands in seasonally dry tropical watersheds. Water Resour. Res.
**2021**, 57, e2020WR028535. [Google Scholar] [CrossRef] - Waylen, P.R.; Caviedes, C.; Quesada, M. Interannual variability of monthly precipitation in Costa Rica. J. Clim.
**1996**, 9, 2606–2613. [Google Scholar] [CrossRef] - Waylen, P.R.; Quesada, M.E.; Caviedes, C.N. Temporal and spatial variability of annual precipitation in Costa Rica and the Southern Oscillation. Int. J. Climatol.
**1996**, 16, 173–193. [Google Scholar] [CrossRef] - Portig, W. The Climate of Central America. In World Survey of Climates; Schwerdtfeger, W., Ed.; Elsevier: Amsterdam, The Netherlands, 1976; pp. 405–478. [Google Scholar]
- Enfield, D.B.; Alfaro, E.J. The dependence of Caribbean rainfall on the interaction of the tropical Atlantic and Pacific Oceans. J. Clim.
**1999**, 12, 2093–2103. [Google Scholar] [CrossRef] - Wu, R.; Zhang, L. Biennial relationship of rainfall variability between central America and Equatorial South America. Geophys. Res. Lett.
**2010**, 37, L08701. [Google Scholar] [CrossRef] - Wang, L.; Yu, J.Y.; Paek, H. Enhanced biennial variability in the Pacific due to Atlantic capacitor effect. Nat. Commun.
**2017**, 8, 14887. [Google Scholar] [CrossRef] [PubMed] - Peña, M.; Douglas, M.W. Characteristics of wet and dry spells over the Pacific side of Central America during the rainy season. Mon. Weather Rev.
**2002**, 1998, 3054–3073. [Google Scholar] [CrossRef] - Karnauskas, K.B.; Seager, R.; Giannini, A.; Busalacchi, A. A simple mechanism for the climatological midsummer drought along the Pacific coast of Central America. Atmósfera
**2013**, 26, 261–281. [Google Scholar] [CrossRef] - Waylen, P.R.; Quesada, M. The Effect of Atlantic and Pacific Sea Surface Temperatures on the Mid-Summer Drought of Costa Rica. In Environmental Change and Water Sustainability; García-Ruiz, J., Jones, J., Arnáez, J., Eds.; Instituto Pirenaico de Ecología, Consejo Superior de Investigaciones Científicas: Zaragoza, Spain, 2002; pp. 197–209. [Google Scholar]
- Curtis, S. Diurnal cycle of rainfall and surface winds and the mid-summer drought of Mexico/Central America. Clim. Res.
**2004**, 27, 1–8. [Google Scholar] [CrossRef] [Green Version] - Rauscher, S.A.; Giorgi, F.; Diffenbaugh, N.S.; Seth, A. Extension and intensification of the Meso-American mid-summer drought in the twenty-first century. Clim. Dyn.
**2008**, 31, 551–571. [Google Scholar] [CrossRef] - Small, R.J.O.; de Szoeke, S.P.; Xie, S.P. The Central American midsummer drought: Regional aspects and large-scale forcing. J. Clim.
**2007**, 20, 4853–4873. [Google Scholar] [CrossRef] - Maldonado, T.; Rutgersson, A.; Alfaro, E.; Amador, J.; Claremar, B. Interannual variability of the midsummer drought in Central America and the connection with sea surface temperatures. Adv. Geosci.
**2016**, 42, 35–50. [Google Scholar] [CrossRef] - Cowpertwait, P.; Isham, V.; Onof, C. Point process models of rainfall: Developments for fine-scale structure. Proc. R. Soc. A Math. Phys. Eng. Sci.
**2007**, 463, 2569–2587. [Google Scholar] [CrossRef] - Burton, A.; Kilsby, C.G.; Fowler, H.J.; Cowpertwait, P.; O’Connell, P. RainSim: A spatial-temporal stochastic rainfall modelling system. Environ. Model. Softw.
**2008**, 23, 1356–1369. [Google Scholar] [CrossRef] - Lettenmaier, D. Stochastic modeling of precipitation with applications for climate model downscaling. In Analysis of Climate Variability; von Storch, H., Navarra, A., Eds.; Springer: Berlin/Heidelberg, Germany, 1995; pp. 197–212. [Google Scholar]
- Steyn, D.; Moisseeva, N.; Harari, O.; Welch, W.J. Temporal and Spatial Variability of Annual Rainfall Patterns in Guanacaste, Costa Rica. Technical Report, The University of British Columbia. 2016. Available online: https://hdl.handle.net/2429/59971 (accessed on 8 February 2023).
- Giannini, A.; Kushnir, Y.; Cane, M.A. Interannual variability of Caribbean rainfall, ENSO, and the Atlantic Ocean. J. Clim.
**2000**, 13, 297–311. [Google Scholar] [CrossRef] - Davis, R.A.; Dunsmuir, W.T.; Wang, Y. On autocorrelation in a Poisson regression model. Biometrika
**2000**, 87, 491–505. [Google Scholar] [CrossRef] - Box, G.E.P.; Jenkins, G. Time Series Analysis, Forecasting and Control; Holden-Day, Incorporated: San Francisco, CA, USA, 1990. [Google Scholar]
- Hyndman, R.J.; Athanasopoulos, G. Forecasting: Principles and Practice, 2nd ed.; OTexts: Melbourne, Australia, 2013. [Google Scholar]
- Hasan, M.T.; Yan, G.; Ma, R. Analysis of periodic patterns of daily precipitation through simultaneous modeling of its serially observed occurrence and amount. Environ. Ecol. Stat.
**2014**, 21, 811–824. [Google Scholar] [CrossRef] - Jørgensen, B. The Theory of Dispersion Models; Monographs on Statistics and Applied Probability; Chapman & Hall: London, UK, 1997. [Google Scholar]
- Pace, L.; Salvan, A. Principles of Statistical Inference from a Neo-Fisherian Perspective; Advanced Series on Statistical Science and Applied Probability; World Scientific: Singapore, 1997. [Google Scholar]
- McCulloch, C.E.; Searle, S.R. Generalized, Linear, and Mixed Models; John Wiley & Sons: Hoboken, NJ, USA, 2001. [Google Scholar]
- Hund, S.V. Community-Based Stream and Groundwater Monitoring and Future Change Impact Modelling of a Socio-Ecohydrological System to Inform Drought Adaptation in the Seasonally-Dry Tropics. Ph.D. Thesis, The University of British Columbia, Vancouver, BC, Canada, 2018. [Google Scholar]

**Figure 1.**Location map of the Futuragua study domain showing topography and the five meteorological stations.

**Figure 2.**Annual cycle of total monthly rainfall values at the Liberia station (

**left**) from 1980 to 2020 and at the Nicoya station (

**right**) from 1980 to 2016. The red line represents the observed median rainfall in each month.

**Figure 3.**Predicted (using the double Gaussian and naive approaches) and observed total spring rainfall values at the Liberia station (

**left**) from 2006 to 2020 and at the Nicoya station (

**right**) from 2007 to 2016. Rainfall is reported on the scale $log\left({Y}_{it}\right)+1$.

**Figure 4.**Predicted (using the double Gaussian and naive approaches) and observed total summer rainfall values at the Liberia station (

**left**) from 2006 to 2020 and at the Nicoya station (

**right**) from 2007 to 2016. Rainfall is reported on the scale $log\left({Y}_{it}\right)+1$.

**Figure 5.**Predicted (using the double Gaussian and naive approaches) and observed total annual rainfall values at the Liberia station (

**left**) from 2006 to 2020 and at the Nicoya station (

**right**) from 2007 to 2016. Rainfall is reported on the scale $log\left({Y}_{it}\right)+1$.

**Figure 6.**One-month-ahead predicted (using the double Gaussian and naive approaches) and observed monthly rainfall values at the Liberia station from 2006 to 2020. “Month” is the number of months since January 2006.

**Figure 7.**One-month-ahead predicted (using the double Gaussian and naive approaches) and observed monthly rainfall values at the Nicoya station from 2007 to 2016. “Month” is the number of months since January 2007.

**Figure 8.**Three-month-ahead predicted (using the double Gaussian and naive approaches) and observed monthly rainfall values at the Liberia station from 2006 to 2020. “Month” is the number of months since January 2006.

**Figure 9.**Three-month-ahead predicted (using the double Gaussian and naive approaches) and observed monthly rainfall values at the Nicoya station from 2007 to 2016. “Month” is the number of months since January 2007.

Station | Parameter | Estimate | Standard Error | p-Value |
---|---|---|---|---|

${\beta}_{10}$ | 5.774 | 0.150 | 0.000 | |

${\beta}_{11}$ | −0.583 | 0.117 | 0.000 | |

${\beta}_{12}$ | −0.342 | 0.237 | 0.150 | |

${l}_{1}$ | 5.820 | 0.050 | 0.000 | |

${\varphi}_{1}^{*}$ | −0.021 | 0.083 | 0.798 | |

Liberia | ${\beta}_{20}$ | 5.849 | 0.128 | 0.000 |

${\beta}_{21}$ | −0.278 | 0.065 | 0.000 | |

${\beta}_{22}$ | 0.185 | 0.206 | 0.368 | |

${l}_{2}$ | 9.198 | 0.062 | 0.000 | |

${\varphi}_{2}^{*}$ | 0.593 | 0.074 | 0.000 | |

${\delta}^{*}$ | 0.426 | 0.105 | 0.000 | |

${\sigma}_{0}^{*}$ | 0.143 | - | - | |

${\sigma}_{1}^{*}$ | −0.330 | - | - | |

${\beta}_{10}$ | 5.825 | 0.099 | 0.000 | |

${\beta}_{11}$ | −0.217 | 0.077 | 0.005 | |

${\beta}_{12}$ | −0.237 | 0.163 | 0.144 | |

${l}_{1}$ | 5.885 | 0.071 | 0.000 | |

${\varphi}_{1}^{*}$ | 0.621 | 0.096 | 0.000 | |

Nicoya | ${\beta}_{20}$ | 6.054 | 0.094 | 0.000 |

${\beta}_{21}$ | −0.209 | 0.048 | 0.000 | |

${\beta}_{22}$ | 0.093 | 0.154 | 0.544 | |

${l}_{2}$ | 9.137 | 0.060 | 0.000 | |

${\varphi}_{2}^{*}$ | 0.652 | 0.079 | 0.000 | |

${\delta}^{*}$ | 0.594 | 0.166 | 0.000 | |

${\sigma}_{0}^{*}$ | 0.343 | - | - | |

${\sigma}_{1}^{*}$ | −0.782 | - | - |

**Table 2.**Estimates and standard errors of some of the parameters of the regression model with AR(1) errors.

Station | Parameter | Estimate | Standard Error | p-Value |
---|---|---|---|---|

${\gamma}_{1}$ | −0.104 | 0.049 | 0.032 | |

${\gamma}_{2}$ | −0.380 | 0.108 | 0.000 | |

Liberia | ${\gamma}_{3}$ | −0.303 | 0.074 | 0.000 |

${\gamma}_{4}$ | 0.014 | 0.115 | 0.902 | |

Lag 1 Coefficient | 0.076 | 0.045 | - | |

Residual Variance | 0.459 | - | - | |

${\gamma}_{1}$ | −0.025 | 0.055 | 0.646 | |

${\gamma}_{2}$ | −0.152 | 0.121 | 0.212 | |

Nicoya | ${\gamma}_{3}$ | −0.239 | 0.083 | 0.004 |

${\gamma}_{4}$ | −0.105 | 0.136 | 0.441 | |

Lag 1 Coefficient | 0.100 | 0.048 | - | |

Residual Variance | 0.538 | - | - |

Station | Parameter | Units | Estimate | Standard Error | p-Value |
---|---|---|---|---|---|

${b}_{1}$ | °C${}^{-1}$ | −0.004 | 0.097 | 0.967 | |

${b}_{2}$ | °C${}^{-1}$ | −0.281 | 0.089 | 0.002 | |

Liberia | ${b}_{3}$ | °C${}^{-1}$ | −0.322 | 0.048 | <0.001 |

${b}_{4}$ | °C${}^{-1}$ | 0.037 | 0.111 | 0.738 | |

${\tau}^{2}$ | - | 0.024 | - | - | |

$\rho $ | - | 0.437 | - | - | |

${b}_{1}$ | °C${}^{-1}$ | −0.029 | 0.086 | 0.732 | |

${b}_{2}$ | °C${}^{-1}$ | −0.147 | 0.070 | 0.036 | |

Nicoya | ${b}_{3}$ | °C${}^{-1}$ | −0.231 | 0.040 | <0.001 |

${b}_{4}$ | °C${}^{-1}$ | −0.173 | 0.095 | 0.069 | |

${\tau}^{2}$ | - | 0.022 | - | - | |

$\rho $ | - | 0.467 | - | - |

**Table 4.**Ratio of the prediction error (PE) based on the double-Gaussian model (where ONI is assumed known) to that based on the naive method (sample means/medians by month).

Station | Quantity | Scale | Mean or Median | Error Measure | PE Ratio |
---|---|---|---|---|---|

Spring total | $log({Y}_{it}+1)$ | Mean | MSE | 0.730 | |

Summer total | $log({Y}_{it}+1)$ | Mean | MSE | 0.591 | |

Liberia | Annual total | $log({Y}_{it}+1)$ | Mean | MSE | 0.608 |

1-month-ahead | ${Y}_{it}$ | Median | MAE | 1.050 | |

3-month-ahead | ${Y}_{it}$ | Median | MAE | 1.049 | |

Spring total | $log({Y}_{it}+1)$ | Mean | MSE | 0.836 | |

Summer total | $log({Y}_{it}+1)$ | Mean | MSE | 0.655 | |

Nicoya | Annual total | $log({Y}_{it}+1)$ | Mean | MSE | 0.515 |

1-month-ahead | ${Y}_{it}$ | Median | MAE | 1.003 | |

3-month-ahead | ${Y}_{it}$ | Median | MAE | 0.999 |

**Table 5.**Ratio of the prediction error (PE) based on the regression model with AR(1) errors (where ONI may be assumed known or unknown) to that based on the naive method (sample means/medians by month).

Station | Quantity | Scale | Mean or Median | Error Measure | PE Ratio | PE Ratio |
---|---|---|---|---|---|---|

(ONI Known) | (ONI Unknown) | |||||

Spring total | ${Y}_{it}^{0.25}$ | Mean | MSE | 0.732 | 1.000 | |

Summer total | ${Y}_{it}^{0.25}$ | Mean | MSE | 0.604 | 1.000 | |

Liberia | Annual total | ${Y}_{it}^{0.25}$ | Mean | MSE | 0.606 | 0.988 |

1-month-ahead | ${Y}_{it}$ | Median | MAE | 0.927 | 0.962 | |

3-month-ahead | ${Y}_{it}$ | Median | MAE | 0.947 | 0.998 | |

Spring total | ${Y}_{it}^{0.25}$ | Mean | MSE | 0.895 | 1.002 | |

Summer total | ${Y}_{it}^{0.25}$ | Mean | MSE | 0.642 | 1.000 | |

Nicoya | Annual total | ${Y}_{it}^{0.25}$ | Mean | MSE | 0.512 | 0.951 |

1-month-ahead | ${Y}_{it}$ | Median | MAE | 0.926 | 0.949 | |

3-month-ahead | ${Y}_{it}$ | Median | MAE | 0.975 | 1.006 |

**Table 6.**Ratio of the prediction error (PE) based on the Tweedie model (where ONI may be assumed known or unknown) to that based on the naive method (sample means/medians by month).

Station | Quantity | Scale | Mean or Median | Error Measure | PE Ratio | PE Ratio |
---|---|---|---|---|---|---|

(ONI Known) | (ONI Unknown) | |||||

Spring Total | ${Y}_{it}$ | Mean | MSE | 0.728 | 1.000 | |

Summer Total | ${Y}_{it}$ | Mean | MSE | 0.703 | 1.000 | |

Liberia | Annual Total | ${Y}_{it}$ | Mean | MSE | 0.338 | 1.000 |

1-Month-Ahead | ${Y}_{it}$ | Mean | MSE | 0.931 | 0.940 | |

3-Month-Ahead | ${Y}_{it}$ | Mean | MSE | 1.017 | 0.987 | |

Spring Total | ${Y}_{it}$ | Mean | MSE | 0.904 | 1.000 | |

Summer Total | ${Y}_{it}$ | Mean | MSE | 0.610 | 1.000 | |

Nicoya | Annual Total | ${Y}_{it}$ | Mean | MSE | 0.344 | 1.000 |

1-Month-Ahead | ${Y}_{it}$ | Mean | MSE | 0.919 | 0.965 | |

3-Month-Ahead | ${Y}_{it}$ | Mean | MSE | 0.963 | 0.970 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Altman, R.M.; Harari, O.; Moisseeva, N.; Steyn, D.
Statistical Modelling of the Annual Rainfall Pattern in Guanacaste, Costa Rica. *Water* **2023**, *15*, 700.
https://doi.org/10.3390/w15040700

**AMA Style**

Altman RM, Harari O, Moisseeva N, Steyn D.
Statistical Modelling of the Annual Rainfall Pattern in Guanacaste, Costa Rica. *Water*. 2023; 15(4):700.
https://doi.org/10.3390/w15040700

**Chicago/Turabian Style**

Altman, Rachel MacKay, Ofir Harari, Nadya Moisseeva, and Douw Steyn.
2023. "Statistical Modelling of the Annual Rainfall Pattern in Guanacaste, Costa Rica" *Water* 15, no. 4: 700.
https://doi.org/10.3390/w15040700