# Evaluation of Analysis by Cross-Validation. Part I: Using Verification Metrics

^{*}

## Abstract

**:**

_{3}and PM

_{2.5}against passive and active observations using standard model verification metrics such as bias, fractional bias, fraction of correct within a factor of 2, correlation and variance. The results show that verification of analyses against active observations always give an overestimation of the correlation and an underestimation of the variance. Evaluation against passive or any independent observations display a minimum of variance and maximum of correlation as we vary the observation weight, thus providing a mean to obtain the optimal observation weight. For the time and dates considered, the correlation between (independent) observations and the model is 0.55 for O

_{3}and 0.3 for PM

_{2.5}and for the analysis, with optimal observation weight, increases to 0.74 for O

_{3}and 0.54 for PM

_{2.5}. We show that bias can be a misleading measure of evaluation and recommend the use of a fractional bias such as the modified normalized mean bias (MNMB). An evaluation of the model bias and variance as a function of model values also show a clear linear dependence with the model values for both O

_{3}and PM

_{2.5}.

## 1. Introduction

_{3}, PM

_{2.5}, PM

_{10}, NO

_{2}, and SO

_{2}from the AirNow gateway with additional observations from Canada. As those surface analyses are not used to initialize an air quality model, it raises the issue on how to evaluate them. We conduct routine evaluations using the same set of observations as those used to produce the analysis. Once in a while, when there is a change in the system, a more thorough evaluation is conducted where we leave out a certain fraction of the observations and use them as independent observations, a process known as cross-validation. Observations used in producing the analysis are called active observations while those not used for evaluation are passive observations. Cross-validation is used to validate any model that depends on data. In air quality applications it has been used, for example, for mapping and exposure models [6,7,8]. The purpose of this two-part paper is to examine the relative merit of using active or passive observations (or independent observations in general) viewed from different evaluation metrics, but also to develop, in the second part, a mathematical framework to estimate the analysis error, and in doing so, to improve the analysis.

## 2. Experimental Design

#### 2.1. Design of the Objective Analysis Solver

#### 2.2. Cross-Validation

^{3}for PM

_{2.5}). Observations are also discarded based on innovations (or observed-minus-background values) when, for ozone, they exceed 50 ppbv (100 μg/m

^{3}for PM

_{2.5}) in absolute value. The quality-controlled observations are then separated into three sets of observations of equal numbers, i.e., a 3-fold cross-validation procedure, as illustrated in Figure 1.

#### 2.3. Verification Metrics

#### 2.4. Description of the Ensemble of Analyses and Their Verification Statistics

_{3}and PM

_{2.5}at 21 UTC for a period of 60 days (14 June to 12 August 2014) were performed with given input error statistics using the operational model GEM-MACH and the real-time AirNow observations as described in the introduction and with quality controlled observations (see Section 2.2 above). In all experiments, the observation and background error variances, ${\sigma}_{o}^{2}$ and ${\sigma}_{b}^{2}$, used in the analysis are uniform. The prescribed observation error and background error covariances are given as $\tilde{R}={\sigma}_{o}^{2}I$, $\tilde{B}={\sigma}_{b}^{2}C$, where the correlation model $C$ is a homogeneous isotropic second-order autoregressive model with a correlation length obtained by maximum likelihood, as in Ménard et al. [17]. Note that aside from quality control, that ends up rejecting some observations, the analysis uses the observation values and model realizations as is, with no bias correction.

## 3. Verification against Passive and Active Observations

_{3}and PM

_{2.5}were produced using a fixed homogeneous isotropic correlation function, where the correlation length was obtained by maximum likelihood using a second-order auto-regressive model and error variances computed using a local Hollingsworth-Lönnberg fit [17]. A correlation length of 124 km was obtained for O

_{3}and of 196 km for PM

_{2.5}. Our correlation length is defined from the curvature at the origin as in Daley [27] and is different from the length-scale parameter of the correlation model (see Ménard et al. [17] for a discussion of these issues). We did a series of 60-days analyses for different values of ${\sigma}_{o}^{2}$ and ${\sigma}_{b}^{2}$ but such that their sum respects the innovation variance consistency, ${\sigma}_{o}^{2}+{\sigma}_{b}^{2}=\mathrm{var}(O-B)$, an important condition for an optimal analysis [25], as explained in Section 2.4. This is the experimental procedure that has been used to generate the Figure 2, Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7. The results are shown for a wide range of variance ratios $\gamma ={\sigma}_{o}^{2}/{\sigma}_{b}^{2}$ from ${10}^{-2}$ to ${10}^{2}$ in Figure 2, Figure 3, Figure 4, Figure 5 and Figure 7 in particular. Note that $\gamma \ll 1$ corresponds to a very large observation weight while $\gamma \gg 1$ correspond to very small observation weight.

_{3}(left panel) and PM

_{2.5}(right panel). The solid blue line represents $\mathrm{var}(O-B)$, the variance of observation-minus-model, i.e., prior to an analysis. As mentioned in Section 2.4, in the cross-validation experiments we averaged the verification metric over the 3-fold subsets so that, in effect, the total number of observations that end up being used for verification is ${N}_{s}$, the total number of stations. We thus argue that the verification sampling error for the cross-validation experiments (red curve) is the same as for the active observations using the full analysis (i.e., analysis using the total number of stations; black curve). In addition, note that from Figure 1 the station sampling strategy gives rise to spatially random selection of stations, so that the individual metric on each set should be comparable. Furthermore, there is roughly 1300 quality controlled O

_{3}observations over the domain and 750 PM

_{2.5}quality controlled observations, each with 60 time samples or less. To give some qualitative idea of the sampling error, the different metric values for the individual 3-fold sets are presented in the Supplementary Material Figures S1 and S2, where we can see that for $\mathrm{var}(O-A)$ and $\mathrm{cor}(O,A)$ the metric values for the individual sets are nearly indistinguishable from the means of the 3-subset.

- red curve: using analysis with $2{N}_{s}/3$ observations with an evaluation at passive sites
- green curve: using analysis with $2{N}_{s}/3$ observations with an evaluation at active sites
- black curve: using analysis with ${N}_{s}$ observations with an evaluation at active sites.

_{3}and for an optimal ratio, the $\mathrm{var}(O-A)$ at passive sites is 51.02 ppbv

^{2}(red curve) while at active sites is 22.77 ppbv

^{2}(green curve). For PM2.5 and for an optimal ratio, the $\mathrm{var}(O-A)$ at passive sites is 38.09 ${(\mathsf{\mu}{\mathrm{g}/\mathrm{m}}^{3})}^{2}$(red curve) while at active sites is 15.41 ${(\mathsf{\mu}{\mathrm{g}/\mathrm{m}}^{3})}^{2}$(green curve). For both species, the error variance at active sites gives a significant overestimation of the error variance by more than a factor of 2.

_{3}, it increases from a value of 0.55 with respect to the model to a value of 0.74 with respect to an optimal analysis (when $\gamma ={\sigma}_{o}^{2}/{\sigma}_{b}^{2}$ is optimal). For PM

_{2.5}, the correlation against the model has a value of 0.3 which basically has no skill, to a value of 0.54 for optimal analysis, which represent a modest but useable skill. The correlation evaluated at the active sites for an optimal ratio, is 0.85 for O

_{3}(green curve) and 0.74 for PM

_{2.5}(green curve), being a substantial overestimation with respect to values obtained at passive sites.

_{3}(left panel) and PM

_{2.5}(right panel). Note that the scale in the ordinate is quite different between the left and right panels. Although the results bear similarity with the correlation between O and A presented in Figure 3, the maximum with passive observations is reached at larger $\gamma $ values than those obtained for $\mathrm{var}(O-A)$ or $\mathrm{cor}(O,A)$, which are identical. Individual fold results are presented in the supplementary materials Figure S3.

_{3}and PM

_{2.5}in Figure 5 are different. The blue curve is the mean $(O-B)$ and thus indicates that for O

_{3}in average over all observation stations (for the time and dates considered) the model overpredicts, and that for PM

_{2.5}the model underpredicts.

_{3}at passive sites and of the order of $\pm 0.1\text{}\mathsf{\mu}{\mathrm{g}/\mathrm{m}}^{3}$ (in average) for the PM

_{2.5}at passive sites (results shown in the supplementary material Figure S4). The distinction between the red, black and green curves may not be statistically significant for both O

_{3}and PM

_{2.5}. However, the difference between the analysis bias and model bias is large and statistically significant (see supplementary material). For O

_{3}, the model bias is eliminated at the passive observation sites (red curve) as long as the observation weight $\gamma \le 1$. The situation is not so clear for PM

_{2.5}. In fact, when the observation weight is small, we get the intruiging result that the bias of the analysis is larger bias than the model. How can that be when the observation weight is small (i.e., $\gamma >1$); should the analysis not be close to the model values? This apparent contradiction reveals a more complex issue underlying the bias metric.

_{3}and PM

_{2.5}show an underprediction for low model values and an overprediction for large model values. The origin of this bias is not known but one would argue that it is not directly related to chemistry as such since both constituents, O

_{3}and PM

_{2.5}, present the same feature. Possible explanations could be related to the model boundary layer, the emissions being too low for low polluted areas and too large for polluted areas, insufficient transport away from polluted areas to unpolluted areas, species destruction/scavenging could be too low in low polluted areas and too high in polluted areas. The lower panels of Figure 6a,b show the count of stations per model bin size. We observe that the majority of stations have O

_{3}model values in the range of 40 to 55 ppbv, where the bias is negative. Over all the stations, this gives rise to a negative mean $(O-B)$, and this is how we make the claim that the model overpredicts. However, for PM

_{2.5}the situation is different: the majority of stations lie in the low model value range, and there are gradually less stations for increasingly larger model values. Although the $(O-B)$ have large negative values in the high model value bin while small model value bins have positives $(O-B)$’s, the effect over all stations is to yield a modestly positive mean $(O-B)$ and thus the model underestimates the PM

_{2.5}. The results of the analysis evaluated at the passive observation sites are presented with the yellow and grey histogram boxes. In yellow, near optimal analyses with optimal observation weight, as determined by the minimum of $\mathrm{var}(O-A)$ are used, and in grey non-optimal analyses with $\gamma =10$.

_{3}the majority of observations lie in the range 40 to 55 ppbv, $(O-A)$ for the optimal analyses at passive observation sites is nearly zero. However, for the non-optimal analysis with $\gamma =10$, the $(O-A)$ at passive sites is negative, i.e., the analysis is overpredicting, as shown in Figure 5.

_{2.5}, the weighted sum of the $(O-A)$ bins is such that over all stations the bias for an optimal analysis is nearly zero. In the case of the non-optimal analysis with $\gamma =10$, the weighted sum of the nearly anti-symmetric $(O-A)$ bias per bin gives more weight to the positive bias at smaller model values, so that overall there is a positive $(O-A)$, as in Figure 5.

_{3}and PM

_{2.5}for passive and active observations are displayed in Figure 7 using the same color as in Figure 2. We note immediately that the MNMB analysis bias does not exceed the MNMB model bias as we observed for the bias metric of PM

_{2.5}(Figure 5 right panel). The MNMB bias also varies smoothly as a function of $\gamma $ (at variance with the bias metric for PM

_{2.5}—Figure 5).

_{2.5}, where we can actually deduce that the difference between the cross-validation and the validation against active observations is statistically significant when $\gamma <1$. There is also another important point to make; although analyses are designed to reduce the error variance, it so happens that for a near optimal analysis the fractional bias MNMB is very small, around 1% for O

_{3}and about 1–2% for PM

_{2.5}. We argue that it results from an optimal use of observations.

_{3}, the model error variance against observations increases gradually with larger model values. However, the fraction of analysis variance vs. model variance is roughly uniform across all bins. This can be explained by the fact that the observation and background error variances are uniform, and thus the reduction of variance across all bins is uniform as well. However, the situation is different for PM

_{2.5}. We note a relatively poor performance of the model at low model values, with standard deviation of 7 μg/m

^{3}. For slightly larger model values (3–6 μg/m

^{3}), the error variance is smaller to 5.5 μg/m

^{3}and then increases almost linearly with model values. The fraction of analysis variance vs. model variance decreases steadily with larger model values. These results thus indicate that the assumption that observation and background error variances are uniform and independent of the model value may have to be revisited.

## 4. Conclusions

_{3}and PM

_{2.5}over North America for a period of 60 days and present an evaluation using different metrics; bias, modified normalized mean bias, variance of observation-minus-analysis residuals, correlation between observation and analysis, and fraction of correction within a factor of 2.

_{3}and 21% for PM

_{2.5}. The correlation between the analysis and independent observations is also significantly improved with an optimal analysis: the correlation between the model and independent observations is 0.55 for O

_{3}and increases to 0.74 with the analysis, while for PM

_{2.5}the correlation between the model and independent observations is only 0.3 (which is basically no skill) but rises to 0.54 for the analysis.

_{3}and PM

_{2.5}, thus indicating that the source of this bias is not related to chemistry. The fact that, over the entire domain, the model overestimates O

_{3}, and underestimates PM

_{2.5}is simply a result of the concentrations. We have not conducted a systematic study of model error for other times of the day and other periods of the year, but it would be very interesting to look at this, to see whether or not changes of biases are due primarily to changes in the distribution of values rather than a fundamental change in the bias per model value bin.

_{3}and in a more pronounced way for PM

_{2.5}.

## Supplementary Materials

_{3}and PM

_{2.5}for the individual sets. Figure S2: Same as Figure S1 but for the correlation between observations and analysis. Figure S3: Same as Figure S1 but for the fraction of correct within a factor of 2. Figure S4: Same as Figure S1 but for bias. Figure S5: Same as Figure S1 but for modified normalized mean bias.

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Ménard, R.; Robichaud, A. The chemistry-forecast system at the Meteorological Service of Canada. In Proceedings of the ECMWF Seminar Proceedings on Global Earth-System Monitoring, Reading, UK, 5–9 September 2005; pp. 297–308. [Google Scholar]
- Robichaud, A.; Ménard, R. Multi-year objective analysis of warm season ground-level ozone and PM
_{2.5}over North-America using real-time observations and Canadian operational air quality models. Atmos. Chem. Phys.**2014**, 14, 1769–1800. [Google Scholar] [CrossRef] - Robichaud, A.; Ménard, R.; Zaïtseva, Y.; Anselmo, D. Multi-pollutant surface objective analyses and mapping of air quality health index over North America. Air Qual. Atmos. Health
**2016**, 9, 743–759. [Google Scholar] [CrossRef] [PubMed] - Moran, M.D.; Ménard, S.; Pavlovic, R.; Anselmo, D.; Antonopoulus, S.; Robichaud, A.; Gravel, S.; Makar, P.A.; Gong, W.; Stroud, C.; et al. Recent Advances in Canada’s National Operational Air Quality Forecasting System, 32nd ed.; Springer: Dordrecht, The Netherlands, 2014. [Google Scholar]
- Pudykiewicz, J.A.; Kallaur, A.; Smolarkiewcz, P.K. Semi-lagrangian modelling of tropospheric ozone. Tellus B
**1997**, 49, 231–248. [Google Scholar] [CrossRef] - Cressie, N.; Wikle, C.K. Statistics for Spatio-Temporal Data; Wiley: Hoboken, NJ, USA, 2011. [Google Scholar]
- Schneider, P.; Castell, N.; Vogt, M.; Dauge, F.R.; Lahoz, W.A.; Bartonova, A. Mapping urban air quality in near real-time using observations from low-cost sensors and model information. Environ. Int.
**2017**, 106, 234–247. [Google Scholar] [CrossRef] [PubMed] - Lindström, J.; Szpiro, A.A.; Oron, P.D.; Richards, M.; Larson, T.V.; Sheppard, L. A flexible spatio-temporal model for air pollution and spatio-temporal covariates. Environ. Ecol. Stat.
**2014**, 21, 411–433. [Google Scholar] [CrossRef] [PubMed] - Carmichael, G.R.; Sandu, A.; Chai, T.; Daescu, D.N.; Constantinescu, E.M.; Tang, Y. Predicting air quality: Improvements through advanced methods to integrate models and measurements. J. Comput. Phys.
**2008**, 227, 3540–3571. [Google Scholar] [CrossRef] - Dabberdt, W.F.; Carroll, M.A.; Baumgardner, D.; Carmichael, G.; Cohen, R.; Dye, T.; Ellis, J.; Grell, G.; Grimmond, S.; Hanna, S.; et al. Meteorological research needs for improved air quality forecasting: Report of the 11th prospectus development team of the US weather research program. Bull. Am. Meteorol. Soc.
**2004**, 85, 563–586. [Google Scholar] [CrossRef] - Sportisse, B. A review of current issues in air pollution modeling and simulation. Comput. Geosci.
**2007**, 11, 159–181. [Google Scholar] [CrossRef] - Elbern, H.; Strunk, A.; Nieradzik, L. Inverse modelling and combined state-source estimation for chemical weather. In Data Assimilation; Lahoz, W., Khattatov, B., Ménard, R., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 491–513. [Google Scholar]
- Bocquet, M.; Elbern, H.; Eskes, H.; Hirtl, M.; Žabkar, R.; Carmicheal, G.R.; Flemming, J.; Inness, A.; Pagaoski, M.; Pérez Camaño, J.L.; et al. Data assimilation in atmospheric chemistry models; current status and future prospects for coupled chemistry meteorology models. Atmos. Chem. Phys.
**2015**, 15, 5325–5358. [Google Scholar] [CrossRef] [Green Version] - Chai, T.; Carmichael, G.R.; Sandu, A.; Tang, Y.H.; Daescu, D.N. Chemical data assimilation of transport and chemical evolution over the pacific (TRACE-P) aircraft measurements. J. Geophys. Res.
**2006**, 111, D02301. [Google Scholar] [CrossRef] - Sandu, A.; Chai, T. Chemical data assimilation—An overview. Atmosphere
**2011**, 2, 426–463. [Google Scholar] [CrossRef] - Marseille, G.J.; Barkmeijer, J.; De Haan, S.; Verkle, W. Assessment and tuning of data assimilation systems using passive observations. Q. J. R. Meteorol. Soc.
**2016**, 142, 3001–3014. [Google Scholar] [CrossRef] - Ménard, R.; Deshaies-Jacques, M.; Gasset, N. A comparison of correlation-length estimation methods for the objective analysis of surface pollutants at Environment and Climate Change Canada. J. Air Waste Manag. Assoc.
**2016**, 66, 874–895. [Google Scholar] [CrossRef] [PubMed] - Cohn, S.E.; Da Silva, A.; Guo, J.; Sienkiewicz, M.; Lamich, D. Assessing the effects of data selection with the DAO physical-space statistical analysis system. Mon. Weather Rev.
**1998**, 126, 2913–2926. [Google Scholar] [CrossRef] - Houtekamer, P.L.; Mitchell, H.L. A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Weather Rev.
**2001**, 129, 123–137. [Google Scholar] [CrossRef] - Efron, B.; Tibshirani, R.J. An Introduction to Boostrap; Chapman & Hall: New York, NY, USA, 1993. [Google Scholar]
- Seigneur, C.; Pun, B.; Pai, P.; Louis, J.F.; Solomon, P.; Emery, C.; Morris, R.; Zahniser, M.; Worsnop, D.; Koutrakis, P.; et al. Guidance for the performance evaluation of three-dimensional air quality modeling systems for particulate matter and visibility. J. Air Waste Manag. Assoc.
**2000**, 50, 588–599. [Google Scholar] [CrossRef] [PubMed] - Chang, J.C.; Hanna, S.R. Air quality model performance evaluation. Meteorol. Atmos. Phys.
**2004**, 87, 167–196. [Google Scholar] [CrossRef] - Savage, N.H.; Agnew, P.; Davis, L.S.; Ordóñez, C.; Thorpe, R.; Johnson, C.E.; O’Connor, F.M.; Dalvi, M. Air quality modelling using the Met Office Unified Model (AQUM OS24-26): Model description and initial evaluation. Geosci. Model Dev.
**2013**, 6, 353–372. [Google Scholar] [CrossRef] - Katragkou, E.; Zanis, P.; Tsikerdekis, A.; Kapsomenakis, J.; Melas, D.; Eskes, H.; Flemming, J.; Huijnen, V.; Inness, A.; Schultz, M.G.; et al. Evaluation of near surface ozone over Europe from the MACC reanalysis. Geosci. Model Dev.
**2015**, 8, 2299–2314. [Google Scholar] [CrossRef] - Ménard, R. Error covariance estimation methods based on analysis residuals: Theoretical foundation and convergence properties derived from simplified observation networks. Q. J. R. Meteorol. Soc.
**2016**, 142, 257–273. [Google Scholar] [CrossRef] - Desroziers, G.; Berre, L.; Chapnik, B.; Poli, P. Diagnosis of observation, background, and analysis-error statistics in observation space. Q. J. R. Meteorol. Soc.
**2005**, 131, 3385–3396. [Google Scholar] [CrossRef] - Daley, R. Atmospheric Data Analysis; Cambridge University Press: New York, NY, USA, 1991; p. 457. [Google Scholar]
- Ménard, R.; Deshaies-Jacques, M. Evaluation of analysis by cross-validation, Part II: Diagnostic and optimization of analysis error covariance. Atmosphere
**2018**, 9, 70. [Google Scholar] [CrossRef]

**Figure 1.**Spatial distribution of the three subsets of PM

_{2.5}observations used for cross-validation. The selection algorithm is based on regular picking of station by ID number.

**Figure 2.**Variance of observation-minus-analysis residuals of O

_{3}and PM

_{2.5}for both active and cross-validation passive observations as a function of $\gamma ={\sigma}_{o}^{2}/{\sigma}_{b}^{2}$. (

**a**) is for O

_{3}with ordinates in ppbv

^{2}units, and (

**b**) is for PM

_{2.5}with ordinates in ${(\mathsf{\mu}{\mathrm{g}/\mathrm{m}}^{3})}^{2}$. Red curve results from the evaluation at the passive observation sites (average of the 3-fold subsets). Black curve results from evaluation at the active observation sites with analyses using all observations. Green curve results from the evaluation at the active observation sites in the cross-validation experiment (i.e., using 2/3 of the observations; average of the three subsets). Blue curve is the variance of observation-minus-model.

**Figure 3.**Correlation between observations and analysis for (

**a**) O

_{3}and (

**b**) PM

_{2.5}for both active and cross-validation passive observations as a function of $\gamma ={\sigma}_{o}^{2}/{\sigma}_{b}^{2}$. The red, black and green curves are as in Figure 2.

**Figure 4.**FC2 for (

**a**) O

_{3}and (

**b**) PM

_{2.5}for both active and cross-validation passive observations as a function of $\gamma ={\sigma}_{o}^{2}/{\sigma}_{b}^{2}$. The red, black and green curves are as in Figure 2.

**Figure 5.**Bias between observation and analysis for (

**a**) O

_{3}and (

**b**) PM

_{2.5}for both active and cross-validation passive observations as a function of $\gamma ={\sigma}_{o}^{2}/{\sigma}_{b}^{2}$. The red, black and green curves are as in Figure 2.

**Figure 6.**Biases per bin of model values. Figure (

**a**), presents the statistics for O

_{3}and in (

**b**), for PM

_{2.5}. In the upper portion, (

**a**,

**b**) are the residual statistics per bin; in black, the $(O-B)$, in grey, the $(O-A)$ at passive observation sites (mean of the 3-fold subsets) for a non-optimal analysis with $\gamma =10$, and in yellow, the $(O-A)$ at passive observation sites (mean of the 3-fold subsets) using the optimal observation weight. In the lower portion, (

**a**,

**b**) are the station number count per model values.

**Figure 7.**Modified normalized mean bias (MNMB) between observation and analysis for (

**a**) O

_{3}and (

**b**) PM

_{2.5}for both active and cross-validation passive observations as a function of $\gamma ={\sigma}_{o}^{2}/{\sigma}_{b}^{2}$. The red, black and green curves are as in Figure 2.

**Figure 8.**

**Figure 8**. Same as Figure 6 except that we display the variance of analysis-minus-passive observations per bin of model values.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ménard, R.; Deshaies-Jacques, M.
Evaluation of Analysis by Cross-Validation. Part I: Using Verification Metrics. *Atmosphere* **2018**, *9*, 86.
https://doi.org/10.3390/atmos9030086

**AMA Style**

Ménard R, Deshaies-Jacques M.
Evaluation of Analysis by Cross-Validation. Part I: Using Verification Metrics. *Atmosphere*. 2018; 9(3):86.
https://doi.org/10.3390/atmos9030086

**Chicago/Turabian Style**

Ménard, Richard, and Martin Deshaies-Jacques.
2018. "Evaluation of Analysis by Cross-Validation. Part I: Using Verification Metrics" *Atmosphere* 9, no. 3: 86.
https://doi.org/10.3390/atmos9030086