# Cokriging Prediction Using as Secondary Variable a Functional Random Field with Application in Environmental Pollution

## Abstract

## 1. Introduction and Bibliographical Review

## 2. Cokriging Using as Secondary Variable a Functional Random Field

#### 2.1. Ordinary Cokriging Predictor

- (i)
- $2{\gamma}_{lq}({\mathit{s}}_{i},{\mathit{s}}_{j})=\mathrm{Cov}({X}_{l}({\mathit{s}}_{i})-{X}_{q}({\mathit{s}}_{j}))$, for $l,q=1,\dots ,m$.
- (ii)
- ${\mathit{\gamma}}_{lq}^{\top}=({\gamma}_{lq}({\mathit{s}}_{1},{\mathit{s}}_{0}),\cdots ,{\gamma}_{lq}({\mathit{s}}_{n},{\mathit{s}}_{0}))$.
- (iii)
- ${\mathsf{\Gamma}}_{lq}=\left(\right)open="("\; close=")">\begin{array}{ccc}{\gamma}_{lq}({\mathit{s}}_{1},{\mathit{s}}_{1})& \cdots & {\gamma}_{lq}({\mathit{s}}_{1},{\mathit{s}}_{n})\\ \vdots & \ddots & \vdots \\ {\gamma}_{lq}({\mathit{s}}_{n},{\mathit{s}}_{1})& \cdots & {\gamma}_{lq}({\mathit{s}}_{n},{\mathit{s}}_{n})\end{array}$

#### 2.2. Cokriging Predictor Using Functional Secondary Variables

**Remark**

## 3. Real Data Analysis

#### 3.1. Definition of the Problem upon Study

#### 3.2. Cokriging Prediction of PM10 Using WS Curves

`R` software was used for obtaining the calculations

`R` package named

`gstat` was used to estimate the corresponding parameters. The data and

`R` codes used in this empirical application are available at [40]. The estimated range of simple and cross-variograms was 8 km (about one-third of the maximum distance between monitoring sites in Figure 1).

## 4. Concluding Remarks, Discussion, and Future Research

#### 4.1. Conclusions and Discussion

- (i)
- A cokriging predictor considering a functional secondary variable was proposed.
- (ii)
- After smoothing by using basis functions, an ordinary cokriging was defined by means of several secondary variables (as many as basis functions are used for smoothing the data).
- (iii)
- Cokriging was considered to be a better option than kriging, because including one or more secondary variables in the prediction process reduces uncertainty.
- (iv)
- It was showed how to use the proposed methodology when there are many measurements of a secondary variable over time.
- (v)
- An illustration with a real data set was considered to predict PM10 values in Bogotá city by using a cokriging predictor with wind speed as functional secondary variable.

#### 4.2. Further Work

- (i)
- A cokriging predictor with functional variables can be studied upon non-stationarity [21].
- (ii)
- Extensions to the multivariate case is also of practical relevance [41].
- (iii)
- (iv)
- (v)
- The applications of the new methodology proposed in this investigation can be of interest in diverse areas, where the functional data analysis is considered [29].
- (vi)
- (vii)
- Autoregressive model-based fuzzy clustering can be used for detecting information redundancy in air pollution monitoring networks [49].
- (viii)

## Appendix A

#### Appendix A.1. Cokriging System of Equations

#### Appendix A.2. Coefficients of Basis Functions

## References

**Figure 1.**Spatial location of ten air quality monitoring stations in the Bogotá area. Circles are proportional to maximum PM10 values (maximum at each station was calculated based on data that were collected hourly between 26 January 2011 at 12:00 p.m. to 3 February 2011 at 2:00 p.m.).

**Figure 2.**WS values (in m/s) of ten air quality monitoring stations in the Bogotá area (data at each monitoring station were collected each two hours between 26 January 2011 at 10:00 a.m. to 3 February 2011 at 12:00 p.m.). In total, WS data were collected in 98 time periods (time period 1 corresponds to 10:00 a.m. of 26 January 2011 and time period 98 to 12:00 p.m. 3 February 2011).

**Figure 3.**WS curves obtained by smoothing the data set of each station using a Fourier basis (of dimension $k=7$). Time period 1 corresponds to 10:00 a.m. 26 January 2011 and time period 98 to 12:00 p.m. 3 February 2011.

**Figure 4.**Map of PM10 predictions in the Bogotá area obtained with cokriging and a secondary variable corresponding to WS curves.

**Figure 5.**Map of PM10 prediction variances with the highest magnitudes (dark grey) corresponding to zones distant from the sampling sites.

**Table 1.**PM10 values and coefficients ${a}_{ij}$, for $i=1,\dots ,n$ and $j=1,\dots ,7$, of Fourier basis functions fitted to SW data (collected each two hours between 26 January 2011 and 3 February 2011 at each of ten environmental monitoring stations in Bogotá, Colombia).

Station ID | Monitoring Station | PM10 | ${\mathit{a}}_{\mathit{i}1}$ | ${\mathit{a}}_{\mathit{i}2}$ | ${\mathit{a}}_{\mathit{i}3}$ | ${\mathit{a}}_{\mathit{i}4}$ | ${\mathit{a}}_{\mathit{i}5}$ | ${\mathit{a}}_{\mathit{i}6}$ | ${\mathit{a}}_{\mathit{i}7}$ |
---|---|---|---|---|---|---|---|---|---|

1 | Carvajal | 242 | 4.35 | 2.31 | 1.87 | 1.40 | –0.40 | –0.10 | –0.67 |

2 | Fontibon | 174 | 11.24 | 3.97 | 2.12 | 2.17 | –1.05 | –0.37 | –0.83 |

3 | Guaymaral | 179 | 3.49 | 1.67 | 1.26 | 0.85 | –0.48 | –0.51 | –0.50 |

4 | Kennedy | 265 | 9.76 | 3.52 | 1.72 | 1.54 | –0.55 | –0.05 | –0.60 |

5 | Las Ferias | 116 | 6.90 | 2.17 | 1.37 | 0.91 | –0.69 | –0.26 | –0.56 |

6 | Simón Bolivar | 130 | 5.77 | 2.13 | 1.52 | 0.88 | –0.27 | –0.48 | –0.24 |

7 | Puente Aranda | 237 | 9.80 | 3.00 | 1.49 | 1.77 | –0.37 | –0.48 | –0.63 |

8 | Suba | 157 | 6.41 | 0.76 | 1.63 | 0.51 | –0.13 | –0.83 | –0.23 |

9 | Tunal | 162 | 4.64 | 1.94 | 1.27 | 1.07 | –0.33 | –0.01 | –0.35 |

10 | Usaquen | 118 | 5.39 | 1.12 | -0.22 | 0.19 | –0.07 | –0.17 | –0.59 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

