Trend assessment for a CO2 and CH4 data series in northern Spain

The main objective of this paper is to implement different methods to assess the salient features of the data trend for a CO2 and CH4 data series. Said series was obtained at the Low Atmosphere Research Centre (41°48′49′′ N, 4°55′59′′ W) using a Picarro analyser (G1301). Different functions were employed to determine and quantify the data trend. The first was a harmonic function based on a third-degree polynomial. An increasing trend, below 2.30 ppm year-1 for CO2 and below 11.90 ppb year-1 for CH4, was reported. Epanechnikov, Gaussian, biweight, tricubic, rectangular and triangle kernels, were applied with a 500-day bandwidth for the trend. The best fit was obtained by the biweight kernel (r>0.20), with an increasing trend around 1.80 ppm year-1 for CO2 and around 7.15 ppb year-1 for CH4. The final analysis, which included local linear regression functions also applying a 500-day bandwidth, revealed increasing trends for both CO2, around 1.98 ppm year-1, and CH4, around 10.85 ppb year-1. Trend values were far more accelerated in the latter years of the series, regardless of the chosen function. This paper demonstrated the usefulness of the mathematical functions, allowing for an accurate determination of the data trend.


Introduction
CO2 and CH4 have been the focus of intensive research at different sites in an effort to gain insights into their evolution, given their link to climate change [1]. Small changes in the trends of greenhouse gases can have a major impact on the Earth, and thus may be of great significance for climate change [2].
The decomposition of an atmospheric time series into its constituent parts is essential for identifying and isolating variations of interest from a data set, and is widely used to obtain information about sources, sinks and trends in climatically important gases. An atmospheric data record is the result of a long-term trend, a short term and a random term resulting from anomalous observations. Due to the periodic and irregular variations on both long and short timescales, analysis of atmospheric time series is a complex process. The long-term is mainly derived from fossil fuel burning processes and from land-use change emissions, overlapped to a large interannual variability related to climate-driven changes in source and sinks [2,3]. Studying the data trend requires subtracting the seasonal component from the complete time series by a deseasonalization process. This procedure involves fitting appropriate mathematical functions to the data [2]. pollution analysis. Cleveland [7] suggested local regression methods to obtain visual information from scatterplots [8]. Kernel smoothing functions are an extension of the locally weighted average method which has been widely used in atmospheric temporal series [9]. Moreover, kernel regression methods have successfully been used in different air pollution data. According to de Haan [10], Lorimer was the first author to use the kernel method for atmospheric dispersion modelling of inert airborne contaminants [11]. Since then, many authors have employed this mathematical technique to atmospheric time series [9,12].
Thus, the main goal of the current study is to quantify the trend evolution of a CO2 and CH4 data series by applying harmonic functions, kernel techniques and, finally, local linear functions. The results may be helpful as regards improving our understanding of the temporal evolution of both gases in a Mediterranean rural area of northern Spain.

Data acquisition and calibration
The measuring campaign took place from mid October 2010 to late February 2016, and consisted of 30-min mixing ratio averages. Semi-hourly measurements were divided into diurnal and nocturnal records. Measurements were carried out at the Low Atmosphere Research Centre, CIBA station (41°48′49″ N, 4°55′59″ W) with a Picarro G1301 analyser, which is based on cavity ring-down spectroscopy. A detailed explanation of this technique is given in [13,14].
The analyser is equipped with three external solenoid valves which measure at a height of 1.8, 3.7 and 8.3 m. Only the highest level was considered in the current paper. Values provided by the device were slightly corrected by using linear equations obtained with calibrations performed fortnightly with three NOAA calibration gas standards. The following equations, taken from Pérez et al. [15] (in ppm), were applied to the data to correct the measurements obtained with the analyser: where C means the corrected value.

Site location
The sampling area is located in a semi-natural area over an extensive plateau 845 m above sea level in northern Spain. Located 24 and 40 km from two of the region's main cities, and with no major roads nearby, the station was able to provide essential information on CO2 and CH4 sources and sinks within the Spanish upper plateau (Fig. 1).
Agricultural crops, Mediterranean shrubs and stands of coniferous trees make up the surrounding vegetation. The site is classed as Mediterranean North, with maximum precipitation in autumn and spring and minimum during the summer season [16,18] (Fig. 2).

Harmonic function
The harmonic function employed (Eq. 3) is based on the equation proposed by Fernández-Duque et al. [6]. In this study, the authors analysed the temporal evolution for a CO2 and CH4 data set with a harmonic function comprising a third-degree polynomial term which expresses the data trend and a series of four harmonics which reflect seasonality. The equation used is described below: where y expresses the mixing ratio in ppm for CO2 and ppb for CH4, whereas time (t) is expressed in years. The four harmonics involved in the equation consider the amplitude as a fixed (k=0) and variable (k=1) term over time in order to obtain more accurate results. The unknown coefficients of the multiple harmonic regression were obtained using MATLAB © software.

Kernel function
The average smoothed mixing ratios employing the kernel procedure can be obtained by using where t refers to time expressed in years, h is the smoothing parameter expressed in days, N is the number of data, K is the kernel function employed, and yi is the mixing ratio in a time ti. Six different kernel functions were considered to smooth out the mixing ratios, and are presented in Table 1. Table 1. Kernel functions employed in the study.

Kernel function a K(u)
Epanechnikov It should be taken into account that the u value for which k (u) > 0 takes positive values ranges between -1 and 1, except for the Gaussian kernel which comprises the whole real line. Each of the kernels employed in the smoothing process is centred as its respective data value ti and is scaled in width relative to the shapes by the smoothing parameter h. If the h value is small, great weight is attached to the neighbouring observations, resulting in a rough estimation. On the other hand, a large h value may lead to significant smoothing, masking the behaviour at certain scales [17]. In order to avoid oscillations, the use of high smoothing factors is recommended even though some peak locations may be lost. A 500-day bandwidth for expressing the trend was used.

Local linear function
The second non-parametric technique employed in this paper was the weighted local linear regression function, which may be stated as follows: where a0 and a1 were calculated according to expressions 6 and 7: where wi are the weights and and are weighted mean values, calculated according to the procedure described by Pérez et al. [8]. It should be considered that in this case, the Epanechnikov kernel was used, since it is the most widely used kernel in atmospheric studies and also because of its simplicity. A 500-day bandwidth for expressing the trend was also used for this analysis.   Figure 3 (a-b) shows the variability of average CO2 mixing ratios over time. CO2 evolution presents a maximum in the winter season, followed by another maximum in spring which is only detectable for nocturnal records. The winter peak is mainly related with local emissions from vehicles, industrial activities and domestic heating, as suggested García et al. [18]. The spring maximum at nights is related with strong temperature inversions at the sampling site, which contribute to trapping CO2 emissions in the highly stable stratified mixing layer during the growing season [5]. Furthermore, an increase in precipitation in spring leads to vigorous vegetation growth.

CO2 and CH4 mixing ratio evolution
Thus, respiration processes play an important role, leading to higher mixing ratio values at nights.
Minimum mixing ratio values were obtained during summer when a decrease in the consumption of fossil fuel is observed together with a photosynthetic uptake contribution.
The CH4 cycle (Fig. 3 c-d) is less pronounced than the CO2 cycle, although differences between day and night could also be inferred. Higher values were detected at nights due to lower planetary boundary layer heights which inhibit vertical mixing movements. 90 th percentile values, which are around 1.95 ppb, might be attributed to fugitive emissions from an urban landfill near the sampling site and from high emissions from the nearby cities as pointed out by Sánchez et al. [19]. The main statistics for the CO2 and CH4 mixing ratios at the CIBA station are presented in Table 2. All the statistics studied (Table 2) display higher values for nocturnal records. The different behaviour between day and night is in line with a lower development of the mixing layer height together with greater emissions from vehicles at night leading to higher mixing ratio values. In contrast, more intense solar radiation during the daytime produces an increase in the mixing layer height and, therefore, efficient dilution of gases from surface decreasing mixing ratio measures [18].
Had we considered the whole data set without differentiating day measurements from night measurements, the main CO2 statistics presented in Table 2 would be consistent with those presented by García et al. [18]. Figure 4 reflects the CO2 and CH4 trend evolution by using the three different mathematical functions described in section 2.3. As regards harmonic functions, only the polynomial term was employed in this section since this is the term that expresses the data trend.

Data trend evolution
Six different kernel functions, listed in Table 1, were used to obtain the CO2 and CH4 trend. However, only the biweight kernel is presented in Fig. 4 since this function provides the best fit for both the CO2(r>0.40) and CH4(r>0.20) dataset. As regards the local weighted method, the Epanechnikov kernel was employed because of its simplicity and low computational effort, since only a fraction of the observations in the neighbourhood of the calculation points were considered. In both cases, a 500-day bandwidth was applied to the dataset. We found that trend curves fit the experimental data provided by the Picarro (G1301) analyser well, showing few differences among the three mathematical functions. A well-defined increasing pattern over the years was inferred from Fig. 4, which is partially related with a rise in anthropogenic emissions from industrial activities and from the urban landfill near the monitoring station for CH4, as well as from farming vehicle emissions [8]. The same behaviour was described by García et al. [18]. Table 3 gives a summary of the mean trend values by using the different functions. Higher nocturnal than diurnal values for both CO2 and CH4 were found mainly as a result of the evolution of the boundary layer, biological processes and the influence of industrial emissions. year -1 for CH4. As regards the kernel functions, slightly lower values were recorded for CO2, around 1.80 ppm year -1 , while for CH4 the mean growth rate yielded 7.15 ppb year -1 . In all cases, mean trend values were far more accelerated in the latter years, which might be the result of a rise in emissions of both gases due to industrial activities. These mean trend values were in agreement with those found by Zhang et al. and Zhu et al. [20,21], which were in the range of 1.7-3.6 ppm year -1 for CO2, and with the results reported by several authors [22,23,24], which were in the range of 6-10 ppb year -1 for CH4.
Harmonic functions lead to greater values and smoother trend curves since all the data observations contribute to the calculations, while kernel and local weighted functions only consider a narrow interval calculation, ranging between -1 and 1. Furthermore, the border effect of kernel and local functions causes less smooth graphical outputs at the start and end of the series.

Conclusions
This paper presents the results of the data trend evolution for a CO2 and CH4 data series by using harmonic, kernel and local linear functions at the CIBA station. We have also investigated the influence of six different kernel functions when estimating the data trend applying the kernel procedure. We failed to find any major differences between the kernels, although slightly better fits were obtained with the biweight kernel. For this reason, only the biweight kernel is presented in the present study. Our comparisons between functions showed no major differences between the outputs, and report very similar trends for both CO2 and CH4. Thus, we can conclude that harmonic functions, Kernel functions and local weighted functions were found to be effective methods of describing the CO2 and CH4 data trend in this Mediterranean rural area of northern Spain. These mathematical functions described the behaviour at the site, producing meaningful information for air quality modelling of both important greenhouse gases in the lower atmosphere.
Acknowledgments: The authors of this paper acknowledge financial support from the Spanish Ministry of Economy and Competitiveness and ERDF funds (projects CGL2009-11979 and CGL2014-53948-P).