Comparison of Multiple Maximum and Minimum Temperature Datasets at Local Level: The Case Study of North Horr Sub-County, Kenya

Climate analyses at a local scale are an essential tool in the field of sustainable development. The evolution of reanalysis datasets and their greater reliability contribute to overcoming the scarcity of observed data in the southern areas of the world. The purpose of this study is to compute the reference monthly values and ranges of maximum and minimum temperatures for the eight main inhabited villages of North Horr Sub-County, in northern Kenya. The official ten-day dataset derived from the Kenyan Meteorological Department (KMD), the monthly datasets derived from the ERA-Interim reanalysis (ERA), the Observational-Reanalysis Hybrid (ORH) and the Climate Limited Area Mode driven by HadG-EM2-ES (HAD) are assessed on a local scale using the most common statistical indices to determine which is more reliable in representing monthly maximum and minimum temperatures. Overall, ORH datasets showed lower biases and errors in representing local temperatures. Through an innovative methodology, a new set of monthly mean temperature values and ranges derived from ORH datasets are calculated for each location in the study area, in order to guarantee to locals an historical benchmark to compare present observations. The findings of this research provide insights for environmental risk management, supporting local populations in reducing their vulnerability.


Introduction
Several studies have classified Africa as the most vulnerable continent to the impacts of climate change, due to its dependence on agricultural activities as well as its poor financial, technical and institutional resilience capacities [1][2][3][4]. Local sustainable development in Africa is heavily threatened by the impacts that climate change has on livelihood activities, ecosystem services and water supply [5]. Among African countries, Kenya is one of the most vulnerable to climate change. According to the Notre Dame Global Adaptation Initiative index, which measures the current level of vulnerability to climate disruption for 182 countries, Kenya was ranked as the 38th most vulnerable country in the world to the effect of climate change and the 23rd least ready to face future impacts in 2020 [6]. Changes in rainfall patterns and increasing temperatures are expected to generate prolonged periods of drought and more intense floods in the country [7], causing huge economic losses, especially for the pastoralist communities in the north of the country [8]. Most of the rural communities in the northern regions live under conditions of poverty and extreme weather events will increasingly threaten the area [9]. Therefore, monitoring, forecasting and early warning systems are the best strategies to mitigate negative socio-economic impacts and strengthen the resilience of the communities [10,11].
The present investigation is based on the principles of the One Health approach. This multidisciplinary approach aims to achieve global health, addressing the needs of the most vulnerable populations and increasing their resilience on the basis of the intimate relationship between human health, animal health and environment conditions [12]. In this context, historical weather observations are essential, since they allow the examination of long-term climate trends and the comparison of past patterns with present and future values, in order to have a broader view of what is happening at the local scale. However, the weather observation network in arid and semi-arid lands (ASALs) has a poor spatial distribution. The majority of the land-based meteorological stations are located in the south and in the coastal areas of the country, which are the territories that attract most of the tourist flows [13]. Furthermore, due to the scarce investment in technological renovation of the infrastructures, the national meteorological network lacks modern facilities for data analysis, which are needed to adequately represent past and present local climate trends [14,15]. These observation deficiencies do not allow the proper interpretation and prediction of local extreme weather events and consequently prevent the mitigation of related risks [16].
With the progress of technologies and the increasing research effort, different sources have been produced over recent decades to fill these data gaps, such as remote sensing, climate models and reanalysis [17]. Recent studies have tried to assess the reliability of different reanalysis datasets in Kenya and, more generally, in East Africa [18][19][20][21]. However, all of these studies approached the issue with a regional rather than a local perspective.
The present study analyzes the reliability of different maximum and minimum temperature datasets derived from temperature reanalysis to assess which is the most appropriate to represent the local climate. In North Horr Sub-County, situated in Marsabit County in northern Kenya, there is only one land-based meteorological station, which was installed in North Horr village in 2019. Due to the lack of historical observations in the sub-county, an area within a 250 km radius from North Horr has been defined and the land-based meteorological stations located inside the area have been selected in order to maximize the spatial coverage and the local representativeness. Specifically, these land-based meteorological stations are situated in Lodwar, Moyale and Marsabit. The climatic products were evaluated against the historical observations recorded by the Lodwar, Marsabit and Moyale land-based meteorological stations using a direct, point-to-pixel validation based on a statistical indices approach [22].
As proposed by [23], the closest grid point to each land-based meteorological station has been selected for each dataset, regardless of the position of the station inside the grid box and of the spatial resolution of the datasets. Although the resolution of the datasets is known to have an influence on the validation results [24,25], the use of this approach is supported by the low topographic complexity of the area [22,26] and by the need to assess the datasets' performances in their original format, i.e., original resolutions. Thus, the evaluation involves the datasets' ability, as available for the end users, to appreciate the local peculiarities, that is, how the selected product will be used for future local-scale research on the area. Five performance indicators were considered in order to assess the goodness of the models [23,27,28]. In addition, Taylor diagrams were calculated in order to display the statistical agreement between the models' data and the observations. The datasets which provided the highest scores were chosen for the calculation of the reference T MAX and T MIN .
The novelty of this study consists in the creation of an innovative methodology for the calculation of reference T MAX and T MIN values and temperature ranges. The reference values and temperature ranges obtained can provide a benchmark to strengthen the warning and response systems against extreme weather events. This methodology can support local populations in developing adaptation strategies and increasing their resilience against climatic anomalies. Section 2 describes the geographic features in the study area, the dataset used, and the methodology adopted. The results are presented and discussed in Section 3, whereas conclusions are drawn in Section 4.

Study Area
North Horr Sub-County is located within Marsabit County, in northern Kenya. In this area, the main populated villages, as reported in Figure 1 as "main locations", are Balesa, Dukana, El-Gade, El-Hadi, Gas, Kalacha, Malabot and North Horr. The region is mainly arid and represents ASAL. Exceptions are represented by the areas around Mt. Marsabit and Mt. Kulal. The geomorphological configuration of the territory is mainly characterized by extensive plains with an altitude varying from 300-900 m.a.s.l. The rainfall is variable and the evaporation rate exceeds ten times the rainfall amount, whereas the temperature ranges from 15 • C to 26 • C and the mean annual temperature is about 20.5 • C [29]. The temperature is mostly influenced by the altitude and its oscillation throughout the year is linked to the rainy seasons. Generally, in Kenya, there are two rainy seasons, with a high peak during the months of March to May, and between October and December. The bimodal rainfall pattern is mainly influenced by the migration of the Inter-Tropical Convergence Zone and occasionally by the El Niño Southern Oscillation and the Indian Ocean Dipole, which can lead to extreme events such as floods and droughts [30][31][32].

Historical Temperature Series
Three land-based meteorological stations, located outside the study area but within a radius of 250 km from North Horr village, allowed us to overcome the lack of historical observations that characterizes the territory [22]. They are located in Lodwar, Marsabit and Moyale. Moreover, an automatic weather station has been active in North Horr since March 2019, within the framework of the One Health international cooperation project carried out in the sub-county. The characteristics of the land-based meteorological and automatic stations are summarized in Table 1. Finally, T MAX and T MIN observations recorded between March 2019 and June 2020 by the land-based automatic weather station located in North Horr were used for the validation of the results.
Regarding the temporal resolution, monthly T MAX and T MIN were considered suitable for the purpose of this study; temperatures recorded at shorter periodic intervals were converted into monthly observations.
As can be seen from Figure 2, the maximum and minimum temperatures recorded by the Lodwar station are significantly higher than those of Marsabit and Moyale. The reason for this difference derives from the different altitudes at which the stations are located. Lodwar station is located at 500 m, while those of Marsabit and Moyale are higher. However, the cold season in the region is experienced from June-August and the hot season is January-March.

Reanalysis Product Description
Four different climatic products were evaluated for the purpose of this study, as shown in Figure 3.  According to [27,35,36], Observational Reanalysis Hybrid (ORH) should be the preferred data source to be used for climate change studies in East Africa. ORH is a global [35] and regional (Northern/West/East Africa) [28] three-hourly, daily and monthly meteorological dataset developed through a spatial downscaling of the National Centers for Environmental Prediction and the National Center for Atmospheric Research reanalysis, with a spatial resolution of 0.25 • . The spatial downscaling includes elevation changes and is evaluated against ground stations, allowing the omission of random errors. Moreover, ORH is corrected for temporal inhomogeneity and biases [28]. Another source of climate information is the Regional Climatic Models derived from the Global Climatic Model through the dynamical downscaling method [37]. Recently, within the Coordinated Regional Downscaling Experiment community, different regional climatic models were developed in the African domain, with high spatial resolution and availablility in different temporal domains. According to a recent study [21], the Climate Limited Area Model driven by HadGEM2-ES (HAD) with a resolution of 0.1 • is suitable for climate change studies in East Africa and was considered for this analysis. In addition, ERA-Interim (ERA) maximum and minimum temperature datasets with a spatial resolution of 0.125 • were assessed. ERA has been issued for the period of 1979-2016 and combines weather observations and short-term forecasts initialized from previous analysis; it integrates daily surface air temperature observations and it is probably the most comprehensive reanalysis existing currently [33]. Finally, the last dataset considered is the ten-day dataset from the Kenyan Meteorological Department (KMD), which has a spatial resolution of 0.0375 • , and covers the period 1983-2014. The KMD is a high-resolution, spatially and temporally complete gridded historical temperature dataset produced by the International Research Institute for Climate and Society (IRI), the Earth Institute at Columbia University [24].

Methodology
The study followed five steps ( Figure 4):

1.
Dataset performance assessment through the comparison between the historical observations of the land-based meteorological stations of Lodwar, Marsabit and Moyale and the dataset point values; 2.
Evaluation of systematic errors in T MAX and T MIN seasonal representation of the chosen datasets at the land-based meteorological station level; 3.
Calculation and validation of the monthly ranges of T MAX and T MIN for Lodwar, Marsabit and Moyale; 4.
Calculation of the monthly reference values and ranges of T MAX and T MIN for all the eight reference points; 5.
Validation of the results through the comparison between monthly T MAX and T MIN ranges and the observations recorded by the North Horr automatic land-based weather station. The five methodological steps are described in detail below.

Dataset Performance Comparison
Point-to-pixel comparison is a commonly used method to compare ground observations with other data products, such as reanalysis temperature datasets [26]. This method has been implemented in the analysis to test the accuracy of each selected product in representing North Horr Sub-County local temperatures. Since the historical observations were available as monthly values, the temperature values from the datasets were firstly aggregated into monthly values. Therefore, KMD, ERA, ORH and HAD datasets were evaluated against the T MAX and T MIN observations recorded by the land-based meteorological stations of Lodwar, Marsabit and Moyale in the selected period of 1983-2014. Five statistical indices have been computed: the Bias, the mean absolute error (MAE), the mean squared error (MSE), the root mean squared error (RMSE) and the correlation coefficient (CC) [27,28,38]. On top of the above statistical indices, a Taylor diagram has been computed to display the statistical agreement between the datasets and the land-based meteorological station data. The diagram shows the agreement through the CC and the standard deviation (σ).

Systematic Error Evaluation at Land-Based Station Level
After establishing which datasets showed the best fit for the study area, the analysis of the ability to represent seasonal temperatures was carried out for each land-based meteorological station. Seasonal T MAX and T MIN boxplots were created to compare the differences between the observations and the dataset values. The analysis of the boxplots aimed to assess the distribution of the temperature values and to determine the presence of seasonal systematic errors in the reanalysis. Moreover, the comparison between the boxplots allowed us to choose the most appropriate position index e to calculate the monthly reference temperatures.

Reference Values and Range Computation and Validation at Land-Based Station Level
Monthly reference values were computed for both T MAX and T MIN timeseries by averaging the monthly temperature for the entire period considered (1983-2014). Temperature ranges were calculated with an amplitude of 2 degrees Celsius centered on the climatological monthly mean values. The amplitude of the ranges is determined by the difference between the 5th and 95th percentile of the ORH monthly distribution of T MAX and T MIN . Thereby, the ranges managed to contain most of the historical observations and to exclude the events of extreme heat or cold.
As a first level of verification, the monthly ranges were compared against historical maximum and minimum temperature observations recorded by Lodwar and Moyale landbased meteorological stations, to assess the number of observations falling within the ranges and the number of outliers outside of the intervals.

Range Computation at Reference Point Level
Since the temperature ranges and reference values computed for Lodwar and Moyale locations showed good performances, the climatological reference values were calculated for all the reference locations (Balesa, Dukana, El-Hadi, El-Gade, Gas, Kalacha, Malabot, North Horr) using the same criteria.

Result Validation for North Horr
As a final test, North Horr temperature ranges were compared against the T MAX and T MIN observations collected from the automatic weather station located in North Horr for the period from March 2019 to June 2020. Daily temperature observations were converted into monthly measurements and subsequently evaluated in relation to the monthly thresholds.

Dataset Performance Comparison
In this section, the comparison between climate products and historical TMAX and TMIN observations is presented. Five different indices were computed: RMSE, MSE, MAE, bias and CC, as shown in Table 2.
Despite small exceptions, all products showed a tendency to overestimate temperatures, as suggested by positive values of the bias. Referring to MSE and RMSE indices, ORH and KMD showed better results. Indeed, ERA and HAD showed higher RMSE, MSE and biases compared to ORH and KMD datasets in almost every station. KMD showed higher CC results compared to other T MAX datasets; differently mixed results were found for T MIN CCs. Overall, the KMD dataset showed better performances in representing maximum temperature, whereas the ORH dataset fit better for minimum temeperature. Furthermore, Taylor diagrams pictured in Figure 5 present the degree of the statistical agreement between the datasets and the recorded historical observations.     Table 2) between each series and the observed historical series is expressed by the azimuthal angle. Points closer to the historical series' marker (white point), with similar standard deviation and higher CC, correspond to the best-fit dataset.
According to the Taylor diagrams, on average, KMD and ORH showed a lower standard deviation and higher CC (as has already been presented in Table 2) in representing T MAX and Lodwar T MIN values, whereas HAD showed a higher CC and lower standard deviation in representing Marsabit T MIN , and ERA showed better results in representing Moyale T MIN .
Considering the overall validation results presented in Table 2 and the Taylor diagrams, ORH is the most accurate product in representing local T MIN , whereas KMD yields a better performance in describing local T MAX . Despite the broader resolution, ORH showed significantly lower biases and errors in most of the validation stations. One reason behind these results could be the low topographic complexity of the area. According to [26], temperature reanalysis with high resolution is fundamental in a context with complex geomorphologic features, but not particularly beneficial in plain areas. The better performances of the KMD in representing Marsabit temperatures support this hypothesis, as Marsabit is located on a hilltop. In this case, the higher-resolution grid may have resulted in better performances. Since KMD and ORH presented the best results in representing local temperatures, we evaluated which of these two datasets best approximated the seasonal distribution of maximum and minimum temperature, in order to decide which datasets to use in calculating the monthly reference values.

Systematic Error Evaluation at Land-Based Station Level
In this section, the ability of the ORH and KMD datasets to depict seasonal distributions of the temperature data is assessed.
Seasonal T MAX and T MIN boxplots were created in order to further compare the differences between observations and the datasets. Figure 6 shows the comparison between seasonal observed data and seasonal data from KMD and ORH reanalysis. ORH showed poor results in representing both T MAX and T MIN in Marsabit, overestimating both maximum and minimum temperatures and confirming the high bias values reported in Table 2. Conversely, ORH showed good results in representing all seasons' T MAX both for Lodwar and Moyale. However, it tended to underestimate Lodwar T MIN during fall, spring and summer and Moyale T MIN during fall, spring and winter, and to overestimate the winter temperature in Lodwar. The density distribution is represented next to the box graph for each season; several seasons appear to have normal distributions, whereas others have bimodal or not centered/symmetric distributions.
On the other side, in Figure 6, it can be seen that the KMD tends to systematically overestimate both maximum and minimum temperatures.
Given the results described by the seasonal boxplots, it emerged that the temperature data from ORH reanalysis, with the exception of Marsabit, fit better than data from KMD to describe the seasonal pattern of the observations recorded by the land-based stations. Indeed, ORH seasonal boxplots did not show a tendency to systematically overestimate or underestimate temperatures recorded by both Lodwar and Moyale land-based meteorological stations. These results are in line with [27,35,36], whose outcomes underline that ORH is the most accurate data source for T MAX and T MIN at monthly resolutions compared to other climatic products in East Africa.
Therefore, monthly mean reference values for each location were calculate as the monthly mean value of T MAX and T MIN data derived from the ORH reanalysis.  As shown in Figure 7, maximum and minimum ranges computed for Lodwar and Moyale showed good results, as they are able to contain from 80 to 95% of the historical observations. The percentage of values not falling within the limits of the ranges refers to those extreme climatic events characterized by a heavy tail distribution and which differ significantly from the average [39].

Calculation of Seasonal/Monthly Temperature Ranges for the Reference Points
The monthly/seasonal T MAX and T MIN ranges were calculated for the eight reference points, as shown in Tables 3 and 4. A visual representation of the maximum and minimum seasonal temperature values at a local scale is given by the isotherm maps presented in Figures 8 and 9. The annual T MAX and T MIN distributions for each reference point based on monthly ORH temperature timeseries are shown in Figure 10.

Result Validation for North Horr
The lack of land-based meteorological stations in the territory may represent a limitation for the reliability of the results of this study. This issue is overcome through the observations recorded by the North Horr land-based automatic weather station, providing local evidence of the adequacy of the results. Despite the recent installation, the observations recorded by the North Horr land-based automatic weather station represent a unique opportunity for the validation of our methodology at a local scale. Observations that were collected and tested refer to the period between March 2019 and June 2020. In particular, since the station is used to measure the maximum and minimum temperature on a daily basis, the daily data were averaged on a monthly basis and then compared with time intervals, as shown in Figure 11. Even though the number of observations is small, the visualization of temperature ranges confirms that the majority of the records are contained within the temperature boundaries. However, in May, both minimum and maximum temperature observations were above the upper limit of the climatological range. The upper limit is likewise passed in March, April, May and July (T MAX ) and in May and June (T MIN ). Moreover, in October, monthly average temperature values were below the lower limit, both for T MIN and T MAX , and in February and April, observations were observed below the lower limit, respectively, of the T MAX and T MIN ranges. Overall, the climatological temperature ranges for both T MAX and T MIN were able to contain nearly 70% of the observations.

Conclusions
The availability of reliable historical climatic datasets is key to understanding and predicting extreme weather events and consequently reducing the vulnerability in African countries, especially in ASAL regions. Similar to many other East African countries, Kenya lacks historical land-based meteorological observations. This study identifies which maximum and minimum temperature datasets are able to fill the data gap in North Horr Sub-County. Four different climatic products are chosen and validated against historical T MAX and T MIN observations recorded by the land-based meteorological stations of Lodwar, Marsabit and Moyale. The comparison between ERA, HAD, KMD and ORH datasets highlights that ORH T MAX and T MIN datasets are able to better represent local temperatures and to successfully describe the seasonal patterns.
To strengthen the response systems against extreme weather events, temperature reference values and climatological temperature ranges were computed using ORH data for the eight main villages located in the study area (Balesa, Dukana, El-Gade, El-Hadi, Gas, Kalacha, Malabot and North Horr). Temperature reference values were calculated by averaging monthly and seasonal T MAX and T MIN data derived from the ORH reanalysis and climatological ranges were calculated with an amplitude of 2 degrees centered on the monthly temperature reference values. Monthly temperature ranges were compared against (i) the observations recorded by the Lodwar and Moyale land-based meteorological stations and then against (ii) the North Horr land-based automatic weather station observations recorded in the period of 03/2019-06/2020. Lodwar and Moyale temperature ranges contain the majority of the records (80-95%), demonstrating the ability to adapt to the local context and to represent temperatures. Differently, despite the small number of observations, temperature ranges computed for North Horr contain roughly 70% of both T MAX and T MIN values recorded from March 2019 to June 2020. The high performances obtained by the temperature ranges in the validation process confirm the adequacy of this methodology's application at a regional level.
High-resolution temperature data are urgently needed to understand climatic trends in East African countries. This study provides a solution to the scarcity of observed data in North Horr Sub-County, by identifying monthly maximum and minimum temperature ranges. Anomalous temperature values can be detected through the comparison between current observations and the temperature ranges, strengthening the local population's ability to cope with forthcoming extreme events.
Future research should test and validate the methodology proposed here in other locations. Moreover, the same methodology should be retested after more observations from the North Horr land-based automatic weather station are provided.   Acknowledgments: This study was conducted within the framework of the International Cooperation Project "ONE HEALTH: Multidisciplinary approach to promote the health and resilience of shepherds" communities in North Kenya" funded by the Italian Agency for Development Cooperation (AICS). The authors would like to thank the project coordinator (CCM) and project partners (TRIM and VSF-Germany) and the Kenyan Meteorological Department.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A