Mean Daily Temperature Estimations and the Impact on Climatological Applications †

: Climate variation in temperature describes the ﬂuctuations that occur above or below the average temperature over time. This study focuses on the analysis of various approaches for calculating the average temperature to better capture and represent the true average temperature over the course of a day. As the conventional approach often fails to integrate signiﬁcant behaviors of temperature variations, nine mathematical formulas are employed with a different number of daily observations. Speciﬁcally, this study inserts the geometric and harmonic mean and their variations for the estimation of the average temperature. Furthermore, the application of the formulas to the time series of temperature recordings for three station locations in Greece is analyzed. For the evaluation of the results, six widely used statistical indices for quantifying errors are applied, and their relative effects are also discussed. Finally, as a practical application of the impact of the formula used for the averaging of daily temperature, the various approaches are introduced on the same observation dataset for the estimation of the heating degree-days, an index that can have signiﬁcant climatological and economic implications over a region.


Introduction
Numerous studies have been undertaken throughout the years regarding the determination of the most effective approach for computing the average daily air temperature.Temperature is the most significant weather parameter and the fundamental parameter in all climate classifications.The average daily temperature plays a vital role not only in determining climatological trends but also in the modeling and simulation of diverse economic applications, such as heating subsidies, as well as the design and operation of energy-efficient architectural structures.
If the daily temperature variability follows a harmonic pattern with a period matching the daily cycle, the min-max estimation method is considered to be rigorously valid.In such cases, the estimation accurately captures the temperature dynamics.However, when a harmonicity is present, indicating the presence of additional temperature anomalies beyond the typical daily cycle, deviations from the min-max estimation can be expected [1].The daily average mean temperature is conventionally determined by calculating the mean of two variables: the maximum and the minimum temperature observed over a 24-h period, typically spanning from midnight to midnight.This approach to temperature averaging has served as a reliable means to reflect both the transient weather conditions in the short term and the enduring climatic patterns in the long term within a specific geographical area.Moreover, it constitutes a fundamental component of the 30-year climate "normal", as advised by the World Meteorological Organization (WMO) [2].
In this study, a comparison is conducted among various mathematical methods for calculating the average daily air temperature using statistical tools.The aim is to find the optimal method among those commonly used in Meteorological Services and to introduce, examine, and recommend the use of additional methods.

Meteorological Data
In the analysis performed, measurements from three meteorological stations in the Attica region of Greece with different geophysical characteristics were used in order to obtain coverage of the climatological peculiarities of the broader area.The meteorological data used in this study were obtained from the archives of the Hellenic National Meteorological Service (HNMS).The selected meteorological stations include the Hellinikon, Elefsina, and Tatoi stations.These selected stations are among the most significant within Attica, fulfilling the operational criteria of the World Meteorological Organization [3].They are adequately staffed, and observations are collected throughout the day as they are situated within military units of the Hellenic Air Force.The choice of these stations ensures the reliability of the observations and the potential avoidance of data gaps.The geographic coordinates and elevation of the stations are provided in Table 1, while for the analysis presented in this paper, data are provided for the three-year period 1995-1997.

Methodology
There are several methods for calculating the average daily temperature, according to the World Meteorological Organization [3].These include methods that use the daily maximum and minimum, 24-h observations, summary observations (every three hours), and observations at specific predetermined hours during the day.The following section focuses on finding the most representative average daily temperature based on the restrictions that exist on the number of available observations, as not all meteorological stations can provide long records of hourly temperature values.
In Greece, the National Meteorological Service operates based on the World Meteorological Organization and uses the formulas below.The arithmetic means aredefined and calculated as the sum of the observations divided by their count.The higher the frequency of observations, the more accurate the mean [3].
where T max and T min are the maximum and minimum temperatures, and T i represents temperature measurements at specific hours during the day.
In addition to the arithmetic mean, there is the geometric mean, which is defined only for positive numbers (X ≥ 0).The mathematical formula is g = T ∏ T t=1 X t while for this study the following formulations were used: GEO − 4 = 4 T 6 * T 12 * T 18 * T 18 (6) Continuing with the mathematical approaches of means or averages, the harmonic mean is defined for numbers other than zero (X = 0), as , while for this study the following formulas were used: The data and measurements processed in this study consist of 2 m temperature observations taken every three hours on a daily basis for the 3-year period, as for most stations and periods this was the interval at which observations were available.The objective is to search for the optimal method of mean temperature calculation that better represents the daily variability, something that will be statistically evaluated by the mean deviation (error).For each day of the analyzed period, there are eight different temperature measurements, as well as the maximum and minimum temperatures derived from the daily SYNOP reports.
One issue addressed during the processing of the data as input to the formulas was the presence of zero and negative temperature observations.The geometric mean does not accommodate negative values due to the existence of square roots.Similarly, the harmonic mean cannot accept observations equal to zero since each observation is in the denominator of the fraction.To resolve this issue and ensure the reliability of the study, it was decided to convert the temperature from Celsius ( • C) to Fahrenheit ( • F).With respect to methods for averaging, it was decided to use Formulas ( 2)-(10).For each mean (arithmetic, geometric, and harmonic), the average daily temperature was computed using the different combinations of observations.

Statistical Evaluation
Subsequently, after obtaining the average daily temperature for each day over the threeyear period, separately for each station, a comparison was performed among the nine means.Each of the nine average daily temperatures was compared to each observation of the day.In order to achieve this, certain statistical measures of error calculation were utilized.The term "error" is used in a statistical sense, referring to the statistical deviation of each observation from the calculated mean rather than implying "incorrectness."Several statistical indices have been utilized in the meteorological literature for evaluating forecasting processes, and in this analysis, six of the most popular statistical errors are employed, as shown in Table 2 [4].

Name Acronym Formula
Mean Absolute Error MAE The statistical indices were applied to the data from the three stations for each of the eight calculated means on each day of the sample.An example of the derived indices values is given in Table 3 for Hellinikon station.Due to the fact that six statistical indices represent different properties of the deviations, a ranking was performed based on their relative values for each averaging approach in an effort to identify the averaging method that systematically leads to smaller overall deviations.Specifically, the error measures of each averaging method have been evaluated using the classical method of assigning penalties, where a score of 1 represents the mean with the smallest error and therefore the smallest penalty.Similarly, the mean with a score of 9 has the largest measure and the highest penalty, indicating the largest deviation and the worst statistical outcome.In Figure 1 (left), the ranking of the statistical indices is represented for Hellinikon station as calculated for each averaging method.The bar length signifies the rank value, while the colored notes indicate the statistical index used.The same analysis was performed for Elefsina and Tatoi stations, and the trends of the results were almost identical, at least with respect to the averaging methods that rank best or worst.Based on this classification, it is evident that the means with eight observations (ARITH-8, GEO-8, and HARM-8) yield the lowest ranking (smaller errors) compared to the methods that are based on fewer observations.Statistically and climatologically, this is expected since a larger sample of observations optimizes the mean.
Moreover, it is shown that the geometric mean of the eight observations is the best overall method, with minimal difference from the harmonic mean of the eight and a clear superiority compared to the arithmetic mean based on the same number of observations.In an effort to summarize the information extracted in the analysis, an overall ranking was also calculated based on all six indices (values ranging from 1 × 9 to 6 × 9) for each station, and the results are presented in Figure 1 (right).Based on this, all station analysis data seem to support the same outcome: that the geometric mean of eight performs slightly better (even more so for the Hellinikon station), while the harmonic mean of eight also follows the same ranking, especially for the Elefsina and Tatoi stations.These conclusions confirm the typical rule that applies to the three means: the harmonic mean is always smaller or equal to the geometric mean, and the geometric mean is always smaller or equal to the arithmetic mean, i.e., HARM ≤ GEO ≤ ARITH.
was also calculated based on all six indices (values ranging from 1 × 9 to 6 × 9) for each station, and the results are presented in Figure 1 (right).Based on this, all station analysis data seem to support the same outcome: that the geometric mean of eight performs slightly better (even more so for the Hellinikon station), while the harmonic mean of eight also follows the same ranking, especially for the Elefsina and Tatoi stations.These conclusions confirm the typical rule that applies to the three means: the harmonic mean is always smaller or equal to the geometric mean, and the geometric mean is always smaller or equal to the arithmetic mean, i.e., HARM ≤ GEO ≤ ARITH.Moreover, it is evident from Figure 1 (right) that the arithmetic means of two and four observations are associated with larger errors.Specifically, the ARITH-4 mean exhibits the highest error ranking for all three stations.Similar results are obtained for the most commonly used ARITH-2, which is based on the difference between the maximum and minimum values and ranks second to last in performance.The difference in deviations among these two arithmetic means (based on 2 and 4 observations) and their corresponding geometric and harmonic means is apparent, with the harmonic mean exhibiting superior performance among the four-observation means and the geometric mean among the two-observation means.
The means of eight observations are clearly superior from a statistical perspective, with the geometric and harmonic means outperforming the arithmetic mean.In the case of four observations, the preferred mean is the harmonic mean, followed by the geometric mean.Based on the results, it is evident that the arithmetic mean of four observations has the highest measure of error and is therefore not recommended.In the case of the most widely used method that is based on the maximum and minimum values, the results favor the geometric mean with a significant advantage in almost all statistical error measures compared to the other two means.
In conclusion, in order to better capture the daily temperature variability, the mean daily temperature calculation approach is desirable to be based on as many observations as possible and not solely on the maximum and minimum values.If, however, we need to use these two observations, the geometric mean is recommended.Furthermore, in the case that a limited number of observation reports are available per day based on specific time intervals, the harmonic mean is suggested.Moreover, it is evident from Figure 1 (right) that the arithmetic means of two and four observations are associated with larger errors.Specifically, the ARITH-4 mean exhibits the highest error ranking for all three stations.Similar results are obtained for the most commonly used ARITH-2, which is based on the difference between the maximum and minimum values and ranks second to last in performance.The difference in deviations among these two arithmetic means (based on 2 and 4 observations) and their corresponding geometric and harmonic means is apparent, with the harmonic mean exhibiting superior performance among the four-observation means and the geometric mean among the twoobservation means.
The means of eight observations are clearly superior from a statistical perspective, with the geometric and harmonic means outperforming the arithmetic mean.In the case of four observations, the preferred mean is the harmonic mean, followed by the geometric mean.Based on the results, it is evident that the arithmetic mean of four observations has the highest measure of error and is therefore not recommended.In the case of the most widely used method that is based on the maximum and minimum values, the results favor the geometric mean with a significant advantage in almost all statistical error measures compared to the other two means.
In conclusion, in order to better capture the daily temperature variability, the mean daily temperature calculation approach is desirable to be based on as many observations as possible and not solely on the maximum and minimum values.If, however, we need to use these two observations, the geometric mean is recommended.Furthermore, in the case that a limited number of observation reports are available per day based on specific time intervals, the harmonic mean is suggested.

Applications
As mentioned earlier, there are several applications for the mean daily surface temperature.The conventional approach of temperature averaging is not only utilized to calculate the daily average mean temperature but also serves as the basis for deriving various temperature-based climate indices.These indices include heating degree days, cooling degree days, etc., which hold significant importance in assessing residential heating and cooling requirements as well as agricultural activities.By capturing the spatiotemporal variations of these indices, valuable insights can be obtained regarding the temporal patterns and geographic distribution of heating and cooling needs and the suitability of different regions for agricultural practices [5].
The impact of the averaging method on heating degree days is briefly demonstrated in this section.According to the American Society of Heating, Refrigerating, and Air-Conditioning Engineers (ASHRAE) [6] method, the degree days are defined as the difference between the mean daily temperature (T mean ) and the reference temperature (T b ).Therefore, the equation for heating degree days (HDD) is as follows: Heating Degree Days : HDD T mean is obtained from the commonly used method through maximum and minimum daily temperatures (ARITH-2), while the reference temperature is set to 15.5 • C. The reference temperature has been established in European Union countries, as recommended by the United Kingdom's Meteorological Service.This specific value is suitable for regions located at moderate latitudes and is followed by the European Environmental Agency.In this application, HDD is additionally calculated with the daily mean temperature as derived through the Formulas (3)-( 10).The objective is to quantify how the use of a different temperature averaging method can affect the amount of HDD and, consequently, the heating allowance that could be attributed to the specific area.For Elefsina station, the annual HDD as calculated by ( 11) is given in Table 4.The arithmetic means of two and four observations yield the lowest degree-day values, which consequently lead to a smaller economic impact on heating allowances.However, from a statistical perspective, these means exhibit poorer statistical error measures as they do not adequately capture the temperature fluctuations within a day.Such an important application with socioeconomic ramifications should be based on the most representative and reliable climatological input.

Recommendations
A comparison was made among nine different mathematical methods for calculating the average daily surface temperature.Daily observations were collected for the period between 1995 and 1997 from three stations in Attica, Greece.The results of this comparison are consistent with the recommendations of the WMO regarding the optimal number and use of available observations for the calculation of the temperature average.Specifically, a clear difference is observed between the means calculated using eight observations versus those using four and two observations, thus confirming the rule that means based on an adequate number of observations yield smaller statistical errors.Among the nine methods, the geometric mean of the eight observations had the smallest deviations within the day, resulting in smaller errors and penalties.The harmonic mean and arithmetic mean of the eight observations followed in the ranking.Considering the method with only four observations, the harmonic mean produced the smallest errors, followed by the geometric mean.In conclusion, regardless of the number of observations, the arithmetic mean is associated with higher errors and yields larger results in most statistical measures.Therefore, from a mathematical, climatological, and economic perspective, it is advisable and recommended to further study and introduce the geometric and harmonic means for calculating the average daily temperature.

Figure 1 .
Figure 1.(Left): performance rank of the nine averaging 2 m temperature methods based on different statistical indices for Hellinikon station.(Right): summary statistical ranking for all stations based on the nine averaging methods.

Figure 1 .
Figure 1.(Left): performance rank of the nine averaging 2 m temperature methods based on different statistical indices for Hellinikon station.(Right): summary statistical ranking for all stations based on the nine averaging methods.

Table 3 .
Statistical indices calculated for Hellinikon station.

Table 4 .
Annual heating degree days calculated for Elefsina station.