A Method Comparison Study between Open Source and Industrial Weather Stations †

: Open-source devices are widespread and have been available to everyone over the past decade. The low cost of such devices boosts the creation of instruments for various applications such as smart farming, environmental monitoring, animal behavior monitoring, human health monitoring, etc. This research aims to use statistical methods to assess agreement and similarity in order to compare an open-source weather station that was constructed and programmed from scratch with an industrial weather station. The experiment took place in the experimental Greenhouses of the University of Thessaly, Velestino, Greece, for 7 consecutive days. The topology of the experiment consisted of 30 open-source weather stations and three industrials, creating three clusters with a ratio of 10 open-source to 1 industrial. The results revealed low to high agreement across the measurement range, with high variability, possibly due to factors that were not considered in the statistical model.


Introduction
The open-source philosophy [1] is applied in numerous areas of our society. The need for inexpensive and trustworthy devices in scientific and industrial applications fomented the hardware and software open-source community [2]. The reliability of such devices is questioned since non-specialized individuals can produce devices expected to have certain specifications theoretically but fail to meet them in the field. Thus, a formal method should be applied, and the outcomes can be proof of the device's effectiveness. The new device can be validated and assess its interchangeability compared to already established devices using appropriate statistical methods. Statistical methods to assess agreement and similarity are used extensively in medicine, applied chemistry [3,4] and other fields to compare approved and widely used devices or methods with experimental ones that might be either less expensive, less intrusive or even both [5]. The need for the application of this method in agriculture is vital since a great number of devices for environmental monitoring emerged during the past years [6,7]. Various comparison methods are used that are inappropriate and reveal misleading results, according to Altman et al. [8]. We constructed an inexpensive and user-friendly device for environmental monitoring and assessed its efficiency in field conditions. Then, we adapted existing statistical methods to assess agreement and similarity between our device and existing certified industrial ones to produce indices and graphs that are easily interpreted by individuals with no need for advanced knowledge in statistics.

Open-Source Weather Station
The open-source weather station constructed for this experiment uses a microcontroller and components listed in Table S1 (Supplementary Materials). It measures the air temperature every five minutes instantly. The device has the ability to store data in a private server provided by the Adafruit company, Adafruit IO [9], via WiFi. Figures S1 and S2 (Supplementary Materials) display the electronic parts and final product, respectively. The cost of the device is displayed in Table S2, which is around EUR 54 (2020 market prices). We programmed the microprocessor using Arduino IDE [10]. The libraries used are listed in Table S3 (Supplementary Materials). In Figure S3 (Supplementary Materials), the flowchart presents the sequence of the weather station's functions.

Industrial Weather Station
The industrial weather station is a Thygro SDI-1 ( Figure S4, Supplementary Materials) [11] and is equipped with air temperature, humidity and solar radiation sensor, solar panels and radiation shield. The temperature accuracy is ±0.2 • C, and the resolution is ±0.015 • C. The total cost is EUR 3000, while the service cost for remote monitoring is additional.

Experimental Design
The experiment took place in the experimental greenhouses, University of Thessaly, Greece. The topology consists of three industrial weather stations and 30 open-source, surrounding each industrial as shown in Figure S5. This configuration reduces the cost of the experiment substantially. The cultivation was tomato, and one of the industrial weather stations was located in chamber 1 and the rest in chamber 2. The duration of the experiment was 7 days.

Method Comparison Strategy
A method comparison study sets two important goals that we adapted in our research: According to Choudharry [12], the primary goal is to quantify the extent of agreement between two measurement methods and determine whether they have sufficient agreement so that we can use them 'interchangeably'. The secondary goal is to compare important characteristics of the measurement methods such as biases, precision and sensitivities to find the sources of their disagreement. The latter comparison is set as the evaluation of similarity.
In our setting, the new method is the open-source weather station and the reference method is the industrial station. Indices and graphical illustrations aid the researcher in easily understanding and interpret the results with no need for advanced knowledge in statistics.
The official method comparison technique is the assessment of agreement and similarity using ad hoc statistical analysis [3,4]. Common practices used to compare two different methods/devices can be misleading or inappropriate. A short exposition is given in Table S4 (Supplementary Materials).
There are many indices and graphs used to interpret agreement between two methods or devices [13]. Each index has advantages and disadvantages, and a combination can be used for better understanding the results. The same applies for the graphical representations.
Choudharry et al. [12] proposed binormal and mixed effect models that produce indices by extracting the model's parameters, useful graphs and indices for agreement, such as Concordance Correlation Coefficient (CCC), Total Deviation Index (TDI) and similarity indices, such as precision ratio, fixed bias and sensitivity. We followed this approach and also produced a similarity index, called fixed bias, which is the mean difference of the paired measurements of the devices. The model used the method as the factor, which has two levels, the industrial as method 1 and the open source as method 2. Each pair of stations, which is the subject in our case, is the random factor. The model we adapted and the index formulas can be found in Table S5 (Supplementary Materials).
Our data are longitudinal since they are produced in 5-minute intervals. However, to avoid model complexity since the instances are too many, we modeled each occasion (5 min interval) separately for every day.

Visual Representation of the Results
The graphs produced for the visual representation of the data are the CCC vs. Occasion, the TDI vs. Occasion and the Fixed Bias vs. Occasion. Every graph has a common x-axis that corresponds to the occasions per day, and the points are color-coded for every pair based on pre-defined temperature categories. Only Day 1 is displayed below, and the rest can be found in the Supplementary Materials ( Figures S6-S11).
For Day 1, the CCC vs. Occasions plot (Figure 1a) reveals low to high agreement for the whole temperature range due to the high variance of the index. The index is not consistent for the different temperature categories. The TDI vs. Occasions plot (Figure 1b) estimated that 90% of the absolute value of differences lies between 2.3 • C and 4.7 • C for temperatures lower than 27 • C and between 1.25 • C and around 6 • C for more than 27 • C. Temperatures more than 29 • C report higher variance than the others. The Fixed Bias vs. Occasions plot (Figure 1c) estimates −2.1 • C to 0.4 • C fixed bias for temperatures between 23 • C to 27 • C, −2.2 • C to around 0.75 • C fixed bias for temperatures between 27 • C and 29 • C and around −1.25 • C to 2.25 • C for the rest of the categories. The trend and the corresponding standard errors were estimated using a LOESS smoother. as Concordance Correlation Coefficient (CCC), Total Deviation Index (TDI) and similarity indices, such as precision ratio, fixed bias and sensitivity. We followed this approach and also produced a similarity index, called fixed bias, which is the mean difference of the paired measurements of the devices. The model used the method as the factor, which has two levels, the industrial as method 1 and the open source as method 2. Each pair of stations, which is the subject in our case, is the random factor. The model we adapted and the index formulas can be found in Table S5 (Supplementary Materials).
Our data are longitudinal since they are produced in 5-minute intervals. However, to avoid model complexity since the instances are too many, we modeled each occasion (5 min interval) separately for every day.

Visual Representation of the Results
The graphs produced for the visual representation of the data are the CCC vs. Occasion, the TDI vs. Occasion and the Fixed Bias vs. Occasion. Every graph has a common xaxis that corresponds to the occasions per day, and the points are color-coded for every pair based on pre-defined temperature categories. Only Day 1 is displayed below, and the rest can be found in the Supplementary Materials (Figures S6-S11).
For Day 1, the CCC vs. Occasions plot (Figure 1a) reveals low to high agreement for the whole temperature range due to the high variance of the index. The index is not consistent for the different temperature categories. The TDI vs Occasions plot (Figure 1b) estimated that 90% of the absolute value of differences lies between 2.3 °C and 4.7 °C for temperatures lower than 27 °C and between 1.25 °C and around 6 °C for more than 27 °C. Temperatures more than 29 °C report higher variance than the others. The Fixed Bias vs. Occasions plot (Figure 1c) estimates −2.1 °C to 0.4 °C fixed bias for temperatures between 23 °C to 27 °C, −2.2 °C to around 0.75 °C fixed bias for temperatures between 27 °C and 29 °C and around −1.25 °C to 2.25 °C for the rest of the categories. The trend and the corresponding standard errors were estimated using a LOESS smoother. Overall, from occasions 91 to 181, it seems that the variability is higher each day. This is probably due to the air humidity difference as shown in Figure S11. Thus, humidity is a possible factor that could explain the high variability on the indices.

Discussion
Our comparison study uses statistical methods to successfully assess agreement and similarity for open-source devices. We combine three useful indices and interpret them as complements. The analysis of the data revealed high to low agreement with increased Overall, from occasions 91 to 181, it seems that the variability is higher each day. This is probably due to the air humidity difference as shown in Figure S11. Thus, humidity is a possible factor that could explain the high variability on the indices.

Discussion
Our comparison study uses statistical methods to successfully assess agreement and similarity for open-source devices. We combine three useful indices and interpret them as complements. The analysis of the data revealed high to low agreement with increased variability for higher temperatures. The open-source weather station performance might be acceptable for specific applications, and the low production cost might encourage researchers to produce them. After examining our preliminary data, we conclude that there is a need for more research to reduce the variance of the indices across the temperatures. Specifically, adding other factors, such as air humidity, vegetation density or the location of Eng. Proc. 2021, 9, 8 4 of 4 each pair in the greenhouse, will certainly improve the model since there may be many confounding variables. Finally, including pooled data from the whole duration of the experiment and not only per day will reveal trends and will help retrieve hidden factors that affect the agreement.