Assessment and Correction of Solar Radiation Measurements with Simple Neural Networks

: Solar radiation received at the Earth’s surface provides the energy driving all micro-meteorological phenomena. Local solar radiation measurements are used to estimate energy mediated processes such as evapotranspiration (ET); this information is important in managing natural resources. However, the technical requirements to reliably measure solar radiation limits more extensive adoption of data-driven management. High-quality radiation sensors are expensive, delicate, and require skill to maintain. In contrast, low-cost sensors are widely available, but may lack long-term reliability and intra-sensor repeatability. As weather stations measure solar radiation and other parameters simultaneously, machine learning can be used to integrate various types of environmental data, identify periods of erroneous measurements, and estimate corrected values. We demonstrate two case studies in which we use neural networks (NN) to augment direct radiation measurements with data from co-located sensors, and generate radiation estimates with comparable accuracy to the data typically available from agro-meteorology networks. NN models that incorporated radiometer data reproduced measured radiation with an R 2 of 0.9–0.98, and RMSE less than 100 Wm − 2 , while models using only weather parameters obtained R 2 less than 0.75 and RMSE greater than 140 Wm − 2 . These cases show that a simple NN implementation can complement standard procedures for estimating solar radiation, create opportunities to measure radiation at low-cost, and foster adoption of data-driven management. total simple case suggest that further should the of valid could be that have been rather than to site-speciﬁc


Introduction
Reliable measurements of solar radiation are critical to accurately estimate the energy available for all other meteorological processes. As a result of water's heat capacity and the latent heat of evaporation, a large portion of solar energy received at the surface is abstracted through evaporation from soil and water surfaces, and through transpiration from vegetation. Correspondingly, the flux of water from terrestrial surfaces is an important hydrologic linkage at local, regional, and global scales [1,2]. Energy-based methods to calculate evapotranspiration (ET) are central to regional and global water management, particularly for agriculture [3,4]. Downwelling shortwave radiation measurements from regional agro-meteorology networks are used to calculate reference ET, and combined with other weather data, are used determine crop water requirements via energy-based methods such as the Penman-Monteith equation [5][6][7]. Satellite images of surface temperatures and surface brightness are used to map spatial variability in ET and determine basin-scale water balances, and require corroborating surface measurements to calibrate [8][9][10]. Instantaneous and time-averaged measurements of net radiation (shortwave and longwave) are also used in calculating closure of the surface energy budget (SEB). The SEB is calculated from measurements of sensible heat flux, latent heat flux (evapotranspiration), and changes in energy storage (primarily in soil), with the sum of these terms equated with the measured net radiation. A closed SEB lends confidence to direct measurements of surface fluxes by methods such as eddy-covariance, with an expectation that the SEB residual does not exceed 10-20% of energy provided by net radiation [11][12][13]. Solar radiation models are also important boundary conditions that link climate and hydrologic models [14,15].
Despite the critical role of solar radiation in many environmental processes, direct measurements of solar radiation are sparse; and the data that are available sometimes lack adequate quality assurance. As a result, there is an abundance of empirical models that are used to approximate actual solar radiation, and global calculations which allow end users to assess the reliability of the resulting information. Typically, these empirical calculations begin with the calculation of exoatmospheric radiation, i.e., radiation received at the top of the atmosphere, based on latitude, day of year, and the solar constant [3,16]. To estimate the actual irradiance at the surface, the global value is adjusted downward, principally to account for cloudiness. Empirical studies have been conducted to correlate atmospheric transmissivity to more easily measured environmental parameters such as daily average temperature [17][18][19], weather conditions [20], or from regional parameters that can be estimated from synoptic scale weather, such as aerosol and precipitable water concentration [21]. Most of these approximations are valid over daily or longer timescales only. For meteorology networks that report reference ET for agriculture, measured downwelling shortwave radiation is typically used to estimate the actual irradiation. The shortwave irradiation is used in conjunction with a terrestrial longwave emission model to approximate the net radiation over shorter (hourly) time steps [16,22].
Increasingly, machine learning (ML) plays a role in prediction and modelling of environmental processes at global and regional scales. As early as twenty years ago, there was widespread interest in using various ML methods to tackle complex problems in hydrology and atmospheric science [23,24]; an ASCE special committee convened to survey hydrologic applications of neural networks [25,26]; and neural networks were compared to stochastic methods for predicting radiation [27]. Subsequent work that was specific to environmental applications has focused on input determination and calculation methods [28][29][30][31], data assimilation, down-or upscaling [32][33][34], and prediction of specific hydrologic and atmospheric processes such as precipitation and precipitable water [35,36], groundwater flow [37], evapotranspiration [38,39], and plant phenology [40,41]. Recently, there are increasing numbers of methods presented which model time-varying process, including solar irradiation [42][43][44][45]. While far from comprehensive, these represent a substantial effort to develop ML methods that focus on prediction of environmental variables. In contrast, there are few published studies focusing on ML as a tool to evaluate site-specific processes and compare on-site environmental sensor data.
In this context, two season-long experiments were conducted in which solar radiation was measured to help evaluate SEB closure and estimate reference and actual ET. Both of these experiments primarily focused on the eddy-covariance method to determine ET and sensible heat flux over short (15 min) averaging periods, and in both cases, there were complications in measuring actual solar radiation. In the first experiment (referenced below as 2015), the net radiometer was fully operation for only the first half of the six-month experiment duration. One component (the uplooking pyranometer) failed midseason, leaving a critical gap in field measurements. Although downwelling shortwave radiation could be approximated by a number of methods as described above, none of these methods were appropriate over such short time periods as 15 min. The abundance of supplementary information, including the other three component radiometers, suggested the possibility of applying neural networks to estimate the actual, site-specific irradiance for fifteen-minute averaging periods. The primary goal of the second experiment (referred below as 2017) was to explore the ability of machine learning to estimate evapotranspiration by training a neural network to reproduce high-quality measurements from data obtained with low-cost sensors. As part of this experiment, a simple integrated net radiometer and other sensors were deployed for a three-month growing season, while a four-component radiometer was co-located for two short periods to train the neural network and validate the model output. ML methods were not employed to evaluate time-varying processes or for gap-filling of missing data, nor did they attempt to predict future conditions. Rather, the neural networks implemented pattern-recognition techniques to estimate instantaneous solar radiation received at the surface. The neural networks were trained by identifying characteristic relationships with other meteorological measurements such as air temperature, or in the case of downwelling shortwave radiation, using measurements of other radiation components (e.g., longwave and upwelling). In both cases, the expected outcome was a trained neural network that could reproduce other radiation estimation methods with higher fidelity (over time intervals shorter than one day) with less requirement for regional or other empirical calibrations, while utilizing in-situ environmental sensor measurements.

Experiments
Measurements of downwelling visible and photosythetically active (PAR) solar radiation, net solar radiation (long-and shortwave), and other environmental data from two field experiments conducted in 2015 and 2017 are utilized in this analysis. These experiments were described previously, although these previous analyses were principally concerned with eddy-covariance flux measurements, and radiation data were not analyzed in any great detail previously.
For the first experiment, which was part of an effort to calibrate remote sensing measurements of ET, solar radiation and other meteorological sensors were installed with an eddy-covariance flux measurement system for a six-month period from 1 April to 1 October 2015 in an irrigated pasture near Rifle, Colorado, USA. A complete description of this field experiment can be found in a report published by the Upper Colorado River Commission [46]. Instrumentation included a four-component net radiometer (Model NR01, Hukseflux, Delft, The Netherlands), and two thermohygrometers (Model HMP155, Vaisala Oyj, Helsinki, Finland). Sensors were measured at five-second intervals, and average and standard deviations of measurements were recorded at 15 min intervals by a datalogger (Model CR1000, Campbell Scientific, Logan, UT, USA). Data from other sensors were not used in this analysis. During the course of this field experiment, the upward-looking shortwave radiometer (pyranometer) suffered an electronic failure and stopped reporting data (from this component sensor only) starting on 20 June 2015. Table 1 lists the sensors and periods of operation from which data used in this study were obtained. Downwelling solar radiation was also measured at the Rifle, CO Wildland Fire Remote Automated Weather Stations (RAWS) located 13 km NW of the experiment site. The 2015 data from this automated station (Station ID RFEC2) were obtained through the Mesowest data portal (https://mesowest.utah.edu/). The second field experiment was conducted to evaluate real-time irrigation decision support, in which machine learning was used to estimate crop water demand. An eddy-covariance system and supplementary meteorological sensors were installed for 85 days (12 June to 4 September 2017) adjacent to a 120 ha irrigated field near Corvallis, Oregon, USA. In this analysis, data were included from a thermopile net radiometer (Model Q-7, Radiation and Energy Balance Systems, Inc., Seattle, WA, USA), a PAR quantum sensor (Apogee Instruments, Inc. Logan, UT, USA), and thermohygrometers (Model HMP60, Vaisala Oyj, Helsinki, Finland). For a 12-day period from 21 June to 3 July, a NR01 four-component radiometer and additional instrumentation were co-located with the other sensors. A complete description of this field experiment was provided in previous publications [47,48]. For a 12-day period-19-31 August-the NR01 radiometer and other sensors were deployed again at the experimental site as part of a large coordinated experiment on atmospheric turbulence during the Great American Eclipse, as described previously [49]. The sensors and periods of operation from which data were used in this study are listed in Table 2. Experimental and meteorological network data were converted from raw text files and processed in Matlab software version 9.5 (The Mathworks Inc., Natick, MA, USA, 2018). Public data obtained for this analysis were checked for quality control flags, although no data were marked as suspect. For the analyses shown in the following results, no records were automatically removed from the experimental data to avoid biasing the subsequent analysis. Neural network methods in this analysis used Mathwork's Machine Learning and Deep Learning Toolbox in Matlab. As implemented in the toolbox, neural networks (NNs) are structured programmatically in multiple layers of nodes. In each node, weighting and bias vectors and a transfer function control how input parameters are converted to nodal outputs. Environmental sensor data from the two field experiments, such as air temperature and humidity, are used as inputs parameters in the NN models. Consistent with the relatively small datasets used in this study, each layer was assigned the same number of nodes (2)(3)(4)(5) as the number of input variables. The first node layer receives as input a number of records of all input variables, and each subsequent layer of nodes receives and applies weighting and bias functions in turn until all nodal outputs are combined to return a final model estimate. In the case of supervised learning, each record used in training is also associated with a control estimate of the output parameter to be modelled. For all the following NN models, the modelled output parameter was either downwelling shortwave radiation or spectrally-integrated net radiation. The control estimates used for training the NNs were measured values of the corresponding variable.
Network training occurs iteratively to adjust weights and biases for the nodes, and tested against a subset of the batched training records. For this study, all instances of training used 90% of the records for the training process (to estimate the output), and 10% assigned to the validation and testing phases which identifies the generality of the network model and prevents overfitting [50]. Supervision refers to the comparison of the control estimate to the NN output, and optimization involves minimizing network output error by adjusting the nodes. Here, mean square error (MSE) was the learning statistic used in backpropagation, in which a random subselection of the training data is compared to the corresponding model output. The backpropagation method is conducted repeatedly in epochs, re-initializing the random assignment of training records before proceeding with training in each epoch. The training proceeds by adjusting the weighting and bias vectors in order to minimize the difference between the control ET measurement and the value predicted by the NN. Training epochs are conducted until any one of a number of criteria are reached-a minimized MSE, a maximal limit of epochs, or an insignificant change or increase in MSE in subsequent epochs. Each training procedure is randomly initialized, so that two networks trained on identical data can produce slightly different outputs. In data-limited conditions such as on-farm monitoring, we have previously proposed that variability in the model output can approximately describe the robustness and generality of the solution [51]. Simplified measures of model robustness (such as RMSE and R 2 ) are appropriate descriptors of model reliability when the model utilized simple input datasets-for example [39]-as opposed to models which enable multi-linear regression with several independent factors, as demonstrated in [41].
In both field experiments shown here, NNs were used to model solar radiation data. In the case of the 2015 experiment, the objective of modelling radiation was to synthesize the downwelling shortwave irradiation after one component of the net radiometer failed. NN models were trained on data collected over a three-week period from 20 June to 10 July prior to the failure of the uplooking pyranometer. Each model used a different configuration of inputs, including air temperature and humidity, uplooking pyrgeometer, and downlooking pyranometer data, and was trained to reproduce the (downwelling) shortwave irradiation measured by the uplooking pyranometer. The trained NNs then modelled shortwave irradiation for the entire 175 days of input data records, including during the 100 day period when the uplooking pyranometer ceased to operate.
For the 2017 field experiment, a four-component radiometer was co-located with other sensors for a 12 day period only (21 June to 3 July), and the objective was to estimate site-specific net radiation with a minimum set of sensors as or more accurately than the integrated thermopile radiometer, compared to a simplified net radiation model [7,22,52]. NNs were trained with various input configurations including data from the Q-7 net radiometer, a photosynthetically active radiation (PAR) sensor, air temperature/humidity, and wind speed sensors for all records during the 12-day period. The resulting NNs were then used to estimate net radiation over the entire 85-day experiment. Because the structure of these NNs treat each record as an independent instance and ignore time dependent features such as ordering, input data for all NNs included the timestamp as a fraction of day. The different arrangements of input variables for each trained neural network are described in Table 3.  Unlike previous demonstrations that used NNs to estimate evapotranspiration [48,51], the control estimates of irradiance (2015) and net radiation (2017) were not measured for the duration of these experiments, so that another basis was needed to evaluate the NN output. For the 2015 experiment, additional solar radiation data obtained from a nearby RAWS weather station provided a local, secondary reference with hourly records. Although the distance between the two stations was only 13 km, the extreme topographic relief meant that the instantaneous records were often decoupled/delayed in time. Therefore, NN outputs were compared to experimental measurements (while the sensor was still operating) and to RAWS data, at both hourly intervals and as summed energy over 24 h periods (MJ·m 2 received per day), without making any assumption of lagged correlations at other timescales. Daily data and model outputs were also compared to clearsky shortwave radiation following Chapter 4 and Appendix H of ASCE Manual 70 [16]. The last test calculated a synthetic surface albedo as the ratio of downwelling shortwave radiation to measured upwelling shortwave radiation, and compared this synthetic albedo to the measured albedo (as determined in the period prior to failure of the net radiometer). The measured and synthetic albedo values were binned by time of day to show the daily development of surface reflectance corresponding to solar angle, angle of incidence, and other conditions affecting reflectance (such as wetness or ground cover).
For the 2017 experiment, a second net radiometer of a different type (integrated thermopile) also collected data for the entire duration of records, indicated throughout this text by the model name Q-7. Data from this second net radiometer were used as an independent basis for comparing against the NN output, which was trained to reproduce net radiation measured by the four-component radiometer. Additionally, the ASCE Manual 70 method was used to calculate the effective net radiation using the shortwave irradiation measured by the PAR sensor. Total daily irradiation estimates produced by each of the NN and the ASCE models were compared against that measured by the Q-7 net radiometer.

Results
Implementing the training phase and subsequent estimation of radiation is straightforward using the interface of the Matlab toolbox. The resulting estimate of shortwave irradiation (for the 2015 experiment) and net radiation (for the 2017 experiment) followed expected diurnal patterns, and reproduced the incident radiation on clear and cloudy days, as well as days in which sunlight was reduced because of wildfire smoke (such as 23 August, as shown in Figure 1). NN models were even able to reproduce the total solar eclipse observed on 21 August 2017, with varying degrees of fidelity ( Figure 1). While superficially similar across a broad range of sky conditions, the NN models are comparable across the range of observed output values as well, and confirmed by independent measurements. There are differences between the different NN implementations; in the example above, NN3b and NN2b are not able to reproduce transients induced by cloudy conditions (Figure 1, 23 August). The results from this limited study demonstrate that the robustness of any particular NN implementation depends on the specific input parameters used. Previous investigations in which NNs were applied to interpret sensor data have highlighted the sensitivity of trained networks to the representativeness of conditions used for training [48]. However, due to the constraints imposed by the availability of observations, the same training period was used in all NN implementations, so that the robustness of the NN models were not evaluated with respect to the selected training periods.

Neural Network Results for 2015 Experiment
Among the different input configurations used from the 2015 data, only those which included a radiation measurement were able to reproduce the observed downwelling radiation with reasonable fidelity (Table 4). Those models included NN4 (using uplooking pyrgeometer data), NN4b (downlooking pyranometer and uplooking pyrgeometer), NN3 (uplooking pyrgeometer), and NN2b (uplooking pyrgeometer). The performance of the two NN models using only time of day, temperature, and/or relative humidity NN3c and NN2 was considerably worse (Figure 2d,e).
In order to evaluate model performance beyond the period when the NR01 net radiometer was operating, the albedo was calculated as the ratio of upwelling to downwelling shortwave radiation, and the median and quartiles of 15 min albedo values were binned by hour of day. The measured albedo was tabulated for the entire period that all four components of the NR01 operated. The synthetic albedo was tabulated for NN output records which did not include downlooking pyranometer data. Figure 3 shows that the NN4 synthetic albedo exhibits behavior more similar to the measured albedo, as compared to the NN2 model. The NN4 model uses uplooking pyrgeometer data, while the NN2 model did not include any radiometer data and exhibits greater scatter in albedo values at all times of the day.   Finally, the NN modelled irradiance time series were used to tabulate the daily energy received at the surface, and these data were compared to the daily energy reported at the nearby RAWS meteorological station (RFEC2). Figure 4 shows the time series of the six-month daily irradiance in MJ·m −2 ·d −1 measured by the NR01, modelled byNN4 and NN4b, and reported at RFEC2. The data were plotted with the clear-sky solar radiation envelope (dashed line), which was calculated on a daily interval for the stations' latitude and elevation.
The clear sky envelope indicates that both daily measurements may exceed physically realistic limits in April, and that RFEC2 station may also report physically unrealistic values in August and September. Consequently, the models trained on these values also exceed the clear sky radiation for some parts of the season. Nonetheless, the time series of the measured and modelled values generally correspond, as evident during the many cloudy days observed throughout May, and similar daily trends through July. Likewise, although daily irradiance (R s ) measured during the experiment appear to trend lower than those measured at RFEC2, the full range of output by the NN4 and NN4b generally corresponds to RFEC2 throughout the season ( Figure 5).

Neural Network Results for 2017 Experiment
As noted earlier, the NN models were able to reproduce the observed time series of net radiation under wide range of conditions, including a total solar eclipse which occurred during the 2017 experiment (the experiment site was located within the path of total solar eclipse). However, the accuracy of the five NN models did vary considerably, especially under these particularly unusual conditions ( Figure 6). Model performance was evaluated under more normal operating conditions, utilizing a co-located Q-7 thermopile net radiometer for the duration of the experiment as an independent comparator.
Model results for the 2017 data are shown in Table 5, which lists the R 2 , root mean square error (RMSE), and slope of the linear regression between all combinations of the two observations (Q-7 and NR01 sensors) and the various NN models. Although the training period for the 2017 experiment was shorter, the strong correspondence between the NN outputs and the Q-7 net radiometer measurements indicates the robustness of the method to approximate net radiation from many different combinations of input data, even those which did not include any radiation sensor data (NN3b and NN2b). The slope of the linear regression between the NR01 and the Q-7 sensors are opposite in direction for the training period and the eclipse experiment (Q-7 measured 90% of the NR01 in the earlier period, and measured 107% of NR01 values during the eclipse, Figure 7). This may be indicative of experimental error, as the Q-7 should have demonstrated a reduction in measured radiation due to degradation of the polyethylene wind shield; however, a post-season "sensor inversion" calibration did not indicate a significant difference between the uplooking or downlooking aspects of the Q-7.   The corresponding analysis of the NN models demonstrate similar trends compared to the NR01 data, in contrast with comparison to Q-7 data (Figure 8). While the general trends indicate the robustness of the NN to reproduce observed five-minute records of net radiation, greater scatter is observed for higher flux magnitudes in estimates by NN3b and NN2b. Both of these NN models lacked radiation data as input, and only used temperature (NN2b) or temperature and humidity (NN3b). Since temperature and humidity (and the sensors used to measure these parameters) vary more slowly, the lack of radiation sensor inputs in these NN models is likely why they are unable to reproduce short-term transients (usually observed during cloudy sky conditions), but they are able to reproduce phenomena that vary slowly in time (such as diurnal patterns or the eclipse).  In keeping with the intent to evaluate the ability of the model to estimate the energy available for ET and other surface fluxes, the modelled net radiation was compared to the net radiation estimations used by the ASCE and FAO56 standardized reference ET methods [16,22]. As part of the standardization, these methods use only the measured downwelling shortwave radiation, calculate a cloudiness function by comparison to the clear sky radiation, and applying a standard albedo of 0.23 (based on a standardized grass reference surface). Terrestrial longwave and longwave backscatter from the sky are estimated using empirical functions that employ the measured air temperature and relative humidity. The standardized net radiation is the sum of the calibrated downwelling shortwave minus the reflected shortwave minus the emitted terrestrial longwave plus the backscattered longwave. In this case, the standardized method was employed with an independent, calibrated sensor (the Apogee PAR sensor) to determine the five minute and daily ASCE/FAO56 standardized net radiation. The standardized net radiation is plotted with the measured Q-7 and the five NN outputs, summed for daily periods, in Figures 9 and 10.  In Figure 9, note that the models were trained during the period 21 June to 3 July, and the closeness of the measured and modelled time series during this period does not indicate good model performance. Disregarding that period, there is still excellent correspondence between the Q-7 net radiation measurements and the NN models that utilized the PAR sensor (NN5, NN3, and NN2), with R 2 exceeding 0.97 in all three models, and an RMSE of 42.9, 35.5, and 28.9 Wm −2 in the NN5, NN3, and NN2 models, respectively. Later in the season, the two models which did not utilize the PAR sensor (NN3b and NN2b) overpredicted the net radiation, with R 2 less than 0.93, and RMSE of 74.6 and 78.9 Wm −2 in the NN3b and NN2b models, respectively. This is likely due to field conditions (temperature, humidity, albedo, etc.) which changed over the course of the season, and late in the season, observed air temperature and humidity were different from those observed during the training period.

Discussion
While it is unsurprising that the results demonstrate that simple neural networks can successfully reproduce observed radiation measurements, there have been few attempts to use these methods to corroborate or bias-correct available data. Usually, the provisioning of radiation data from meteorological networks, especially those that serve agriculture or other applied purposes, rely on two costly and tenuous propositions. First, sensors and stations must be maintained and calibrated regularly to prevent the development of long-term bias. While maintenance and calibration of radiometers are standard practice for network weather stations and research equipment, farm operators will not typically be able to justify these costs or spare the required effort among other farm activities. Secondly, sensor outputs must be checked against standardized methods following regionally specific calibrations, but these calibrations may not reflect site-specific conditions. These site-specific environmental conditions certainly do affect measurements, but more importantly they affect the actual energy available at the surface which drives all microclimatic and biological processes. Ultimately, a practical constraint on provisioning high-quality data will always be the availability of skilled technical work, both for sensor maintenance and data quality assurance, and this is especially apparent with large agricultural weather networks [53][54][55].
These results are a limited demonstration of simple neural networks which were used to recognize simple patterns and relationships in the data. This analysis has treated data as time-invariant, and more sophisticated ML methods such as long-term short-term machine learning (LSTM) may offer much better results and the ability to predict future trends [43,45] in ways that this simple approach did not. Earlier work with simple neural networks [48,51] indicates that the robustness of an NN model depends on providing the network with training data that are representative of a full range of conditions, and this is an area in which these results could be expanded. There is an indication that some actual measurement of radiation substantively improves the reliability of the radiation model. While it may not be obviously helpful to model that which is already being measured, it is likely that this supplementary sensor information will be helpful even if the absolute value of downwelling radiation cannot be measured. This would suggest the possibility of using extremely low-cost LED sensors to obtain relative light level measurements, and utilizing these in lieu of more expensive calibrated sensors, given that a calibrated sensor was available to train the low-cost sensor ensemble. These results also indicate that existing networks can serve as an information "backbone" that serves a much denser network of low-cost sensors, leveraging the existing monitoring resources to enable many more site-specific measurements across the landscape, as has been accomplished with other aspects of weather monitoring [56][57][58].
It can hardly be expected that a simple machine learning method will replace standardized methods such as provided by ASCE or FAO, which represent many careers worth of engineering and experimental effort, as well as many years of practical experience in field measurement. However, there will always remain a large portion of the landscape and conditions under which standard methods cannot be applied naively or without major technical expertise and effort. One simple, site-specific characteristic which dramatically affects energy availability is topographic aspect. Despite the fact that most areas are not perfectly flat, the technical considerations of estimating radiation on sloping surfaces [59] preclude this from the most simple applied methods. Even from this one simple example, the heterogeneity of the landscape should motivate us to adopt tools such as machine learning that can leverage all available data. In this way, ML can offer new insights, not by replacing existing methods but by augmenting our ability to recognize patterns in the landscape and by extrapolating our highest quality measurements to a broader range of conditions and places.

Conclusions
The application of simple neural networks to estimate solar radiation measurements was demonstrated by two case studies in which machine learning could complement standard estimating methods. In the case of a field experiment conducted in 2015, neural networks accurately estimated missing data from a broken uplooking pyranometer, arguably the most important part of a four-component net radiometer. The replacement data were validated against the uplooking pyranometer data that were collected prior to the sensor failure, and also compared to a nearby RAWS meteorological station. The models that incorporated radiometer data (downwelling longwave and/or upwelling long-and shortwave) were able to reproduce the observed downwelling shortwave radiation with an R 2 greater than 0.9, and an RMSE less than 100 Wm −2 . Models that only relied on other weather parameters (such as air temperature or humidity) did not perform as well, with an R 2 less than 0.75, and an RMSE greater than 140 Wm −2 .
For the second case in 2017, neural networks were demonstrated in a field experiment, and used to estimate site-specific evapotranspiration with low-cost sensors. The field experiment provided data for machine learning by co-locating higher quality sensors with lower-cost sensors typically used in on-farm monitoring and agro-meteorological networks. This experiment was also able to leverage data from simultaneous experiments conducted during a total solar eclipse that occurred at the site. Neural networks were trained and tested using a variety of input data configurations. All models that utilized some form of radiation data (downlooking pyrgeometers or PAR sensors) were able to reproduce downwelling shortwave radiation or net radiation with high fidelity over five to fifteen minute measurement periods, with R 2 ranging from 0.97 to 0.98, and RMSE values less than 40 Wm −2 , as compared to observed values. Neural networks that only used air temperature, humidity, and timestamps estimated solar radiation measurements with less fidelity compared to observations, with R 2 ranging from 0.91 to 0.92, and RMSE values greater than 70 Wm −2 . These models also appeared to exhibit a stronger dependence on seasonal variation in other environmental conditions such as humidity and air temperature. Nonetheless, even the less accurate models were also able to approximate observed trends in measured radiation in many normal cases, and most models were able to reproduce the atypical and extreme drop in radiation observed during a total solar eclipse. These simple case studies suggest that further work should explore the range of valid conditions under which solar radiation could be modelled. Such neural network models would facilitate using low-cost sensors that have been trained, rather than calibrated, to report site-specific solar radiation.