A Wireless Underground Sensor Network Field Pilot for Agriculture and Ecology: Soil Moisture Mapping Using Signal Attenuation

Wireless Underground Sensor Networks (WUSNs) that collect geospatial in situ sensor data are a backbone of internet-of-things (IoT) applications for agriculture and terrestrial ecology. In this paper, we first show how WUSNs can operate reliably under field conditions year-round and at the same time be used for determining and mapping soil conditions from the buried sensor nodes. We demonstrate the design and deployment of a 23-node WUSN installed at an agricultural field site that covers an area with a 530 m radius. The WUSN has continuously operated since September 2019, enabling real-time monitoring of soil volumetric water content (VWC), soil temperature (ST), and soil electrical conductivity. Secondly, we present data collected over a nine-month period across three seasons. We evaluate the performance of a deep learning algorithm in predicting soil VWC using various combinations of the received signal strength (RSSI) from each buried wireless node, above-ground pathloss, the distance between wireless node and receive antenna (D), ST, air temperature (AT), relative humidity (RH), and precipitation as input parameters to the model. The AT, RH, and precipitation were obtained from a nearby weather station. We find that a model with RSSI, D, AT, ST, and RH as inputs was able to predict soil VWC with an R2 of 0.82 for test datasets, with a Root Mean Square Error of ±0.012 (m3/m3). Hence, a combination of deep learning and other easily available soil and climatic parameters can be a viable candidate for replacing expensive soil VWC sensors in WUSNs.


Introduction
Wireless Underground Sensor Networks (WUSNs) have been increasingly studied over the past two decades for terrestrial, agricultural, and ecological applications [1][2][3][4][5][6][7][8][9], including the demonstration of a fully buried, spatially distributed sensor node network that will not disrupt agricultural and/or industrial processes [10][11][12][13][14][15][16][17]. Such sensor networks have been explored as the backbone for geospatial internet-of-things (IoT) for agricultural applications [18]. Data gathered from such networks, combined with data curation and artificial intelligence-based analysis, is anticipated to be a significant component of future digital farming and ecological practices provided challenges in cost and scalability are met [18][19][20][21][22]. There are two important areas of WUSN applicability in agriculture and land ecology that require further examination. First, there is the need to test WUSN performance under field conditions over an extended period to examine robustness and scalability. The second area relates to the potential use of the electromagnetic wireless signal, currently only used for data transmission, to become an indicator of the soil environmental conditions, as the signal strength has been shown to be sensitive to variations in soil water content and other soil factors [10,11].
Soil volumetric water content (VWC) is a crucial parameter of interest for many applications in ecology, soil science, and agriculture [23,24]. Methods available for capturing the soil-hydrological system's natural heterogeneity are unsatisfactory at the 1 to 1000 m 2 scales. For instance, remote sensing measurements are limited to low spatial and temporal resolutions [25], while point measurement sensors sample small volumes, are relatively expensive (typically a few hundred US dollars each), and individually do not reflect the heterogeneity of the soil. Thus, there is a need for soil VWC sensor systems that can operate in this intermediate spatial scale at high temporal resolutions.
The primary goal of our work reported in this paper is to show that a WUSN-based IoT system with soil data sensing, curation, and mapping can be successfully and continuously operated in farms under "real-life" conditions. We demonstrate this by presenting results from a nine-months pilot experiment in which we have designed and operated a 23-sensor WUSN over a 130-acre site in an active soy/corn farm. An additional goal of our work has been to show that using a combination of RSSI data and machine learning makes it possible to map the soil VWC without the need for in-ground point moisture sensors, which can be cost prohibitive for large networks.
Soil moisture is an important input for flood and drought forecasting, in soil ecological studies, and in precision agriculture for effective nutrient management [51][52][53][54][55][56][57]. For this purpose-as noted earlier-it is important to be able to measure soil moisture at high resolutions (1-1000 m 2 ), and efficient techniques for this are lacking.
Different machine learning-based models have been developed with varying degrees of success for soil moisture estimation using a variety of environmental inputs (such as air and soil temperature, relative humidity, etc.), with some requiring initial soil moisture measurements as inputs [25,[58][59][60][61][62][63][64][65][66]. In addition, it has been pointed out that the Relative Signal Strength Indicator (RSSI) parameter, which quantifies signal attenuation through soil, is an easily measurable variable for any WUSN and can be a powerful input for soil moisture measurement. Microwave propagation through soil is strongly affected by the soil's dielectric properties [67], which in turn is affected by the soil texture, and the amount of organic matter and water present in the soil. This latter effect can be strong, since water and soil have significantly different dielectric constants of~80 and 3-5, respectively. Small changes in soil moisture can therefore affect the propagation of the WUSN signal [68] and can be detected via changes in RSSI.
Several researchers have studied the relationship between attenuation of the electromagnetic signal and soil moisture [10,11,36,38,69]. Soil moisture determination via RSSI changes would enable high resolution measurements without relying upon expensive soil sensors. Aroca et al. [70] used a buried passive RFID network and calibrated a neural network model for soil moisture prediction using RSSI as an input. However, due to a limited transmission range, the technique needs proximal signal reception via a mobile robot and can therefore be limited in scalability and resolution. Rodic et al. [33] installed a low-power, LoRa-based, underground sensor node and two above-ground gateways, and optimized deep learning techniques to predict soil moisture from the RSSI at one buried location. All these experiments were carried out under limited and controlled conditions or did not explore the mapping of moisture using RSSI across large areas enabled by multiple-sensor WUSNs. Since signal attenuation can also depend upon the field's topology, seasonal conditions, various other field conditions (such as crop coverage), and the system's physical parameters [1,10,17]), the relationship between RSSI and VWC requires field validation under realistic operating conditions-such work has been lacking so far.

Contribution of This Work
In the present study we offer two contributions pursuant to our research goals noted in the introduction section. Firstly, we demonstrate the design and deployment of a WUSN and then examine the results of 9 months of continuous operation of a fully buried, non-intrusive WUSN at an agricultural field located at Fermi National Laboratory (Fermilab, Batavia, IL, USA). Through this demonstration, we show that a WUSN can operate under all-season, realistic field conditions. The wireless network is based on the increasingly used unlicensed ISM band, deployed around 900 MHz (in the USA) and made popular through low-power IoT wireless networks such as LoRa and Sigfox. The WUSN collects and maps soil VWC, electrical conductivity (EC), and soil temperature (ST) simultaneously, and the data can be visualized in near real-time through a web-based user interface. We discuss the performance of the WUSN through seasonal weather variations (fall, winter, and spring) and the impacts of climate, soil, and site characteristics affecting the sensor module transmissions.
Secondly, we ask the question: Can attenuation of the wireless signal in-ground be used as a robust predictor of soil VWC under real-life conditions in a field and over changing seasonal climatic conditions? In doing so, we examine whether the RSSI, which is used to transmit the sensor measurements to the base antenna, could also be used to predict and spatially map soil VWC under long-term field testing. We found that, while RSSI alone was not able to predict soil VWC accurately over time, the addition of site-specific physical, soil, and climatic factors as input parameters along with the use of available, off-the-self machine learning algorithms significantly improves predictive performance for VWC. We evaluate three well-known machine learning techniques: support vector regression (SVR) [59,60], extreme learning machine (ELM) [61], and artificial neural network-multilayer perceptron (ANN-MLP) [59,66]. Among the tested models, ANN-MLP greatly increased the predictive capability for VWC. We propose this machine learning-based approach as an alternative method for estimating soil VWC with WUSNs, thereby obviating the need for expensive point sensors for VWC measurement (which can limit geographical scalability). We note that the scope of the paper is limited to exploring the efficacy of readily available machine learning algorithms in extracting soil moisture from our sensor network data, and not to carry out detailed research on the algorithms themselves.
The rest of the paper is organized as follows. Section 2 explains the sensor node development, WUSN architecture, and the experimental setup used in the work. Section 3 shows the preliminary results from the WUSN, wireless transmission, and effects of soil moisture and elevation and describes the performance results of soil moisture prediction using different machine learning techniques. The paper's conclusion is in Section 4.

Sensor Node Development
Thoreau 2.0 is a modified version of an earlier system developed and operated at the University of Chicago campus described in Zhang et al. (2017) [11]. The wireless backbone of the Thoreau 2.0 system is a Sigfox 901.2 MHz low-power IoT wireless network, and the architecture contains three components: (1) the buried sensor nodes, (2) a base station, and (3) a user interface (Figure 1).

Sensor Node Development
Thoreau 2.0 is a modified version of an earlier system developed and operate University of Chicago campus described in Zhang et al. (2017) [11]. The wireless ba of the Thoreau 2.0 system is a Sigfox 901.2 MHz low-power IoT wireless network, architecture contains three components: (1) the buried sensor nodes, (2) a base stati (3) a user interface (Figure 1). Each sensor node consists of a sealed carbonate casing (16 cm × 8 cm × 8.5 c contains the electronics, power source, and a transmitting antenna ( Figure 1 inset the electronic components and antenna, while the power rack is housed beneath t tronics and cannot be seen in the picture). Each node is connected to an external that simultaneously measures soil VWC, ST, and EC (TEROS12, Meter Environmen man, Washington, DC, USA) in the vicinity of the box. The mechanical design of t sor node box is critical. Building upon our experience from the first-generation (Thoreau 1.0, [11]), a hermetically sealed box was designed (Windy City Lab, Chic USA) that could withstand extreme water and temperature fluctuations. We used around the lid to prevent leaks and silicone sealant to waterproof the encasing. A tight, nylon cable gland was used for cable connection to the external sensor. Addit a conformal coating was applied to all electronics inside the box. A humidity senso the box transmits the sensor node internal humidity status of the box. The node been running for more than 29 months (6 September 2019, to the time of paper subm and have withstood high heat and cold conditions in the field. The sensor node electronics consist of a micro controller unit (MCU) (Xenon V001) that manages the sensor node core functions, such as data acquisition, ra quency (RF) transmission, and power use; and an RF transceiver (Sigfox Thinxtra that is integrated with an RF trace antenna and a power amplifier (transmission ~22 dBm). Each sensor node is powered by a battery rack containing 4 AA lithium ies. Power usage analysis indicates that each sensor node can be active for up to 4 with one set of batteries.
In the sensor nodes, as a way to conserve power, a watchdog timer awakens th from sleep mode every 30 min. The MCU subsequently wakes up the sensor, co measurement, and encodes the data prior to transmitting to the base station throu RF transceiver. The RF transceiver initiates a transmission by sending three uplin Each sensor node consists of a sealed carbonate casing (16 cm × 8 cm × 8.5 cm) that contains the electronics, power source, and a transmitting antenna (Figure 1 inset shows the electronic components and antenna, while the power rack is housed beneath the electronics and cannot be seen in the picture). Each node is connected to an external sensor that simultaneously measures soil VWC, ST, and EC (TEROS12, Meter Environment, Pullman, Washington, DC, USA) in the vicinity of the box. The mechanical design of the sensor node box is critical. Building upon our experience from the first-generation WUSN (Thoreau 1.0, [11]), a hermetically sealed box was designed (Windy City Lab, Chicago, IL, USA) that could withstand extreme water and temperature fluctuations. We used O-rings around the lid to prevent leaks and silicone sealant to waterproof the encasing. A watertight, nylon cable gland was used for cable connection to the external sensor. Additionally, a conformal coating was applied to all electronics inside the box. A humidity sensor inside the box transmits the sensor node internal humidity status of the box. The nodes have been running for more than 29 months (6 September 2019, to the time of paper submission) and have withstood high heat and cold conditions in the field.
The sensor node electronics consist of a micro controller unit (MCU) (Xenon, MCU V001) that manages the sensor node core functions, such as data acquisition, radio frequency (RF) transmission, and power use; and an RF transceiver (Sigfox Thinxtra board) that is integrated with an RF trace antenna and a power amplifier (transmission power~22 dBm). Each sensor node is powered by a battery rack containing 4 AA lithium batteries. Power usage analysis indicates that each sensor node can be active for up to 4.5 years with one set of batteries.
In the sensor nodes, as a way to conserve power, a watchdog timer awakens the MCU from sleep mode every 30 min. The MCU subsequently wakes up the sensor, collects a measurement, and encodes the data prior to transmitting to the base station through the RF transceiver. The RF transceiver initiates a transmission by sending three uplink packages in sequence on three random carrier frequencies to the base station. However, in order to conserve power, and based upon our experience with Thoreau 1.0, we modified the firmware to produce only two uplink packet transmissions for this project. A single, base station (Sigfox Macro, SBS-T-902v3) is installed in the center of the field, with an antenna mounted at a height of 10 m from the ground. The base station receives transmissions from nodes located anywhere in an approximately 530 m radius from the antenna. The base station is powered via a photovoltaic array with battery backup. The location of the base station antenna and the sensors are shown in Figure 2. The Sigfox IoT Network receives the signal from the base station. RSSI is then calculated by SigFox based on the average radio signal intensity of the packets correctly received per transmission. The RSSI varies over time as soil parameters change, and it is these natural variations that our machine learning algorithms identify during training to determine soil moisture from RSSI values.  Figure 2. The Sigfox IoT Network receives the signal from the base station. RSSI is then calculated by SigFox based on the average radio signal intensity of the packets correctly received per transmission. The RSSI varies over time as soil parameters change, and it is these natural variations that our machine learning algorithms identify during training to determine soil moisture from RSSI values. The base station antenna receives the data packets from the sensor nodes and transmits them (via the internet) to a cloud-based backend operated by Sigfox. The data is then accessed (using a callback API, curated, and visualized on Thoreau (Thoreau-Home (uchicago.edu)) in near real-time. Thoreau is a cloud-based, open web interface [71] where all data is easily accessible and available for download by the public.

Experimental Setup
Thoreau 2.0 WUSN was installed in an agricultural field located at Fermilab. The field is cultivated utilizing corn-soybean crop rotation management, and the soil is tilled to a ~25 cm depth every other year after the corn is harvested. Twenty-three sensor nodes were deployed within a ~530 m radius area from the base station and were buried ~40 cm deep. The sensors deployed with each node were also buried at a ~40 cm depth, coplanar with the tops of the sensor node boxes. This depth was chosen to avoid interference with farming activities and to avoid sensor displacement and damage due to plowing (plowing depth is typically ~25 cm). Being a noninvasive system is an important consideration and requirement for many buried agricultural network applications. Soil motion can also occur due to "frost heave", which occurs when ice lenses form in the soil, usually in cold environments with fine-textured soils. As the ice lenses grow, the soil can move upward due to pressure. In our study, the majority of the soils were of the Markham series. The sensors were buried 40 cm deep in the "B horizon", which is classified as silty clay loam soil with 35-45% clay content and moderately textured. The amount of clay in our soils leads to lower hydraulic conductivity and small void volume. The resultant drop in capillary flow with clay content tends to reduce the severity of frost heave because it impedes ice lens formation [72]. Because of the moderate clay content in our soils, as well as the moderate texture, we do not expect significant frost heave over the time period of our experiments (~9 months). In addition, the antennas contained within the rugged sensor The base station antenna receives the data packets from the sensor nodes and transmits them (via the internet) to a cloud-based backend operated by Sigfox. The data is then accessed (using a callback API, curated, and visualized on Thoreau (Thoreau-Home (uchicago.edu)) in near real-time. Thoreau is a cloud-based, open web interface [71] where all data is easily accessible and available for download by the public.

Experimental Setup
Thoreau 2.0 WUSN was installed in an agricultural field located at Fermilab. The field is cultivated utilizing corn-soybean crop rotation management, and the soil is tilled to a~25 cm depth every other year after the corn is harvested. Twenty-three sensor nodes were deployed within a~530 m radius area from the base station and were buried~40 cm deep. The sensors deployed with each node were also buried at a~40 cm depth, coplanar with the tops of the sensor node boxes. This depth was chosen to avoid interference with farming activities and to avoid sensor displacement and damage due to plowing (plowing depth is typically~25 cm). Being a noninvasive system is an important consideration and requirement for many buried agricultural network applications. Soil motion can also occur due to "frost heave", which occurs when ice lenses form in the soil, usually in cold environments with fine-textured soils. As the ice lenses grow, the soil can move upward due to pressure. In our study, the majority of the soils were of the Markham series. The sensors were buried 40 cm deep in the "B horizon", which is classified as silty clay loam soil with 35-45% clay content and moderately textured. The amount of clay in our soils leads to lower hydraulic conductivity and small void volume. The resultant drop in capillary flow with clay content tends to reduce the severity of frost heave because it impedes ice lens formation [72]. Because of the moderate clay content in our soils, as well as the moderate texture, we do not expect significant frost heave over the time period of our experiments (~9 months). In addition, the antennas contained within the rugged sensor boxes are omnidirectional dipole antennas, and small motions in the soil cannot alter the RSSI readings significantly.
As mentioned in the previous section, each sensor node is integrated with sensors that simultaneously measure soil VWC, ST, and EC. Additionally, weather parameters, including air temperature (AT), relative humidity (RH), and precipitation (P) were obtained from a weather station located at Fermilab near the field site that collected measurements every 5 min (https://wwwesh.fnal.gov/pls/default/weather.html (accessed on 25 June 2020)) ( Figure 2). We determined the soil type adjacent to each of the sensor nodes by using the Natural Resources Conservation Services Online Soil Survey (https://www.nrcs.usda. gov/wps/portal/nrcs/main/soils/survey/ (accessed on 15 December 2019)) to map the soil types within the field area. We superimposed this map and the sensors' geographical coordinates onto a Google Earth (https://www.google.com/earth/ (accessed on 10 January 2020)) image of the field site area (Figure 2). Sensor nodes were adjacent to five soil types, Elliott (146A), Drummer (152A), Peotone (330A), Mundelein (442A), and Markham (531B) (Figure 2).

Temporal Variation of Soil Properties
Measurements were collected over 289 days from 6 September 2019 through 20 June 2020. During this time, the WUSN operated continuously and produced 112,000 measurements after data curation and QA/QC. Descriptive statistics for soil VWC, EC, and ST, and weather variables AT, RH, and P averaged for the three seasons comprising the study time are shown in Table 1. Figure 3 shows daily means for these variables and precipitation events. The overall high average soil EC values of the field site indicate good soil fertility. The site is composed of silt loam and silty clay loam soils that are prime farmland soils. Nevertheless, the soils have a tendency of being too wet, potentially creating nutrient build up, as indicated by high maximum soil EC and VWC, particularly in the spring, as shown in Table 1. As expected, soil VWC increases after each precipitation event (Figure 3a). A positive correlation of soil VWC with EC was observed throughout the study time (R = 0.47; Pearson correlation analysis, MATLAB R2020b). Analyzing the data by season, the correlation coefficients of soil VWC with EC are 0.54, 0.45, and 0.29 in fall, spring, and winter, respectively (Pearson correlation analysis, MATLAB R2020b). This correlation is not surprising, as it is known that the more water there is in the soil the more cations are in solution and the soils' capacity to conduct electricity increases, resulting in higher EC values. Weather variations in RH, ST, and AT occurred throughout the study time, with cold winter and warmer spring temperatures, and frequent fluctuations in air temperature and relative humidity typical of continental climates (Figure 3c,d). Figure 3. Daily mean averages ± standard deviation of soil volumetric water content (VWC) (a), electrical conductivity (EC) (b), relative humidity (c), and soil and air temperature (d). Relative humidity and air temperature measurements were collected from the weather station shown in Figure  2. Blue bars are cumulative daily precipitation events. Shadows indicates the ± standard deviation of the mean for VWC, EC, and RH.

System Performance
Electromagnetic wave transmissions can be attenuated by several factors, including distance between a sensor node and the base station [2,10]. Figures 4a,b show the percentage of data packets received by the base station from the sensors during the 9 months of the study. Results indicate that there is a substantial loss of data packets from the sensor nodes that are farthest away from the base station (Figure 4a). In addition, we found that the amount of data received is not equal across sensors installed the same distance from the base antenna. Analysis of the data packets received from each sensor superimposed onto a topographic map (Figure 4b,c) suggests that the percentage of data packets received is a function of sensor node distance to the base station, terrain elevation, and the localized concentration of soil moisture. For instance, sensors B3S20, B2S8, B2S11, B2S12, B2S13, and B2S21; and B3S19, B3S17, and B2S6 are installed approximately equidistant to the base Daily mean averages ± standard deviation of soil volumetric water content (VWC) (a), electrical conductivity (EC) (b), relative humidity (c), and soil and air temperature (d). Relative humidity and air temperature measurements were collected from the weather station shown in Figure 2. Blue bars are cumulative daily precipitation events. Shadows indicates the ± standard deviation of the mean for VWC, EC, and RH.

System Performance
Electromagnetic wave transmissions can be attenuated by several factors, including distance between a sensor node and the base station [2,10]. Figure 4a,b show the percentage of data packets received by the base station from the sensors during the 9 months of the study. Results indicate that there is a substantial loss of data packets from the sensor nodes that are farthest away from the base station (Figure 4a). In addition, we found that the amount of data received is not equal across sensors installed the same distance from the base antenna. Analysis of the data packets received from each sensor superimposed onto a topographic map (Figure 4b,c) suggests that the percentage of data packets received is a function of sensor node distance to the base station, terrain elevation, and the localized concentration of soil moisture. For instance, sensors B3S20, B2S8, B2S11, B2S12, B2S13, and B2S21; and B3S19, B3S17, and B2S6 are installed approximately equidistant to the base station, but they differ in the number of data packets received by the base station. Sensor B2S8 (green pins in Figure 4b) is in a floodplain, leading to fewer data packets received compared to the nearest sensors B2S11 and B3S20, while sensors B2S7, B2S12, and B2S13 (red pins in Figure 4b) only differ in their topographic position and elevation yet have different data packet percentages (Figure 4b). station, but they differ in the number of data packets received by the base station. Sensor B2S8 (green pins in Figure 4b) is in a floodplain, leading to fewer data packets received compared to the nearest sensors B2S11 and B3S20, while sensors B2S7, B2S12, and B2S13 (red pins in Figure 4b) only differ in their topographic position and elevation yet have different data packet percentages (Figure 4b).
(a) (b) (c) Figure 4. The average percentage of data packets received by the base station from the sensor nodes as a function of the horizontal distance from the base station (a) and from each sensor as they were located in the field (b). The size and color of the filled circles represents the percentage of data packets received. The flag indicates the base station location; yellow, red, and green pins represent sensors installed at 740 and 735 feet elevation and those installed in the flood plain zone of the study area, respectively (b). The soil topographic map (c) used to determine site elevation is retrieved from: https://www.usgs.gov/search-map?search=3DEP (accessed on 25 November 2020).

Spatial Variation of Volumetric Water Content Before, During, and after a Precipitation Event
Soil chemistry attributes vary from site to site due to the soil's heterogenic nature, and are influenced by the water-holding capacity of soils and water availability. For instance, some soils dry faster than others because of their high porosity. Soils in northern Illinois are generally high in silt and clay, somewhat poorly drained, and have high soil water holding capacity because of high organic matter content. Humid climate conditions prevail in this area, which receives an average of 900 mm of rainfall per year, plus a variable amount of snowmelt in winter and spring. For farmers in this area, one of the most important needs is to know when the soil conditions are adequate for planting, since spring planting in overly wet conditions can result in surface compaction, leading to, in turn, poor plant emergence and root development (and reduced yield). WUSNs have the capability to determine the spatial distribution of soil moisture at any given time. For example, heatmaps on Figure 5 show the spatial distribution of soil VWC daily mean before and after a rainfall event. The heatmaps were generated by linearly interpolating between measured datapoints, as in Harris et al. (2020) [73], and overlaying the results over a base map obtained from the OpenStreet map API from MapBox [74]. Rain began on 9 March 2020, resulting in a 24.6 mm precipitation event. Although before it rained the soil VWC was already high, this event elevated soil VWC content from 0.36 to 0.41 m 3 /m 3 ± 0.03 sensor accuracy for the next 3-4 days, after which soil VWC levels dropped back to 0.37 m 3 /m 3 ± in situ sensor accuracy. Interpolation of measurements and mapping of soil

Spatial Variation of Volumetric Water Content Before, During, and after a Precipitation Event
Soil chemistry attributes vary from site to site due to the soil's heterogenic nature, and are influenced by the water-holding capacity of soils and water availability. For instance, some soils dry faster than others because of their high porosity. Soils in northern Illinois are generally high in silt and clay, somewhat poorly drained, and have high soil water holding capacity because of high organic matter content. Humid climate conditions prevail in this area, which receives an average of 900 mm of rainfall per year, plus a variable amount of snowmelt in winter and spring. For farmers in this area, one of the most important needs is to know when the soil conditions are adequate for planting, since spring planting in overly wet conditions can result in surface compaction, leading to, in turn, poor plant emergence and root development (and reduced yield). WUSNs have the capability to determine the spatial distribution of soil moisture at any given time. For example, heatmaps on Figure 5 show the spatial distribution of soil VWC daily mean before and after a rainfall event. The heatmaps were generated by linearly interpolating between measured datapoints, as in Harris et al. (2020) [73], and overlaying the results over a base map obtained from the OpenStreet map API from MapBox [74]. Rain began on 9 March 2020, resulting in a 24.6 mm precipitation event. Although before it rained the soil VWC was already high, this event elevated soil VWC content from 0.36 to 0.41 m 3 /m 3 ± 0.03 sensor accuracy for the next 3-4 days, after which soil VWC levels dropped back to 0.37 m 3 /m 3 ± in situ sensor accuracy. Interpolation of measurements and mapping of soil conditions and attributes in almost near real-time is important and can inform agricultural management decisions, one of the many new applications for WUSN technologies.

Estimation of VWC from Received Signal Strength Indicator (RSSI)
The RSSI value received by the base station is generally lower than the original sensor node transmission. This signal attenuation is the result of the electromagnetic wave traveling through the soil medium and the free-space path in the air and intervening vegetation. Variations in soil moisture also affect RSSI values because the presence of moisture alters the soil's dielectric response to the propagation of the signal [10]. Figure 6 denotes RSSI values typically received by the base station, plotted against the soil VWC and the amount of precipitation at that time. As Figure 6 indicates, soil VWC rises following precipitation, sometimes reaching the upper detection limit of the sensor, and then decreases within a few hours to days due to soil water drainage and evaporation (when the soil is bare) or evapotranspiration (when plants are present). Correlation of soil VWC with RSSI is not present in winter, but it appears over the course of spring and early fall ( Figure 6).

Estimation of VWC from Received Signal Strength Indicator (RSSI)
The RSSI value received by the base station is generally lower than the original sensor node transmission. This signal attenuation is the result of the electromagnetic wave traveling through the soil medium and the free-space path in the air and intervening vegetation. Variations in soil moisture also affect RSSI values because the presence of moisture alters the soil's dielectric response to the propagation of the signal [10]. Figure 6 denotes RSSI values typically received by the base station, plotted against the soil VWC and the amount of precipitation at that time. As Figure 6 indicates, soil VWC rises following precipitation, sometimes reaching the upper detection limit of the sensor, and then decreases within a few hours to days due to soil water drainage and evaporation (when the soil is bare) or evapotranspiration (when plants are present). Correlation of soil VWC with RSSI is not present in winter, but it appears over the course of spring and early fall ( Figure 6). A correlation analysis using all data indicates a weak significant negative correlation (R = −0.25, p-value = <0.005) of soil VWC with RSSI. This analysis was conducted with a total of 92,000 datapoints, obtained after removal of soil VWC values that were above the maximum detection limit of the sensors, 0.45 m 3 /m 3 (removal of 4,000 datapoints), and of incomplete datasets due to temporary malfunction of the weather station that led to the removal of another 12,000 datapoints from our dataset. Because electromagnetic signal propagation is also impacted by other factors, such as terrain elevation and distance of the sensor node to the base station, as shown in Figure 4, and other factors identified in the literature, including climatic and weather factors [75,76], we investigated whether a nonlinear approach using standard machine learning algorithms could improve the prediction of soil VWC by taking into consideration RSSI and the various factors that might affect signal transmission from the sensors.
As noted earlier, we used an Artificial Neural Network with Multilayer Perceptron (ANN-MLP) algorithm (details in [58,[77][78][79][80]) for constructing soil VWC predictive models. MATLAB R2020b was used for the neural network development, training, and simulations [81]. The total dataset was randomly classified into 70% for training (64,000 datapoints), 15% for validation (13,500datapoints), and 15% for testing (13,500 datapoints). Soil VWC in situ measurements (collected by the external soil sensor) were used as ground truth data and target parameters for the model. Available input parameters that could be used to train the model included WUSN site parameters (RSSI and hypotenuse distance (D) between the buried sensor node and base station antenna) and soil (ST) and weather A correlation analysis using all data indicates a weak significant negative correlation (R = −0.25, p-value = <0.005) of soil VWC with RSSI. This analysis was conducted with a total of 92,000 datapoints, obtained after removal of soil VWC values that were above the maximum detection limit of the sensors, 0.45 m 3 /m 3 (removal of 4000 datapoints), and of incomplete datasets due to temporary malfunction of the weather station that led to the removal of another 12,000 datapoints from our dataset. Because electromagnetic signal propagation is also impacted by other factors, such as terrain elevation and distance of the sensor node to the base station, as shown in Figure 4, and other factors identified in the literature, including climatic and weather factors [75,76], we investigated whether a nonlinear approach using standard machine learning algorithms could improve the prediction of soil VWC by taking into consideration RSSI and the various factors that might affect signal transmission from the sensors.
As noted earlier, we used an Artificial Neural Network with Multilayer Perceptron (ANN-MLP) algorithm (details in [58,[77][78][79][80]) for constructing soil VWC predictive models. MATLAB R2020b was used for the neural network development, training, and simulations [81]. The total dataset was randomly classified into 70% for training (64,000 datapoints), 15% for validation (13,500 datapoints), and 15% for testing (13,500 datapoints). Soil VWC in situ measurements (collected by the external soil sensor) were used as ground truth data and target parameters for the model. Available input parameters that could be used to train the model included WUSN site parameters (RSSI and hypotenuse distance (D) between the buried sensor node and base station antenna) and soil (ST) and weather parameters (P, AT, and RH). Weather parameters were averaged at 30 min intervals to harmonize all data to that time interval for the model. WUSN/site parameters are those that can be obtained directly from the deployment of the WUSN in the field. Because the signal transmitted from a buried node first travels through the soil, the power (PT (in dB)) at the soil surface depends on the moisture content in the soil. After traveling through the air with a free space pathloss proportional to 20 × log(D), where D is the hypotenuse distance of the buried sensor to the antenna, the signal is received at the base station antenna with a certain RSSI. Hence, one can approximate PT as being proportional to the sum of 20 × log(D) and RSSI. The distance (D) appears as a separate parameter in addition to the 20 × log(D) factor in the estimate of PT in order to capture propagation effects, such as multipath, which are not accounted for in the free space pathloss. These site parameters along with the measured weather and soil variables were used as input parameters for evaluating the machine learning algorithm.
Trial and error was used to select the optimum number of hidden layers and number of neurons. Table A1 in Appendix A shows the performance matrix of different combinations of hidden layers and number of neurons. Ultimately, a five-layer feed-forward neural network including three hidden layers, one input layer, and one output layer, with each hidden layer containing 55 neurons, was selected based upon performance in VWC predictions. The Levenberg-Marquardt algorithm was used for network training. To avoid overtraining, we used an early stopping training method. Model performance was assessed by comparing the coefficient of determination (R 2 ), root mean square error (RMSE), and mean absolute error (MAE) of the estimated value of soil VWC to in situ, ground-truth measurements. All input and target parameters were normalized to a range between 0.2 to 0.8, as suggested by Cigizoglu (2003) [82]. Table 2 shows the performance of this model in training, validation, and testing with various combinations of input parameters. The six-input parameter model, containing all site and climate input parameters, predicted soil VWC with very good accuracy and low RMSE and MAE.
In contrast, running the model with only site input parameters, i.e., using only RSSI + 20 × log(D) and D as input parameters, substantially reduced the performance of the model, rendering unacceptable predictions ( Table 2, two-parameter model). We determined that ST is an important variable in determining the model's predictive ability, i.e., any model was substantially improved when ST was included as an input parameter (Table 2, three-, four-, and five-parameter models). Similarly, but to a lesser extent, the performance of the model could be further improved with the addition of AT and RH as input parameters (in addition to RSSI + 20 × log(D), D, and ST as inputs) leading to four-and five-parameter models (see Table 2). For example, Figure 7 shows measured and predicted soil VWC when using the ANN-MLP model with RSSI + 20 × log(D), D, ST, AT, and RH as input parameters. Including P in the model did not affect model performance ( Table 2). Since P is infrequent, there were fewer inputs into the algorithm, which might have reduced its influence on model performance. Nevertheless, it is possible that the effect of P might already have been accounted for, indirectly, by RSSI and RH variations.
While RSSI is expected to and has been previously shown to be affected by soil VWC due to changes in the soil's dielectric properties under limited testing conditions [10,36,38,69], our results summarized in Table 2 (and the seasonal examples highlighted in Figure 6) clearly show that RSSI alone is a poor consistent predictor of soil VWC, and we propose the use of this algorithm as an alternative method for estimating soil VWC with WUSNs. Indeed, dropping RH as an input parameter and using a four-parameter model also leads to good prediction of soil VWC (R 2 = 0.82). The approach described above, while obviating the need for an expensive VWC sensor at every node, does require other soil and weather data. However, many algorithms developed to predict ST from AT can render this approach feasible [83][84][85][86]. We compared the performance of the six-parameter ANN-MLP model with two other commonly used machine learning algorithms: Support Vector Regression (SVR) [87][88][89] and Extreme Learning Machines (ELMs) [90,91]. We used these algorithms to predict soil moisture in the same way as for our ANN-MLP model, and the implementation details of the two models are described in Appendix B. The performance statistics of the two models are presented in Table 3, showing the best results across a collection of SVR kernel functions and ELM activation functions. Grid searches for these two models yielded poorer results than the ANN-MLP model. The best SVR model was found with a radial basis function (RBF) kernel which resulted in R 2 = 0.56 and RMSE = 0.053 m 3 /m 3 for the testing stage; the best ELM model was found with a sigmoid activation function, which resulted in R 2 = 0.48 and RMSE = 0.06 m 3 /m 3 for the testing stage. The reason for these models' poor performance in comparison to the ANN-MLP model is unclear and beyond the scope of the current paper. We can conclude, however, that our deep learning approach surpasses the other surveyed methods even without a thorough grid search, and that the ANN-MLP model proves a useful tool in predicting geospatial VWC.
We now make a few comments regarding our analysis. Firstly, note that we did not include the polarization and antenna gain for the sensor node transmitters as input parameters. Polarization and antenna gains cannot be controlled precisely, especially in a real-world field deployment such as described in this paper. With ML, if measurements from each sensor are adequately represented in the training set, these, as well as other hidden factors, such as differences in terrain, will be learned by the model during the training phase. In the experiment, 92,000 datapoints were collected from 23 sensor nodes over 9 months, ensuring that measurements from each sensor node were well-represented in the training data. Furthermore, in estimating soil moisture, the ML model utilizes the relative difference in RSSI between wet soil and dry soil from each sensor, and hence, the absolute effect of polarization, antenna gain, and terrain do not need to be modeled accurately since these do not change with the level of moisture in the soil. Secondly, as noted in Section 3.2, some sites can have lower number of successful transmissions due to factors such as distance, topography, etc. To examine this effect, we tested the model with data from the sites that had >50% and <50% data packet transmission rates, and the difference in performance statistics during testing was not significant. However, this is a parameter that will likely need to be examined and tested in other site installations.  Table 2 as the best-performing model.
We compared the performance of the six-parameter ANN-MLP model with two other commonly used machine learning algorithms: Support Vector Regression (SVR) [87][88][89] and Extreme Learning Machines (ELMs) [90,91]. We used these algorithms to predict soil moisture in the same way as for our ANN-MLP model, and the implementation details of the two models are described in Appendix B. The performance statistics of the two models are presented in Table 3, showing the best results across a collection of SVR kernel functions and ELM activation functions. Grid searches for these two models yielded poorer results than the ANN-MLP model. The best SVR model was found with a radial basis function (RBF) kernel which resulted in R 2 = 0.56 and RMSE = 0.053 m 3 /m 3 for  Table 2 as the best-performing model.

Conclusions
In this paper, we present the successful design and deployment of Thoreau 2.0, a scalable, low-power WUSN for subterranean sensing. The WUSN has been operating continuously in an agricultural field since September 2019, and we present data collected over 9 months (three seasons). High temporal and spatial resolutions were accomplished by measuring soil conditions at 30 min intervals and interpolating and mapping the sensor results over an approximated 530 m radius area. We showcase the ability of our WUSN to monitor and map real-time variations in soil VWC, ST, and EC. Our results show that such WUSNs can be reliably operated under "real-world" conditions and are scalable. Using the RSSI signal from the WUSN along with other inputs as proxies, we developed a deep learning model that can accurately predict and map an important parameter for soil ecology and agriculture: volumetric water content (or soil moisture). This enables soil moisture determination without the need for expensive sensors for direct VWC measurement.  Acknowledgments: The authors thank the Windy City Lab for manufacturing the sensor nodes, Fermi National Laboratory for providing access to the field site and weather information, U. Rai for helping with heatmap coding, and T. Vugteveen, S. Hofmann, and D. Cook from Argonne National Laboratory for providing technical assistance in the field.

Conflicts of Interest:
The authors declare no conflict of interest. Table A1 shows the performance statistics of different neural network architectures tested for volumetric water content (VWC) predictions.

Appendix B
We implemented our SVR models using the out-off-the-box SVR function in Python's scikit-learn package [88,90]. The SVR function in Python's scikit-learn package implements epsilon-supported SVR wherein no penalty is associated with training loss within a certain distance ε from the target data. A randomized grid search was used for hyperparameter tuning 89, using scikit-learn's GridSearchCV function. The search is defined for the regularization parameter C = {0.001, 0.1, 1, 10, 100, 500, 1000, 2500}; the epsilonsupport region size ε = {0.00001, 0.0001, 0.001, 0.005, 0.01, 0.1, 1}; and the kernel coefficient γ = {0.0001, 0.001, 0.01, 0.1, 1, 2, 10, 20}; γ was also allowed the values 'auto' and 'scale' according to scikit-learn's implementation, which, respectively, use the inverse of the number of input features and the inverse of the number of input features times the variance of the input data. Table A2 shows the optimal C, ε, γ, and degree values for the respective kernel functions for VWC prediction. For soil moisture prediction, the associated ELM neural network involved an input layer with six neurons (one per input parameter), one hidden layer, and one output layer with one neuron. MATLAB was used to develop an ELM model with a specified number of hidden neurons with a specified activation function (tested sigmoid, sine, tanh, triangular basis, hard limit, ReLu, and RBFs). The number of neurons between 5 and 1000 with an increment of five were tested in the hidden layer of the ELM models. Table A3 shows the optimal hidden neurons for different activation functions.