A comparison between support vector machine and water cloud model for estimating crop leaf area index

: The water cloud model (WCM) can be inverted to estimate leaf area index (LAI) using the intensity of backscatter from synthetic aperture radar (SAR) sensors. Published studies have demonstrated that the WCM can accurately estimate LAI if the model is effectively calibrated. However, calibration of this model requires access to ﬁeld measures of LAI as well as soil moisture. In contrast, machine learning (ML) algorithms can be trained to estimate LAI from satellite data, even if ﬁeld moisture measures are not available. In this study, a support vector machine (SVM) was trained to estimate the LAI for corn, soybeans, rice, and wheat crops. These results were compared to LAI estimates from the WCM. To complete this comparison, in situ and satellite data were collected from seven Joint Experiment for Crop Assessment and Monitoring (JECAM) with only slightly higher errors (RMSE of 0.75 m 2 m − 2 and MAE of 0.61 m 2 m − 2 ) when estimating corn LAI. For rice, only the SVM model was tested, given the lack of soil moisture measures for this crop. In this case, both high correlations and low errors were observed in estimating the LAI of rice using SVM (R of 0.96, RMSE of 0.41 m 2 m − 2 and MAE of 0.30 m 2 m − 2 ). However, the results demonstrated that when the calibration points were limited (in this case for soybeans), the WCM outperformed the SVM model. This study demonstrates the importance of testing different modeling approaches over diverse agro-ecosystems to increase conﬁdence in model performance. globally important crops (corn, soybeans, rice, and wheat) in seven geographically diverse sites. This research assessed the performance of these methods to deliver accurate estimates of crop conditions, with the goal to develop a robust SAR-based approach to crop monitoring applicable to different growing seasons and varying agricultural systems.


Introduction
Leaf area index (LAI) is one-half of the total green leaf area per unit ground surface area. It is a measure of crop canopy development and is a good indicator of productivity as crops develop through their growing seasons. In particular, it is important to monitor how well leaf development is progressing in the first half of the crop growth period, a period of rapid accumulation of leaves. Temporally dense measures of LAI are helpful in determining current conditions and predicted yields. Here, synthetic aperture radars (SARs) could make a contribution. Yet, implementation of these sensors for LAI monitoring requires extensive testing of models over diverse cropping systems and multiple cropping years.
Although optical satellites have been widely used for monitoring the condition of crops [1][2][3][4][5], the presence of clouds impedes data collection and as such, these cloudy conditions are a significant challenge when monitoring crop development, operationally. The longer wavelengths propagated by synthetic aperture radar (SAR) sensors are unaffected by clouds. Consequently, the use of SAR data for mapping and monitoring agricultural landscapes has helped agencies such as Agriculture and Agri-Food Canada (AAFC) to consistently deliver products of value to the agriculture sector [6].
In 2017, AAFC initiated a three-year project called the Joint Experiment for Crop Assessment and Monitoring (JECAM) SAR Inter-Comparison experiment. The objective of this research was to develop and test methods to map crops and estimate crop biophysical status, using SAR satellite data. The experiment challenged the performance of these methods in delivering robust results over diverse agro-ecosystems by engaging research partners from 11 countries. JECAM-SAR research sites are located in Argentina, Bangladesh, Brazil, Canada, Germany, India, Italy, Poland, Taiwan, Ukraine and the United States of America (USA). Of these sites, seven provided field data on crop LAI, an important indicator of crop development and productivity [7]. RADARSAT-2 and Sentinel-1 satellite data were collected over these seven sites (Argentina, Canada, Germany, India, Poland, Ukraine, USA). In this research, field data provided by JECAM-SAR partners were used with these C-Band satellite observations to develop SAR-based approaches for LAI estimation.
The intensity of microwave scattering is impacted by both soil (volumetric soil moisture and surface roughness) and vegetation (LAI, canopy water content and biomass) conditions. Several studies have used semi-empirical modeling to characterize the contribution of these soil and vegetation parameters to SAR backscatter, with promising results [8][9][10][11][12]. The water cloud model (WCM) [13] is a semi-empirical radiative transfer model which has been used successfully to estimate crop biophysical parameters such as LAI and biomass [14][15][16][17][18][19]. For example, Bériaux et al. (2011) used the WCM with C-Band satellites (ERS-1/2, ENVISAT/ASAR and RADARSAT-2) to estimate wheat LAI, later applying the WCM to corn LAI [11,16]. Bériaux et al. (2015) used a simulated dataset and demonstrated that the VV-HV polarization combination provided the highest LAI estimation accuracies for corn, with a correlation coefficient of 0.82 between ground-measured LAI and estimated LAI and RMSE of 0.85 m 2 m −2 [16]. Hosseini et al. (2015) extended this approach, using the WCM to estimate LAI for corn and soybeans, successfully exploiting both RADARSAT-2 (C-Band) and airborne UAVSAR (L-Band) data [11]. They tested the WCM model over a JECAM site in Canada using data collected during the 2012 Soil Moisture Active Passive (SMAP) Validation Experiment (SMAPVEX) field campaign [20]. Hosseini et al. (2015) reported a correlation (R-value) of 0.81 between ground-measured and model-estimated LAI for corn when using C-Band co-and cross-polarizations (VV-HV) [11]. Correlations for soybeans were very similar (R-value of 0.80). These results were consistent with those from Jiao et al., 2009, andHosseini et al., 2015, demonstrating that the WCM could estimate LAI over a broader range (LAI of 0.07-3.57 m 2 m −2 for soybeans; 0.04-4.79 m 2 m −2 for corn) [11,21]. More recently, co-polarized X-Band data have also been used to estimate LAI over a study region in the Philippines [19]. They exploited X-HH to estimate LAI of rice during vegetative and reproductive growth stages, with a correlation coefficient of 0.71 reported. However, X-HH backscatter was not significantly correlated with LAI for late-season rice nor for rice in its ripening stage. A study on co-polarized TerraSAR-X data for assessing the LAI of wheat, barley, and canola in the JECAM site in Germany using the WCM reported coefficients of determination between 0.69 and 0.79. For wheat, X-HH backscatter, and for barley and canola, X-VV backscatter provided the highest accuracies [12].
These studies demonstrate that SAR data can be exploited to estimate the LAI of crops. However, implementation of this modeling approach can be challenging given the requirement for in situ measures LAI and soil moisture near co-incident to satellite acquisitions to calibrate the WCM. Assessment of the WCM in these studies has been limited geographically and thus constrained to single agro-ecosystems under specific climatic conditions. As such the robustness of this method to estimate crop condition over diverse cropping systems and climatologies is unknown. In this study, data from a network of international partners facilitate assessment of the WCM model using crop measurements provided for seven JECAM sites distributed over three continents. Except for rice, with data available for only the India JECAM site, field measurements for other crop types were available for multiple JECAM sites (five JECAM sites for corn, three JECAM sites for soybeans and two JECAM sites for wheat). With this diversity of in situ data, model performance is tested beyond individual research sites. This study challenges the applicability of these methods to broader regions of global importance to agricultural production.
Application of the WCM to estimate crop LAI necessitates the calibration of this semi-empirical model. Calibration requires ground measures of not only LAI but also soil moisture. The high temporal variability of surface soil moisture challenges efforts to collect SAR-coincident moisture measures and as a result, complicates calibration of the WCM. All seven JECAM-SAR sites were able to provide field-measured LAI, but only three (Canada, Germany and Poland) had available soil moisture measures at the time of SAR overpasses. In comparison, machine learning (ML) algorithms can theoretically be trained with available data. As such, models can be developed by training SAR intensities on crop LAI, without soil moisture data. However, these algorithms are highly dependent on the size of training data. Use of the support vector machine (SVM) algorithm for estimating the LAI of crops has drawn limited attention, although this algorithm has been widely and successfully used for crop type classification. Recently, Mandal et al. (2019a) compared SVM and Random Forest (RF) algorithms for crop LAI estimation [17]. They concluded that the RF approach, with RMSE of 0.723 m 2 m −2 , was more sensitive to the size of the training dataset while the SVM algorithm, with RMSE of 0.677 m 2 m −2 , was more robust and efficient with a smaller training dataset.
Considering the success of the WCM, yet acknowledging its limitations, this study compared the SVM algorithm and WCM for LAI estimation. This SVM-WCM evaluation was undertaken for four globally important crops (corn, soybeans, rice, and wheat) in seven geographically diverse sites. This research assessed the performance of these methods to deliver accurate estimates of crop conditions, with the goal to develop a robust SARbased approach to crop monitoring applicable to different growing seasons and varying agricultural systems.

Study Sites and In Situ Data Collection
Seven JECAM-SAR partners participated in this experiment with study sites located in Argentina, Canada, Germany, India, Poland, Ukraine, and the USA (Figure 1). JECAM sites are typically 25 × 25 km in size, and most are long-term agricultural research sites. Sites are distributed among diverse cropping systems and varying climate regimes. In addition to the site descriptions, in situ collections of crop and soil measures are described in this section and summarized in Table 1.

Argentina
The Argentina JECAM site is located at San Antonio de Areco county, northwest of Buenos Aires province, Argentina. The topography of the area is gently rolling with less than 3% slopes. The main crops in this area are soybeans, corn, and spring wheat. The growing season starts around June-July and ends in April-May of the following year. Given the length of the growing season, farmers can accommodate one or two crops per growing season. Single crops are mostly maize or soybean sown in early October and November, respectively. Alternatively, these summer crops can be sown in December or early January following a winter crop. Wheat is sown around June or July. No-till systems (i.e., direct sowing) are common thus soils are often covered by crop residues during fallow periods. Soil textures vary from silt-loam to clay-loam and silt-clay-loam. Field sizes average around 20-30 ha. In situ LAI measurements were collected during the 2017-2018 growing season for ten soybeans fields (six early soybean and four late soybean fields) with five sample sites per field. At each sample site within each field, 14 hemispherical photos (Jonckheere et al., 2004) were taken downwards along two parallel transects, using a Samsung GEAR 360 camera which was equipped with two hemispherical fisheye lenses (180 degrees each) [22]. Measurement points were located five meters apart, creating a rectangular sampling pattern. LAI was determined through post-processing of digital photos using the CanEye software [23]. For each sample site, the in situ LAI was calculated as the average LAI of all 14 photos.

Canada
The Carman JECAM site is located south and west of Winnipeg, Manitoba, with crop production focused on forage, canola, flaxseed, sunflower, soybeans, corn, barley, spring wheat, winter wheat, rye, oat, canary seed, potato, and field pea. Carman has been a long-term site for SAR research for agriculture in Canada, with studies dating back to the 1994 Shuttle missions [24]. More recently, this site has been the location of both the 2012 and 2016 Soil Moisture Active Passive (SMAP) Validation Experiment (SMAPVEX) field campaigns [20,25]. Typically, producers use a rotation of cereals alternating with oilseed or pulse crops. Field sizes range from 20-30 to 50-60 ha. The topography is generally flat with very little relief. Soil texture varies greatly across this site from heavy clays (southeast) to sandy loams (northwest).
For SMAPVEX12, LAI and soil moisture were measured in 55 agricultural fields which included 9 corn, 18 soybeans and 9 wheat fields. Slightly fewer fields (50) were sampled during SMAPVEX16 which included 5 corn, 14 soybeans and 12 wheat fields. In both campaigns, 16 sample sites were established in each field, in two parallel transects of 8 sites each. The distance between the two transects was 200 m and the sampling sites along each transect were separated by 75 m. Three measures of surface soil moisture (0-5.6 cm) were taken at each of the 16 sample points using Stevens Hydra Probes. The average of the three replicates was considered as the soil moisture of each site. Soil moisture was measured the same day of each aircraft (SMAPVEX12) or each satellite (SMAPVEX16) overpass. Calibration of the Hydra Probe measures was completed for each site according to the methodology detailed in [26]. LAI was measured within 1-3 days of satellite overpasses, at three of the 16 soil moisture sampling sites. Comparable to the approach taken by JECAM-Argentina, here LAI was measured at each of the three sample sites per field, using hemispherical photos [22]. At each site, 14 photos were collected in two transects (5-metre spacing) and post-processed using the CanEye software [23]. The LAI for all 14 photos was averaged to give a single LAI per sample site.

Germany
The Durable Environmental Multidisciplinary Monitoring Information Network (DEMMIN) test site is located in the federal state of Mecklenburg-Western Pomerania in Northeast Germany. It is an intensively used agricultural ecosystem, where crops are grown under temperate climate conditions in the transition zone from maritime to continental climate. The area is a typical Pleistocene landscape with hilly, loamy to clayey moraines, sandy plains, and peaty floodplains. Ground surface varies from flat to undulating, up to a maximum elevation of 179 m above sea level. Soil substrates are dominated by loamy sands and sandy loams alternating with pure sand patches or clayey areas. Field sizes range between approximately 80 and 100 ha. The main crops are winter wheat, winter barley, winter rape, maize, potatoes, and sugar beet.
The test site was established in 1999 based on a cooperation between the German Aerospace Center (DLR) and farmers of the region. In 2015-2017, LAI was collected in field campaigns of the University of Würzburg, DLR, and the GFZ German Research Center for Geosciences with an SS1 SunScan Canopy Analysis System [27]. The system derives the LAI from Photosynthetically Active Radiation (PAR) measurements above and below the canopy surface with a 1 m probe, and a reference PAR sunshine sensor. There is no specific postprocessing procedure for the aforementioned system. Soil moisture was measured using a portable soil sensor (Thetakit) in the field.

India
The India JECAM site is in Vijayawada, Andhra Pradesh, India, a region dominated by agricultural production and characterized by flat topography. The predominant soils in this test site are loamy to clayey deep reddish-brown. Agricultural production focuses on three major annual crops: rice, sugarcane, and cotton. Rice is grown in two distinct seasons: monsoon or kharif (June-November) and winter or rabi (December-March). The test site is a super site, where numerous techniques have been tested to characterize crops [28], classification assessments [29], and vegetation condition monitoring by means of derived radar vegetation indices [30] using C-band SAR data in full and compact-pol modes. The data used in this study were limited to rice cultivation during the kharif season of 2014 and 2018. In situ measurements of rice included LAI, plant height, density, and phenology, which were collected from 17 fields over the test site. The nominal size of each rice field was approximately 100 m × 100 m. In each field, vegetation measures were conducted at two sample points. LAI was estimated from digital hemispherical photography using a wide-angle lens [22]. During each measurement day, ten photos were taken along two transects, separated by 2 m. All photos were post-processed using the CanEYE software [23], with all ten photos averaged to a single LAI per sample site.
A detailed description of sampling strategies at the JECAM Indian site can be found in the field campaign report [31].

Poland
The Poland JECAM site is located in Wielkopolska, Poland and is one of the main calibration/validation sites for the European Space Agency (ESA) optical missions (Proba-V, Sentinel-2 and Sentinel-3) [32]. Topography of the area is flat with field sizes varying between 10 and 120 ha. The landscape is dominated by large fields intermixed with smaller fields. Crop types in the area include wheat, corn, barley, triticale, alfalfa, rye, sugar beet, and potato. In situ LAI and soil moisture data were collected in 2016 and 2018 for corn and winter wheat two to three times a month. The TRIME-PICO64 sensor with internal TDR-electronics was used for the soil moisture measurements. LAI was measured using LAI 2200C Plant Canopy Analyzers. At each measurement location, one above-canopy reading and four below-canopy readings (repeated twice) were recorded and used to calculate a single LAI value. Above and below canopy measurements were taken with a fisheye optical sensor with 148 • angle of view. An azimuth mask of 270 • view cap was used on the LAI-2200C sensor to block the bright sky and thus eliminate the shadowing effect of the instrument operator. From each measurement location, three LAI measures were recorded.

Ukraine
The Ukraine JECAM site was established in 2011, located near the city of Kyiv. The area is mostly flat with less than 2% slope but 10% of the territory is hilly with slopes of about 2-5%. The climate in the region is humid continental with approximately 710 mm of annual precipitation. The main soil type is humus with fields ranging in size from 30 to 250 ha. The Kyiv region represents an agriculture-intensive area with the following major crop types: corn, wheat, soybeans, barley, sunflower, and sugar beet. The crop calendar is September-July for winter crops, and April-October for spring and summer crops. The research sites are nested around the village of Pshenichne, Vasilkiv district (NUTS3 administrative level), as well as across the entire Kyiv region (NUTS2 level). Both sites have been used to develop and test satellite-based technologies and to compare information products derived from satellite imagery to in situ data, including LAI maps. During the last ten years, this JECAM-Ukraine test site has provided data for a number of international initiatives, dedicated to the estimation of biophysical parameters and inter-comparison of global experiments, namely the SPOT-5 Take 5 Initiative, FP7 ImagineS [33], SIGMA projects [34] and SENSAGRI Horizon 2020.
Ground data collection for the Ukraine site followed the Validation of Land European Remote Sensing Instruments protocol, with measurements made per elementary sampling units (ESUs) [35]. Between 12 and 15 measures were taken within each ESU, with typically 10 to 15 ESUs sampled for each crop type. As with many of the other JECAM sites, LAI was measured using digital hemispherical photos. Photos were acquired using a CANON EOS 500d camera and a wide-angle lens [22], with post processing using the CanEYE software [23]. At each sampling date, 3 corn plants were harvested within a random 60 cm row segment per sampling site, yielding a total of 30 corn plants per field (3 plants × 10 sites). All soybean plants within a 60 cm row segment were harvested in each of the 10 random sampling sites. Row spacing for both corn and soybean fields was 75 cm. Plant samples were processed in the laboratory by cutting each leaf at its collar or petiole and measuring one side of the leaves with a leaf area meter (LI-3000, Li-Cor Inc., https://www.licor.com/ env/products/leaf_area/LI-3100C/, accessed on 1 February 2021). Leaf area index was calculated as: LAI (dimensionless) = leaf area (m 2 )/sampling area (m 2 ). Soil moisture was also measured at the sampling sites, but these measurements were not on the same days as satellite overpasses and as such, were not used in this study.

Satellite Data Acquisitions
This study used 41 RADARSAT-2 images and 81 Sentinel-1 images. The acquisition dates are listed in Table 2 Pre-processing of satellite images was accomplished using the Sentinel Application Platform (SNAP). For Sentinel-1 images, ground range detected high resolution (GRDH) products were used. First, an orbital file correction was applied to update the image metadata with the corresponding satellite orbital information. The pixel digital numbers were then converted to backscatter coefficients using the SNAP Radiometric Correction function. Next, a Gamma Map filter (5 × 5 kernel) was applied for speckle noise reduction. Finally, the images were orthorectified using the Range Doppler Terrain Correction algorithm available in SNAP.
SNAP was also used to pre-process the Single-Look Complex (SLC) RADARSAT-2 data. For these RADARSAT-2 products, the orbital files have already been applied. As such, the first step in preparing these data involved converting pixel digital numbers to backscatter values. A Gamma Map filter (5 × 5 kernel) was applied for speckle noise reduction, followed by orthorectification using the Range Doppler Terrain Correction algorithm. Given that the pixel size of Sentinel-1 was 10 m × 10 m, all RADARSAT-2 images were resampled to a 10 m × 10 m spatial resolution using bilinear interpolation.
For both Sentinel-1 and RADARSAT-2, like-polarization (VV) and cross-polarization (VH) backscatter coefficients values were extracted for each LAI sampling site using a 5 × 5 average window. The incidence angle was also extracted for each sampling site. The WCM model was assessed in this study for LAI estimation. In this model, the total backscatter is modeled as the direct backscatter from vegetation and the backscatter from soil which is attenuated by vegetation. The general formula for the WCM model is as follows (Equations (1)-(3)) [13].
where σ 0 is total backscatter in power units, σ 0 veg is direct backscatter from the vegetation, τ 2 σ 0 soil is backscatter from the soil attenuated by the vegetation expressed by a two-way attenuation factor (τ 2 ), A and B are the coefficients of the model and θ is the incidence angle, V 1 and V 2 are the vegetation indicators and expressed as follows [15].
where L is LAI and E 1 and E 2 are coefficients. σ 0 soil is expressed by the following equation [36].
where M v is soil moisture and C and D are parameters of the model.

The Support Vector Machine (SVM) Model
Machine learning involves adaptive mechanisms that enable computers to learn from experience, example, and analogy [37]. In this study, the SVM, a well-known machine learning algorithm, was assessed for LAI estimation.
The basis of SVM regression is to fit a function between the training data and the targets such that the maximum deviation between the estimated and observed values is less than a precision parameter ( ). This optimization problem is expressed by introducing the slack variables (ξ i , ξ * i ) as follows [38]: subject to where w is the normal vector and w is its norm, C is the constant, f () is the fitting function, x i is the input training data, y i is the model response and n is the number of training data. The performance of the SVM model has been robust in the presence of outliers, but the model can be prone to over-fitting [39,40]. In this study, the SVM model was trained on the two backscatter intensities of VV and VH and the incidence angle, with LAI set as the response.

Calibration and Inversion of the WCM Model
The WCM model is a semi-empirical model and has six coefficients (A, B, C, D, E1 and E2) and two unknown variables (in our study, LAI and soil moisture). Therefore, prior to inverting the WCM to estimate LAI and soil moisture, the model must first be calibrated, a process necessary to parameterize these six coefficients. The non-linear least square method [41] was used to parameterize the model for each crop type and each polarization (i.e., VV and VH). Hosseini et al. (2015) demonstrated that calibration of the WCM model for different crop growth stages provided better results compared to generating one set of coefficients for the whole growing season. Therefore, in this study, the WCM model was calibrated for three ranges of LAI per crop ( Table 3). The ranges of LAI and soil moisture associated with the data used in this study are shown in Table 3. Wheat has the largest range in LAI, followed by corn and soybeans, with the smallest LAI range reported for rice. The LAI range for soybeans, when considering sample points where soil moisture was also measured, is more limited (0.05-4.18 m 2 m −2 ). The thresholds for groupings of LAI were crop type-specific, given that development stages for each crop were the determining criteria for establishing these class limits (Table 3). After the model calibration, the Levenberg-Marquardt algorithm [42] was used to independently estimate LAI and soil moisture [11]. This algorithm needs initial values (first guesses) for LAI and soil moisture. LAI values of 1 m 2 m −2 and soil moisture values of 0.20 m 3 m −3 were used as initial values. Subsequently, the model improves these values based on an iterative algorithm. The iterations stop when the improvement of the values in two consecutive iterations is less than 10 −6 or the number of iterations reaches 400. These steps are also shown in Figure 2.

Selection of Calibration and Validation Points
To compare the performance of the WCM and SVM models, the same calibration and validation datasets were used for both models. Calibration of the WCM model requires both LAI and soil moisture data. Soil moisture measurements must be completed the same day as satellite observations, given the dynamic nature of moisture in the surface soil layer. JECAM sites that only provided LAI measurements or did not measure soil moisture the same day as the satellite observations, were considered for validation only. JECAM sites that provided both LAI and soil moisture measurements were used for the calibration and validation. When considering crop parameters like LAI, a more tolerant temporal offset (up to 3 days) is often deemed acceptable. This offset is based on the observation that crop canopies are changing more slowly than soil moisture, and thus co-incident (same day) crop and satellite data are not required for model development [15]. Based on these criteria, data from three sites (Canada, Germany and Poland) were used for calibration and validation. For each crop type and LAI range, half of the in situ measurements were randomly selected for calibration data, a larger percentage than that used in the previous studies [17,19]. The remainder of the sample points were reserved for independent validation. The total number of calibration and validation points are shown in Table 4 for each crop type.

Results
The following sections describe the results from calibrating the WCM and SVM models. These calibrated models are then applied to validation sites, and error statistics including correlation coefficient (R), root mean square error (RMSE) and mean absolute error (MAE) are generated to assess the performance of both models. Recall that JECAM sites in Canada, Germany and Poland collected not only LAI measures, but also measures of surface soil moisture.

Corn
The in situ LAI measures for corn were collected at five JECAM sites, including Canada, Germany, Poland, Ukraine, and USA-ND. For Canada, Germany, and Poland, 130 sample sites had both LAI and soil moisture measures; 65 were selected randomly and used as calibration points, the rest were retained for validation. The Ukraine JECAM site only provided LAI measurements and soil moisture measured at the USA-ND sites were not coincident with SAR satellite acquisitions. Therefore, the data from Ukraine and USA-ND were used for validation only. In total, 237 validation points were available from these five JECAM sites.
The calibration and validation results are shown in Figure 3. High accuracies are observed from both WCM and SVM models. However, the validation accuracies from the SVM algorithm (R of 0.93, RMSE of 0.64 m 2 m −2 , MAE of 0.51 m 2 m −2 ) are higher than those from the WCM model (R of 0.89, RMSE of 0.75 m 2 m −2 , MAE of 0.61 m 2 m −2 ). It is interesting to observe that both models performed well even for those sites that were not used in the calibration of the WCM and SVM (i.e., Ukraine and USA-ND). Distribution of the points reveals greater scatter for WCM compared to SVM.

Soybeans
The JECAM sites in Argentina, Canada, and USA-ND gathered in situ data for soybeans. However, soil moisture measurements coincident with the SAR acquisitions were only available at the JECAM site in Canada. As such, the calibration of the WCM and SVM models was limited to the data from Canada. This calibration was accomplished by randomly selecting 86 points from JECAM Canada; the remaining 359 points from Argentina, Canada, and USA-ND were used for validation.
Results of calibration and validation of the WCM and SVM are shown in Figure 4. Both models performed well in estimating LAI on unseen data, with slightly higher accuracies for the WCM (R of 0.90, RMSE of 0.68 m 2 m −2 and MAE of 0.52 m 2 m −2 ). Promisingly, both models were capable of providing good estimates of LAI even for JECAM sites (Argentina and USA-ND) not contributing in situ data to model calibration. In comparison, the SVM model resulted in greater scattering for the validation sites (R of 0.90, RMSE of 0.73 m 2 m −2 and MAE of 0.57 m 2 m −2 ), more specifically in mid to late growing season. Lower accuracies from SVM could be partly related to a greater dependency of this algorithm on calibration (training) data compared to the WCM model. For soybeans, the calibration was done using only one JECAM site while for corn, 3 JECAM sites contributed to the calibration.

Rice
For rice, in situ data were only available from the JECAM site in India. Given that only LAI was measured, with no field-measured soil moisture, it was not possible to calibrate the WCM. This limited calibration and validation to the SVM, but emphasized an important advantage of the SVM: the model can be implemented without ground measures of moisture. Half of the rice LAI measures were randomly selected for SVM calibration (119), leaving 120 points to validate model performance. Figure 5 provides interesting calibration-validation statistics. Correlations between estimated and observed LAI of rice are high (R of 0.96) with low errors of estimate (RMSE of 0.41 m 2 m −2 and MAE of 0.30 m 2 m −2 ). Given that the data for rice were only available for one JECAM site, the validation accuracies are very close to the calibration accuracies (R of 0.96, RMSE of 0.47 m 2 m −2 and MAE of 0.28 m 2 m −2 ). Greater scatter is observed during the early and mid growing periods, with LAI estimates later in the rice season fitting very well to the Y=X line ( Figure 5). Again, very similar distributions are observed for the calibration and validation points at low, medium, and high LAI ranges.

Wheat
The results for wheat are shown in Figure 6. Both Canadian and German teams provided in situ data of LAI and soil moisture, lending data for calibration as well as validation. In total, 60 points were randomly selected and used for calibration of the WCM and SVM, leaving 59 points for validation.
For wheat, the SVM model generated higher accuracies with R of 0.86, RMSE of 1.03 m 2 m −2 and MAE of 0.82 m 2 m −2 compared to the accuracies from the WCM model with R of 0.66, RMSE of 1.63 m 2 m −2 , MAE of 1.21 m 2 m −2 . As data from both JECAM sites were used for the calibration, the SVM modeled the LAI estimation better than the WCM model. In general, the results for wheat show more scattering compared to corn, soybean, and rice. Specifically, for mid and late growing season, the scattering is more pronounced. Additionally, for both models, for some points with different observed LAI, similar estimates were derived.

Discussion
From Figure 3, it is observed that for both WCM and SVM models, the distribution of validation points for corn generally mimics that of the calibration points. For example, at an LAI range of 1-2 m 2 m −2 , the USA-ND results were underestimated by WCM, similar to the underestimation of LAI by this model for the calibration points. Better calibration of the SVM model provided more accurate LAI estimates for independent validation points. At the LAI range of 2-5 m 2 m −2 , the calibration results show a better fit for the SVM model compared to the WCM model. For the LAI range of 5-6 m 2 m −2 , the number of calibration points is lower and as such, the model fit for the WCM is poorer relative to that observed at lower LAI. As such, more scatter is observed at higher LAI and for data collected in the Ukraine, LAI is underestimated at this period of peak canopy development. In general, over-or under-estimation during calibration resulted in a similar degree of scattering during validation, for both models. Increasing the number of calibration points and improving the distributed through the entire range of LAI will improve the calibration of both models. This improved calibration will lead to a reduction in errors of estimate especially for the WCM model that showed more scatter.
For soybean, the in situ points from Canada JECAM are in the LAI range of 0-4.2 m 2 m −2 yet some of the validation points from the Argentina JECAM site are out of this range, reaching a maximum LAI of 6.1 m 2 m −2 (Figure 4). For these validation points, where LAI is beyond the LAI range used during model calibration, both the WCM and SVM underestimated LAI. However, it is encouraging to observe that both models performed well in estimating LAI within the range of leaf area used to calibrate. As reported for corn, the distributions of the validation and calibration points are similar which indicates robust calibration of both models. For example, for the LAI range of 2-3 m 2 m −2 , the WCM underestimates LAI for both the calibration and validation points. Errors could be reduced with inclusion of additional calibration data from a larger number of JECAM sites. For the LAI range of 3-6 m 2 m −2 , the underestimation is more pronounced. Soybean canopies and LAI ranges vary from region to region. This is due not only to climatic and soil variations, but also differences in cultivars and seeding. Even within the Canadian site, farmers vary planting densities among fields, with some fields double row cropped and others planted in single rows. As such, the range of LAI for Canada does not necessarily reflect the LAI ranges for Argentina and USA-ND sites. Lower LAI is measured at the site in Canada with maximum LAI at around 4.5 m 2 m −2 In contrast, in Argentina and USA-ND, soybean canopies reach LAIs in the range of about 3-6 m 2 m −2 . This example underscores the importance of multi-site collaborations to exploit pooled data over wide ranges of crop conditions. Continuing these joint initiatives will help in the development of robust modeling.
The accuracies for wheat are lower compared to the other three crops ( Figure 6). These lower accuracies and the difficulties in estimating LAI for wheat may be partly explained by the significant structural changes that happen in small grain canopies in later growth stages, and the impact of these structural changes on SAR scattering. Wiseman et al. (2014) described in detail the changes in C-HV backscatter for the SMAPVEX12 data set as wheat phenology changes. Backscatter increases through booting and until wheat heads emerge, then levels off as anthers are formed and wheat enters its milking stage [43]. However, as the crop begins its dough stage and the grain seeds begins to infill, C-HV intensities increase significantly. The impact of these structural changes on backscatter, complicate modeling of LAI for small grain crops like wheat. Using Single-Look Complex (SLC) data with phase information might better explain different scattering mechanism from wheat and these data will be considered in a future study to evaluate their capability to estimate the biophysical parameters for wheat.
Despite the promising validation results derived for all four crops, our approach comes with some limitations when considering operational implementation. The developed models are crop-specific and therefore, in-season crop type maps should be generated in order to select the appropriate model for each field. Although, in-season crop type mapping approaches have been developed with acceptable accuracies, these products are often not as accurate as end of season crop maps [44]. The developed models are also calibrated for three LAI ranges. As such, knowledge of crop phenology would aid in the application of this method. In a recent study, McNairn et al. (2018) successfully generated crop phenology maps from SAR satellite data, and these maps could be used for model selection [45]. For the WCM, in situ soil moisture data are required for calibration and data on soil moisture are not always readily available. In this study, four sites (Argentina, India, Ukraine and USA ND) out of the seven study sites did not collect soil moisture data. Finally, as with all data-driven methods the SVM model requires a wide range of in situ and satellite data to develop a robust model that performs well over space and time. Although this experiment gathered data from seven geographically dispersed sites and over multiple seasons, further testing will be valuable to continue the assessment of an SVM approach. This future analysis will aid in determining the boundaries of model performance over regions with differing cropping systems, soil characteristics and meteorological conditions. A single approach was not taken to measure LAI in situ for all JECAM sites. Other than the USA ND team which collected direct measures of LAI via destructive sampling, other teams determined LAI via indirect measures. Four of the sites including Argentina, Canada, India, and Ukraine JECAM sites used hemispherical photos. At the Germany JECAM site, LAI was derived from PAR measurements and the Ukraine team used LAI 2200C Plant Canopy Analyzers. Despite these different approaches, calibration and validation of the LAI models using these pooled data yielded promising results. This speaks to the care taken by field crews in measuring LAI, and the robustness of the model approach. As well, discrepancies in LAI among measurement approaches are lower for crops when compared to forest land covers [46]. Of note, models calibrated using indirect measures of LAI were able to accurately estimate LAI for the U.S.A. ND site where LAI had been measured via destructive sampling. Lending additional confidence, the accuracy of LAI estimated from this JECAM study, which pooled data from multiple sites with different measurements approaches, was comparable to the accuracies reported for previous studies which involved analysis for a single site with a single LAI measurement approach [11,15]. Nevertheless, there is merit in standardizing measurement protocols although it can be challenging to do so given the diversity of sites and researchers involved in international science teams.

LAI Map
Exploiting the WCM developed in this study using C-Band SAR, and in previous studies by this research team using optical and L-band SAR, a LAI mapping tool was developed (Figure 7). Input to the tool is multi-polarization SAR data (VV and VH backscatter and an incidence angle layer) and a crop map to allocate a crop type-specific model for each field. A subset of a RADARSAT-2 image over the Canada JECAM site in Carman, Manitoba, ( Figure 1) along with the corresponding crop type map were used as inputs to the LAI mapping tool. The LAI map generated by the mapping tool is shown in Figure 7. The crops at this time of the year (mid June) were at an early leaf development growth stage and LAI values were still low. Wheat is planted earlier than other crops, 2-4 weeks sooner than corn and soybeans. As such, the model and tool estimate higher LAI for wheat fields.

Conclusions
The objective of this study was to evaluate the performance of the semi-empirical water cloud model (WCM) and the machine learning support vector machine (SVM) approaches to estimate LAI from C-Band SAR backscatter (VV and VH polarizations). To test the robustness of these methods, multiple Joint Experiment for Crop Assessment and Monitoring (JECAM) site leads were engaged. These leads gathered in situ measures of soil moisture and LAI in seven countries (Argentina, Canada, Germany, India, Poland, Ukraine, and the USA-North Dakota) for four globally important crops (corn, soybeans, wheat, and rice). More than 1100 LAI samples were used in this study. In excess of 100 C-Band SAR acquisitions from RADARSAT-2 and Sentinel-1 were acquired from 2012 to 2019.
Separate WCM and SVM models were developed for each crop type and for different crop growth stages. Data from all sites were pooled by crop type and randomly separated to provide both calibration and independent validation data sets. The models were assessed by comparing estimates of LAI from either a calibrated WCM or calibrated SVM, to measured LAI. Both models performed very well for corn with slightly higher accuracies for the SVM model (R of 0.93, RMSE of 0.64 m 2 m −2 , MAE of 0.51 m 2 m −2 ). However, when calibration points were not available from all the validation sites (e.g., soybeans) the SVM resulted in lower accuracies compared to the WCM model. This demonstrates that the WCM is less affected by the number of calibration points compared to the SVM. However, the SVM model does not require both soil moisture and LAI calibration data, but can be trained exclusively using field-measured LAI. For the WCM, ground-measured LAI and soil moisture are required to parameterize all six coefficients in this semi-empirical model. For example, for rice, although no soil moisture data were collected the SVM model could be trained on only LAI and still produce promising results (R of 0.96, RMSE of 0.41 m 2 m −2 , MAE of 0.30 m 2 m −2 ).
In this study, we calibrated the models using a wide range of in situ data from seven JE-CAM sites distributed over seven countries and promising validation results were derived from both models. Although the results demonstrated the potential of multi-polarization C-band SARs for crop LAI estimations, the models could be further tested over other different regions, years, and crops. It is through such iterative and collaborative research that robust methods are produced and passed to agencies for implementation. In situ data collection of LAI and soil moisture is expensive and therefore developing globally robust models will require substantial effort from the scientific community. Particular attention should be paid to increasing in situ and satellite data and re-calibration of these models for wheat which had fewer data in our study.