A Framework for Crop Yield Estimation and Change Detection Using Image Fusion of Microwave and Optical Satellite Dataset

: Crop yield prediction is one of the crucial components of agriculture that plays an important role in the decision-making process for sustainable agriculture. Remote sensing provides the most efﬁcient and cost-effective solution for the measurement of important agricultural parameters such as soil moisture level, but retrieval of the soil moisture contents from coarse resolution datasets, especially microwave datasets, remains a challenging task. In the present work, a machine learning-based framework is proposed to generate the enhanced resolution soil moisture products, i


Introduction
In agriculture, crop yield estimation is essential to improving productivity, food security, and the decision-making process [1]. The growing population of the world is also one of the major concerns related to food security. As per the Food and Agriculture Organization (FAO) report, there will be a 60% increment in the demand for food-to-supply ration by 2050. As per the United Nations (UN) Sustainable Development Goals (SDGs), food security and the promotion of sustainable agriculture are important to end hunger [2]. Thus, there is an urgent requirement for enhancing crop yield production to meet the current as well as future requirements of the world [3,4]. There are many factors involved in crop yield prediction analysis, but it remains challenging to develop a perfect prediction model [5,6].
As an important component of crop yield prediction, soil moisture plays a vital role in energy and water exchanges at the land surface and atmosphere interface [7]. In addition to crop yield prediction, the soil moisture parameter is also utilized in many other prediction models, such as weather forecasting, soil erosion, drought warning, and flood estimation. Therefore, continuous and accurate monitoring of soil moisture at a global level is essential to be monitored for many applications. However, continuous and reliable soil moisture measurements at the global or national level are one of the most challenging tasks [8]. Some authors highlighted the challenges in soil moisture retrieval over vegetated areas, which could be improved by utilizing change detection or image fusion-based approaches [7,9].
In the past few decades, remote sensing technology has made a significant contribution to the monitoring and management of agricultural land on a larger scale [10]. Remote sensing allows the acquisition of earth surface information in multispectral bands via optical sensors and backscattered coefficients via microwave sensors. Both sensors have their features and limitations in delivering reliable information. The optical sensors are very useful in the identification of crop diseases via the visible infrared (IR) spectral band because this region of the spectrum is very sensitive to crop vigor, damage, and stress. However, the major problem associated with optical bands included the impact of clouds on satellite imagery because optical bands cannot penetrate through clouds. In such situations, microwave sensors are very useful to acquire the earth's surface information in microwave regions with day-night capability under rainy and extreme weather conditions. To retrieve the soil moisture, various microwave sensors were reported in the literature, such as synthetic aperture radar (SAR) [11,12], Ku-band-based QuikSCAT [13,14], C-bandbased advanced microwave instrument (AMI) [15][16][17][18][19], special sensor microwave imager (SSM/I) data [20,21], L-band based soil moisture and ocean salinity (SMOS) mission [22,23], advanced microwave scanning radiometer-2 (AMSR-2) [24][25][26], and soil moisture active and passive (SMAP) [27][28][29]. However, microwave sensors face the problem of coarse resolution within the range of 25-50 km, which limits the applicability of microwave imagery.
With advanced computing models, there is the possibility of developing high-resolution soil moisture products at the global level using different remote sensing datasets. Amongst the various scatterometers [30], the scatterometer satellite (SCATSAT-1) made a significant contribution to agriculture applications [31], such as high-resolution soil moisture product development [7,32], leaf area index (LAI) estimation [33,34], paddy crop estimation [8,12,[35][36][37][38], and jute crop estimation [39]. A summary of state-of-the-art approaches for crop phenology and soil moisture studies using SCATSAT-1 is shown in Table 1. The SCATSAT-1 offers a variety of enhanced resolution (up to 2 km) operational products for different scientific domains such as agriculture, cryosphere, hydrology, and oceanography [33,[40][41][42][43][44][45][46]. Some of the authors highlighted the technical details, preprocessing, and calibration/validation of the SCATSAT-1 dataset [47][48][49]. To compensate for the lack of high spatial resolution remote sensing images, the fusion of SCATSAT-1 with MODIS data via machine learning models allows the finer resolution of soil moisture products. The daily-based enhanced resolution products can be utilized in the identification of different crops, the assessment of crop conditions, and the estimation of crop yields. Accurate predictions of crop yields are essential to farmers' production plans and the various policy decisions related to trading and food security.
The main focus of this article is to generate the enhanced resolution soil moisture products and also generate the change maps to analyze the variations between soil moisture classified maps. Therefore, the objectives included: (a) fusion of the optical-based MODIS dataset and microwave-based SCATSAT-1 dataset; (b) develop a framework based on NNIF and ANN to generate the soil moisture classified maps; (c) generating post-classificationbased change detection (PCCD) based change maps for accurate crop yield change products for crop yield; (d) analysis the impact of the proposed framework on different SCATSAT-1 parameters, i.e., σ • -HH (sigma-naught at horizontal-transmit and horizontal-receive polarization), σ • -VV (sigma-naught at vertical-transmit and vertical-receive polarization), γ • -HH (gamma-naught at horizontal-transmit and horizontal-receive polarization), and γ • -VV (gamma-naught at vertical-transmit and vertical-receive polarization); and (e) comparing the performance of the proposed framework with random forest post-classification-based change detection (RFPCD) using various performance metrics.

Regression Model Ratio
More external data is needed to estimate crop yield.
The accuracy of crop yield is more than 95%.

Data Classification
No training data is required. More precision is required. Jute crop yield [39] 1 Soil moisture estimation model-1; 2 sigma-nough at horizontal-transmit and horizontal-receive; 3 sigma-nough at vertical-transmit and vertical-receive; 4 modified water cloud model; 5 normalized difference vegetation index; 6 soil moisture estimation model-2; 7 vegetation temperature condition index; 8 water cloud model; 9 Oveisgharan model; 10 paddy crop estimation; 11 crop yield estimation; 12 moderate resolution imaging spectroradiometer; 13 water index; 14 rice grain yield estimation; 15 food and agriculture organization corporate statistical database; 16 Jute crop estimation.

Study Location and Satellite Dataset
Punjab State, India, has been selected as the study location with geographical coordinates of 29 • 0 0 -33 • 0 0 N and 73 • 0 0 -77 • 0 0 E ( Figure 1).  This Indian state made a significant contribution to food grain production and agriculture development and was also the pioneer in Indiaʹs ʺgreen revolution.ʺ The major crops of the region included barley, wheat, rice, maize, and sugarcane. As per the national statistics, Punjab state contributed 29% of rice and 38% of wheat during the year 2016-2017, making India self-reliant in food production. The satellite dataset was acquired on three different dates, i.e., 20 November 2019, 20 December 2019, and 20 January 2020, from two different satellite sensors. The optical-based NASA's MODIS data were acquired from Level-1 and the Atmosphere Archive and Distribution System Distributed Active Archive This Indian state made a significant contribution to food grain production and agriculture development and was also the pioneer in India's "green revolution." The major crops of the region included barley, wheat, rice, maize, and sugarcane. As per the national  This Indian state made a significant contribution to food grain production and agri culture development and was also the pioneer in Indiaʹs ʺgreen revolution.ʺ The major crops of the region included barley, wheat, rice, maize, and sugarcane. As per the nationa statistics, Punjab state contributed 29% of rice and 38% of wheat during the year 2016-2017, making India self-reliant in food production. The satellite dataset was acquired on three different dates, i.e., 20 November 2019, 20 December 2019, and 20 January 2020, from two different satellite sensors. The optical-based NASA's MODIS data were acquired from Level-1 and the Atmosphere Archive and Distribution System Distributed Active Archive Center (LAADS DAAC) online web portal (https://ladsweb.modaps.eosdis.nasa.gov/) ac cessed on 20 November 2022 (See Figure 2). Additionally, microwave ISRO's SCATSAT-1 (Level-4) data were acquired from the Meteorological and Oceanographic Satellite Data Archival Centre (MOSDAC) online portal (https://www.mosdac.gov.in/) accessed on 20 November 2022 (See Figure 3).

Methodology
From Figure 4, the methodology of the proposed framework included: (a) preprocessing of the input dataset; (b) NNIF-based fusion of optical (MODIS) and microwave (SCATSAT-1) datasets; (c) PCCD using ANN; and (d) validation of classified and change maps using the SMAP dataset.

Methodology
From Figure

Methodology
From Figure 4, the methodology of the proposed framework included: (a) preprocessing of the input dataset; (b) NNIF-based fusion of optical (MODIS) and microwave (SCATSAT-1) datasets; (c) PCCD using ANN; and (d) validation of classified and change maps using the SMAP dataset.

Preprocessing of Optical and Microwave Dataset
The SCATSAT-1 measures the Earth's surface information in THE form of backscattered coefficients, i.e., sigma-naught (σ • ) and gamma-naught (γ • ). Both the backscattered coefficients offer information in two different polarization modes, i.e., HH and VV. The SCATSAT-1 level-4 India product is available at an enhanced resolution of 2 km, while MODIS data are available at a spatial resolution of 500 m (1-7 bands). Therefore, both datasets need to be resampled beforehand at the same resolution of 500 m for the fusion process. To resample the input dataset, the nearest-neighbor resampling method was utilized, in which each pixel in the resampled image acquires the same value as its nearestneighbor pixel value in the original image. It is noteworthy that in the present work, all the backscattered coefficients along with different polarization modes have been considered to generate the soil moisture classification and change maps.

Nearest-Neighbor-Based Image Fusion (NNIF)
After the preprocessing of optical and microwave datasets, both images were fused using the nearest-neighbor-based image fusion (NNIF) algorithm [50]. There are two main objectives of the fusion of microwave and optical datasets. The first is to enhance the resolution of the input dataset, and the second is to integrate the features of the microwave dataset with optical datasets [51]. However, there are many challenges involved in the fusion of the microwave and optical datasets, such as spectral distortion in the optical dataset and different atmospheric conditions and periods in the acquisition processes of both datasets [52]. However, various fusion methods were reported in the literature to fuse the optical and microwave datasets, such as brovey transformation (BT) [53], Gram-Schmidt (GS) [54], principal component analysis (PCA) [55], intensity hue saturation (IHS) [56], Ehler's transformation (ET) [57], wavelet principal component analysis (WPCA) [58], and many more [59][60][61][62][63][64][65]. As per the previous literature, the NNIF algorithm is best suited for the fusion of scatterometer and MODIS datasets as compared to well-defined fusion techniques such as BT, GS, and ET. Therefore, in the present work, we have implemented the NNIF to fuse the SCATSAT-1 and MODIS data for the retrieval of soil moisture maps. To implement the NNIF, both datasets must be accurately geo-registered and resampled at the same resolution to avoid the problem of misalignment due to the multisensory dataset. Once the prerequisites are accomplished, the difference factor of the nearest-neighbor is estimated as follows: where parameters, i.e., Ω j (x, y), represented the region of nearest-neighbor pixels (p, q) in multispectral data, and P(x, y), represented the region of the pixel in microwave data. The parameter b is the number of spectral bands. Afterward, the datasets are fused using the following equation: where the parameters k(x, y) represented the normalized factor; σ and σ s representing the intensity smoothness and spatial smoothness factors, respectively. The parameters M(u, v; x, y, j) represented the spectrum vector of nearest neighbor pixels (u, v). The parameters exp and exp represented the similarity measure and spatial closeness measure of its neighboring pixels, respectively. The term T is the spectral photometric contribution vector.

Post-Classification-Based Change Detection (PCCD) Using ANN
To generate the change maps, the post-classification change detection (PCCD) technique has been followed [66]. This approach has been implemented in two different steps, i.e., classification and change detection. Initially, the fused dataset was classified using an ANN-based classifier to classify the different levels of soil moisture in satellite imagery [67]. To process the ANN, the network parameters are selected as logistic activation function, training threshold (0.82 value), training rate (0.20 value), training momentum (0.70 value), RMS exit criteria (0.1), six input nodes (MODIS bands 1-4, 6 and 7; it is noted that the 5th band was removed due to strip error), three output nodes (low, mid and high values of soil moisture), and iterations (800). The ANN is one of the options for handling complex patterns and prediction problems due to its flexible approach and unique approximation potential to capture complex nonlinear behaviors. Afterward, the classified dataset of multitemporal dates is processed via post-classification comparison to generate the soil moisture change maps. The PCCD approach takes the advantages of straightforwardness and simplicity and removes the requirement of strict radiometric errors. However, this technique may face the problem the classification errors.

Validation and Cross-Referencing
Once the classification and change maps are generated, there is the requirement of validating the outcomes with respect to the existing data sources to understand the applicability of the generated data. Therefore, the accuracy assessment was conducted and computed for each class as well as the change map. The important parameters of the accuracy assessment included: (a) producer's accuracy (PA); (b) user's accuracy; (c) omission error (OE); (d) commission error (CE); (e) overall accuracy (OA); and (f) kappa coefficient [68]. In the accuracy assessment procedure, more than 250 samples have been selected for each class category using a stratified random sampling procedure [69][70][71]. To validate the outcomes, SMAP-enhanced Level-2 radiometer surface soil moisture (derived from SMAP Level-1B) data has been acquired at a resolution of 9 km from the online web portal (https://search.earthdata.nasa.gov/, accessed on 20 November 2022). The SMAP delivers the soil moisture and freeze/thaw state from space for all non-liquid water surfaces globally within the top layer of the Earth.
Moreover, the outcomes of the proposed framework have also been compared with the well-known random forest post-classification-based change detection (RFPCD). As a powerful and versatile supervised machine learning algorithm, the random forest is Quaternary 2023, 6, 28 7 of 16 also known as the random decision forest. It operates by constructing a multitude of decision trees on various subsets of the given dataset and taking the average to improve the predictive accuracy. The random forest-based classified multitemporal input datasets are compared together to generate the change maps. This method is very commonly used for handling complex or big data problems. Nonetheless, the major problem associated with RFPCD is that due to a large number of trees, the algorithms become slower and less efficient in handling real-time scenarios.

Results and Discussion
To assess the performance of the proposed framework, the qualitative (visual) and quantitative were computed. To explore the potential of the SCATSAT-1 dataset, all the parameters of SCATSAT-1 (Level-4) have been considered i.e., σ • -HH, σ • -VV, γ • -HH, and γ • -VV. Moreover, the comparative analysis of the proposed framework has also been performed with a well-defined RFPCD algorithm with respect to various SCATSAT-1 parameters (σ • -HH, σ • -VV, γ • -HH, and γ • -VV). The NNIF allows the fusion of microwave-based SCATSAT-1 (level 4) and optical-based MODIS (MOD02) images as shown in Figure 5a   To generate the change maps from PCCD, the fused datasets of multitemporal inputs are classified using ANN. ANN generally generates two types of datasets, i.e., rule maps and classified maps, as shown in Figures 6 and 7, respectively. The classified maps are actual outcomes, but if the outcomes are not satisfactory, then rule maps can be used to regenerate the classified outcomes without reperforming the classification. Figure 6 represents the ANN rule maps generated from the fused dataset (SCATSAT-1 and MODIS) for (a-d) 20 November 2019; (e-h) 20 December 2019; and (i-l) 20 January 2020 using different parameters, i.e., σ°-HH, σ°-VV, γ°-HH, and γ°-VV. In the rule maps, each class category is represented in grayscale, and a multiclass image can be visualized by putting To generate the change maps from PCCD, the fused datasets of multitemporal inputs are classified using ANN. ANN generally generates two types of datasets, i.e., rule maps and classified maps, as shown in Figures 6 and 7, respectively. The classified maps are actual outcomes, but if the outcomes are not satisfactory, then rule maps can be used to regenerate the classified outcomes without reperforming the classification. Figure 6 represents the ANN rule maps generated from the fused dataset (SCATSAT-1 and MODIS) for (a-d) 20 November 2019; (e-h) 20 December 2019; and (i-l) 20 January 2020 using different parameters, i.e., σ • -HH, σ • -VV, γ • -HH, and γ • -VV. In the rule maps, each class category is represented in grayscale, and a multiclass image can be visualized by putting each class in different RGB (red, green, and blue) planes. In Figure 6, the RGB planes carried different information (i.e., red: high level of soil moisture, blue: mid-level of soil moisture, green: high level of soil moisture). Figure 7   Afterward, multitemporal change maps have been generated from the fused classified dataset using the PCCD approach, as shown in Figure 8. The multitemporal change maps represent the variations in moisture level either in the positive direction, i.e., increment in soil moisture (represented with green color), or the negative direction, i.e., decrement in soil moisture (represented with maroon color). If the value is equal to zero, then no change has been observed between two multitemporal dates.  Afterward, multitemporal change maps have been generated from the fused classified dataset using the PCCD approach, as shown in Figure 8. The multitemporal change maps represent the variations in moisture level either in the positive direction, i.e., increment in soil moisture (represented with green color), or the negative direction, i.e., decrement in soil moisture (represented with maroon color). If the value is equal to zero, then no change has been observed between two multitemporal dates. To confirm the effectiveness of PCCD, a comparative analysis has also been performed with the RFPCD algorithm. It must be noted that the RFPCD algorithm has been implemented on the fused dataset (SCATSAT-1 and MODIS) for (a-d) 20 November 2019; (e-h) 20 December 2019; and (i-l) 20 January 2020 using different parameters, i.e., σ°-HH, σ°-VV, γ°-HH, and γ°-VV as shown in Figure 9. Afterward, multitemporal change maps have been generated from the fused classified dataset using the RFPCD approach, as shown in Figure 10. To confirm the effectiveness of PCCD, a comparative analysis has also been performed with the RFPCD algorithm. It must be noted that the RFPCD algorithm has been implemented on the fused dataset (SCATSAT-1 and MODIS) for (a-d) 20 November 2019; (e-h) 20 December 2019; and (i-l) 20 January 2020 using different parameters, i.e., σ • -HH, σ • -VV, γ • -HH, and γ • -VV as shown in Figure 9. Afterward, multitemporal change maps have been generated from the fused classified dataset using the RFPCD approach, as shown in Figure 10.  The quantitative analysis allows us to judge the effectiveness of our proposed technique statistically, which is better than visual interpretation. Therefore, accuracy  The quantitative analysis allows us to judge the effectiveness of our proposed technique statistically, which is better than visual interpretation. Therefore, accuracy The quantitative analysis allows us to judge the effectiveness of our proposed technique statistically, which is better than visual interpretation. Therefore, accuracy assessments have been computed for each classified and changed map. Tables 2 and 3 represent the accuracy assessment of classified maps and change maps, respectively, computed from ANN-PCCD. From Table 2, it has been seen that on all dates, the parameter σ • -HH achieved better accuracy (on 20 November 2019, OA = 94.92, kappa = 0.9234; on 20 December 2019, OA = 92.97, kappa = 0.8939; and on 20 January 2020, OA = 94.14, kappa = 0.9116) as compared to other SCATSAT-1 parameters (i.e., σ • -VV, γ • -HH, and γ • -VV). Moreover, it has also been apparent that for all parameters, more than 90.23% overall accuracy and less than 12.79% error have been observed. These outcomes may be satisfactory enough to generate the change maps.  From the outcomes of ANN-PCCD change maps (Table 3), it has been seen that more than 88% accuracy has been achieved in change maps computed using different SCATSAT-1 parameters, i.e., σ • -HH, σ • -VV, γ • -HH, and γ • -VV. However, the parameter σ • -HH achieved marginally better accuracy (in November 2019-December 2019, OA = 91.80%, kappa = 0.8997; and in December 2019-January 2020, OA = 88.67%, kappa = 0.8597) as compared to other SCATSAT-1 parameters (i.e., σ • -VV, γ • -HH, and γ • -VV).
For the comparative analysis, the accuracy assessment has also been computed for RFPCD classified and change maps, as shown in Tables 4 and 5, respectively. From the classified outcomes of classified maps (Table 4), more than 90% accuracy has been achieved with the SCATSAT-1 σ • -HH parameter as compared to other SCATSAT-1 parameters, i.e., σ • -VV, γ • -HH, and γ • -VV. From the change map outcomes of classified maps (Table 4), marginally better accuracy (86.80-87.60%) has been achieved with the SCATSAT-1 σ • -HH parameter as compared to other SCATSAT-1 parameters, i.e., σ • -VV, γ • -HH, and γ • -VV.  The variability in climate is continuously reducing soil moisture and decreasing crop yield. As an essential part of and an indicator of crop yield, the soil moisture level is essential to be monitored continuously and accurately at the global level for planned food production. The proposed framework allows the production of enhanced-resolution soil moisture products using a multisensory remote sensing dataset. Due to the potential of the microwave dataset for penetration through the clouds and its sensitivity towards the water contents within the soil, active microwave-based SCATSAT-1 is very useful in the real-time estimation of soil moisture. From the comparative analysis (Tables 2-5), it is apparent that ANN-PCCD performed well enough not only in classifying outcomes but also in generating change maps as compared to the RFPCD algorithm.
Moreover, the ANN-PCCD has also controlled the error rate (OE and CE) to a great extent as compared to the RFPCD algorithm. However, it still needs to be improved with the incorporation of advanced methods or the fusion of high spatial resolution on a larger scale. As far as the different parameters of SCATSAT-1 are concerned, the γ • as a normalized form of the radar backscattered coefficient (σ • ) may overcome the range-dependency issues in SCATSAT-1. Nevertheless, in the present work, better outcomes have been received with σ • . As a major characteristic of an electromagnetic (EM) signal, polarization (HH or VV) highlights the different features of the Earth's surface and is highly dependent on structural variation or surface roughness [72]. In the present work, the soil moisture products were generated marginally better with HH polarization as compared to VV polarization.
With the proposed framework, an enhanced resolution of SCATSAT-1 soil moisture products can be achieved through the fusion of MODIS and SCATSAT-1 datasets. The major advantages of both datasets included: (a) free accessibility; (b) daily data delivery; and (c) global-level coverage [53]. Therefore, this combination of multisensory fusion enhances the applicability of the proposed framework for the effective measurement of soil moisture contents. Further, it also helps in delivering essential information on the growing crops and their environment, allowing the farmers to understand the adequate requirement of irrigation in the crop yield. The PCCD is straightforward in processing the fused dataset and also avoids radiometric errors. However, multisensory fusion creates many problems, such as spatial/spectral distortion, multiplicative speckle noise, and improper registration [52]. Therefore, future work may include the incorporation of deep learning, data mining, and big data processing.

Conclusions
This work presents a framework based on the integration of NNIF and ANN-PCCD for crop yield estimation using optical-based MODIS and microwave-based SCATSAT-1. In this study, various parameters of SCATSAT-1, i.e., σ • -HH, σ • -VV, γ • -HH, and γ • -VV, have been demonstrated for the effective retrieval of soil moisture. Moreover, the outcomes of the proposed framework have also been compared with those of a well-defined RFPCD. The experimental outcomes confirm the effectiveness of the proposed framework in the production of enhanced-resolution soil moisture operational classified maps (more than 90% overall accuracy) as well as change maps (more than 88% overall accuracy). However, the commission and omission errors are still high in the change map production, which may need to be addressed via more advanced models in the feature extraction and data representation. However, the incorporation of deep neural networks with high spatial resolution datasets may allow an improvement in the commission and omission errors. The daily-based enhanced-resolution soil moisture products allow farmers to address the emerging challenges in food security, particularly in crop yield prediction. This study also highlights the crucial role of multisensory remote sensing datasets for crop monitoring and yield prediction.