An OVR-FWP-RF Machine Learning Algorithm for Identification of Abandoned Farmland in Hilly Areas Using Multispectral Remote Sensing Data

Wang, Liangsong; Li, Qian; Wang, Youhan; Zeng, Kun; Wang, Haiying

doi:10.3390/su16156443

Open AccessArticle

An OVR-FWP-RF Machine Learning Algorithm for Identification of Abandoned Farmland in Hilly Areas Using Multispectral Remote Sensing Data

by

Liangsong Wang

^1,2,

Qian Li

^1,3,

Youhan Wang

^1,4,*,

Kun Zeng

^1,4 and

Haiying Wang

^1,4

¹

The Engineering Laboratory of Land and Resources Utilization in Hilly Areas, China West Normal University, Nanchong 637009, China

²

College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541004, China

³

Business School, China West Normal University, Nanchong 637009, China

⁴

School of Geographical Sciences, China West Normal University, Nanchong 637009, China

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(15), 6443; https://doi.org/10.3390/su16156443

Submission received: 19 June 2024 / Revised: 14 July 2024 / Accepted: 23 July 2024 / Published: 27 July 2024

Download

Browse Figures

Versions Notes

Abstract

Serious farmland abandonment in hilly areas, and the resolution of commonly used satellite-borne remote sensing images are insufficient to meet the needs of identifying abandoned farmland in such regions. Furthermore, addressing the problem of identifying abandoned farmland in hilly areas with a certain level of accuracy is a crucial issue in the research of extracting information on abandoned farmland patches from remote sensing images. Taking a typical hilly village as an example, this study utilizes airborne multispectral remote sensing images, incorporating various feature factors such as spectral characteristics and texture features. Aiming at the issue of identifying abandoned farmland in hilly areas, a method for extracting abandoned farmland based on the OVR-FWP-RF algorithm is proposed. Furthermore, two machine learning algorithms, Random Forest (RF) and XGBoost, are also utilized for comparison. The results indicate that the overall accuracy (OA) of the OVR-FWP-RF, Random Forest, and XGboost classification algorithms have reached 92.66%, 90.55%, and 90.75%, respectively, with corresponding Kappa coefficients of 0.9064, 0.8796, and 0.8824. Therefore, by combining spectral features, texture features, and vegetation factors, the use of machine learning methods can improve the accuracy of identifying ground objects. Moreover, the OVR-FWP-RF algorithm outperforms the Random Forest and XGboost. Specifically, when using the OVR-FWP-RF algorithm to identify abandoned farmland, its producer accuracy (PA) is 3.22% and 0.71% higher than Random Forest and XGboost, respectively, while the user accuracy (UA) is also 5.27% and 6.68% higher, respectively. Therefore, OVR-FWP-RF can significantly improve the accuracy of abandoned farmland identification and other land use type recognition in hilly areas, providing a new method for abandoned farmland identification and other land type classification in hilly areas, as well as a useful reference for abandoned farmland identification research in other similar areas.

Keywords:

hilly areas; abandoned farmland; airborne multispectral remote sensing imagery; machine learning

1. Introduction

Abandoned farmland is an important aspect of land use and land cover research [1], farmland abandonment is becoming increasingly severe in China due to the loss of the rural labor force, accelerated industrialization, and the intensification of natural factors [2]. This phenomenon not only threatens national food security but also undermines the stability of agriculture [3]. In hilly regions, the complexity of the terrain and the fragmentation and dispersion of land parcels lead to uneven distribution of abandoned farmland and difficulties in effective identification, further compounding the issue. Therefore, accurately assessing the status of abandoned farmland is crucial for formulating effective land-use strategies and optimizing the allocation of agricultural resources.

Field surveys are the traditional method for identifying abandoned farmland, but this approach, consumes both time and a significant amount of labor, making it difficult to promote on a large scale and challenging to obtain complete survey data for a specific region. With the widespread adoption of remote sensing technology, notable advancements have been made in temporal, spatial, and spectral resolutions, the identification and deep exploration of remote sensing images into a prevalent and effective approach for monitoring abandoned farmland [4]. Baumann et al. [5] utilized a support vector machine classification model to classify Landsat data from 1986 to 2008, extracting abandoned farmland in western Ukraine. Meanwhile, Kuemmerle et al. [6] employed Landsat TM/ETM+ imagery to produce a land use type map of southern Romania, identifying the spatial distribution of abandoned land from 1990 to 2005, revealing an abandonment rate of 21.1%. Higher-resolution remote sensing imagery can offer quicker and more potent technical avenues for investigating and quantifying the extent and amount of abandoned farmland [7]. In the work of using remote sensing technology to extract abandoned farmland, scholars such as Jiqiu Deng [8], Baumann [5], Kuemmerle [6], and others have adopted satellite-borne remote sensing images and combined spectral features to identify abandoned farmland. Their research mainly focuses on plain areas or regions with large and concentrated plots. However, in hilly areas, where a large number of farmland plots have widths less than 2 m, the plots are fragmented, and the farmland patterns often present irregular shapes such as “stripes”, the use of satellite-borne medium and low-resolution imagery fails to meet the requirements in hilly areas.

In recent years, machine learning algorithms have been widely adopted and practiced in the fields of remote sensing target recognition and thematic information extraction owing to their excellent ability to integrate limited training data and adapt to complex regions [9]. Wu et al. [10]. utilized global coverage products and points of interest data to analyze the major driving factors of urban land use change based on the Random Fores algorithm; Rodriguez-Galiano et al. [11] studied the application performance of the Random Forest (RF) algorithm in identifying land use types within complex regions; Ge et al. [12] evaluated the performance of four machine learning algorithms in land use classification in the mosaic landscape of oasis and desert; Xiao Xiangwen et al. [13] compared the accuracy of machine learning algorithms in the identification of Arctic icebergs. However, traditional machine learning algorithms have the limitation of relying solely on raw image data. It is worth thinking about how to effectively utilize and deeply mine the rich information carried by remote sensing images to improve image classification accuracy and information extraction precision.

In summary, most scholarly research on abandoned farmland extraction has primarily relied on satellite-borne, remote-sensing imagery. However, given the shape and size of cultivated land in hilly areas, the use of satellite-borne remote sensing imagery is insufficient to meet the needs of identifying abandoned farmland in these regions. Furthermore, many scholars have neglected the varying degrees of impact that various feature factors have on the identification of abandoned farmland, and some of these factors can even inhibit classification accuracy. Therefore, the study proposes an OVR-FWP-RF algorithm, which builds upon the traditional Random Forest classification algorithm. This algorithm introduces OneVsRest to transform the multi-class classification problem into a binary classification problem and employs a weighted average algorithm to select the optimal feature factors suitable for identifying abandoned farmland. By utilizing multi-feature extraction and feature optimization, the algorithm improves the availability of features for machine learning while reducing potential interference caused by feature redundancy, achieving higher classifier performance and improved accuracy in land feature extraction. The algorithm also analyzes the degree of influence that different feature factors have on the identification of abandoned farmland and other land features. This study employed airborne multispectral remote sensing imagery, combined with multiple feature factors such as spectral features, vegetation indices, and texture features. It employed the OVR-FWP-RF algorithm, as well as Random Forest (RF) and XGboost algorithms, to extract abandoned farmland and conducted a comparative analysis. The research findings are of significant importance to land monitoring, land management, and ecological protection.

2. Materials and Methods

The research first conducts preprocessing of the field-collected data, including image stitching, fusion, cropping, etc. Secondly, spectral features, texture features, and vegetation indices are extracted from the preprocessed images, and these feature factors are then fused into a single remote-sensing image with multiple feature factors. Then, the three machine learning algorithms of OVR-FWP-RF, traditional Random Forest, and XGboost were used to extract abandoned farmland and other land types, and precision evaluation was conducted. The research roadmap is shown in Figure 1.

2.1. Summary of the Research Area and Data Sources

2.1.1. Summary of the Research Area

The research area is located in Tanshanpu Village, which is under the jurisdiction of Xinfu Township, Shunqing District, Nanchong City. The region is located in the hilly areas of northeast Sichuan, representing a typical shallow hill landform type. The region has a width of 1.3 km from east to west and a length of 1.6 km from north to south, with a total area reaching 1.805 square kilometers. The terrain of the entire region is unique, surrounded by mountains on three sides, with the mountain ridge serving as a natural boundary. The land gradually descends from north to south. The geographical location of the study area is shown in Figure 2.

Based on field investigations, the research area boasts a diverse range of cultivated crops, with wheat, rapeseed, rice, and corn being the dominant ones. Most of these crops are planted around April and have a growth cycle from May to October. Rapeseed is planted from mid-September to mid-October and harvested from late April to mid-May of the following year. Wheat is planted from mid-October to mid-November and harvested from late April to mid-May of the following year. Rice is mainly planted from mid-April to mid-May and harvested from early October to mid-November of the same year. Corn is mainly planted around the Grain Rain period and harvested around the Cold Dew period.

2.1.2. Data Source

On 24 April 2024, a Feima drone carrying a 6-channel multispectral camera with a D-MSPC2000 payload was used to acquire multispectral image data of the study area. The payload supports the capture of red, green, blue, near-infrared (NIR) spectra, red edge 1, and red edge 2, providing rich spectral information. The details of each spectral band are shown in Table 1. The acquisition process of multispectral image data involves several key stages. Firstly, the pre-flight preparation stage encompasses equipment selection and inspection, along with the plotting of flight routes and the setting of flight parameters. Secondly, the actual flight data acquisition stage follows. Lastly, the data processing and analysis stage is where software such as Pix4D_V1.0 is employed to conduct pre-processing tasks like image denoising, correction, and stitching. Concurrently, a field survey is conducted to investigate the abandonment status, crop types planted in the farmland, and other relevant information. This study utilized a combination of visual interpretation and field surveys to select 1071 representative samples, including 222 woodland samples, 83 water samples, 242 construction land samples (including roads, houses, etc.), 279 non-abandoned farmland samples, and 245 abandoned farmland samples.

2.2. Feature Extraction

This study utilizes ENVI5.3 software for principal component analysis and texture feature extraction. With Python 3.8, PyCharm serves as the scripting development platform for calculating vegetation indices, extracting sample features, constructing models, and performing classification. Following this, ArcGIS 10.7 software is used for post-classification processing and mapping. After preprocessing, spectral feature factors and texture feature factors are extracted from the multispectral image data, and vegetation feature factors are calculated. Subsequently, these feature factors are integrated into the preprocessed images for further analysis or application. The initially selected feature factors are shown in Table 2.

2.2.1. Spectral Feature Factor

The spectral characteristics of an image are obtained through the measurement and analysis of radiant energy in different bands and are widely used in the analysis and evaluation of remote-sensing images as the physical basis for object interpretation and classification. In this research, red, green, blue, near-infrared (NIR), red edge 1, and red edge 2 bands of the multispectral imagery were selected as the initial feature factors. These bands have different sensitivities to different ground objects, providing rich spectral information that can be used to distinguish and classify ground objects. By analyzing and extracting these spectral features, more information about ground objects can be obtained, which can be further used for the interpretation and classification of ground objects.

2.2.2. Vegetation Index Feature Factor

There are significant differences in phenological characteristics between natural vegetation and crops. Vegetation index directly reflects phenological characteristics and can be used as an important basis for land cover classification. Therefore, this study utilizes the information from six bands of airborne multispectral imagery to calculate Normalized Difference Vegetation Index (NDVI) [14], Normalized Difference Red Edge Index (NDRE: NDRE_1, NDRE_2) [15], Green Normalized Difference Vegetation Index (GNDVI) [16], Enhanced Vegetation Index (EVI), and Ratio Vegetation Index (RVI) [17], and integrates the multispectral imagery. The calculation formulas are shown in Table 3.

2.2.3. Texture Feature Factor

Texture features can fully utilize the advantages of high resolution in remote sensing images to enhance classification precision. The Gray-Level Co-occurrence Matrix (GLCM), in short, is a powerful tool for image texture analysis. It reveals the texture features of an image by statistically analyzing the spatial positional relationships between pairs of pixels with different gray levels. When describing the phenomena of “same object with different spectra” and “different objects with similar spectra”, the advantages of GLCM are particularly pronounced [18]. Therefore, this study adopts a combined approach of Principal Component Analysis (PCA) and Gray Level Cooccurrence Matrix to extract texture features. After PCA processing, it was found that the first principal component in the image contains over 94.65% of the information (The results of the PCA are shown in Table 4), while other principal components contain less information. Therefore, in order to streamline the data processing procedure and enhance efficiency, this study only extracts the texture features of the first band as modeling factors. The eight texture features of the first principal component are: Mean, Variance, Homogeneity, Contrast, Dissimilarity, Entropy, Second moment, and correlation [19].

2.3. Machine Learning Model Construction

Machine learning algorithms are resilient and inclusive, capable of handling strongly correlated variables. They do not impose strict requirements on the relationship between predictor variables and dependent variables, nor on the data distribution. There are no restrictions on the type and quantity of variables, and they can achieve relatively high estimation accuracy. The aim of this study is to use machine learning algorithms combined with feature factors to identify abandoned farmland. To optimize the classification results, we have grouped some smaller patches into larger categories using the Majority Analysis tool [20].

2.3.1. Random Forest Algorithm

The Random Forest classification algorithm was introduced by Leo Breiman in 2001. Its principle combines the idea of “bagging ensemble learning” with the random subspace method [21]. Random Forest constructs bagging ensembles based on decision trees as the base learners, and during the training process of the decision trees, it adds random attribute selection to increase the diversity among the classification models, thereby enhancing the generalization ability and predictive power of the model. In this study, the main parameter settings for the algorithm are n_estimators = 100 and max_depth = None.

2.3.2. Feature-Weighted Preference for OneVsRest-RF

The goal of the OneVsRest classifier is to model and predict multi-class classification problems by training multiple binary classifiers in a one-vs-all manner. During training, the samples of a certain category are grouped into one class, while the remaining samples are grouped into another class. This way, k random forest (RF) is constructed for k categories of samples. Feature importance analysis is an additional step in the OneVsRest classifier, which is used to evaluate the degree of contribution or influence of each feature in the model’s classification decisions. This article proposes a three-step implementation of the feature-weighted preference for the OneVsRest-RF algorithm. Firstly, the OneVsRest classifier is employed to transform the multi-class classification problem into binary classification problems, and the importance scores of each feature factor for each land type are computed. Secondly, a weighted processing of the importance scores of the feature factors is performed. Using Equation (1), the weighted average method is applied to the same feature factor across different land types to obtain a weighted comprehensive importance score for each land type’s feature factor. By comprehensively considering and ranking the importance of features, we iteratively add feature variables into the model one by one until we find the optimal combination of feature numbers. This process helps us to determine the key feature factors used for classification. Finally, utilizing the Random Forest classifier, the fused imagery of the study area is classified, and abandoned farmland is extracted. This method faces potential limitations such as the risk of overfitting (especially in high-dimensional data and with improper feature weight allocation) and challenges posed by class imbalance. Future research can focus on optimizing feature weight allocation algorithms, enhancing model interpretability, and exploring applications in different regions to further improve the performance and practicality of OVR-FWP-RF.

X_{i} = (x_{1 i} f_{1} + x_{2 i} f_{2} + \dots + x_{j i} f_{j}) / N

(1)

N = f_{1 i} + f_{2 i} + \dots + f_{j i}

(2)

In the formula,

X_{i}

represents the weighted comprehensive score of the importance of feature factor type I,

x_{j i}

represents the importance score of the feature factor of type i for the j class, and

f_{j}

represents the weight of the feature factor for the j class. Due to the focus of the study on identifying abandoned land, the weight of the feature factors related to abandoned farmland is set to 2, while the weights of other types of feature factors are set to 1. Additionally, the main parameter settings include n_estimators = 100 and max_depth = None. The principle of OVR-FWP-RF is illustrated in Figure 3.

The sensitivity of the model to data noise and outliers. Random Forest reduces the sensitivity of a single decision tree to noise and outliers by integrating multiple decision tree models, each of which is constructed based on stochastically selected samples and features, and this randomness helps to reduce the model’s dependence on specific noise or outliers [22]. OVR-FWP-RF, which introduces feature weighting and an OVR (One-Versus-Rest) classifier on top of Random Forest, thus inherits Random Forest’s characteristic of having a certain degree of resistance to noise and outliers. Furthermore, through feature weighting, OVR-FWP-RF is able to mitigate the impact of data redundancy.

The robustness of the model under different environmental conditions. When handling multi-class classification problems, OVR-FWP-RF’s One-Vs-Rest strategy enhances the model’s robustness by decomposing the problem into multiple binary classification tasks. However, when the number of classes is extremely large or the discriminability between classes is low, the model’s performance may suffer. Proper feature weighting can potentially further improve the model’s robustness under varying environmental conditions. Nevertheless, it is important to be cautious as inappropriate weights may lead to the model’s excessive reliance on important features or misdirected attention towards irrelevant features. Additionally, this algorithm inherits Random Forest’s (RF) capabilities in managing high-dimensional data, resisting overfitting, and leveraging parallel computing power [22,23].

2.3.3. XGBoost Algorithm

The XGBoost algorithm is an improvement on GBDT (Gradient Boosting Decision Tree) [24]. By continuously iterating and optimizing the decision tree structure, this algorithm achieves an in-depth analysis and precise classification of data. Meanwhile, XGBoost also possesses excellent parallel processing capabilities, which can fully utilize computing resources to improve training speed. Furthermore, the algorithm also provides a wealth of parameter tuning options, which can be flexibly configured according to different datasets and task requirements to achieve optimal classification results. Therefore, the algorithm is widely used in various classification problems, providing powerful support for data analysis and decision-making. In this study, the main parameter settings for the algorithm are n_estimators = 100, max_depth = 6, and multi:softmax.

2.4. Accuracy Evaluation

During the research process, we divided the sample data into two parts, with 80% of the data used for training the model and the remaining 20% used for validating the mode. When evaluating the performance of the model on the test set, we use accuracy as the measurement criterion. Accuracy is calculated by dividing the number of samples correctly classified by the model in the test set by the total number of samples, and the resulting proportion is the accuracy. The prediction results and actual results of each classification are statistically analyzed through the use of a confusion matrix. Evaluation metrics including the F1 score, overall accuracy score (OA), recall score, Kappa coefficient, precision score, producer’s accuracy score, and user’s accuracy score are utilized to assess the land cover classification outcomes. Among these metrics, producer accuracy (PA) represents the proportion of data, within all the true ground reference data for a given class, that is correctly classified by the model as belonging to that class [25]; while user accuracy(UA) can be described as the proportion of all verification points marked as belonging to a specific category on the classification map that actually do belong to that category [26]. The calculation formulas for OA [27] and Kappa coefficient [28,29] are shown in Equations (3)–(6). Recall is the overall recall rate across different models, while precision refers to the overall precision of the model [30]. The study calculates the overall F1 score of the model by assessing recall and precision and calculating the F1 score for each land type through PA and UA [31]. The calculation formulas for recall, precision, and F1 scores are shown in Equations (7), (8), and (9), respectively.

P A = \frac{x_{i i}}{x_{+ i}} \times 100 %

(3)

U A = \frac{x_{i i}}{x_{i +}} \times 100 %

(4)

O A = \frac{1}{N} \sum_{i = 1}^{r} x_{i i} \times 100 %

(5)

K a p p a = \frac{N \cdot \sum_{i = 1}^{r} x_{i i} - \sum_{i = 1}^{r} (x_{i +} \cdot x_{+ i})}{N^{2} - \sum_{i = 1}^{r} (x_{i +} \cdot x_{+ i})}

(6)

R e c a l l = \sum_{i = 1}^{r} \frac{x_{i i}}{x_{i +}}

(7)

P r e c i s i o n = \sum_{i = 1}^{r} \frac{x_{i i}}{x_{+ i}}

(8)

F 1 = 2 \times \frac{(R e c a l l o r U A) * (P r e c i s i o n o r P A)}{(R e c a l l o r U A) + (P r e c i s i o n o r P A)}

(9)

where

x_{i i}

represents the number of pixels that have been accurately categorized,

x_{+ i}

represents the count of pixels assigned to the i class in the reference dataset,

x_{i +}

represents the number of pixels identified as belonging to the i class in the land use type data product being validated, r represents the number of categories, and N represents the total amount of pixels.

3. Result Analysis

3.1. Feature Importance Ranking and Feature Selection

A study was undertaken to assess and compare the operational efficiency of three machine learning models (OVR-FWP-RF, RF, and XGBoost) based on their runtime, CPU utilization, and RSS utilization, as presented in Table 5. Significant differences were observed in the operational efficiency of these models. While XGBoost boasted the shortest runtime at 50.57 s, it also exhibited the highest CPU utilization rate of 6.60% and the largest RSS utilization of 12.62 MB. In contrast, the Random Forest (RF) model had a slightly longer runtime of 60.52 s but demonstrated a moderate level of CPU utilization (2.20%) and RSS utilization (2.57 MB), reflecting a good balance in resource utilization. Notably, the OVR-FWP-RF model achieved the second-shortest runtime of 58.60 s while simultaneously maintaining the lowest CPU utilization (0.10%) and RSS utilization (2.50 MB), indicating a relatively conservative approach to processing tasks with minimal impact on system resources. In conclusion, although XGBoost excels in speed, it comes at the cost of the highest resource consumption. Conversely, OVR-FWP-RF may be more suitable for resource-sensitive application scenarios.

3.2. Feature Importance Ranking and Feature Selection

The participation of all features in classification will inevitably lead to information redundancy. A large number of redundant features will increase the computational burden of the computer and cause the ‘Hughes phenomenon’. It is essential to eliminate redundant features through optimal feature selection. Therefore, the algorithm in this study determines the significance level of each feature variable and orders the feature variables from highest to lowest significance. The importance ranking of the initial features of each land type is shown in Figure 4.

The results of the feature importance ranking indicate that among the initially selected feature factors, NDRE_1 (18.25%), GNDVI (12.85%), and NDVI (12.50%) contribute significantly to the identification of water. The contribution rate for the identification of construction land is mainly derived from the vegetation indices GNDVI (15.91%) and EVI (14.80%), as well as the spectral feature of the blue band (13.55%). The identification of woodland is mainly influenced by the spectral features of red (13.97%) and blue band (12.00%), as well as the vegetation index NDVI (11.66%). Non-abandoned farmland is mainly influenced by vegetation features such as GNDVI (16.85%), RVI (11.37%), and EVI (10.84%). For the identification of abandoned farmland, vegetation features like NDRE_1 (8.47%), texture features like variance (7.40%), and spectral features like blue band (6.61%), all contribute to a certain extent to the identification of abandoned farmland. After calculating the importance scores of the initial feature factors for each land type, a weighted average method was used to obtain the weighted comprehensive feature factor importance scores. The results showed that the vegetation features GNDVI (10.44%), NDRE_1 (9.09%), and NDVI (8.79%) had the greatest influence on enhancing the classification performance of the OVR-FWP-RF algorithm.

A high value of importance indicates that a feature has a strong classification ability and can distinguish different types of ground objects effectively. On the other hand, a low value of importance suggests that a feature has less relevance to classification and can be considered redundant. Including too many such features in the classification process can increase the complexity of the model, leading to reduced classification efficiency and accuracy. Based on the ranking of feature importance, we can iteratively input the feature variables into the model one by one to explore the relationship between feature dimensionality and model accuracy. This allows us to investigate how the addition of each feature, in order of its importance, affects the performance of the model. Drawing from the research of Jianwen Huang [32], the number of decision trees is set to 100, and the square root of the total number of features is used as the number of features randomly selected at each node during the growth process of the decision trees.

Figure 5 demonstrates that as the number of features rises from 1 to 12, the performance of the feature subsets’ classification initially improves and subsequently plateaus. This is primarily attributed to the high importance of the initial feature bands, which have low correlation and minimal information redundancy among them, thus enhancing the performance of the classifier. When the number of features is 12, the classification accuracy of the feature subset reaches its peak value of 89.30%. Therefore, the algorithm in this paper adopts the top 12 feature factors with the highest weighted comprehensive feature factor contribution rate as the optimal model features for classification in this algorithm. Among these 12 features, there are 6 vegetation features, which are: GNDVI, NDRE_1, NDVI, RVI, EVI, and NDRE_2; 4 texture features, which are: variance, entropy, dissimilarity, and second moment, as well as two spectral features, which are referred to as blue band and red band.

3.3. Machine Learning Classification Results and Evaluation

With the support of the Python language, this study employed the OVR-FWP-RF algorithm, Random Forest, and XGBoost algorithm to extract land use information in the experimental area. The accuracy served as a metric to assess the effectiveness of the models on the test set, yielding accuracy rates of 0.91, 0.87, and 0.89 for each model, respectively. Figure 6 presents the classification results.

Based on the confusion matrix (Figure 7), the classification accuracy is evaluated by calculating the overall accuracy scores, Kappa coefficients, precision values, recall values, and F1 score; the results are presented in Table 6. As can be observed from this data, the traditional RF, XGBoost, and OVR-FWP-RF algorithms achieve overall accuracy scores of 90.55%, 90.75%, and 92.66%, respectively. The Kappa coefficients are 0.8796, 0.8824, and 0.9064, respectively. The precision scores are 0.9047, 0.9081, and 0.9247, while the recall scores are 0.9062, 0.9053, and 0.9259, respectively. Lastly, the F1 scores for these algorithms are 0.9055, 0.9067, and 0.9253, respectively. Compared to the traditional RF and XGBoost algorithms, the overall accuracy of the method proposed in this paper is 2.11% and 1.91% higher than that of RF and XGBoost, respectively. The Kappa coefficients are 2.68% and 2.40%, higher than that of RF and XGBoost, respectively. The precision values are 2% and 1.66%, higher than that of RF and XGBoost, respectively. The recall values are 1.97% and 2.06%, higher than that of RF and XGBoost, respectively. Lastly, the F1 scores are 1.98% and 1.86%, higher than that of RF and XGBoost, respectively. Therefore, the method proposed in this paper demonstrates a significant improvement in accuracy, achieving higher classification precision and more accurate classification results.

3.4. Classification Information Extraction Results and Evaluation

To further investigate the differences in classification performance among different algorithms for specific ground objects, this study adopts producer accuracy and user accuracy to conduct a comparative analysis of different algorithms.

Through Table 7, it can be seen that all three machine learning algorithms are able to distinguish different ground objects well. In the classification and extraction of water and construction land, the PA and UA of the three algorithms are not significantly different. However, in the classification and extraction of woodland information, OVR-FWP-RF achieves the highest PA of 92.13%, which is 4.14% and 4.65% higher than RF and XGBoost respectively. In the classification and extraction of non-abandoned farmland and abandoned farmland, the PA and UA of the OVR-FWP-RF algorithm presented in this paper are significantly higher than those of the RF and XGBoost algorithms. Specifically, for non-abandoned farmland, the PA of OVR-FWP-RF is 2.22% and 1.69% higher than RF and XGBoost, respectively, while the UA is 1.84% and 1.04% higher. For abandoned farmland, the PA of OVR-FWP-RF is 3.22% and 0.71% higher than RF and XGBoost, respectively, and the UA is 5.27% and 6.68% higher. The partial comparison is shown in Figure 8.

4. Discussion

The OVR-FWP-RF algorithm proposed in this article provides an effective machine-learning method for the identification of abandoned farmland in hilly areas as well as other ground objects. This research not only offers technical guidance for the rapid acquisition of the distribution of abandoned farmland in hilly areas but also offers robust informational backing for rural revitalization.

First, the OVR-FWP-RF algorithm employs the OneVsRest classifier framework. This framework transforms a multi-class classification problem into multiple binary classification problems when dealing with multi-class classification issues, providing a strategy for handling imbalanced data and thus improving classification accuracy. Jiaxing Xu et al. [33] adopted an RF classification method based on a diverse feature integration classification framework for remote sensing images to extract land use information in agro-industrial complexes. Although this method achieved a high overall classification accuracy and Kappa coefficient, the recognition accuracy for forestland was relatively low. In contrast, the producer’s accuracy and the user’s accuracy for forestland in this study are both higher. Wang Hongyan et al. [34] utilized Sentinel-2 data sources, categorizing land use into “others” and “abandoned land”, and employed a decision tree for binary classification. Their mapping accuracy for abandoned land reached 86.8%, and the user’s accuracy was 84.1%. Compared to the algorithm proposed in this paper, there is still room for improvement in the recognition accuracy of abandoned land using decision trees. ZHANG Gaoteng et al. [35] used random forest for land use classification in their study area, achieving an overall classification accuracy of 85.96% and a Kappa coefficient of 0.81. However, the recognition accuracy for buildings, crops, and bare land was relatively low. In summary, previous studies on land use type identification mainly focused on multi-class classification methods, often without considering the influence between different land types, resulting in lower recognition accuracy for certain land types.

Second, the OVR-FWP-RF algorithm employs a feature weighting method in feature selection. Through the average weighted processing of feature factor importance, it can filter out the most suitable feature factors for abandoned farmland identification and reduce data redundancy. This approach not only improves the performance of the classifier but also specifically enhances the accuracy of ground object extraction. Feature selection has a significant impact on the accuracy and performance of remote-sensing image classification. On one hand, the limitations of a small number of feature parameters prevent them from comprehensively capturing effective information on all land types, which, to a certain extent, hinders the ability to precisely distinguish between different land types. On the other hand, having more feature parameters does not necessarily mean better results. The involvement of all features in the classification process inevitably leads to information redundancy. A large number of redundant features increases the computational burden on the computer and can result in the “Hughes phenomenon”, which complicates the classification process and may even lead to a decrease in classification accuracy [36,37]. However, appropriate feature selection methods can not only improve the performance of the classifier but also enhance the accuracy of feature extraction for ground objects.

In addition, the data source used in this study is airborne multispectral remote sensing data, which possesses rich spectral and textural information, providing strong data support for the identification of abandoned farmland. Previous studies have employed satellite-borne remote sensing data such as Sentinel-2 and Landsat for land-use identification [5,6,38,39], but their spatial resolutions are limited and may not be suitable for hilly areas, making them more applicable to plains and areas with large land parcels. In contrast to these data sources, airborne multispectral remote sensing data boasts rich spectral information and improved spatial resolution. However, different data sources and data quality may have an impact on classification results. Therefore, further research can investigate the impact of varied data sources on the correctness of abandoned farmland identification. Additionally, this study only takes a typical hilly village as an example, with a relatively small study area, which may not fully reflect the complexity and diversity of abandoned farmland identification in hilly areas. Therefore, in future research, the study area can be further expanded to explore abandoned land identification methods and accuracy under different topographical, geomorphic, and climatic conditions.

5. Conclusions

Taking a typical hilly village as an example, this study utilizes airborne multispectral remote sensing image data as the data source to compare the capabilities and adaptability of OVR-FWP-RF, traditional RF, and XGBoost algorithms in identifying abandoned farmland in complex land surface environments in hilly areas. The following conclusions can be drawn:

(1): By adopting the OneVsRest classifier and ranking the importance of the initial features for each land cover type, it is found that for the identification of abandoned land, among the spectral features, vegetation features, and texture features, there are 10 feature factors with an importance of over 5%. Among them, there are 2 spectral feature factors, namely Blue and Red Band; 4 vegetation feature factors, namely NDRE_1, GNDVI, NDVI, and EVI; and 4 texture features, namely Variance, Entropy, Dissimilarity, and Contrast. Therefore, the identification of abandoned land is primarily influenced by vegetation features and texture features.
(2): The classification accuracy of the OVR-FWP-RF algorithm is higher than that of the RF and XGBoost algorithms, and the overall classification accuracy of all three machine learning algorithms is higher than 90%, with Kappa Coefficient values exceeding 0.85. Therefore, the utilization of machine learning methods and airborne multispectral data for land use classification in hilly areas achieves high classification accuracy.
(3): In the abandoned land identification results using the OVR-FWP-RF algorithm, the producer’s accuracy is 3.22% and 0.71% higher than that of RF and XGBoost respectively, while the user’s accuracy is 5.27% and 6.68% higher respectively. By employing the One-Vs-Rest classifier framework and feature weighting method, the OVR-FWP-RF algorithm is able to enhance the available features of a random forest while reducing interference caused by feature redundancy. This improves classifier performance and land cover extraction accuracy, providing a new approach for the identification of abandoned land and other land cover classification tasks in hilly areas.

In summary, this study not only provides an effective machine learning algorithm for identifying abandoned land in hilly areas but also offers a valuable reference for land use classification research in other similar areas.

Author Contributions

Conceptualization, L.W. and Y.W.; data curation, L.W., Q.L. and K.Z.; formal analysis, L.W. and Q.L.; funding acquisition, Q.L. and Y.W.; methodology, L.W. and H.W.; project administration, Q.L. and Y.W.; supervision, L.W. and Y.W.; Writing—Original Draft, L.W.; Writing—Review and Editing, L.W. and Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Social Science Fund of China (No.19XJY008), the Startup Project of Doctoral Research by China West Normal University (No. 20E034).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

For access to relevant data and code, please contact the relevant author.

Acknowledgments

We would like to express our sincere give our thanks to Hui Liu.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Huang, Y.; Li, F.; Xie, H. A Scientometrics Review on Farmland Abandonment Research. Land 2020, 9, 263. [Google Scholar] [CrossRef]
Chen, R.; Ye, C.; Cai, Y.; Xing, X.; Chen, Q. The impact of rural out-migration on land use transition in China: Past, present and trend. Land Use Policy 2014, 40, 101–110. [Google Scholar] [CrossRef]
Li, L.; Pan, Y.; Zheng, R.; Liu, X. Understanding the spatiotemporal patterns of seasonal, annual, and consecutive farmland abandonment in China with time-series Modis images during the period 2005–2019. Land Degrad. Dev. 2022, 33, 1608–1625. [Google Scholar] [CrossRef]
Liu, B.; Song, W.; Sun, Q. Status, Trend, and Prospect of Global Farmland Abandonment Research: A Bibliometric Analysis. Int. J. Environ. Res. Public Health 2022, 19, 16007. [Google Scholar] [CrossRef] [PubMed]
Baumann, M.; Kuemmerle, T.; Elbakidze, M.; Ozdogan, M.; Radeloff, V.C.; Keuler, N.S.; Prishchepov, A.V.; Kruhlov, I.; Hostert, P. Patterns and drivers of post-socialist farmland abandonment in Western Ukraine. Land Use Policy 2011, 28, 552–562. [Google Scholar] [CrossRef]
Kuemmerle, T.; Müller, D.; Griffiths, P.; Rusu, M. Land use change in Southern Romania after the collapse of socialism. Reg. Environ. Chang. 2009, 9, 1–12. [Google Scholar] [CrossRef]
Wang, L.J.; Zhang, G.M.; Wang, Z.Y.; Liu, J.G.; Shang, J.L.; Liang, L. Bibliometric Analysis of Remote Sensing Research Trend in Crop Growth Monitoring: A Case Study in China. Remote Sens. 2019, 11, 809. [Google Scholar] [CrossRef]
Deng, J.Q.; Guo, Y.W.; Chen, X.Y.; Liu, L.; Liu, W.Y. Abandoned Farmland Extraction and Feature Analysis Based on Multi-Sensor Fused Normalized Difference Vegetation Index Time Series-A Case Study in Western Mianchi County. Appl. Sci. 2024, 14, 2102. [Google Scholar] [CrossRef]
Bangira, T.; Alfieri, S.M.; Menenti, M.; van Niekerk, A. Comparing Thresholding with Machine Learning Classifiers for Mapping Complex Water. Remote Sens. 2019, 11, 1351. [Google Scholar] [CrossRef]
Wu, H.; Lin, A.Q.; Xing, X.D.; Song, D.X.; Li, Y. Identifying core driving factors of urban land use change from global land cover products and POI data using the random forest method. Int. J. Appl. Earth Obs. Geoinf. 2021, 103, 13. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.F.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J.P. An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS-J. Photogramm. Remote Sens. 2012, 67, 93–104. [Google Scholar] [CrossRef]
Ge, G.; Shi, Z.; Zhu, Y.; Yang, X.; Hao, Y. Land use/cover classification in an arid desert-oasis mosaic landscape of China using remote sensed imagery: Performance assessment of four machine learning algorithms—ScienceDirect. Glob. Ecol. Conserv. 2020, 22, e00971. [Google Scholar] [CrossRef]
Dong, J.; Ren, G.B.; Hu, Y.B.; Feng, J.Z.; Ma, Y. Construction and classification of coral reef geomorphological unit system based on high-resolution remote sensing: Taking 8-band Worldview-2 image as an example. J. Trop. Oceanogr. 2020, 39, 116–129. [Google Scholar]
Carlson, T.; Ripley, D.A.J. On the Relation between NDVI, Fractional Vegetation Cover, and Leaf Area Index. Remote Sens. Environ. Interdiscip. J. 1997, 62, 241–252. [Google Scholar] [CrossRef]
Schuster, C.; Forster, M.; Kleinschmit, B. Testing the red edge channel for improving land-use classifications based on high-resolution multi-spectral satellite data. Int. J. Remote Sens. 2012, 33, 5583–5599. [Google Scholar] [CrossRef]
Ustuner, M.; Sanli, F.B.; Abdikan, S.; Esetlili, M.; Kurucu, Y. Crop Type Classification Using Vegetation Indices of RapidEye Imagery. ISPRS Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2014, XL-7, 195–198. [Google Scholar] [CrossRef]
Navarro, G.; Caballero, I.; Silva, G.; Parra, P.C.; Vázquez, Á.; Caldeira, R. Evaluation of forest fire on Madeira Island using Sentinel-2A MSI imagery. Int. J. Appl. Earth Obs. Geoinf. 2017, 58, 97–106. [Google Scholar] [CrossRef]
Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. Stud. Media Commun. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef]
Jiang, Q.; Yan, X. Parallel PCA–KPCA for nonlinear process monitoring. Control. Eng. Pract. 2018, 80, 17–25. [Google Scholar] [CrossRef]
Deng, S.B.; Chen, Q.J.; Du, H.J.; Xu, E.H. ENVI Remote Sensing Image Processing Method, 2nd ed.; Higher Education Press: Beijing, China, 2014. [Google Scholar]
Belgiu, M.; Dragut, L. Random forest in remote sensing: A review of applications and future directions. Isprs. J. Photogramm. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Xia, J.S.; Yokoya, N.; Iwasaki, A. Classification of large-sized hyperspectral imagery using fast machine learning algorithms. J. Appl. Remote Sens. 2017, 11, 15. [Google Scholar] [CrossRef]
Zhang, Y.Q.; Cao, G.; Li, X.S.; Wang, B.S. Cascaded Random Forest for Hyperspectral Image Classification. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2018, 11, 1082–1094. [Google Scholar] [CrossRef]
Alhassan, V.; Henry, C.; Ramanna, S.; Storie, C. A deep learning framework for land-use/land-cover mapping and analysis using multispectral satellite imagery. Neural Comput. Appl. 2020, 32, 8529–8544. [Google Scholar] [CrossRef]
Yang, L.B.; Wang, L.M.; Huang, J.F.; Mansaray, L.R.; Mijiti, R. Monitoring policy-driven crop area adjustments in northeast China using Landsat-8 imagery. Int. J. Appl. Earth Obs. Geoinf. 2019, 82, 18. [Google Scholar] [CrossRef]
Sahin, E.K. Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Appl. Sci. 2020, 2, 1308. [Google Scholar] [CrossRef]
Zhang, D.Y.; Fang, S.M.; She, B.; Zhang, H.H.; Jin, N.; Xia, H.M.; Yang, Y.Y.; Ding, Y. Winter Wheat Mapping Based on Sentinel-2 Data in Heterogeneous Planting Conditions. Remote Sens. 2019, 11, 2647. [Google Scholar] [CrossRef]
Wang, L.; Dong, T.; Zhang, G.; Niu, Z. LAI Retrieval Using PROSAIL Model and Optimal Angle Combination of Multi-Angular Data in Wheat. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2013, 6, 1730–1736. [Google Scholar] [CrossRef]
Guo, J.; Zhu, L.; Jin, B. Crop classification based on data fusion of Sentinel-1 and Sentinel-2. Trans. Chin. Soc. Agric. Mach 2018, 49, 192–198. [Google Scholar]
Zhang, X.L.; Feng, X.Z.; Xiao, P.F.; He, G.J.; Zhu, L.J. Segmentation quality evaluation using region-based precision and recall measures for remote sensing images. ISPRS J. Photogramm. Remote Sens. 2015, 102, 73–84. [Google Scholar] [CrossRef]
Sorboni, N.G.; Wang, J.F.; Najafi, M.R. Fusion of Google Street View, LiDAR, and Orthophoto Classifications Using Ranking Classes Based on F1 Score for Building Land-Use Type Detection. Remote Sens. 2024, 16, 2011. [Google Scholar] [CrossRef]
Huang, J.W.; Li, Z.Y.; Chen, E.X.; Zhao, L.; Mo, B.P. Classification of Plantation Types Using Wide-Swath Multispectral Data from GF-6 Satellite. J. Remote Sens. 2021, 25, 539–548. [Google Scholar]
Xu, J.; Chen, C.; Zhou, S.; Hu, W.; Zhang, W. Land use classification in mine-agriculture compound area based on multi-feature random forest: A case study of Peixian. Front. Sustain. Food Syst. 2024, 7, 1335292. [Google Scholar] [CrossRef]
Wang, H.Y.; Wang, X.F.; Gao, L.; Li, Q.Z.; Zhao, L.C.; Du, X.; Zhang, Y. Research on Remote Sensing Extraction Method of Abandoned Farmland Based on Seasonal Variation Characteristics. Remote Sens. Technol. Appl. 2020, 35, 596–605. [Google Scholar]
Zhang, G.T.; Wang, H.Y.; Wang, C.; Wu, X.K. Land Use Classification Based on Fusion of Airborne LiDAR and Hyperspectral Images. Laser J. 2023, 44, 133–136. [Google Scholar] [CrossRef]
Li, X.K.; Liu, K.; Tian, J. Variability, predictability, and uncertainty in global aerosols inferred from gap-filled satellite observations and an econometric modeling approach. Remote Sens. Environ. 2021, 261, 112501. [Google Scholar] [CrossRef]
Yao, X.; Wang, X.; Zhang, Y.; Quan, W. Summary of feature selection algorithms. Control. Decis. 2012, 27, 161–313. [Google Scholar]
Puletti, N.; Chianucci, F.; Castaldi, C. Use of Sentinel-2 for forest classification in Mediterranean environments. Ann. Silvic. Res. 2017, 42, 32–38. [Google Scholar]
Georganos, S.; Grippa, T.; Gadiaga, A.; Vanhuysse, S.; Kalogirou, S.; Lennert, M.; Linard, C. An Application of Geographical Random Forests for Population Estimation in Dakar, Senegal Using Very-High-Resolution Satellite Imagery. In Joint Urban Remote Sensing Event; IEEE: Vannes, France, 2019. [Google Scholar] [CrossRef]

Figure 1. Technology roadmap.

Figure 2. The geographical location map of the research area.

Figure 3. Diagram of the OVR-FWP-RF Principle.

Figure 4. Feature Importance Ranking: (a) water; (b) construction land; (c) woodland; (d) non-abandoned farmland; (e) abandoned farmland; (f) Weighted ranking of the importance of comprehensive feature factors.

Figure 5. The relationship between the feature dimensionality and validation accuracy of a random forest regression model.

Figure 6. Land Use Classification Maps: (a) OVR-FWP-RF; (b) RF; (c) XGBoost.

Figure 7. Confusion Matrix Graph: (a) OVR-FWP-RF; (b) RF; (c) XGBoost. Notes: Class1 = Water; Class2 = Constrction Land; Class3 = Woodland; Class4 = Non-abandoned Farmland; Class5 = Abandoned Farmland.

Figure 8. Partial Comparison Chart.

Table 1. Spectral band information.

Band Name	Central Wavelength (in nm)	Band Name	Central Wavelength (in nm)
Blue	450	Red edge1	720
Green	555	Red edge2	750
Red	660	NIR	840

Table 2. Initially selected feature factors.

Feature Category	Feature Factors	Feature Category	Feature Factors
Spectral Feature	Blue	Vegetation Index	NDVI
	Red		NDRE_1
	Green		NDRE_2
	Red edge1		GNDVI
	Red edge2		EVI
	NIR		RVI
Texture Feature	Mean	Texture Feature	Dissimilarity
	Variance		Second moment
	Homogeneity		Correlation
	Contrast		Entropy

Table 3. Vegetation Index and its Calculation Formulas.

Vegetation Index	Formulas
Normalized Difference Vegetation Index	$N D V I = \frac{N I R - R e d}{N I R + R e d}$
Normalized Difference Red-edge Index	$N D R E = \frac{N I R - {R e d e d g e}_{1 o r 2}}{N I R + {R e d e d g e}_{1 o r 2}}$
Green Band Normalized Difference Vegetation Index	$G N D V I = \frac{N I R - G r e e n}{N I R + G r e e n}$
Ratio Vegetation Index	$R V I = \frac{N I R}{R e d}$
Enhanced Vegetation Index	$E V I = \frac{2.5 * (N I R - R e d)}{N I R + 6 * R e d - 7.5 * B l u e + 1}$

Table 4. The Principal Component Analysis Results.

PC	Eigenvalue	Percent
1	3,647,376,287.97	94.65%
2	201,161,928.06	99.87%
3	2,726,339.96	99.94%
4	1,388,727.05	99.98%
5	625,448.10	100.00%
6	191,634.94	100.00%

Table 5. The running time and resource consumption of each model.

Type	OVR-FWP-RF	RF	XGBoost
Runtime	58.60″	60.52″	50.57″
CPU utilization	0.10%	2.20%	6.60%
RSS utilization	2.50 MB	2.57 MB	12.62 MB

Table 6. Accuracy evaluation results of each model.

Evaluation Type	OVR-FWP-RF	RF	XGBoost
Overall Accuracy	92.66%	90.55%	90.75%
Kappa Coefficient	0.9064	0.8796	0.8824
Precision	0.9247	0.9047	0.9081
Recall	0.9259	0.9062	0.9053
F1	0.9253	0.9055	0.9067

Table 7. Accuracy Statistics.

Class	Evaluation Types	OVR-FWP-RF	RF	XGBoost
Water	PA	96.46%	96.20%	96.58%
	UA	97.44%	97.69%	94.67%
	F1	0.9695	0.9694	0.9561
Construction Land	PA	97.12%	96.28%	95.78%
	UA	95.56%	94.24%	95.45%
	F1	0.9633	0.9525	0.9561
Woodland	PA	92.13%	87.99%	87.48%
	UA	94.43%	92.80%	94.76%
	F1	0.9326	0.9033	0.9097
Non-abandoned Farmland	PA	87.35%	85.13%	85.66%
	UA	89.01%	87.17%	87.97%
	F1	0.8817	0.8614	0.8680
Abandoned Farmland	PA	89.30%	86.80%	88.59%
	UA	86.52%	81.25%	79.84%
	F1	0.8789	0.8393	0.8399

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, L.; Li, Q.; Wang, Y.; Zeng, K.; Wang, H. An OVR-FWP-RF Machine Learning Algorithm for Identification of Abandoned Farmland in Hilly Areas Using Multispectral Remote Sensing Data. Sustainability 2024, 16, 6443. https://doi.org/10.3390/su16156443

AMA Style

Wang L, Li Q, Wang Y, Zeng K, Wang H. An OVR-FWP-RF Machine Learning Algorithm for Identification of Abandoned Farmland in Hilly Areas Using Multispectral Remote Sensing Data. Sustainability. 2024; 16(15):6443. https://doi.org/10.3390/su16156443

Chicago/Turabian Style

Wang, Liangsong, Qian Li, Youhan Wang, Kun Zeng, and Haiying Wang. 2024. "An OVR-FWP-RF Machine Learning Algorithm for Identification of Abandoned Farmland in Hilly Areas Using Multispectral Remote Sensing Data" Sustainability 16, no. 15: 6443. https://doi.org/10.3390/su16156443

APA Style

Wang, L., Li, Q., Wang, Y., Zeng, K., & Wang, H. (2024). An OVR-FWP-RF Machine Learning Algorithm for Identification of Abandoned Farmland in Hilly Areas Using Multispectral Remote Sensing Data. Sustainability, 16(15), 6443. https://doi.org/10.3390/su16156443

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An OVR-FWP-RF Machine Learning Algorithm for Identification of Abandoned Farmland in Hilly Areas Using Multispectral Remote Sensing Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Summary of the Research Area and Data Sources

2.1.1. Summary of the Research Area

2.1.2. Data Source

2.2. Feature Extraction

2.2.1. Spectral Feature Factor

2.2.2. Vegetation Index Feature Factor

2.2.3. Texture Feature Factor

2.3. Machine Learning Model Construction

2.3.1. Random Forest Algorithm

2.3.2. Feature-Weighted Preference for OneVsRest-RF

2.3.3. XGBoost Algorithm

2.4. Accuracy Evaluation

3. Result Analysis

3.1. Feature Importance Ranking and Feature Selection

3.2. Feature Importance Ranking and Feature Selection

3.3. Machine Learning Classification Results and Evaluation

3.4. Classification Information Extraction Results and Evaluation

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI