Next Article in Journal
Resource-Constrained 3D Volume Estimation of Lunar Regolith Particles from 2D Imagery for In Situ Dust Characterization in a Lunar Payload
Previous Article in Journal
Correction: Coleman, K.; Kuenzer, C. Forest Fragmentation in Bavaria: A First-Time Quantitative Analysis Based on Earth Observation Data. Remote Sens. 2025, 17, 2558
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Apple Orchard Mapping in China Based on an Automatic Sample Generation Algorithm and Random Forest

1
Department of Geographic Information Engineering, College of Land Science and Technology, China Agricultural University, No.2 West Old Summer Palace Road, Haidian District, Beijing 100193, China
2
Key Laboratory of Remote Sensing of Agricultural Disasters, Ministry of Agriculture and Rural Affairs, Beijing 100193, China
3
School of Artificial Intelligence, China University of Geosciences (Beijing), Beijing 100083, China
4
Hebei Key Laboratory of Geospatial Digital Twin and Collaborative Optimization, Beijing 100083, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2025, 17(20), 3449; https://doi.org/10.3390/rs17203449
Submission received: 27 August 2025 / Revised: 12 October 2025 / Accepted: 14 October 2025 / Published: 16 October 2025
(This article belongs to the Special Issue Innovations in Remote Sensing Image Analysis)

Abstract

Highlights

What are the main findings?
  • Development of a novel AMCI index for apple orchard identification.
  • Proposal of a knowledge-assisted apple orchard mapping framework.
What is the implication of the main finding?
  • Application of a hybrid approach combining thresholding and random forest for crop identification.
  • Production of the first 10 m-resolution apple orchard map of China.

Abstract

Accurate apple orchard mapping plays a vital role in managing agricultural resources. However, national-scale apple orchard mapping faces challenges such as the “same spectrum with different objects” phenomenon between apple trees and other crops, as well as difficulties in sample collection. To address the above issues, this study proposes a knowledge-assisted apple mapping framework that automatically generates samples using agronomic knowledge and employs a random forest algorithm for classification. Firstly, an apple mapping composite index (AMCI) was developed by integrating the chlorophyll content and leaf structural characteristics of apple trees. In a single Sentinel-2 image, a novel natural vegetation phenolic compounds index was applied to systematically exclude natural vegetation, and based on this, the AMCI was used to generate an initial apple distribution map. Using this initial map, apple samples were obtained through random point selection and visual interpretation, and other samples were constructed based on land cover products. Finally, a 10 m-resolution apple orchard map of China was generated with the random forest algorithm. The results show an overall accuracy of 90.7% and a kappa of 0.814. Moreover, the extracted area shows an 82.11% consistency with official statistical data, demonstrating the effectiveness of the proposed method. This simple and robust framework provides a valuable reference for large-scale crop mapping.

1. Introduction

As one of the world’s major economic crops, apples possess nutritional, economic, and ecological value. Apples can be consumed directly or processed into a wide range of products, including food, pharmaceuticals, and daily necessities, making them widely cultivated due to their diverse applications. In addition, apple orchards contribute significantly to ecological conservation by stabilizing soil through root systems, enriching soil fertility through litterfall, and reducing runoff via canopy interception. These functions enhance soil sustainability and help prevent erosion. The cultivation of apple trees thus plays a crucial role in reinforcing soil structure, increasing nutrient content, and conserving soil resources. As the world’s largest apple cultivation region, China requires accurate knowledge of the spatial distribution of apple orchards to provide a scientific basis for optimizing planting layouts and formulating industrial policies, thereby promoting efficient resource allocation and sustainable agricultural development [1,2].
Remote sensing technology, with its high temporal and spatial resolution, enables the efficient and accurate acquisition of large-scale crop spatial information. Traditional crop planting statistics rely heavily on field surveys conducted by researchers, which are time-consuming, costly, and typically lack detailed spatial distribution data [3]. Official statistics are often released in the following year, resulting in temporal delays and insufficient information. In contrast, remote sensing offers timely and accurate access to crop information over large areas, thereby enhancing the availability of spatial information for agricultural monitoring. However, crop growth is a dynamic and continuous process. While different crops exhibit unique phenological cycles, mature fruit trees, such as apple trees, follow a recurring annual growth and development pattern. Apple trees typically attain a stable growth phase by their third to fourth year, after which their spectral and phenological characteristics exhibit small interannual variation in remote sensing imagery [4]. Nevertheless, apple orchards exhibit a highly fragmented spatial distribution and limited spectral separability from vegetation with similar characteristics [5]. In addition, the complexity of China’s cropping systems further compounds the challenge of accurate identification [6]. Therefore, achieving precise apple orchard detection from remote sensing imagery presents significant technical challenges, making the selection of suitable classification methods particularly critical. Currently, remote sensing imagery is widely applied in crop mapping, with mainstream approaches including classifier-based and threshold-based extraction techniques.
Using classifiers for crop classification is a widely adopted and effective approach [7]; however, its performance is constrained by data quality and model selection. Conventional machine learning algorithms, including random forest (RF) and support vector machine (SVM), have been widely employed in crop identification [8] and have demonstrated satisfactory performance in large-scale identification studies [9,10,11,12]. However, their heavy reliance on features and samples can adversely affect algorithmic accuracy [13]. On the other hand, deep learning offers superior segmentation precision and enables crop classification without the need for manually designed features [14]. Nevertheless, it demands substantial computational resources and requires a sizable dataset for model training [15,16]. Consequently, in large-scale high-resolution crop classification tasks, machine learning and deep learning face limitations due to constraints in data availability and hardware, which may compromise their robustness and generalization. Moreover, achieving large-scale, high-precision crop mapping remains challenging in the absence of abundant manually annotated samples [17].
Threshold-based classification methods primarily enhance the spectral contrast between target crops and other land cover types by constructing vegetation indices, which evaluate crop growth conditions [18]. These methods achieve accurate crop identification by extracting spectral and agronomic information from remote sensing imagery [19]. A critical aspect of this approach lies in analyzing spectral variation trends within time-series imagery to extract phenological features, thereby improving the separability of target crops [20]. Due to the simplicity of the index computation formulas and the efficiency of rapid information extraction, threshold-based approaches are more convenient and computationally efficient than traditional classification methods [21]. Due to their independence from large amounts of manually labeled samples, threshold-based classification techniques have been widely applied in large-scale crop mapping, such as rice [22], soybean [23,24], corn [25], rapeseed [26], and tea plants [27]. However, to the best of our knowledge, there is no reported work on the development of specialized indices by mining the unique phenological and spectral characteristics of apple trees. Existing index designs often rely on the assumed linear relationships among variables. For apples, the presence of many spectrally similar crops poses a challenge, making it difficult for a single index to achieve effective discrimination. Therefore, relying solely on threshold-based methods for apple identification presents certain limitations.
Addressing these challenges, this research proposes a knowledge-assisted apple mapping framework. First, based on the spectral differences between apples and other land cover types, a novel index named the Apple Mapping Composite Index (AMCI) is designed to automatically generate an initial apple distribution map. Subsequently, random sampling and visual interpretation are performed on the initial map to construct an apple sample set. In parallel, additional sample sets for other land cover categories are generated by overlaying existing land cover products within the study area. Finally, the RF model is employed to produce the final apple distribution map. This knowledge-assisted mapping framework is designed to mitigate the uncertainty associated with single-method approaches and achieve accurate delineation of apple distribution. This method is concise and efficient, exhibits strong scalability, and can provide an important reference for global apple mapping and related studies.

2. Materials and Methods

2.1. Study Area

Apples, as a fruit of significant economic value, are widely cultivated across the globe. China, with an apple cultivation area of 1,955,800 hectares (Figure 1), far surpasses other countries and stands as the world’s largest apple producer [28]. The study area (32°11′ to 42°57′N, 92°13′ to 122°43′E) encompasses China’s primary apple-producing regions, spanning four provinces from west to east: Gansu, Shaanxi, Shanxi, and Shandong (Figure 1), and covers a total area of approximately 949,300 square kilometers. The study area is located within the globally recognized “golden zone” for apple cultivation (35–50° north and south latitude). In 2022, the apple planting area and production in the study area accounted for 63.7% and 67.4% of the national total, respectively [29]. The study area features diverse natural conditions, with annual average precipitation ranging from 387.2 mm in Gansu to 878 mm in Shandong [29], encompassing climate types such as temperate monsoon, subtropical monsoon, temperate continental, and plateau alpine. The topography comprises plains, hills, basins, plateaus, and mountains. This diversity of geographical conditions not only results in varied land cover types but also provides favorable environmental conditions for apple cultivation.

2.2. Data

2.2.1. Sentinel-2 Data

Sentinel-2 provides freely available multispectral imagery at a 10 m spatial resolution. The MSI data adopted in this study were preprocessed on the Google Earth Engine (GEE) platform, including atmospheric correction, so no further correction was necessary. To address the issue of image quality degradation caused by cloud and rain interference in optical imagery, cloud removal was performed using the quality assessment (QA) band, and images with cloud coverage below 10% were selected [30]. Given the spatial stability and limited interannual variation in mature apple orchards, imagery from the same period in 2023 and 2024 (see Section 2.3.2 for detailed time information) was used to fill in missing areas in 2022. Sentinel-2 imagery consists of 13 spectral bands, among which 10 bands (BLUE, GREEN, RED, RE1, RE2, RE3, RE4, NIR, SWIR1, and SWIR2) are commonly used for vegetation monitoring. To ensure a consistent spatial resolution, the RE1, RE2, RE3, RE4, SWIR1, and SWIR2 bands were resampled to 10 m.

2.2.2. Ground Truth Data

To ensure the accuracy of reference data, field samples were conducted in the study area over the past years to collect ground truth data. Specifically, two rounds of surveys were carried out in Shandong Province in October 2021 and August 2022. In addition, in August 2023, sampling was conducted along road networks in Shanxi, Shaanxi, and Gansu provinces using a handheld GPS device. All collected samples are point data with geographic coordinates, and all were acquired during the apple fruit-setting stage, with a temporal interval of less than five days between consecutive acquisitions. A total of 5280 samples were collected (Figure 1), including 1026 apple samples. The field samples were processed using ArcGIS for data format conversion and geographic coordinate assignment, and were used for time-series spectral analysis and accuracy validation.

2.2.3. Land Cover Products

To automatically generate sample sets for other land cover categories, this study utilized publicly available land cover datasets: China Land Cover Dataset (CLCD) [31] and Globeland30 [32].

2.2.4. Topographic Data

Apple cultivation is influenced by terrain, making topographic data essential for identification. The Shuttle Radar Topography Mission (SRTM) generated global Digital Elevation Model (DEM) data at 30 m and 90 m resolution using radar technology [33]. In this study, 30 m SRTM data from GEE (ee.Image(“USGS/SRTMGL1_003”)) were used to extract elevation, slope, aspect, and hillshade as input variables for the RF model.

2.2.5. Statistical Data

Official statistical data play an important role in validating the results. In this study, apple orchard area statistics for 2022 published by the National Bureau of Statistics of China were used to obtain apple cultivation area within the study area and to support the validation of results. According to the statistics, the national apple orchard area in 2022 was 1,955,733 hectares [34]. Given that the study area accounts for 63.7% of the national planting area, a value of 1,245,802 hectares was used as the reference data for comparison with the model-derived apple cultivation area.

2.3. Methods

The knowledge-assisted apple mapping framework (KAMF) consists of four main steps (Figure 2) [35]: (1) Input Data: Data acquisition and preprocessing; (2) Knowledge-based Method: By collecting and analyzing agronomic knowledge related to apples, phenological and spectral characteristics were examined to construct the AMCI, which was used to generate an initial apple distribution map; (3) Apple Sample Generation: Based on the initial apple map, samples were generated through random point selection and visual interpretation; (4) Classification and Data Output: Apple identification and accuracy assessment were performed.

2.3.1. Phenological Phase Analysis

Based on field investigations, we found that peaches, pears, and cherries are the crops most likely to be confused with apples in the study area. Furthermore, considering the extensive cultivation of corn in the region, its phenological characteristics were also examined. Due to differences in natural conditions and crop varieties, the phenological phases of crops vary across regions. Therefore, the phenological information of crops in the study area was integrated by taking the union of the time ranges for the same phenological stage across different varieties to produce a comprehensive phenological calendar (Figure 3) [2,36].
The flowering period of apples typically occurs from April to early May, followed by the fruit setting stage in mid to late May. The maturation period extends from late August to late November, during which most apple harvesting is completed. Specifically, early-maturing cultivars begin to ripen in late August, mid-season cultivars mature in September, and late-maturing cultivars gradually ripen in October. After entering December, the trees enter the defoliation stage, during which most leaves fall off, usually following the first frost.
The phenological phases of peaches and pears are relatively similar, though the growth stages of peaches generally occur slightly earlier than those of pears. The maturation period of peaches begins in early June and harvesting is mostly completed before September. Pears typically mature around August, with large-scale harvesting completed by the end of October. In comparison, the defoliation period of peach trees is significantly longer than that of pear trees. Cherries exhibit a much earlier flowering phase than peaches, pears, and apples, with a notably shorter duration. Fruit setting usually begins in late April, and maturation starts in mid to late May, continuing through to mid or late June. As a grain crop, corn shows distinct phenological characteristics compared to the above-mentioned fruit crops. It generally germinates from late April to early May and is harvested on a large scale in September. In regions with double-cropping systems, other crops are usually planted shortly after the corn harvest to maximize the utilization of arable land resources.

2.3.2. Apple Orchard Mapping Algorithm Based on AMCI

(1) Natural Vegetation Phenolic Compounds Index (NVPCI)
Natural vegetation refers to types of vegetation that grow and develop under natural conditions without human intervention, such as natural forests and grasslands. In remote sensing imagery, natural vegetation is often confused with orchards and other types of managed vegetation (Figure 4) [4], making it necessary to effectively eliminate natural vegetation interference. Compared to natural vegetation, fruit tree leaves contain higher levels of flavonoids. Flavonoids are a type of phenolic compound that absorb light in the green spectral region, while the shortwave infrared (SWIR) bands are particularly sensitive to phenolic content. Therefore, based on the Phenolic Compound Index (PCI) [27], this study improves the PCI using field samples by setting the base of the logarithmic function to 0.5, thereby developing the NVPCI (Equation (1)).
NVPCI = ( l o g 0.5 G R E E N + 1 S W I R 2 2 )
Note: GREEN and SWIR2 represent the reflectance of the green and shortwave infrared-2 bands from Sentinel-2 imagery.
(2) Time-Series Analysis of Spectral Bands
The reflectance of spectral bands in remote sensing imagery captures surface responses to specific wavelengths, with different crops showing distinct reflectance characteristics due to physiological differences. Compared to similar crops such as pears and peaches, apples have a longer fruit-setting period (see Section 2.3.1) and darker leaf coloration (Figure 5), indicating a difference in chlorophyll content [37]. In addition, apple leaves are relatively smaller, resulting in a leaf structure that is markedly different from that of other crops (Figure 5). Among the original spectral bands, red-edge bands are particularly sensitive to variations in chlorophyll content and leaf structure, while the near-infrared (NIR) band shows higher reflectance in response to leaf structure [38]. As shown in Figure 6, around DOY 150, apples demonstrate a clear spectral separability in bands RE2, RE3, RE4, and NIR.
(3) Time-Series Analysis of Spectral Indices
Spectral indices can reflect the spectral characteristics of crops, thereby enabling effective discrimination among different crop types. During the fruit-setting stage, the time-series curves of spectral indices for apples exhibit varying degrees of separability from other land cover types (Figure 7). As shown in Figure 7a–i, the curves within each group display similar shapes. In terms of the trend of apple index curves, indices (a) and (b) show a clear similarity. Regarding the separability between apples and other land cover types, the Enhanced Vegetation Index (EVI) (Equation (2)) performs best. This is because EVI effectively reduces atmospheric and soil background noise, exhibits higher sensitivity to vegetation variation, and better reflects vegetation growth conditions. Similarly, the time-series curves of indices Figure 7d–i for apples also show similar patterns, among which the Green Chlorophyll Vegetation Index (GCVI) (Equation (3)) displays greater distinctiveness. GCVI enhances the reflectance difference between the green and near-infrared bands, thereby improving sensitivity to vegetation with high chlorophyll content. Therefore, compared with other commonly used spectral indices, EVI [39] and GCVI [40] demonstrate stronger capability in distinguishing apples from other land cover types.
EVI = 2.5 N I R R E D N I R + 6 R E D 7.5 B L U E + 1
GCVI = N I R G R E E N 1
Note: The variables denote Sentinel-2 band reflectance: NIR (near-infrared), RED (red), BLUE (blue), and GREEN (green). In addition, Equation (2) is the specific formula used in this study, and its constant terms can be adjusted as needed based on the requirements of different research scenarios.
(4) Apple Mapping Composite Index (AMCI)
Due to the smaller leaf area and distinctive leaf structure and chlorophyll content, apples show greater differences from other land cover types in RE2, RE3, RE4, NIR, as well as in EVI and GCVI during the fruit-setting stage. Therefore, based on a comprehensive analysis of the original spectral bands and spectral indices, this study combines bands and indices that exhibit the greatest differences from apples to enhance spectral separability [24]. Following the principle of avoiding repeated use of the same spectral bands (NIR) to minimize redundancy, this study proposes the Apple Mapping Composite Index (Equation (4)), specifically designed for apple identification.
AMCI = { R E 2 + R E 3 + R E 4 E V I G C V I } t
Note: RE2, RE3, and RE4 represent the reflectance of the red-edge bands from Sentinel-2 imagery, and t denotes the image acquisition date.
To evaluate the performance and regional robustness of AMCI, annual time-series curves were plotted for different land cover types across the study area (Figure 8). During the fruit-setting stage, apple AMCI values were consistently higher than those of other land covers, with slight regional differences in peak timing. By identifying the overlapping periods during which the difference in AMCI values between apples and other land cover types exceeds 0.5 across the four study areas, the time window from DOY 160 to 200 was selected for apple orchard mapping. During this period, the AMCI values for apples are most distinguishable from those of other land cover types, thereby maximizing the ability to highlight the spatial distribution of apple orchards.

2.3.3. Apple Orchard Mapping Algorithm Based on RF

(1) Sample Generation
The quality of training samples directly affects the classification performance and generalization ability of the model. If samples can be automatically acquired, it would significantly reduce manual labor costs and minimize subjective human bias. Therefore, this study aims to achieve automated sample generation by leveraging the initial apple distribution map derived from AMCI and land cover products.
In this study, the classification system is divided into two categories: apples and others (Table 1). Based on the initial map, points were randomly generated, visually verified, and retained as apple samples. Specifically, the AMCI raster results were imported into ArcGIS, followed by post-processing steps including projection transformation, raster-to-polygon conversion, and polygon merging. Based on the processed results, random sample points were generated and visually inspected using Google Maps. Other samples were generated by overlaying the 30 m CLCD and Globeland30 datasets, selecting random points from overlapping areas, and resampling them to 10 m resolution. A 1:1 ratio between apple and other samples was maintained to ensure class balance. Finally, all samples were split into training and validation sets at a ratio of 7:3 for model training and evaluation [41,42].
(2) Feature Selection
Feature selection is essential for accurate representation of sample information, and in this study spectral and topographic features were used as model inputs [43]. Spectral features are capable of fully capturing key information from remote sensing imagery, reflecting the unique spectral reflectance characteristics of crops. As the direct observational source of the imagery, original bands carry essential spectral information, and Sentinel-2 data are particularly rich in red-edge bands. Therefore, ten bands were selected as original band features (see Section 2.2.1). Spectral indices enhance the ability to distinguish target classes by integrating information across multiple bands. Based on the spectral analysis of apples, this study selected 15 commonly used spectral indices (Table 2) that are closely related to the red-edge region and chlorophyll content, and additionally incorporated well-established spectral indices for typical non-vegetation land-cover types (e.g., building and water). Topographic features directly influence the growing environment of apple trees by regulating the distribution of sunlight, water, and nutrients, making them one of the key factors determining the spatial distribution pattern of apple orchards. Therefore, in this study, four topographic features (see Section 2.2.4) were extracted to support and enhance the accuracy of apple identification.
(3) Construction of RF Model
Random forest [58] is a non-parametric machine learning model composed of multiple decision trees. Each tree is constructed by performing bootstrap sampling of the training data and randomly selecting a subset of features at each split, resulting in diversity among trees. This ensemble structure effectively reduces the risk of overfitting compared to a single decision tree, and provides superior classification performance. In addition, random forests are capable of automatically evaluating feature importance, demonstrating strong capabilities in both feature selection and ensemble learning. In the task of apple orchard mapping using the RF model, this study employed the ee.Classifier.smileRandomForest method on the GEE platform to generate a nationwide apple distribution map at a spatial resolution of 10 m. The RF model has been widely validated and successfully applied in large-scale crop classification tasks due to its robustness and effectiveness [11,41,59].
Parameter settings are critical to the classification performance of the RF model. The key parameters include the number of trees in the forest (n_estimators) and the number of features considered at each split (max_features). Increasing the number of trees helps reduce model variance and improve generalization ability; however, an excessively large number of trees can lead to higher computational costs. Numerous studies have shown that when n_estimators is set to 200, the model accuracy tends to stabilize. Therefore, in this study, n_estimators was set to 200. The max_features parameter is typically set to the square root of the total number of features by default. As 29 features were used for classification in this study, max_features was set to approximately 6.

2.3.4. Accuracy Assessment Metrics

This study evaluated the classification accuracy of the initial map generated by the AMCI method and the final map produced by the KAMF method. To assess the performance of both approaches, confusion matrix was constructed using ground truth data (a total of 2014 samples, including 1007 apple samples), and relevant evaluation metrics were calculated. The spatial mapping results of apple distribution were assessed using multiple metrics, including User’s Accuracy (UA), Producer’s Accuracy (PA), F1-score, Overall Accuracy (OA), and the Kappa coefficient (Equations (5)–(9)) [2,60].
UA = T i i j = 1 n T i j
PA = T i i j = 1 n T j i
F 1 i = 2 T i i j = 1 n T j i T i i j = 1 n T i j T i i j = 1 n T j i + T i i j = 1 n T i j
OA = i = 1 n T i i N
Kappa = i = 1 n T i i N i = 1 n ( j = 1 n T i j ) ( j = 1 n T j i ) N 2 1 i = 1 n ( j = 1 n T i j ) ( j = 1 n T j i ) N 2
Note: n is the dimension of the confusion matrix, representing the total number of classes in the classification task. N is the total number of pixels or samples involved in the accuracy assessment. T i j is the number of pixels in the confusion matrix whose true class is i and predicted class is j. T j i is the number of pixels in the confusion matrix whose true class is j and predicted class is i. T i i is the number of correctly predicted pixels in class i.

3. Results

3.1. Apple Planting Map of China

During the generation of the initial apple distribution map, ground truth sample points within the study area were used to assist in threshold determination. Based on these samples, the final threshold values were set as θ = −37 and β = 1.5. After generating the initial distribution map, random sample points were selected within the study area and visually interpreted. The validated points were then used as apple category inputs for the RF model. The apple orchard mapping results for the study area in 2022, generated using the KAMF method, are shown in Figure 9. China’s major apple-producing regions are primarily located around 27–31°N, specifically including southeastern Gansu Province, central Shaanxi Province, southwestern Shanxi Province, and the northeastern and southeastern parts of Shandong Province. These areas broadly encompass the country’s major apple-producing regions, which are predominantly characterized by hilly terrain.

3.2. Accuracy Assessment

Based on ground-truth sample data, this study constructed confusion matrices and performed accuracy assessments for both the AMCI and KAMF methods (Table 3 and Table 4). The samples included both apple and non-apple points. As shown in Table 3, the number of correctly classified samples by KAMF varies across different study areas. However, compared with AMCI, it shows an overall improvement in the correct classification of both apples and other categories. As shown in Table 4, the AMCI method achieved relatively high OA and Kappa across all four major producing regions. After incorporating the RF model for classification, both OA and Kappa values showed varying degrees of improvement in each region, indicating that the proposed method has overall strong robustness. Among the four major production regions, Shaanxi Province showed the most significant improvement in accuracy, mainly because some apple orchards in this area are located on plateau form in the Loess Plateau, which were formed by erosional processes. During the NVPCI-based masking process, some of these orchard areas may have been mistakenly excluded. However, the integration of the RF model helped to effectively correct such omission errors. In terms of UA, PA, and F1-score, the KAMF method also achieved consistently higher values. The results demonstrate that the KAMF method outperforms the standalone AMCI approach in overall apple identification performance and exhibits stable cross-regional transferability.
In addition, based on the final apple distribution map, the apple cultivation area within the study region was estimated and compared with official statistical data. Within China’s major apple-producing regions, the predicted apple cultivation area in 2022 was 1,022,926 hectares. Compared to the officially reported area of 1,245,802 hectares for the same year and region, the consistency reached 82.11%.

3.3. Feature Importance Analysis

Feature selection plays a critical role in determining the performance of classification models. In this study, a total of 29 features were used to support apple classification with the RF algorithm. Figure 10 illustrates the feature importance results for apple identification across different regions (a–d) and at the national scale (e). Due to variations in environmental conditions, crop types, and cultivation practices among regions, the ranking of feature importance differs to some extent. Nevertheless, certain features consistently appear among the top ranks across all regions. From the perspective of large-scale mapping, the overall feature importance ranking of the entire study area provides valuable reference. Therefore, the feature importance results from the four primary apple-producing zones were weighted and averaged, yielding the top fourteen most important features in the following order: SWIR2, Elevation, SWIR1, DVI, EVI, RE4, NIRv, Cire, RE1, RE2, RE3, MTCI, SAVI, GCVI, and BSI, as shown in the red box in Figure 10e.

4. Discussion

4.1. The Significance of Nationwide Apple Orchard Mapping

As a major cash crop in China, apple not only contributes to agricultural income but also plays a vital role in improving the environment and supporting the implementation of the rural revitalization strategy [1]. In recent years, the apple industry has shifted from traditional cultivation to diversified development, enhancing its socioeconomic value. However, challenges such as labor migration, orchard industrial transformation, and land-use competition with staple crops have negatively impacted apple cultivation and may threaten food security. Against this backdrop, accurately and rapidly obtaining spatial distribution and planting area information of apple cultivation is of great significance for relevant authorities to conduct scientific planting management, agricultural zoning, and macro-level decision-making. Recent research on large-scale crop mapping has increasingly focused on orchard types [36,61,62]. However, studies specifically targeting high-resolution, large-scale mapping of individual orchard types remain relatively limited [2,5]. Accurately carrying out high-resolution mapping of the main apple-producing areas in China can not only provide information support for the development of the apple industry but also provide methodological references for subsequent apple recognition research nationwide and even globally. At the same time, it lays a scalable idea and technical foundation for fine mapping research of other types of orchards.

4.2. Contribution of the Combined Thresholding and RF Method

Large-scale crop mapping is often constrained by regional environmental heterogeneity and the difficulty of acquiring extensive and high-quality samples [63]. Numerous studies have demonstrated that both threshold-based methods and machine learning models such as RF can achieve satisfactory performance in large-area crop identification [64]. Threshold-based approaches integrate key agronomic knowledge, such as field samples, crop growth characteristics, and management practices, with multisource remote sensing data, such as spectral and radar information, which facilitates the extraction of useful features from imagery. These approaches typically construct formulas or frameworks that combine such features to facilitate target crop identification in a relatively automated manner [23,24,65]. However, due to the subjective nature of threshold or rule design, such methods may be prone to omission errors in practical applications. Random forest models, known for their robust classification performance, have been widely adopted in crop mapping. Nonetheless, they rely heavily on the quality of input features and training samples. The processes of feature selection and sample construction can be labor-intensive and may introduce subjective bias. Therefore, the scientific and rational construction of high-quality sample sets and feature sets is critical for enhancing the accuracy of machine learning-based crop identification methods [35,63].
Based on spectral analyses using field sample data, this study found that apple orchards exhibit high reflectance characteristics during the fruit-setting period in red-edge bands and the near-infrared (NIR) band, as well as in the EVI and GCVI. Accordingly, the AMCI was constructed and combined with NVPCI to generate an initial apple distribution map. This approach demonstrated robust performance across regions with varying environmental conditions. Since mature apple orchards show stable spatial patterns with little interannual variation, the proposed AMCI method is suitable for apple identification across different years [2]. The automatic sample generation method based on the initial distribution map [66] enables efficient extraction of apple samples and holds potential for constructing larger-scale and multi-year datasets. The proposed approach integrates threshold-based methods, which make full use of agronomic knowledge, with the RF algorithm’s ability to automatically extract deep features from remote sensing imagery [67]. The apple distribution results derived from the proposed framework are generally consistent with previously published apple distribution maps in smaller study areas [5]. This integration not only enables the automatic acquisition of high-quality training samples for model input but also offers a scientific foundation for feature selection, thereby reducing the omission errors often associated with threshold-based methods. This method improves crop classification accuracy, reduces manual effort, and provides a transferable solution for large-scale mapping.

4.3. Potential Factors and Future Work

The large-scale apple orchard mapping method proposed in this study demonstrates significant advantages over traditional approaches, yet there remains room for improvement. First, although the study conducted a spectral analysis of the main land cover types within the study area, it may still be affected by other types not included in the dataset. Due to the diverse natural conditions and complex land cover structures across different regions, it is practically challenging to collect and analyze all possible land cover samples. Therefore, a promising direction for future research would be to further explore the unique growth patterns and orchard management characteristics of apples and progressively mask and eliminate other land cover types in remote sensing imagery. Additionally, this study primarily focuses on mature apple orchards, while the official statistical data also include young orchards that have not yet entered full production. As a result, the predicted apple planting area is slightly lower than the official statistics. If future research can distinguish between orchards of different age classes, it would provide more valuable references for agricultural management and precision cultivation [4]. Notably, there are some phenological differences among different apple varieties. In this study, the same phenological stages of different apple varieties were merged by taking their intersection. Conducting a more detailed classification and discussion of common apple varieties, tree density, and other related factors would be a highly meaningful direction for future research. In addition, based on the factors influencing apple identification from remote sensing imagery explored in this study, further optimization of the identification details could help advance and refine research in apple recognition. Finally, in this study, training samples were constructed by randomly generating sample points from the initial apple distribution map and manually verifying them through visual interpretation, a process that, although effective, inevitably involves certain subjective elements. Introducing a more objective and accurate method for cross-validation with AMCI to jointly optimize the construction of the training sample set will be a key future direction for improving the scientific rigor and accuracy of apple orchard mapping.

5. Conclusions

This study presents the KAMF method for large-scale apple orchard identification, implemented on the GEE platform using Sentinel-2 imagery. By deeply integrating agronomic knowledge, the AMCI was constructed to generate an initial apple distribution map, enabling the automatic and accurate construction of training samples. Without the need for manually labeled data, this framework facilitates the large-scale classification of apple orchards via RF modeling. Building on this approach, we produced the first 10 m resolution distribution map of major apple-producing regions in China. The method achieved an average overall accuracy of 90.67% across diverse agroecological zones and demonstrated notable spatial robustness. The estimated planting area achieved an 82.11% consistency with official statistics, thereby supporting the reliability of the approach. The integration of threshold-based extraction and RF classification combines the interpretability of agronomic rules with the predictive strength of machine learning. The KAMF method not only improves the automation and scalability of apple orchard mapping but also provides a valuable reference for the high-resolution, large-scale identification of other economic crops. This work offers a transferable solution for crop inventory, precision agriculture, and land resource management at regional and national scales.

Author Contributions

Conceptualization, C.W. and J.Y.; methodology, C.W. and J.Y.; project administration, J.Y. and D.M.; software, C.W. and H.Z.; supervision, S.Z., X.X. and K.T.; validation, H.Z., X.Z. and N.Z.; visualization, C.W. and H.Z.; writing—original draft, C.W.; writing—review and editing, C.W., J.Y. and D.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key Research and Development Program of China (Grant No. 2022YFB3900025-4); and the National Natural Science Foundation of China (Grant No. 42371379).

Data Availability Statement

The data presented in this study are available on request corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Chen, R.; Wang, J.; You, C.; Bai, R.; Feng, P.; Li, Y.; Zhao, G.; Qu, Z.; Li, Q. Characterizing apple production in China’s apple planting region: Biophysical simulation and machine learning analysis of quality determinants. Eur. J. Agron. 2025, 170, 127761. [Google Scholar] [CrossRef]
  2. Wu, C.; Liu, Y.; Yang, J.; Dai, A.; Zhou, H.; Tang, K.; Zhang, Y.; Wang, R.; Wei, B.; Wang, Y. Large-Scale Apple Orchard Identification from Multi-Temporal Sentinel-2 Imagery. Agronomy 2025, 15, 1487. [Google Scholar] [CrossRef]
  3. Du, Z.; Yu, L.; Chen, X.; Gao, B.; Yang, J.; Fu, H.; Gong, P. Land use/cover and land degradation across the Eurasian steppe: Dynamics, patterns and driving factors. Sci. Total Environ. 2024, 909, 168593. [Google Scholar] [CrossRef] [PubMed]
  4. Zhu, Y.; Yang, G.; Yang, H.; Wu, J.; Lei, L.; Zhao, F.; Fan, L.; Zhao, C. Identification of Apple Orchard Planting Year Based on Spatiotemporally Fused Satellite Images and Clustering Analysis of Foliage Phenophase. Remote Sens. 2020, 12, 1199. [Google Scholar] [CrossRef]
  5. Zhang, T.; Hu, D.; Wu, C.; Liu, Y.; Yang, J.; Tang, K. Large-scale apple orchard mapping from multi-source data using the semantic segmentation model with image- to- image translation and transfer learning. Comput. Electron. Agric. 2023, 213, 108204. [Google Scholar] [CrossRef]
  6. Xia, T.; He, Z.; Cai, Z.; Wang, C.; Wang, W.; Wang, J.; Hu, Q.; Song, Q. Exploring the potential of Chinese GF-6 images for crop mapping in regions with complex agricultural landscapes. Int. J. Appl. Earth Obs. Geoinf. 2022, 107, 102702. [Google Scholar] [CrossRef]
  7. Li, M.; Zhang, R.; Luo, H.; Gu, S.; Qin, Z. Crop Mapping in the Sanjiang Plain Using an Improved Object-Oriented Method Based on Google Earth Engine and Combined Growth Period Attributes. Remote Sens. 2022, 14, 273. [Google Scholar] [CrossRef]
  8. Lary, D.J.; Alavi, A.H.; Gandomi, A.H.; Walker, A.L. Machine learning in geosciences and remote sensing. Geosci. Front. 2016, 7, 3–10. [Google Scholar] [CrossRef]
  9. Adrah, E.; Wong, J.P.; Yin, H. Integrating GEDI, Sentinel-2, and Sentinel-1 imagery for tree crops mapping. Remote Sens. Environ. 2025, 319, 114644. [Google Scholar] [CrossRef]
  10. Faheem, Z.; Kazmi, J.H.; Shaikh, S.; Arshad, S.; Noreena Mohammed, S. Random forest-based analysis of land cover/land use LCLU dynamics associated with meteorological droughts in the desert ecosystem of Pakistan. Ecol. Indic. 2024, 159, 111670. [Google Scholar] [CrossRef]
  11. Phalke, A.R.; Özdoğan, M.; Thenkabail, P.S.; Erickson, T.; Gorelick, N.; Yadav, K.; Congalton, R.G. Mapping croplands of Europe, Middle East, Russia, and Central Asia using Landsat, Random Forest, and Google Earth Engine. ISPRS J. Photogramm. Remote Sens. 2020, 167, 104–122. [Google Scholar] [CrossRef]
  12. Zhang, P.; Du, P.; Guo, S.; Zhang, W.; Tang, P.; Chen, J.; Zheng, H. A novel index for robust and large-scale mapping of plastic greenhouse from Sentinel-2 images. Remote Sens. Environ. 2022, 276, 113042. [Google Scholar] [CrossRef]
  13. Liu, D.; Chen, N.; Zhang, X.; Wang, C.; Du, W. Annual large-scale urban land mapping based on Landsat time series in Google Earth Engine and OpenStreetMap data: A case study in the middle Yangtze River basin. ISPRS J. Photogramm. Remote Sens. 2020, 159, 337–351. [Google Scholar] [CrossRef]
  14. Dias, C.R.G.; Neves, A.K.; Silva, J.M.N.; Ribeiro, N.S.; Pereira, J.M.C. Pereira. A landsat-based burned area atlas (2000–2023) for the Niassa Special Reserve, Mozambique using U-Net deep learning. ISPRS J. Photogramm. Remote Sens. 2025, 230, 147–169. [Google Scholar] [CrossRef]
  15. Turkoglu, M.O.; D’Aronco, S.; Perich, G.; Liebisch, F.; Streit, C.; Schindler, K.; Wegner, J.D. Crop mapping from image time series: Deep learning with multi-scale label hierarchies. Remote Sens. Environ. 2021, 264, 112603. [Google Scholar] [CrossRef]
  16. Wang, B.; Zhao, H.; Wang, X.; Lyu, G.; Chen, K.; Xu, J.; Cui, G.; Zhong, L.; Yu, L.; Huang, H.; et al. Bamboo classification based on GEDI, time-series Sentinel-2 images and whale-optimized, dual-channel DenseNet: A case study in Zhejiang province, China. ISPRS J. Photogramm. Remote Sens. 2024, 209, 312–323. [Google Scholar] [CrossRef]
  17. Yang, J.; Hu, Q.; Li, W.; Song, Q.; Cai, Z.; Zhang, X.; Wei, H.; Wu, W. An automated sample generation method by integrating phenology domain optical-SAR features in rice cropping pattern mapping. Remote Sens. Environ. 2024, 314, 114387. [Google Scholar] [CrossRef]
  18. Yang, J.; Dong, J.; Liu, L.; Zhao, M.; Zhang, X.; Li, X.; Dai, J.; Wang, H.; Wu, C.; You, N.; et al. A robust and unified land surface phenology algorithm for diverse biomes and growth cycles in China by using harmonized Landsat and Sentinel-2 imagery. ISPRS J. Photogramm. Remote Sens. 2023, 202, 610–636. [Google Scholar] [CrossRef]
  19. Xu, X.; Fu, D.; Su, F.; Lyne, V.; Yu, H.; Tang, J.; Hong, X.; Wang, J. Global distribution and decline of mangrove coastal protection extends far beyond area loss. Nat. Commun. 2024, 15, 10267. [Google Scholar] [CrossRef]
  20. Shen, Y.; Zhang, X.; Tran, K.H.; Ye, Y.; Gao, S.; Liu, Y.; An, S. Near real-time corn and soybean mapping at field-scale by blending crop phenometrics with growth magnitude from multiple temporal and spatial satellite observations. Remote Sens. Environ. 2025, 318, 114605. [Google Scholar] [CrossRef]
  21. Zhang, C.; Dong, J.; Ge, Q. IrriMap_CN: Annual irrigation maps across China in 2000–2019 based on satellite observations, environmental variables, and machine learning. Remote Sens. Environ. 2022, 280, 113184. [Google Scholar] [CrossRef]
  22. Yang, M.; Guo, B.; Wang, J. A novel and robust method for large-scale single-season rice mapping based on phenology and statistical data. ISPRS J. Photogramm. Remote Sens. 2024, 213, 14–32. [Google Scholar] [CrossRef]
  23. Chen, H.; Li, H.; Liu, Z.; Zhang, C.; Zhang, S.; Atkinson, P.M. A novel Greenness and Water Content Composite Index (GWCCI) for soybean mapping from single remotely sensed multispectral images. Remote Sens. Environ. 2023, 295, 113679. [Google Scholar] [CrossRef]
  24. Xiao, G.; Huang, J.; Song, J.; Li, X.; Du, K.; Huang, H.; Su, W.; Miao, S. A novel soybean mapping index within the global optimal time window. ISPRS J. Photogramm. Remote Sens. 2024, 217, 120–133. [Google Scholar] [CrossRef]
  25. Huang, Y.; Qiu, B.; Yang, P.; Wu, W.; Chen, X.; Zhu, X.; Xu, S.; Wang, L.; Dong, Z.; Zhang, J.; et al. National-scale 10 m annual maize maps for China and the contiguous United States using a robust index from Sentinel-2 time series. Comput. Electron. Agric. 2024, 221, 109018. [Google Scholar] [CrossRef]
  26. Zhang, H.; Liu, W.; Zhang, L. Seamless and automated rapeseed mapping for large cloudy regions using time-series optical satellite imagery. ISPRS J. Photogramm. Remote Sens. 2022, 184, 45–62. [Google Scholar] [CrossRef]
  27. Peng, Y.; Qiu, B.; Tang, Z.; Xu, W.; Yang, P.; Wu, W.; Chen, X.; Zhu, X.; Zhu, P.; Zhang, X.; et al. Where is tea grown in the world: A robust mapping framework for agroforestry crop with knowledge graph and sentinels images. Remote Sens. Environ. 2024, 303, 114016. [Google Scholar] [CrossRef]
  28. Food and Agriculture Organization of the United Nations. Available online: https://www.fao.org/home/zh (accessed on 5 October 2025).
  29. National Bureau of Statistics of China. Available online: https://www.stats.gov.cn/sj/ (accessed on 20 August 2025).
  30. Longjie, Y.; Qihao, W. A hybrid neural network for mangrove mapping considering tide states using Sentinel-2 imagery. Remote Sens. Environ. 2025, 329, 114917. [Google Scholar] [CrossRef]
  31. Yang, J.; Huang, X. The 30 m annual land cover dataset and its dynamics in China from 1990 to 2019. Earth Syst. Sci. Data 2021, 13, 3907–3925. [Google Scholar] [CrossRef]
  32. The 30-Meter Global Land cover Data GlobeLand30. Available online: http://www.webmap.cn/mapDataAction.do?method=globalLandCover (accessed on 20 August 2025).
  33. Huang, H.; Liu, C.; Wang, X.; Biging, G.S.; Chen, Y.; Yang, J.; Gong, P. Mapping vegetation heights in China using slope correction ICESat data, SRTM, MODIS-derived and climate data. ISPRS J. Photogramm. Remote Sens. 2017, 129, 189–199. [Google Scholar] [CrossRef]
  34. National Data of National Bureau of Statistics of China. Available online: https://data.stats.gov.cn/easyquery.htm?cn=C01 (accessed on 6 October 2025).
  35. Zhang, C.; Zhang, H.; Tian, S. Phenology-assisted supervised paddy rice mapping with the Landsat imagery on Google Earth Engine: Experiments in Heilongjiang Province of China from 1990 to 2020. Comput. Electron. Agric. 2023, 212, 108105. [Google Scholar] [CrossRef]
  36. Wu, C.; Jia, W.; Yang, J.; Zhang, T.; Dai, A.; Zhou, H. Economic Fruit Forest Classification Based on Improved U-Net Model in UAV Multispectral Imagery. Remote Sens. 2023, 15, 2500. [Google Scholar] [CrossRef]
  37. Li, D.; Croft, H.; Duveiller, G.; Schreiner-McGraw, A.P.; Belwalkar, A.; Cheng, T.; Zhu, Y.; Cao, W.; Yu, K. Global retrieval of canopy chlorophyll content from Sentinel-3 OLCI TOA data using a two-step upscaling method integrating physical and machine learning models. Remote Sens. Environ. 2025, 328, 114845. [Google Scholar] [CrossRef]
  38. Delalieux, S.; Somers, B.; Hereijgers, S.; Verstraeten, W.W.; Keulemans, W.; Coppin, P. A near-infrared narrow-waveband ratio to determine Leaf Area Index in orchards. Remote Sens. Environ. 2008, 112, 3762–3772. [Google Scholar] [CrossRef]
  39. Liu, H.Q.; Huete, A. A feedback based modification of the NDVI to minimize canopy background and atmospheric noise. IEEE Trans. Geosci. Remote Sens. 1995, 33, 457–465. [Google Scholar] [CrossRef]
  40. Gitelson, A.A.; Gritz, Y.; Merzlyak, M.N. Relationships between leaf chlorophyll content and spectral reflectance and algorithms for non-destructive chlorophyll assessment in higher plant leaves. J. Plant Physiol. 2003, 160, 271–282. [Google Scholar] [CrossRef]
  41. Wang, M.; Mao, D.; Wang, Y.; Xiao, X.; Xiang, H.; Feng, K.; Luo, L.; Jia, M.; Song, K.; Wang, Z. Wetland mapping in East Asia by two-stage object-based Random Forest and hierarchical decision tree algorithms on Sentinel-1/2 images. Remote Sens. Environ. 2023, 297, 113793. [Google Scholar] [CrossRef]
  42. Gibson, R.; Danaher, T.; Hehir, W.; Collins, L. A remote sensing approach to mapping fire severity in south-eastern Australia using sentinel 2 and random forest. Remote Sens. Environ. 2020, 240, 111702. [Google Scholar] [CrossRef]
  43. Cheng, F.; Qiu, B.; Yang, P.; Wu, W.; Yu, Q.; Qian, J.; Wu, B.; Chen, J.; Chen, X.; Tubiello, F.N.; et al. Crop sample prediction and early mapping based on historical data: Exploration of an explainable FKAN framework. Comput. Electron. Agric. 2025, 237, 110689. [Google Scholar] [CrossRef]
  44. Fitzgerald, G.; Rodriguez, D.; O’lEary, G. Measuring and predicting canopy nitrogen nutrition in wheat using a spectral index—The canopy chlorophyll content index (CCCI). Field Crops Res. 2010, 116, 318–324. [Google Scholar] [CrossRef]
  45. Jordan, C.F. Derivation of Leaf-Area Index from Quality of Light on the Forest Floor. Ecology 1969, 50, 663–666. [Google Scholar] [CrossRef]
  46. Rouse, J.; Haas, R.; Schell, J.; Deering, D. Monitoring Vegetation Systems in the Great Plains with ERTS. Proc. Earth Resour. Technol. Satell. Symp. 1973, 1, 309–317. [Google Scholar]
  47. Xiao, X.; Boles, S.; Liu, J.; Zhuang, D.; Frolking, S.; Li, C.; Salas, W.; Moore, B. Mapping paddy rice agriculture in southern China using multi-temporal MODIS images. Remote Sens. Environ. 2005, 95, 480–492. [Google Scholar] [CrossRef]
  48. Buschmann, C.; Nagel, E. In vivo spectroscopy and internal optics of leaves as basis for remote sensing of vegetation. Int. J. Remote Sens. 1993, 14, 711–722. [Google Scholar] [CrossRef]
  49. Merzlyak, M.N.; Gitelson, A.A.; Chivkunova, O.B.; Rakitin, V.Y. Non-destructive optical detection of pigment changes during leaf senescence and fruit ripening. Physiol. Plant. 1999, 106, 135–141. [Google Scholar] [CrossRef]
  50. Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
  51. Grayson, B.; Christopher, B.; Joseph, A. Canopy near-infrared reflectance and terrestrial photosynthesis. Sci. Adv. 2017, 3, e1602244. [Google Scholar] [CrossRef]
  52. Huete, A.; Justice, C.; Liu, H. Development of vegetation and soil indices for MODIS-EOS. Remote Sens. Environ. 1994, 49, 224–234. [Google Scholar] [CrossRef]
  53. Dash, J.; Curran, P.J. The MERIS terrestrial chlorophyll index. Int. J. Remote Sens. 2004, 25, 5403–5413. [Google Scholar] [CrossRef]
  54. Gitelson, A.A.; Viña, A.; Arkebauer, T.J.; Rundquist, D.C.; Keydan, G.; Leavitt, B. Remote estimation of leaf area index and green leaf biomass in maize canopies. Geophys. Res. Lett. 2003, 30, 1248. [Google Scholar] [CrossRef]
  55. Penuelas, J.; Baret, F.; Filella, I. Semi-empirical indices to assess carotenoids/chlorophyll a ratio from leaf spectral reflectance. Photosynthetica 1995, 31, 221–230. Available online: https://cdk.lib.cas.cz/client/handle/uuid:0945cc3e-0232-49a4-b687-78524bdd60bf (accessed on 20 August 2025).
  56. Zha, Y.; Gao, J.; Ni, S. Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. Int. J. Remote Sens. 2003, 24, 583–594. [Google Scholar] [CrossRef]
  57. McFeeters, S.K. The use of the normalized difference water index (NDWI) in the delineation of open water features. Int. J. Remote 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
  58. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  59. Mpakairi, K.S.; Dube, T.; Sibanda, M.; Mutanga, O. Fine-scale characterization of irrigated and rainfed croplands at national scale using multi-source data, random forest, and deep learning algorithms. ISPRS J. Photogramm. Remote Sens. 2023, 204, 117–130. [Google Scholar] [CrossRef]
  60. Xie, Y.; Nhu, A.N.; Song, X.; Jia, X.; Skakun, S.; Li, H.; Wang, Z. Accounting for spatial variability with geo-aware random forest: A case study for US major crop mapping. Remote Sens. Environ. 2025, 319, 114585. [Google Scholar] [CrossRef]
  61. Chen, R.; Yang, H.; Liu, W.; Liu, M.; Qi, N.; Feng, H.; Zhang, C.; Xu, H.; Yang, G. An orchard mapping index and mapping algorithm coupling orchard phenology and green-holding characteristics from time-series sentinel-2 images. Comput. Electron. Agric. 2024, 226, 109437. [Google Scholar] [CrossRef]
  62. Han, L.; Wang, X.; Li, D.; Yu, W.; Feng, Z.; Lu, X.; Wang, S.; Zhang, Z.; Gao, X.; Fan, J. A Novel Approach to Mapping the Spatial Distribution of Fruit Trees Using Phenological Characteristics. Agronomy 2024, 14, 150. [Google Scholar] [CrossRef]
  63. Li, Z.; Xuan, F.; Dong, Y.; Huang, X.; Liu, H.; Zeng, Y.; Su, W.; Huang, J.; Li, X. Performance of GEDI data combined with Sentinel-2 images for automatic labelling of wall-to-wall corn mapping. Int. J. Appl. Earth Obs.-Vation Geoinf. 2024, 127, 103643. [Google Scholar] [CrossRef]
  64. Chen, Z.; Zhang, H.; Zhang, M.; Wu, Y.; Liu, Y. Mangrove mapping in China using Gaussian mixture model with a novel mangrove index (SSMI) derived from optical and SAR imagery. ISPRS J. Photogramm. Remote Sens. 2024, 218, 466–486. [Google Scholar] [CrossRef]
  65. Tian, J.; Wang, L.; Diao, C.; Zhang, Y.; Jia, M.; Zhu, L.; Xu, M.; Li, X.; Gong, H. National scale sub-meter mangrove mapping using an augmented border training sample method. ISPRS J. Photogramm. Remote Sens. 2025, 220, 156–171. [Google Scholar] [CrossRef]
  66. Zheng, Z.; Yu, J.; Zhang, X.; Du, S. Development of a 30m resolution global sand dune/sheet classification map (GSDS30) using multi-source remote sensing data. Remote Sens. Environ. 2024, 302, 113973. [Google Scholar] [CrossRef]
  67. Miao, C.; Fu, S.; Sun, W.; Feng, S.; Hu, Y.; Liu, J.; Feng, Q.; Li, Y.; Liang, T. Large-scale mapping of the spatial distribution and cutting intensity of cultivated alfalfa based on a sample generation algorithm and random forest. Comput. Electron. Agric. 2025, 237, 110613. [Google Scholar] [CrossRef]
Figure 1. Major apple-producing countries and study area (Data Source: Food and Agriculture Organization of the United Nations, FAO).
Figure 1. Major apple-producing countries and study area (Data Source: Food and Agriculture Organization of the United Nations, FAO).
Remotesensing 17 03449 g001
Figure 2. Flowchart of apple orchard mapping.
Figure 2. Flowchart of apple orchard mapping.
Remotesensing 17 03449 g002
Figure 3. Phenological calendar of major crops.
Figure 3. Phenological calendar of major crops.
Remotesensing 17 03449 g003
Figure 4. Time-series NDVI curves of typical land cover types.
Figure 4. Time-series NDVI curves of typical land cover types.
Remotesensing 17 03449 g004
Figure 5. Comparison of fruit tree leaves. (a) Apple; (b) Pear; (c) Peach.
Figure 5. Comparison of fruit tree leaves. (a) Apple; (b) Pear; (c) Peach.
Remotesensing 17 03449 g005
Figure 6. Time-series curves of spectral bands of typical land cover types.
Figure 6. Time-series curves of spectral bands of typical land cover types.
Remotesensing 17 03449 g006
Figure 7. Spectral index time-series curves of typical land cover types.
Figure 7. Spectral index time-series curves of typical land cover types.
Remotesensing 17 03449 g007
Figure 8. AMCI time-series curves for major land cover types.
Figure 8. AMCI time-series curves for major land cover types.
Remotesensing 17 03449 g008
Figure 9. Spatial distribution of apple plantation in major producing regions of China (2022).
Figure 9. Spatial distribution of apple plantation in major producing regions of China (2022).
Remotesensing 17 03449 g009
Figure 10. Feature importance rankings for apple classification across different regions and the national scale. (a) Shandong province; (b) Shanxi province; (c) Shaanxi province; (d) Gansu province; (e) Entire study area.
Figure 10. Feature importance rankings for apple classification across different regions and the national scale. (a) Shandong province; (b) Shanxi province; (c) Shaanxi province; (d) Gansu province; (e) Entire study area.
Remotesensing 17 03449 g010
Table 1. Classification system.
Table 1. Classification system.
CategoryHigh-Resolution Remote Sensing Imagery
ApplesRemotesensing 17 03449 i001
OthersRemotesensing 17 03449 i002
Table 2. Spectral indices used in the RF model.
Table 2. Spectral indices used in the RF model.
Index NameCalculation
Enhanced Vegetation Index (EVI) [39] 2.5 n i r r e d n i r + 6 r e d + 7.5 b l u e + 1
Ratio Vegetation Index (RVI) [44] n i r r e d
Difference Vegetation Index (DVI) [45] n i r r e d
Normalized Difference Vegetation Index (NDVI) [46] n i r r e d n i r + r e d
Land Surface Water Index (LSWI) [47] n i r s w i r 1 n i r + s w i r 2
Green Normalized Difference Vegetation Index (GNDVI) [48] n i r g r e e n n i r + g r e e n
Green Chlorophyll Vegetation Index (GCVI) [40] n i r g r e e n 1
Plant Senescence Reflectance Index (PSRI) [49] r e d g r e e n n i r
Soil Adjusted Vegetation Index (SAVI) [50] 1.5 n i r r e d n i r + r e d + 0.5
Near-Infrared Reflectance of Vegetation (NIRv) [51] n i r r e d n i r + r e d n i r
Normalized Difference Red Edge Index (NDRE) [44] n i r r e 1 n i r + r e 1
Bare Soil Index (BSI) [52] ( r e d + s w i r 1 ) ( n i r + b l u e ) ( r e d + s w i r 1 ) + ( n i r + b l u e )
MERIS Terrestrial Chlorophyll Index (MTCI) [53] r e 2 r e 1 r e 1 r e d
Chlorophyll Index Red Edge (Cire) [54] n i r r e 1 1
Structure Insensitive Pigment Index (SIPI) [55] n i r b l u e n i r r e d 1
Normalized Difference Built-up Index (NDBI) [56] s w i r 1 n i r s w i r 1 n i r
Normalized Difference Water Index (NDWI) [57] g r e e n n i r g r e e n n i r
Table 3. Apple orchard mapping confusion matrix in the study area.
Table 3. Apple orchard mapping confusion matrix in the study area.
AreaMethodCategoryApplesOthers
Shandong ProvinceAMCIApples5516
Others3170
KAMFApples785
Others881
Shanxi
Province
AMCIApples29691
Others87292
KAMFApples32025
Others63358
Shaanxi ProvinceAMCIApples21697
Others82201
KAMFApples26321
Others35277
Gansu
Province
AMCIApples14517
Others95223
KAMFApples2057
Others35233
Table 4. Apple orchard mapping accuracy in the study area.
Table 4. Apple orchard mapping accuracy in the study area.
AreaMethodCategoryUAPAF1-ScoreOAKappa
Shandong ProvinceAMCIApples0.8160.8260.8210.8200.640
Others0.8240.8140.819
KAMFApples0.9070.9400.9230.9240.849
Others0.9420.9100.926
Shanxi
Province
AMCIApples0.7640.7720.7680.7680.535
Others0.7700.7630.767
KAMFApples0.8350.9280.8790.8850.770
Others0.9350.8500.890
Shaanxi ProvinceAMCIApples0.6900.7250.7070.7000.399
Others0.7110.6750.692
KAMFApples0.8820.9260.9040.9060.812
Others0.9300.8880.908
Gansu
Province
AMCIApples0.8950.6040.7220.7670.533
Others0.7020.9290.801
KAMFApples0.8540.9670.9070.9130.825
Others0.9710.8690.917
AverageAMCIApples0.7910.7320.7550.7640.527
Others0.7520.7950.770
KAMFApples0.8700.9400.9030.9070.814
Others0.9450.8790.910
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, C.; Yang, J.; Zhou, H.; Zhang, S.; Xiao, X.; Tang, K.; Zhang, X.; Zhang, N.; Ming, D. Apple Orchard Mapping in China Based on an Automatic Sample Generation Algorithm and Random Forest. Remote Sens. 2025, 17, 3449. https://doi.org/10.3390/rs17203449

AMA Style

Wu C, Yang J, Zhou H, Zhang S, Xiao X, Tang K, Zhang X, Zhang N, Ming D. Apple Orchard Mapping in China Based on an Automatic Sample Generation Algorithm and Random Forest. Remote Sensing. 2025; 17(20):3449. https://doi.org/10.3390/rs17203449

Chicago/Turabian Style

Wu, Chunxiao, Jianyu Yang, Han Zhou, Shuoji Zhang, Xiangyi Xiao, Kaixuan Tang, Xinyi Zhang, Nannan Zhang, and Dongping Ming. 2025. "Apple Orchard Mapping in China Based on an Automatic Sample Generation Algorithm and Random Forest" Remote Sensing 17, no. 20: 3449. https://doi.org/10.3390/rs17203449

APA Style

Wu, C., Yang, J., Zhou, H., Zhang, S., Xiao, X., Tang, K., Zhang, X., Zhang, N., & Ming, D. (2025). Apple Orchard Mapping in China Based on an Automatic Sample Generation Algorithm and Random Forest. Remote Sensing, 17(20), 3449. https://doi.org/10.3390/rs17203449

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop