Automatic Extraction of Discolored Tree Crowns Based on an Improved Faster-RCNN Algorithm

Ma, Haoyang; Yang, Banghui; Wang, Ruirui; Yu, Qiang; Yang, Yaoyao; Wei, Jiahao

doi:10.3390/f16030382

Open AccessArticle

Automatic Extraction of Discolored Tree Crowns Based on an Improved Faster-RCNN Algorithm

by

Haoyang Ma

^1,2

,

Banghui Yang

^3,*

,

Ruirui Wang

^1,2,*,

Qiang Yu

^1,2

,

Yaoyao Yang

^1,2 and

Jiahao Wei

^1,2

¹

College of Forestry, Beijing Forestry University, Beijing 100083, China

²

Beijing Key Laboratory of Precision Forestry, Beijing Forestry University, Beijing 100083, China

³

National Engineering Research, Center for Geoinformatics, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China

^*

Authors to whom correspondence should be addressed.

Forests 2025, 16(3), 382; https://doi.org/10.3390/f16030382

Submission received: 11 January 2025 / Revised: 14 February 2025 / Accepted: 17 February 2025 / Published: 20 February 2025

(This article belongs to the Section Forest Health)

Download

Browse Figures

Versions Notes

Abstract

The precise prevention and control of forest pests and diseases has always been a research hotspot in ecological environmental protection. With the continuous advancement of sensor technology, the fine-grained identification of discolored tree crowns based on UAV technology has become increasingly important in forest monitoring. Existing deep learning models face challenges such as prolonged training time and low recognition accuracy when identifying discolored tree crowns caused by pests or diseases from airborne images. To address these issues, this study improves the Faster-RCNN model by using Inception-ResNet-V2 as the feature extractor, replacing the traditional VGG16 feature extractor, aiming to enhance the accuracy of discolored tree crown recognition. Experiments and analyses were conducted using UAV aerial imagery data from Jilin Changbai Mountain. The improved model effectively identified discolored tree crowns caused by pine wood nematodes, achieving a precision of 90.22%, a mean average precision (mAP) of 83.63%, and a recall rate of 92.33%. Compared to the original RCNN model, the mAP of the improved model increased by 4.68%, precision improved by 10.11%, and recall improved by 5.23%, significantly enhancing the recognition performance of discolored tree crowns. This method provides crucial technical support and scientific basis for the prevention and control of forest pests and diseases, facilitating early detection and precise management of forest pest outbreaks.

Keywords:

discolored tree crowns; forest pests and diseases; deep learning; convolutional neural network; improved Faster-RCNN

1. Introduction

The protection of forest ecological resources is a core task in forestry, with pest and disease control playing a critical role in safeguarding these resources. Over recent years, the rapid spread and severe impact of forest pests, particularly pine wilt disease, have escalated, posing significant threats to global forest ecological security [1]. Pine wilt nematodes Bursaphelenchus xylophilus (Steiner & Buhrer) Nickle invade pine trees, causing discoloration of their crowns, a key indicator for identifying the areas affected by this disease [2].

In addition to these direct effects, recent studies have emphasized the importance of leveraging advanced deep learning techniques to enhance pest and disease identification [3]. The fine-grained identification of discolored tree crowns is crucial for controlling pine wilt disease. Timely detection can effectively identify areas of pest spread, offering scientific guidance for targeted pest control measures [4,5,6]. With the continuous development of remote sensing technologies, especially UAV-based imagery, coupled with advancements in deep learning models, efficient identification of discolored tree crowns has become a critical tool in pest management [7,8,9,10]. For instance, Li et al. demonstrated that convolutional neural networks can accurately detect pests in forestry by analyzing high-resolution aerial imagery, underscoring the potential of UAV-based approaches [11].

Significant progress has been made in applying various algorithms to detect and identify forest pests. Traditional methods based on feature extraction, such as color and texture analysis, have been extended with more sophisticated machine learning and deep learning models. For example, Wu Qiong et al. utilized color and texture features for identifying pest-affected areas, providing robust support for pest analysis [12]. Hu Gensheng et al. employed weighted support vector machines to extract pest-affected pine trees, thereby improving the accuracy of pest detection through machine learning techniques [13]. More recent approaches, such as Zhang Ruirui et al.’s use of U-Net for segmenting UAV high-resolution aerial imagery, have advanced the field by automating identification with improved precision [14]. Moreover, Zhang and Guo reported on the effective use of deep convolutional networks for both pest monitoring and disease classification in forests using UAV imagery [15].

Several deep learning algorithms have been successfully adapted to identify discolored tree crowns in UAV imagery. Huang Liming et al. applied the YOLO algorithm to automatically detect discolored crowns, enhancing YOLOv4 with depthwise separable convolution and inverted residual structures to improve both recognition accuracy and computational efficiency [16]. Similarly, convolutional neural networks (CNNs) and region-based CNNs like Faster-RCNN have shown strong performance in object detection tasks, including pest and disease identification in forestry [16,17,18]. However, limitations remain regarding computational efficiency and the full utilization of high-resolution UAV imagery. Models such as Faster-RCNN, despite their high accuracy, often struggle with complex network structures that lead to slow processing times, especially when using heavy backbone networks like VGG-16 [19,20,21,22,23].

To address these challenges, this study improves the Faster-RCNN model by adopting Inception-ResNet-V2 as the feature extractor to boost both computational efficiency and recognition accuracy [24]. Recent work by Rai and Jain also highlights the benefits of multi-scale convolutional neural networks for forest disease detection from high-resolution UAV images, further supporting the improvements proposed in this study [25]. The approach aims to enhance the model’s ability to represent detailed features while simplifying the network architecture and reducing feature quantization errors. Optimization techniques, such as kernel decomposition and delayed downsampling, are incorporated into the ResNet-50 backbone, resulting in a more efficient network with improved performance. Additionally, an improved ROI Align layer is introduced to refine the accuracy of region-of-interest extraction, and attention mechanisms are integrated across the feature extraction, region proposal, and classification stages to enhance focus on critical regions.

In addition to algorithmic improvements, a crucial factor in forest pest management is the labor cost involved in identifying affected plants. Traditional methods, such as manual inspection and field surveys, are labor-intensive and time-consuming, requiring significant human resources for large-scale monitoring. These approaches increase operational costs and limit the scalability of pest detection.

In contrast, automated systems using machine learning, particularly deep learning, can significantly reduce labor costs by processing UAV or satellite imagery faster than manual methods. Studies have shown that such systems minimize the need for fieldwork, thereby lowering costs while improving detection accuracy. For example, Reddy and Yang (2018) found that deep learning models for pest detection reduced labor requirements and enhanced efficiency [26], while Ghosal and Ghosh (2019) highlighted how automation improved both cost-effectiveness and accuracy in forestry pest management [27].

Experimental validation using UAV high-resolution aerial imagery from multiple self-constructed datasets demonstrates that the optimized model outperforms existing methods in both recognition accuracy and computational efficiency. This study provides a robust solution for the fine-grained identification of discolored tree crowns, offering valuable technical support for forest pest prevention and control.

2. Materials and Methods

2.1. Study Area and Data Acquisition

The experimental data for this study were collected from the Baihe Protection and Management Station, located in the northern part of the Changbai Mountain National Nature Reserve in Jilin Province, China. The study area is situated in the eastern region of the nature reserve, with an elevation range of 700 to 1000 m. The forest resources in the area primarily consist of mixed forests of Korean pine Pinus koraiensis Sieb. et Zucc and broadleaf trees, as shown in Figure 1.

During the UAV aerial photography, the flight altitude was set at 500 m, with a side overlap ratio of approximately 45% and a forward overlap ratio of 65%. This ensured sufficient coverage and minimized gaps in data capture. Over 2900 high-resolution images were obtained, covering a total area of 11.8 square kilometers. The multispectral imagery corresponds to a ground resolution of approximately 0.03 m. Each image consists of three spectral bands—red, green, and blue (RGB true color composition)—which provide rich color and texture details critical for analyzing tree crown features. The total size of the collected image data is 37.4 GB, as shown in Figure 2.

The high-resolution UAV imagery offers a detailed visual representation of the study area, enabling precise identification of discolored tree crowns associated with pests such as pine wilt nematodes. This dataset forms the foundation for applying and validating the proposed deep learning model for fine-grained pest and disease detection.

2.2. Data Preprocessing

To ensure the UAV imagery was suitable for analysis, several preprocessing steps were undertaken. First, the exterior orientation parameters of all images captured during the flight were obtained, followed by the determination of the interior orientation parameters of the Phase One camera. Using the bundle adjustment method in aerial triangulation, digital orthophoto images (DOIs) were generated based on the interior orientation, relative orientation, and epipolar image matching [28].

Subsequent radiometric calibration and atmospheric correction were performed to ensure the imagery was free from distortions or noise. These steps ensured that the images were accurately georeferenced and had consistent geometric shapes, reflecting the real-world conditions of the study area. The generation of digital orthophotos eliminated image tilt and distortion, making the images more suitable for analysis. Additionally, radiometric calibration and atmospheric correction reduced atmospheric interference and sensor noise, enhancing image clarity and contrast. The preprocessed images exhibited higher saturation and color accuracy, allowing for better discernment of features, such as tree crown details, as shown in Figure 3.

As shown in Figure 3, a comparison between the preprocessed images (a) and raw UAV images (b) illustrates the improvement in clarity and detail, particularly in the discernment of tree crown features after the preprocessing steps were applied.

2.3. Model Construction and Optimization

The core methods of this study involve five key stages: dataset construction, backbone network development, model optimization, model training, and accuracy validation. To accurately identify discolored tree crowns caused by pine wilt disease, a high-quality multispectral UAV imagery dataset was assembled. This dataset underwent rigorous preprocessing and annotation to ensure its quality and representativeness, with careful attention to the elimination of potential distortions and noise from the original imagery [29].

For the model development, we adopted the Faster-RCNN framework as the base model and developed a more efficient backbone network using Inception-ResNet-V2. Several critical optimizations were incorporated to enhance the model’s performance. These optimizations included modifications to the ROI Align layer, reducing quantization errors, as well as the introduction of attention mechanisms to enable the model to focus more effectively on key regions within the images, such as discolored tree crowns. Additionally, appropriate loss functions and optimizers were selected to ensure efficient model training and improve recognition accuracy.

Compared to traditional RCNN models, the optimized Faster-RCNN demonstrated superior accuracy and efficiency, especially in recognizing small targets and managing complex backgrounds. These enhancements significantly improved the model’s ability to detect discolored tree crowns, offering valuable technical support for forest pest management and early disease detection. The entire research methodology is outlined in Figure 4.

2.3.1. Dataset Construction

The primary task in constructing a deep learning object detection model is to create a high-quality dataset that meets the requirements of model training and validation. To address the identification needs of discolored tree crowns caused by pine wilt disease, the collected multispectral UAV images were carefully annotated. The annotation process involved marking the specific locations and bounding box information of discolored tree crowns, typically using rectangular bounding boxes to delineate the target areas. This step aimed to provide standardized training data, ensuring that the object detection model could effectively learn feature patterns.

The annotation process used LabelImg, a widely adopted tool for object detection tasks known for its efficiency and intuitive interface [30]. During this process, each UAV image was reviewed frame by frame, and the discolored tree crowns were individually marked, generating the corresponding annotation files. Each annotation file was saved in XML format, containing key information such as target categories and position coordinates. These files were compatible with the input requirements of deep learning object detection models. The dataset was then divided into three subsets: training set, validation set, and testing set.

A total of 200 high-resolution images were used in this study, with the dataset divided as follows: Training set—140 images (70% of the total), Validation set—30 images (15% of the total), Testing set—30 images (15% of the total).

This split ensured that the model had sufficient data for learning while maintaining separate datasets for performance evaluation. The resulting dataset, consisting of annotated XML files and original images, formed a comprehensive foundation for model training and testing. The annotation process and a sample labeled image are shown in Figure 5.

2.3.2. Model Architecture and Optimization

To construct a Faster-RCNN model capable of efficiently extracting discolored tree crowns, this study employed the new ResNet50 FPN V2 as the backbone feature extraction network. Several key optimizations were applied to enhance the model’s performance, including adjustments to anchor box sizes and ratios, improvements to the ROI Align layer, and the introduction of attention mechanisms. These optimizations, combined with carefully defined activation functions and loss functions, created a complete and efficient object detection framework, tailored for identifying discolored tree crowns caused by pine wilt disease.

This study utilized the Faster-RCNN framework with ResNet50 FPN V2 as the backbone feature extraction network. Compared with the traditional VGG-16 feature extraction network, ResNet50 demonstrated superior performance in terms of accuracy and loss rates due to its deeper architecture and feature pyramid design. ResNet50 FPN V2 effectively captures multi-scale features, which is particularly critical for detecting tree crown discoloration in complex forest backgrounds.

The ResNet50 architecture is composed of several residual blocks that allow for the direct flow of gradients through skip connections, alleviating the vanishing gradient problem in deeper networks. This structure is crucial for tasks like tree crown detection, where high-resolution features from deep layers are essential for accurate localization of tree crowns. The feature pyramid network (FPN), integrated into ResNet50 FPN V2, enhances the model’s ability to capture multi-scale features by generating a feature pyramid from different network layers. This multi-scale feature extraction is particularly critical for detecting discolored tree crowns, which vary in size across the images due to the complex forest background. The model construction process involves several key stages outlined below.

(1): Input Data: High-resolution images of the study area, typically preprocessed for noise reduction and normalization, are fed into the network.
(2): Residual Blocks: These blocks allow the network to learn deep representations by using skip connections. Each residual block performs a transformation F(x), and the result is added back to the input x, as shown in the following formula:

$y = F (x) + x$

This design helps avoid the gradient vanishing problem, enabling more stable training in deeper networks.

(3): Feature Extraction with FPN: The FPN component aggregates features from various layers to capture both fine-grained details and coarse features. By upsampling the higher-resolution features and merging them with lower-level features through lateral connections, the network creates a multi-scale representation. This step improves the detection of targets with varying scales, such as the tree crowns in forest environments.
(4): Output Features: After passing through the residual blocks and FPN, the feature maps are used by the region proposal network (RPN) to generate candidate bounding boxes. These proposals are further processed to refine the localization and classification of discolored tree crowns.

To adapt the model to the characteristics of discolored tree crowns, anchor box sizes (128, 256, 512) and aspect ratios (0.5, 1.0, 2.0) in the RPN were customized. The RPN generates candidate object boxes through convolutional operations, filters regions likely to contain discolored tree crowns, and applies non-maximum suppression (NMS) to eliminate highly overlapping candidate boxes, thereby generating precise proposals. This customization allowed the model to effectively focus on discolored tree crown regions while minimizing interference from irrelevant background elements, such as healthy foliage or bare ground.

To further improve the model’s performance in identifying discolored tree crowns, this study optimized two critical aspects: feature extraction precision and the model’s ability to focus on key regions. The specific improvements included refining the ROI Align layer and introducing attention mechanisms.

(1) Traditional ROI Pooling introduces quantization errors by forcing continuous coordinates to discrete values, reducing feature extraction precision and impairing detection performance. To address this issue, an improved ROI Align layer was incorporated. Unlike traditional ROI Pooling, ROI Align uses bilinear interpolation to map feature maps more accurately, ensuring continuity in coordinate information during mapping and avoiding quantization errors.

This improvement significantly enhanced the model’s ability to locate the boundaries of discolored tree crowns, especially when processing high-resolution UAV imagery. The extracted tree crown regions were clearer and more precise, improving the overall detection performance.

The improved ROI Align layer can be described by the following formula:

v (x, y) = \sum_{i = 1}^{1} \sum_{j = 0}^{1} w_{i j} \cdot q (x_{i}, y_{j})

(1)

where

v (x, y)

represents the pixel value after bilinear interpolation,

q (x_{i}, y_{j})

is the value of the nearest four sampling points on the input feature map,

w_{i j}

is the weight of bilinear interpolation, satisfying,

w_{i j} = (1 - |x - x_{i}|) \cdot (1 - |y - y_{j}|)

.

Using the improved ROI Align layer, the average localization error in high-resolution UAV imagery decreased from 2.4 pixels (with traditional ROI Pooling) to 1.1 pixels, and the intersection over union (IoU) score of candidate boxes increased by 4.6%.

(2) To enable the model to focus more effectively on key regions—specifically, discolored tree crown features—this study incorporated the squeeze-and-excitation network (SENet) module, a channel attention mechanism. SENet dynamically reweights input feature maps, highlighting important features and suppressing irrelevant information. The integration of attention mechanisms in the Faster-RCNN framework was applied in two stages:

Feature Extraction Stage: SENet modules enhanced the network’s sensitivity to discolored tree crown features, allowing the model to better capture details related to discolored tree crowns.

Classification and Region Proposal Stage: SENet helped the model focus on key regions during candidate region generation and target classification, reducing false positives and false negatives.

The SENet module operates in two steps outlined below.

Squeeze Operation: Global average pooling is applied to the feature map to obtain channel weight vectors.

s_{c} = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} F_{c} (i, j)

(2)

Excitation Operation: Channel weights are learned through two fully connected layers and activation functions. The weights are then applied back to the original feature map, enhancing the contributions of important channels.

z = σ (W_{2} \cdot δ (W_{1} \cdot s))

(3)

Among them,

W_{1}

and

W_{2}

are the weight matrices of the fully connected layer, where

σ

and

δ

represent the ReLU activation function and sigmoid activation function, respectively. Finally, the channel weights are re-applied to the original feature map to enhance the contribution of the important channels.

2.3.3. Model Training

The optimized Faster-RCNN model, incorporating improvements such as the enhanced ROI Align layer and the introduction of attention mechanisms, demonstrated significant advancements in detecting discolored tree crowns. These modifications allowed the model to achieve higher precision in target localization and greater stability in complex environments, such as forest settings with diverse backgrounds. Experimental results indicated that the optimized model achieved notable improvements in key evaluation metrics, including precision, mean average precision (mAP), and recall, thereby providing strong technical support for the accurate and efficient detection of discolored tree crowns.

During training, an initial learning rate of 0.1 was set, and the batch size was configured to 10 to maintain both stability and convergence. The Adam optimizer was selected for updating model parameters, effectively minimizing the loss function and improving overall training efficiency. Additionally, the SENet module (squeeze-and-excitation network) was incorporated as an attention mechanism. This module enabled the model to focus on crucial regions, such as the core features of discolored tree crowns, thereby enhancing its feature recognition and target detection accuracy. The integration of this attention mechanism significantly improved the model’s ability to detect and accurately locate discolored tree crowns.

Compared to traditional machine learning approaches, the optimized Faster-RCNN model demonstrated substantial improvements in both accuracy and robustness, illustrating its strong adaptability to real-world scenarios involving complex forest environments.

The training process relied on a loss function to measure the discrepancy between the predicted and actual bounding boxes for discolored tree crowns. The Faster-RCNN framework evaluates performance through a weighted combination of regression loss and classification loss. During training, the backpropagation algorithm was employed to compute gradients and iteratively adjust model parameters, facilitating gradual convergence toward an optimal solution for detecting discolored tree crowns.

The learning rate curve during training, as illustrated in Figure 6, shows the model’s performance over time. While the model achieved its highest mAP during the training process, some fluctuations were observed, likely due to data instability. The training losses steadily decreased and eventually stabilized without any noticeable increase, indicating the absence of overfitting. A minor gap between validation and training losses was observed, which could suggest some inconsistencies in the dataset or challenges in generalizing to unseen data.

The training metrics, shown in Figure 6, reveal the progression of key performance indicators throughout the training epochs. The classification loss, which reflects the model’s ability to correctly classify targets, decreased from an initial value of 1.8 to approximately 1.0 before stabilizing, indicating successful learning in classification tasks. The bounding box loss, representing localization accuracy, decreased from 1.2 to around 0.7, demonstrating significant improvement in bounding box predictions. Similarly, the object detection loss dropped from 0.65 to 0.4 and remained stable, further highlighting enhanced detection performance. The mean average precision (mAP) across intersection over union (IoU) thresholds from 0.5 to 0.95, denoted as mAP_50_95, stabilized at approximately 0.25, signifying consistent detection performance across different overlap thresholds.

Through systematic improvements in architecture and robust training strategies, the optimized Faster-RCNN model demonstrated strong performance in detecting discolored tree crowns. This makes it an effective tool for precise and efficient monitoring of forest pest outbreaks, providing a valuable resource for forest management and pest control applications.

2.3.4. Model Accuracy Validation

The following performance evaluation metrics were used to assess and analyze the performance of the optimized Faster-RCNN algorithm [31]:

(1): mAP (mean average precision) is a metric used in object detection algorithms to evaluate performance. It represents the average area under the precision–recall curve. Higher mAP values (closer to 100%) indicate better performance. mAP is calculated by averaging the precision and recall values and integrating over their curve:

$m A P = \frac{1}{|Q_{R}|} \sum_{q \in Q_{R}} A q (q)$

(4)

(2): Precision is the ratio of correctly predicted positive samples to the total number of positive predictions. The formula for precision is the following:

$P e r c i s i o n = \frac{T P}{T P + F P}$

(5)

(3): Recall is the ratio of correctly predicted positive samples to the total number of actual positive samples. The formula for recall is the following:

$R e c a a l l = \frac{T P}{T P + F N}$

(6)

3. Results

3.1. Model Indices and Decision Values

In this study, precision, recall, and mean average precision (mAP) served as the primary metrics for assessing how accurately discolored tree crowns were identified from high-resolution UAV imagery in the Changbai Mountain region. These metrics, introduced formally in a preceding section, quantify different facets of model performance in detecting discolored tree crowns. Precision measures the proportion of correctly identified discolored tree crowns among all predicted instances, while recall indicates the proportion of correctly identified discolored tree crowns among all actual instances in the reference data. The mAP is derived by averaging the average precision for each target category under varying confidence thresholds, thereby offering an overall assessment of detection performance.

Throughout the experiments, the intersection over union (IoU) threshold for bounding box matching was set to 0.5, meaning that a predicted bounding box is considered correct only when its spatial overlap with the ground-truth bounding box reaches or exceeds 50%. This threshold ensures a reasonable trade-off between localization precision and tolerance for slight geometric discrepancies. The model also outputs a confidence score for each detected object, reflecting the likelihood that the detected region corresponds to a discolored tree crown. In this research, a confidence threshold of 70% was employed, primarily to reduce false positives in complex forest scenes while maintaining sufficient sensitivity to capture early discoloration. Objects with confidence scores below 70% were discarded to focus on high-probability detections, thereby improving the overall robustness of the system in identifying genuinely discolored tree crowns.

During practical detection, the improved Faster-RCNN model—enhanced by replacing the traditional VGG16 backbone with an Inception-ResNet-V2 feature extractor—produced bounding boxes and corresponding confidence scores for every region suspected of being discolored. Early-stage discolorations, often characterized by subtle color changes such as yellowish or brownish tinges, were accurately captured once their predicted confidence surpassed the 70% threshold. For advanced cases of pine wood nematode infestation, where the discoloration tends to be more pronounced, the model was particularly effective at correctly localizing and classifying the affected tree crowns due to the distinct spectral signatures and high-contrast appearance in the UAV images. Additionally, the detection of deadwood was similarly governed by the same set of thresholds, ensuring a consistent decision framework across different stages of tree degradation.

These chosen thresholds and metrics remained consistent from the initial validation stages through the final testing phase, guaranteeing that the observed improvements reflect the model’s genuine capacity for high-accuracy discolored crown extraction. In turn, this systematic evaluation process confirmed that replacing the VGG16 backbone with Inception-ResNet-V2 not only increases the speed of model convergence, but also bolsters classification precision in forest areas characterized by diverse canopy structures. By adhering to the IoU requirement and leveraging an appropriate confidence threshold, the proposed model demonstrated robust performance in accurately locating discolored tree crowns, thus providing vital technical support for timely forest health assessments and targeted intervention against pest and disease outbreaks.

3.2. The Improvement of the Model

This study employed both the improved Faster-RCNN model and the traditional VGG16-based Faster-RCNN model to detect discolored tree crowns in UAV aerial images of the Changbai Mountain experimental area. The training results, as presented in Table 1, indicate that the optimized model significantly outperformed the traditional model in all evaluated metrics. Specifically, the precision of the optimized model reached 90.22%, reflecting an improvement of 4.68%; the recall rate improved to 92.33%, representing a gain of 10.11%; and the mean average precision (mAP) increased to 83.63%, a rise of 5.23%.

The training convergence and stability of the optimized model were notably superior, as illustrated in Figure 7. Compared to the VGG16-based model, the ResNet50 FPN V2-based model demonstrated smaller fluctuations in the training and validation loss curves, with faster convergence of validation loss. These findings confirm the robustness and efficiency of the optimized model, particularly in handling complex datasets and high-resolution UAV imagery.

3.3. Application of the Model

With a confidence level threshold set at 70%, the optimized Faster-RCNN model effectively detected discolored tree crowns in the study area, as shown in Figure 8. The model demonstrated reliable detection capabilities, supporting the practical monitoring of discolored tree crowns in the Changbai Mountain forest. The detection results showed reasonable confidence distributions, accurate localization, and comprehensive coverage, achieving a high practical application level. The model successfully identified both early-stage discoloration caused by pine wilt disease and deadwood. These results highlight the model’s capacity to detect discolored tree crowns across various stages of degradation with high precision and recall rates.

For early-stage discoloration caused by pine wilt disease, the discolored areas typically appeared light yellow or yellow-brown, characterized by high spectral reflectance. The model accurately identified these regions in high-resolution multispectral UAV imagery, as demonstrated in Figure 9. the purple box is the extraction result of the discolored crown under this condition. This capability is critical for early detection and prevention of disease spread.

For deadwood that had completely lost vitality, the model also achieved accurate identification. Deadwood targets were characterized by dark brown or gray-brown tones and exhibited clear drying characteristics. These detection results are shown in Figure 10. the purple box is the extraction result of the discolored crown under this condition. The ability to identify deadwood further supports the model’s versatility in monitoring forest health.

The model’s test results were consistent with the data from the evaluation phase, verifying its predictive performance and practical stability. Through evaluation and calibration, the improved Faster-RCNN model was applied to detect discolored tree crowns across the entire dataset, enabling accurate counting of the affected trees within the study area. The experimental results confirmed the model’s capability to effectively identify discolored tree crowns in complex forest backgrounds, particularly in high-resolution UAV aerial images.

This study underscores the significant role of deep learning models in forest pest and disease monitoring. Compared with traditional methods, deep learning models demonstrated substantial advantages in reducing reliance on human and material resources while markedly improving detection efficiency and accuracy. The high efficiency and practicality of the optimized Faster-RCNN model in extracting discolored tree crown information from UAV imagery were evident throughout the study.

4. Discussion

Forest pests pose a significant threat to forest ecological security, making the development of real-time, precise, and effective detection methods an urgent requirement. This research established a novel detection approach for controlling pine wood nematode disease by automatically extracting discolored tree crowns from high-resolution UAV aerial imagery. The integration of advanced deep learning models with UAV technology represents a transformative step in forest pest management, offering both efficiency and scalability.

The improved Faster-RCNN model demonstrated outstanding performance in processing high-resolution UAV imagery for detecting discolored tree crowns. Compared with traditional methods, this model exhibited superior real-time detection capabilities and operational efficiency. UAV aerial imagery provides detailed, large-scale, high-resolution data, while the optimized Faster-RCNN model effectively processes and extracts relevant information in a short timeframe. This enables timely detection of pest infestations, offering a critical advantage in forest pest control. By enabling early detection and intervention, this method minimizes the ecological impact of pine wood nematode disease, thereby maintaining forest health, ecological balance, and supporting sustainable forest management.

However, the performance of deep learning-based pest detection models largely depends on the quality and quantity of the training datasets. For instance, studies in agricultural pest detection using Faster-RCNN models have highlighted several challenges, such as imbalanced datasets and the inability to detect low-frequency categories. In a recent study on farmland pest detection, a dataset of over 3000 pest images was used to train a Faster-RCNN model with VGG16 as the backbone network. Despite achieving high precision for classes with sufficient training samples (e.g., a detection accuracy of 0.99 for certain pest species), the model struggled with rare categories like Spodoptera exigua and Agrotis segetum, leading to poor detection results. This limitation was attributed to data scarcity and imbalanced class distributions, which are common challenges in pest detection tasks.

Similarly, in this study, although the improved Faster-RCNN model achieved high precision and recall rates for detecting discolored tree crowns, the potential limitations of the dataset, such as imbalanced samples of different tree species or stages of discoloration, could impact the generalizability of the results. Addressing these challenges by expanding the dataset and improving data augmentation techniques could further enhance the robustness of the model.

The successful application of the improved Faster-RCNN model in detecting discolored tree crowns offers a new paradigm for forest pest monitoring. Its scalability is particularly noteworthy, as the methodology is not limited to detecting pine wood nematode infestations but could be extended to identify other forest pest and disease types. Future research could focus on further optimizing the model to improve its adaptability and generalization to diverse pest types. For example, incorporating additional data modalities, such as multispectral or hyperspectral imagery, could enhance detection precision and robustness. Moreover, integrating advanced augmentation strategies, as observed in agricultural pest detection studies [32], might help mitigate the limitations of dataset size and class imbalance. Exploring lightweight model architectures could also enhance the system’s applicability in real-time monitoring scenarios with limited computational resources.

Overall, the improved Faster-RCNN model developed in this study provides vital support for achieving real-time, precise, and effective detection of forest pests. Its practical applications hold significant implications for the protection of national forest ecological security, offering a scalable, efficient, and highly accurate approach to pest monitoring. Furthermore, this research underscores the potential of deep learning technology in ecological monitoring and management, highlighting its ability to address pressing environmental challenges. Continued advancements in model optimization and integration with emerging technologies can further expand its utility and contribute to more resilient forest management practices.

5. Conclusions

This study presents an enhanced Faster-RCNN model for the automatic detection of discolored tree crowns caused by forest pests, specifically pine wilt disease. By incorporating the Inception-ResNet-V2 feature extractor into the Faster-RCNN framework, the model demonstrates significant improvements in both recognition accuracy and computational efficiency. With a precision of 90.22%, a mean average precision (mAP) of 83.63%, and a recall rate of 92.33%, the proposed model shows notable advancements over the traditional VGG16-based Faster-RCNN model. These results underscore the effectiveness of the optimized model in detecting early-stage discoloration in tree crowns, offering critical support for the early detection and management of pest outbreaks.

While the model’s performance is robust, several avenues for future research remain. First, expanding the training dataset to encompass a wider range of forest environments and pest species could further improve recognition accuracy and address potential class imbalances. This expansion could also include more diverse weather conditions and varying image quality, which would enhance the model’s robustness in real-world applications.

Second, integrating additional data modalities, such as LiDAR or multispectral imagery, could strengthen the model’s ability to distinguish between disease-induced discoloration and other natural variations in tree crowns. The fusion of these data modalities would provide richer inputs, thereby improving the model’s accuracy, particularly in complex forest environments.

Third, although the proposed model has shown significant improvements, real-time monitoring remains a challenge. Future work should focus on optimizing the model for real-time processing with limited computational resources, potentially through lightweight network architectures or model pruning techniques. Such improvements would enable field-based applications, such as deployment of UAVs with lower computational power, to detect pests in near-real-time.

In conclusion, this study highlights the considerable potential of deep learning, particularly the improved Faster-RCNN model, for the precise and efficient monitoring of forest health and pest outbreaks. The model not only contributes to forest pest management but also establishes a foundation for further advancements in UAV-based ecological monitoring. Future research focused on dataset expansion, data fusion, real-time processing, and economic analysis will enhance the model’s applicability and help establish automated systems as a vital tool for sustainable forest management.

Author Contributions

Conceptualization, B.Y.; Methodology, H.M., Q.Y., Y.Y. and J.W.; Software, H.M.; Data curation, R.W.; Writing—original draft, H.M.; Writing—review & editing, R.W.; Supervision, B.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Natural Science Foundation of China, 41971376 and the Beijing Natural Science Foundation (8212031): Study on the Red Line Division Mechanism of Water Conservation Ecological Protection in Beijing Based on the SWAT Model and Ecological Security Pattern.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ye, J. Current epidemic status, prevention and control technologies, and countermeasure analysis of pine wood nematode disease in China. Sci. Silvae Sin. 2019, 55, 1–10. [Google Scholar]
Tao, H.; Li, C.; Cheng, C.; Jiang, L.; Hu, H. Research progress on remote sensing monitoring of discolored pine trees infected by pine wood nematode disease. For. Res. 2020, 33, 172–183. [Google Scholar]
Hassan, M.I.; Ahmed, M.H. Deep Learning Techniques for Pests and Disease Identification in Forestry: A Survey. For. Ecol. Manag. 2020, 464, 118012. [Google Scholar]
Wang, J. Extraction and detection of crop pest and disease information based on multi-source remote sensing technology. Mod. Agric. Sci. Technol. 2020, 170–171. [Google Scholar]
Lv, X.; Wang, J.; Yu, W. Preliminary study on UAV monitoring of forestry pests. Hubei For. Sci. Technol. 2016, 30–33. [Google Scholar]
Zeng, Q.; Sun, H.; Yang, Y.; Zhou, J.; Yang, C. Comparison of the accuracy of UAV monitoring for pine wood nematode disease. Sichuan For. Sci. Technol. 2019, 40, 92–95+114. [Google Scholar]
Gu, J.; Congalton, R.G. Individual Tree Crown Delineation From UAS Imagery Based on Region Growing by Over-Segments with a Competitive Mechanism. IEEE Trans. GeoScience Remote Sens. 2021, 60, 4402411. [Google Scholar] [CrossRef]
Li, H.; Xu, H.; Zheng, H.; Chen, X. Research on monitoring technology of pine wood nematode disease based on UAV remote sensing images. J. Chin. Agric. Mech. Chem. 2020, 41, 6. [Google Scholar]
Liu, X.; Cheng, D.; Li, T.; Chen, X.; Gao, W. Preliminary study on the automatic monitoring technology for pine wood nematode disease affected trees based on UAV remote sensing images. Chin. J. For. Pathol. 2018, 37, 16–21. [Google Scholar]
Zhai, Y. Research on Peach Pest Occurrence Prediction System Based on Internet of Things Technology. Master’s Thesis, Hebei Agricultural University, Baoding, China, 2014. [Google Scholar]
Li, X.; Liu, Y.; Zhao, X. Aerial Pest Detection in Forestry Using Convolutional Neural Networks. Comput. Electron. Agric. 2021, 182, 105956. [Google Scholar]
Wu, Q. Research on Regional Detection Algorithm of Pine Wood Nematode Disease Based on Remote Sensing Images. Master’s Thesis, Anhui University, Hefei, China, 2013. [Google Scholar]
Hu, G.; Zhang, X.; Liang, D.; Huang, L. Identification of diseased pine trees in remote sensing images based on weighted support vector data description. Trans. Chin. Soc. Agric. Eng. 2013, 44, 258–263+287. [Google Scholar]
Zhang, R.; Xia, L.; Chen, L.; Xie, C.; Chen, M.; Wang, W. Recognition of discolored pine wood infected by pine wood nematode disease based on U-Net network and UAV images. Trans. CSAE 2020, 36, 61–68. [Google Scholar]
Zhang, Z.; Guo, Y. Pest Monitoring and Disease Classification in Forests Using UAV Imagery and Deep Convolutional Networks. Remote Sens. 2019, 11, 2283. [Google Scholar]
Huang, L.; Wang, Y.; Xu, Q.; Liu, Q. Recognition of abnormal discolored pine wood infected by pine wood nematode disease using YOLO algorithm and UAV images. Trans. Chin. Soc. Agric. Eng. 2021, 37, 197–203. [Google Scholar]
Safonova, A.; Tabik, S.; Alcaraz-Segura, D.; Rubtsov, A.; Maglinets, Y.; Herrera, F. Detection of Fir Trees (Abies sibirica) Damaged by the Bark Beetle in Unmanned Aerial Vehicle Images with Deep Learning. Remote Sens. 2019, 11, 643. [Google Scholar] [CrossRef]
Chen, F.; Zhu, X.; Zhou, W.; Gu, M.; Zhao, Y. Spruce counting based on UAV aerial photography and improved YOLOv3 model. Trans. Chin. Soc. Agric. Eng. 2020, 36, 22–30. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the Conference on Computer Vision & Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; IEEE: Piscataway, NJ, USA, 2017. [Google Scholar]
Xu, X.; Tao, H.; Li, C.; Cheng, C.; Guo, H.; Zhou, J. Recognition and location of pine wood nematode disease-affected wood based on Faster R-CNN. Trans. Chin. Soc. Agric. Mach. 2020, 51, 228–236. [Google Scholar]
Huang, H.; Ma, X.; Hu, L.; Huang, Y.; Huang, H. Preliminary study on the application of Fast R-CNN deep learning and UAV remote sensing in pine wood nematode disease monitoring. J. Environ. Entomol. 2021, 43, 1295–1303. [Google Scholar]
Sun, X.; Wu, P.; Hoi, S.C.H. Face Detection using Deep Learning: An Improved Faster RCNN Approach. arXiv 2017, arXiv:1701.08289. [Google Scholar] [CrossRef]
Lee, C.; Kim, H.J.; Oh, K.W. Comparison of faster R-CNN models for object detection. In Proceedings of the 2016 16th International Conference on Control, Automation and Systems, Gyeongju, Republic of Korea, 16–19 October 2016. [Google Scholar]
Kamal, K.; Hamid, E.Z. A Comparison Between the VGG16, VGG19 and ResNet50 Architecture Frameworks for Classification of Normal and CLAHE Processed Medical Images. Available online: https://www.researchsquare.com/article/rs-2863523/v1 (accessed on 16 February 2025).
Rai, P.; Jain, A. Multi-Scale Convolutional Neural Network for Forest Disease Detection from High-Resolution UAV Images. ISPRS J. Photogramm. Remote Sens. 2022, 176, 122–134. [Google Scholar]
Reddy, S.K.; Yang, C. Cost-benefit analysis of applying deep learning models for agricultural pest detection. Comput. Electron. Agric. 2018, 151, 104–114. [Google Scholar]
Ghosal, S.; Ghosh, S. Economic analysis of pest detection in forestry using automated systems. J. Environ. Manag. 2019, 248, 109254. [Google Scholar]
He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 2341–2353. [Google Scholar] [PubMed]
Chen, S.; Liang, D.; Ying, B.; Zhu, W.; Zhou, G.; Wang, Y. Assessment of an improved individual tree detection method based on local-maximum algorithm from unmanned aerial vehicle RGB imagery in overlapping canopy mountain forests. Int. J. Remote Sens. 2021, 42, 106–125. [Google Scholar] [CrossRef]
Wu, J.; Yang, G.; Yang, X.; Xu, B.; Han, L.; Zhu, Y. Automatic Counting of in situ Rice Seedlings from UAV Images Based on a Deep Fully Convolutional Neural Network. Remote Sens. 2019, 11, 691. [Google Scholar] [CrossRef]
Yu, R.; Luo, Y.; Li, H.; Yang, L.; Huang, H.; Yu, L.; Ren, L. Three-Dimensional Convolutional Neural Network Model for Early Detection of Pine Wilt Disease Using UAV-Based Hyperspectral Images. Remote Sens. 2021, 13, 4065. [Google Scholar] [CrossRef]
Huang, L. Image Detection and Localization of Farmland Pests Based on the Faster-RCNN Algorithm. Digit. Technol. Appl. 2023, 41, 49–52. [Google Scholar] [CrossRef]

Figure 1. Location map of the study area.

Figure 2. Study area data.

Figure 3. Comparison of data pretreatment results.

Figure 4. Research technology flowchart.

Figure 5. Labeling of discolored tree crowns.

Figure 6. Learning curve.

Figure 7. Comparison of training and validation loss: VGG16 vs. ResNet50 FPN V2 (with fluctuations).

Figure 8. Extraction results of discolored tree crowns.

Figure 9. Early detection results of discolored tree crowns.

Figure 10. Dry tree crown detection results.

Table 1. Model improvement.

Metric	VGG16	ResNet50 FPN V2	Improvement
Precision	85.54%	90.22%	4.68%
Recall	82.22%	92.33%	10.11%
mAP	78.40%	83.63%	5.23%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ma, H.; Yang, B.; Wang, R.; Yu, Q.; Yang, Y.; Wei, J. Automatic Extraction of Discolored Tree Crowns Based on an Improved Faster-RCNN Algorithm. Forests 2025, 16, 382. https://doi.org/10.3390/f16030382

AMA Style

Ma H, Yang B, Wang R, Yu Q, Yang Y, Wei J. Automatic Extraction of Discolored Tree Crowns Based on an Improved Faster-RCNN Algorithm. Forests. 2025; 16(3):382. https://doi.org/10.3390/f16030382

Chicago/Turabian Style

Ma, Haoyang, Banghui Yang, Ruirui Wang, Qiang Yu, Yaoyao Yang, and Jiahao Wei. 2025. "Automatic Extraction of Discolored Tree Crowns Based on an Improved Faster-RCNN Algorithm" Forests 16, no. 3: 382. https://doi.org/10.3390/f16030382

APA Style

Ma, H., Yang, B., Wang, R., Yu, Q., Yang, Y., & Wei, J. (2025). Automatic Extraction of Discolored Tree Crowns Based on an Improved Faster-RCNN Algorithm. Forests, 16(3), 382. https://doi.org/10.3390/f16030382

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automatic Extraction of Discolored Tree Crowns Based on an Improved Faster-RCNN Algorithm

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Data Acquisition

2.2. Data Preprocessing

2.3. Model Construction and Optimization

2.3.1. Dataset Construction

2.3.2. Model Architecture and Optimization

2.3.3. Model Training

2.3.4. Model Accuracy Validation

3. Results

3.1. Model Indices and Decision Values

3.2. The Improvement of the Model

3.3. Application of the Model

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI