The time series prediction model used in this study is the CNN-LSTM-Attention (CLA) model, as shown in
Figure A2 in
Appendix C. The CLA model has demonstrated success across various domains. In time series forecasting, it offers distinct advantages by effectively capturing both short-term fluctuations and long-term trends in the data [
47,
48]. In this study, the CNN component extracts key features from meteorological data, such as temperature, precipitation, and wind speed, to identify relevant patterns. The LSTM component captures temporal dependencies, learning from historical climate trends to predict how long-term changes, like rising temperatures, influence migration timing and intensity. The attention mechanism then highlights important interactions between meteorological factors, dynamically adjusting their impact to better represent the complex relationships between climate variables and bird migration. This integrated approach allows the model to accurately capture the dynamics of climate impacts on migration.
The 48 h moving window used in the model is both scientifically grounded and pragmatically efficient. It captures both short-term and long-term dependencies, ensuring that the model can accurately forecast bird activity patterns while maintaining sensitivity to cyclical changes. This approach provides a solid framework for prediction, leveraging key temporal structures in the data while preserving the model’s overall predictive accuracy. Mao employed a GRU-based model with attention mechanisms to predict bird migration [
49]. While this method captured temporal dependencies, it struggled with feature extraction, limiting its performance. Our study combined CNN and LSTM, improving feature extraction and handling long-term dependencies more effectively, leading to more accurate predictions. Additionally, Mao removed missing values, which could lead to data loss [
49]. In contrast, our study used predictive interpolation to estimate missing values, preserving data integrity and enhancing the reliability of the results.
3.5.1. MTR Prediction
During the regression interpolation of missing data, particularly near zero, the model may struggle to accurately estimate low-intensity values, which can result in fluctuations and negative MTR predictions. In temporal predictions, the model faces difficulties when attempting to fit all data points, especially when the data include extreme values or high volatility. This can lead to negative predictions if the model does not adequately adapt to local patterns or abrupt changes. These issues are particularly noticeable when the data are sparse. To address these challenges, we replaced negative values in the interpolated data with the nearest non-negative value from a time point that is 24 h apart. This ensures the predictions are consistent with expected migration patterns and prevents unrealistic negative values. Additionally, we incorporated the rectified linear units (ReLU) activation function [
50] in the final stage of the nonlinear regression model. Instead of using ReLU as a post-prediction correction, we applied it during training. When the model predicts negative values, the ReLU activation increases the loss, prompting the model to adjust its predictions. This active correction, embedded in the training process, helps the model avoid negative predictions, especially when positive values are expected. Thus, ReLU is an essential part of the learning process, guiding the model to minimize negative predictions and ensuring accurate, non-negative migration values without the need for post-prediction adjustments. These two solutions ensure that the model generates accurate, non-negative predictions that align with real-world migration patterns while maintaining its adaptability and minimizing prediction errors.
Figure 6 and
Figure 7 show the prediction performance of the CLA model on bird migration intensity (MTR) across five cities during the spring and autumn migration seasons. The model effectively captures the main trends and fluctuations in migration intensity, particularly during peak periods, where the predicted values closely match the actual data, demonstrating its accuracy in identifying peak migration changes. However, at lower intensity levels, especially when the intensity approaches zero, the model’s prediction accuracy decreases. The higher errors observed in the model during low migration periods are mainly due to the ecological differences between resident and migratory birds. Resident birds, which stay in fixed areas year-round, are driven by food availability, territorial behavior, and social interactions, showing little response to short-term weather changes. In contrast, migratory birds depend heavily on weather conditions, adjusting their activity levels based on factors such as wind, radiation, and stable air pressure. During low migration periods, the contribution of resident birds to the MTR increases, leading to a higher amount of background noise. This results in larger errors during interpolation and time-series prediction, affecting model accuracy. However, the impact of this error on the overall predictions remains minimal for migration safety applications. As algorithms for extracting bird activity-related data from weather radar continue to improve [
51,
52], data quality will progressively enhance over time. From an aviation safety perspective, the primary objective of the model is to predict high-risk migration periods when bird strikes are more likely. Errors during low-activity periods have minimal impact on the model’s overall utility for aviation safety, as these periods present a lower risk for bird strikes. Future work will focus on enhancing the model’s performance during low-activity periods by incorporating additional factors, such as habitat changes and human behavior. This will help better account for the behavior of resident birds and mitigate the effects of interpolation errors [
42].
To address the small dataset issue, we incorporated additional data from the 2024 spring migration season to enhance the model’s generalization. We applied a “pre-training and fine-tuning” strategy [
53], where the CLA model, initially trained on the 2023 data, was fine-tuned using the 2024 dataset. This process involved freezing the CNN layers and adjusting the parameters of the LSTM and attention layers. The results for the 2024 dataset, with an average R
2 of 0.8954 (compared to the original model’s R
2 of 0.9274), demonstrate the model’s robustness despite the meteorological variations across the years. These results are presented in
Figure 8. This strategy is particularly beneficial in regions with sparse radar data, as it enables the model to adapt to new regions or conditions with minimal additional data, reducing the need for full retraining and computational costs. Additionally, the validation strategy, which includes cities with diverse climatic and ecological conditions, ensures the model’s effectiveness and generalizability across a wide range of meteorological scenarios.
To further compare the performance differences between the CLA model and more advanced models, we selected the Transformer model for evaluation due to its ability to capture complex, nonlinear relationships within the data, facilitated by its attention mechanism. This mechanism allows the model to focus on the most relevant features at different stages of migration, thereby improving its predictive accuracy, especially during periods of high migration intensity. The experimental results, presented in
Figure 6 and
Figure A3, show that although the Transformer model exhibits a slightly higher Mean Absolute Error (MAE) compared to the CLA model, it excels in capturing the intricate patterns of migration intensity, particularly during peak migration periods. Specifically, the CLA model achieved an average R
2 value of approximately 0.92, while the Transformer model achieved an average R
2 value of approximately 0.87. The comparison between
Figure 6 and
Figure A3 demonstrates how the Transformer model more effectively captures high-intensity migration events, which are critical for bird strike prevention in aviation. However, the results also highlight areas for improvement, particularly the need to reduce MAE during periods of low migration activity, which is essential for improving the model’s overall performance. In conclusion, the attention mechanism integrated into the Transformer model enables more nuanced and accurate predictions of migration intensity, particularly during high-risk periods. While challenges persist during low-activity phases, the Transformer model shows significant promise for real-time bird strike prediction systems. Future research will focus on refining Transformer variants based on the CLA model, incorporating additional features and optimization techniques to enhance its performance across all migration phases, ultimately improving both its robustness and predictive accuracy.
Overall, the model successfully meets the objective of predicting bird migration activity based on meteorological factors and achieves the expected outcomes. In comparison, a study by Xu et al., which used similar data to this research, employed the random forest model to predict migration activity. Their model achieved a prediction accuracy of around 75%, which increased to 82% after improving data quality [
54]. In contrast, the model developed in our study consistently maintained a prediction accuracy of over 90% for MTR across different cities. This highlights the superior performance and robustness of our model compared to previous approaches. Our model predicts migration intensity based on current weather conditions, enabling airports to anticipate peak migration periods. This allows for early warning alerts to notify airport authorities of high-risk times, enabling timely adjustments to flight schedules or the implementation of additional safety measures. Additionally, the model can assist wildlife management by directing efforts to high-risk areas and can be integrated with existing bird detection systems, such as radar and tracking technologies. This integration provides a comprehensive view of bird activity, enabling more informed decision-making, such as adjusting flight paths or deploying deterrents, ultimately enhancing bird strike prevention. In summary, our model offers airports a proactive tool to anticipate bird strike risks, improve real-time decision-making, and ensure safer airport operations.
3.5.2. Ablation Experiments
To assess the contribution and necessity of each component within the overall model and further optimize its accuracy and robustness, a series of ablation experiments were conducted using spring migration data. By systematically removing or substituting certain structures within the model, the impact on model performance was evaluated, clarifying the role and importance of each module. Model performance was comprehensively evaluated using metrics such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R
2). The experimental results are summarized in
Table 2.
The results indicate that, compared to the benchmark model, the ability to handle long-distance dependencies is significantly reduced when the attention mechanism is removed. Removing the attention mechanism alone reduced the average prediction accuracy (R2) by 3.93%, while excluding both the CNN and attention mechanisms caused a more severe decline of 10.96%. The attention mechanism enhances the model’s capacity to process long sequence data by dynamically adjusting the focus on critical temporal intervals, such as nocturnal migration peaks. Although the CNN-LSTM model can effectively extract features and capture time series information, it performs worse than the benchmark model in handling long-distance dependencies. When the CNN component is removed, the model’s spatial feature extraction capability is notably diminished, reducing average R2 by 8.47%. While the LSTM with the attention mechanism can adaptively weight sequential data, the absence of CNN preprocessing degraded initial feature representations, especially under rapidly changing meteorological conditions (e.g., sudden rainfall events). Removing both the CNN and attention mechanisms leaves the LSTM model to rely solely on itself for processing the raw data. This simplification reduced accuracy by 10.96%. Although LSTM can capture temporal dependencies, its performance deteriorates significantly without the benefit of feature extraction and dynamic attention adjustment, leading to an overall performance lower than that of the benchmark model. The ablation experiments highlight the contributions of each component: CNNs effectively extract high-level features, LSTM captures time series dependencies, and the attention mechanism enhances the ability to handle long-distance dependencies. By fully integrating the strengths of these modules, the benchmark model outperforms models that lack any single component. These findings provide valuable insights for optimizing future model designs.
Our study demonstrates the effectiveness of the CLA model in predicting bird migration intensity. However, several avenues for future research could further enhance its performance and applicability. One promising direction is the integration of advanced Transformer variants, such as Informer [
55] and Autoformer [
56]. These models have shown considerable success in forecasting long-sequence time series [
57,
58,
59]. I Informer improves computational efficiency through its sparse attention mechanism and excels in long-term sequence prediction tasks. Compared to traditional Transformers, Informer is better suited to handle large-scale time series data. In contrast, Autoformer uses a seasonal decomposition strategy to manage different components of time series, enhancing its ability to model seasonality and trends. Another potential direction is the use of state space models, such as Mamba [
60]. This model achieves linear time complexity in sequence modeling, offering high accuracy and low computational cost for long-sequence tasks. Although still in the early stages, Mamba shows great potential for real-time bird activity monitoring, especially in scenarios requiring rapid responses.