1. Introduction
Airport site selection is a complex multidisciplinary decision-making process. Meteorological conditions, as a key influencing factor, directly affect the operational safety, efficiency, and environmental sustainability of an airport. According to statistics, the proportion of flight delays caused by weather in the aviation industry is as high as 60% [
1]. For example, in August 2018, Macau International Airport experienced a major landing incident due to changes in the weather system [
2]. Therefore, systematic and comprehensive assessment of meteorological conditions during the site selection stage is particularly important [
3,
4,
5]. In recent years, with the development of artificial intelligence and machine learning methods, such as flight delay prediction based on deep learning [
6], causal machine learning [
7], and ensemble learning [
8], new technologies have provided powerful tools for quantifying and addressing meteorological risks [
9]. These advances further emphasize the need to systematically incorporate meteorological impact analysis into airport site selection and initial operation, thereby avoiding operational disruptions caused by weather at the source and providing a scientific basis for flight scheduling and airspace planning [
10,
11].
In terms of meteorological suitability assessment, various analytical methods have been proposed in previous studies. For instance, Zhang et al. [
12] compared the data of a proposed site with those from surrounding meteorological stations to evaluate the differences in temperature, precipitation, wind, etc.; Wang et al. [
13] pointed out that visibility, cloud base height, wind, and thunderstorms are the key factors affecting airport operations; Lang et al. [
14] used historical climate data to analyze the spatiotemporal distribution characteristics of meteorological factors and disasters, providing climatic background support for site selection. At the methodological level, multi-criteria decision analysis techniques have been widely applied, such as by Ballis et al. [
15], who comprehensively evaluated the meteorological and operational suitability of different site selection options; Chen et al. [
16] used the AHP-TOPSIS model to quantitatively assess the meteorological indicators of water-based airports; Li et al. [
17] generated a meteorological suitability index map based on GIS and weighted overlay analysis; Zheng et al. [
18] sorted the candidate sites through correlation and consistency tests.
Although these studies have advanced the development of airport meteorological assessment indicators and processes, existing methods are primarily limited to the independent analysis of individual meteorological elements. This approach fails to fully capture the synergistic effects and joint probability distributions among multiple elements, thus hindering a comprehensive assessment of the systemic impact of meteorological conditions on airport operations. Case-based reasoning (CBR), an artificial intelligence-driven problem-solving method, is designed to tackle complex decision-making problems in novel situations by retrieving, reusing, adapting, and retaining experiences from historical cases. This method is particularly effective at capturing complex, nonlinear correlations among multiple factors and can offer interpretable empirical references for decision-makers operating in environments characterized by incomplete information or high uncertainty, ultimately enhancing decision-making efficiency and reducing risks [
19]. To date, CBR has demonstrated substantial applicability across various fields requiring comprehensive judgment, such as infrastructure planning, environmental impact assessment, and emergency management [
20]. For instance, in transportation planning, CBR can leverage past urban transportation planning cases to predict public reactions to new plans or assess the potential impacts of different transportation schemes [
21]. In the context of emergency decision-making for environmental incidents, the CBR method can offer decision support by analyzing multi-dimensional scenario spaces, thereby improving decision-making efficiency [
22]. These examples highlight that CBR excels in multi-dimensional scenario adaptation and case knowledge reuse, offering a viable technical solution for multi-criteria and multi-constraint decision-making problems.
Due to the interrelationships and combinations among various meteorological factors, the impact of meteorological conditions on airport operations is extremely complex. This complexity makes it difficult to conduct precise quantitative calculations through mechanism analysis methods. Case-based reasoning methods make up for the limitations of traditional methods in complex meteorological linkage analysis by reusing the comprehensive models of multiple-factor meteorological conditions from historical cases [
23]. Based on this, this article proposes an airport site selection evaluation framework that integrates multi-meteorological-element clustering and case-based reasoning. Firstly, taking Zhengzhou Xinzheng International Airport as a typical case, K-means clustering is used to identify key weather scenarios, and the quantitative relationship between them and flight operation indicators is analyzed in depth. This aims to calibrate the model parameters and verify the effectiveness of the method. On this basis, the calibrated model is applied to the meteorological data of four candidate airports, and the potential operational efficiency of each site is inferred by calculating the frequency of its weather scenarios, ultimately achieving the ranking of the advantages and disadvantages of candidate sites. This study aims to transform meteorological analysis from qualitative description to quantitative decision support, providing a new data-driven approach for airport site selection.
2. Data Sources and Preprocessing
In the analysis of aviation meteorological conditions, not only should the general climate profile of the area where the airport is located be understood, but the daily changes in meteorological elements should also be examined. Starting from the requirements for airport site selection, all candidate sites must be comprehensively evaluated for their meteorological characteristics at different time scales, including short-term variations and long-term trends, to determine whether they have the meteorological conditions for safe and efficient operation. Only by deeply understanding the daily changes in meteorological elements can the meteorological adaptability of the airport be systematically evaluated, providing support for the reasonable planning of flight schedules and the avoidance of adverse weather conditions, and ultimately improving the reliability and economic efficiency of airport operations. To support the site selection decision, this paper selects Zhengzhou Xinzheng International Airport as a typical example, and four candidate sites from Chengdu, Kunming, Jinan, and Changsha for comparative analysis. To ensure the consistency of the time resolution of meteorological data, this study uses meteorological observation data from 1 January 2016 to 30 April 2016. Among them, Zhengzhou Xinzheng International Airport uses hourly observation data, and the candidate airports use 3-hourly observation data, enabling comprehensive comparison and site evaluation across multiple locations and multiple elements.
Flight activities are affected by various meteorological conditions. Among them, low visibility, especially fog, is a key meteorological condition that affects flight activities [
24]. For example, the fog analysis at Ankara Esenboga International Airport showed that visibility had a significant impact on flight safety [
25]. Unstable winds, especially turbulence on the runway glide path, can cause aircraft to deviate from the flight path or require a go-around during landing [
26]. Therefore, the layout of the airport runway needs to be aligned with the dominant wind direction to maximize the operating time within the limits of headwinds and crosswinds [
27]. Heavy precipitation can reduce the operational capacity of the airport, especially the capacity of taxiways and runways [
28]. Thunderstorm weather can significantly reduce the airport’s capacity and affect the normal operation of flights. According to the regulations of the International Civil Aviation Organization regarding flight safety, we selected six meteorological factors that directly affect the takeoff and landing of flights from the obtained meteorological observation data: horizontal visibility, cloud base height, average wind speed, rainfall, snowfall, and thunderstorm observation data. Since the obtained meteorological element data include numerical (horizontal visibility, cloud base height, wind speed) and categorical (rainfall, snowfall, thunderstorm) types, it is necessary to uniformly quantify these data. Among them, although horizontal visibility, cloud base height, and wind speed are all numerical, there are some threshold requirements in the actual operation process, which need to be further quantified in combination with relevant standards. Based on the research results of Shang et al. [
29], the meteorological data were quantified, and the results are shown in
Table 1.
In order to fully analyze the delay of flights under each meteorological factor, the flight information from Zhengzhou Xinzheng International Airport from 1 January 2016 to 30 April 2016 was collected from the Ctrip travel website, including departure airport, arrival airport, flight number, aircraft number, planned takeoff time, actual takeoff time, planned arrival time, actual arrival time, flight cancellation, and other information. At the same time, in order to clearly understand the corresponding weather conditions under the planned takeoff time of each aircraft, the obtained airport flight data and meteorological data were further processed. Firstly, based on the time of each meteorological data point of the existing airport, the airport flight data under the corresponding time were matched; secondly, if at a certain time no aircraft was scheduled to take off, the meteorological data information was removed. If there were multiple aircraft scheduled to take off at a certain time, the information of all aircraft would be retained. Finally, 1924 meteorological data points from 1 January 2016 to 30 April 2016 and the flight data for flights planned to take off from Zhengzhou Xinzheng International Airport at the corresponding time of each meteorological data point were obtained.
3. Methods
3.1. Technical Route
This paper introduces a method based on case-based reasoning and multi-meteorological-factor joint analysis to assist in airport site selection. The method is mainly divided into four parts: meteorological factor selection, K-means clustering analysis, typical weather scenario–operational efficiency analysis and calculation, and candidate airport ranking. The technical route is shown in
Figure 1.
3.2. Selection of Meteorological Elements
The appropriate selection of meteorological elements is the basis for constructing typical weather scenarios, which directly affects the accuracy of airport operational efficiency evaluation. Combined with the civil aviation air traffic management rules and the existing research [
30,
31,
32], this paper selects six meteorological factors that directly affect the safety of flight takeoff and landing: horizontal visibility, cloud base height, average wind speed, rainfall, snowfall, and thunderstorms.
- (1)
Horizontal visibility
Low visibility can prevent pilots from seeing the runway or navigation signs, increasing operational risks, and also affect the operational efficiency of ground vehicles. When visibility is insufficient, the airport may suspend takeoffs and landings or reduce the frequency of flights, resulting in large-scale delays.
- (2)
Cloud base height
When the cloud base height is too low, pilots may not be able to visually see the runway before the decision altitude and need to go around or prepare for landing again. At the same time, cloud coverage may force the aircraft to detour, increasing fuel consumption and flight time.
- (3)
Average wind speed
When the crosswind exceeds the aircraft type limit, the aircraft may not be able to take off and land safely due to difficult control. Because the airport needs to dynamically adjust the use of the runway according to the wind direction, operating efficiency may be reduced. At the same time, high winds can also threaten the flight safety of aircraft, especially in the takeoff/landing phase.
- (4)
Rainfall
Heavy rainfall can cause aircraft to skid, requiring runway closure or extended takeoff and landing intervals. At the same time, heavy rain further deteriorates visibility, which may affect ground radar and communication systems. Continued rainfall can also create a backlog of flights, increasing ground waiting time.
- (5)
Snowfall
Snow requires frequent de-icing and cleaning, occupying runway resources, reducing takeoff and landing windows, and reducing visibility and runway friction. In extreme cases, the airport will be closed. At the same time, aircraft need to be de-iced on snowy days, increasing ground support time and cost.
- (6)
Thunderstorm
A no-fly buffer zone should be set around the thunderstorm area, resulting in route detours or flow control. Lightning may interrupt ground operations, and aircraft on the tarmac need lightning protection, delaying the support process.
3.3. K-Means Clustering Analysis
3.3.1. K-Means Algorithm
Clustering analysis divides the objects in the dataset into different groups, so that the objects in the same group have higher similarity, while the objects in different groups have lower similarity [
33]. The K-means clustering algorithm is a commonly used unsupervised learning method that divides the dataset into clusters. The goal is to make the data points in each cluster as similar as possible, and the data points between different clusters as different as possible. It has the advantages of simple implementation and low linear time complexity [
34]. Therefore, this paper uses the K-means clustering algorithm to cluster the 1924 meteorological data obtained. The steps of the K-means clustering algorithm are as follows:
- ①
Initialization: Randomly select data points from the dataset as clustering centers;
- ②
Distribution: For each data point in the dataset, the distance between the point and each center is calculated, and each point is assigned to the nearest center to form clusters;
- ③
Update: For each cluster, the mean value of all data in the cluster is calculated and the center is updated to the mean value;
- ④
Iteration: When the new cluster center coincides with the previous center or the change is below a certain threshold, the clustering result is stable and the clustering ends. Otherwise, repeat steps 2 and 3.
3.3.2. K Value Calculation
It is very important to select the appropriate value of
in the K-means clustering algorithm. The silhouette coefficient is an indicator used to evaluate the clustering effect, which measures the similarity between each data point and its cluster, as well as the dissimilarity with other clusters [
35]. The range of values is [−1, 1], and the closer the value is to 1, the more similar the data point is to other points in its cluster, and the greater the difference between the points in other clusters, the better the clustering effect. The closer its value is to −1, the worse the clustering effect. The sum of squared errors is the sum of squared clustering errors for all data, which can indicate the quality of clustering results. However, since the sum of squared errors is equal to 0 when the number of clusters is equal to the number of data, it cannot be used alone as an indicator to determine the number of clusters. To ensure the effectiveness of clustering, this paper uses a combination of the silhouette coefficient and the sum of squared errors to determine the optimal number of clusters.
The silhouette coefficients and sum of squared errors of 1924 meteorological data from airports in this study under different clustering numbers are shown in
Figure 2. According to
Figure 2, as the number of clusters increases, the sum of squared errors decreases and gradually stabilizes, while the silhouette coefficient shows an increasing trend and gradually stabilizes. When
, the sum of squared errors is 2.89 and the silhouette coefficient is 0.9972. Both the sum of squared errors and the silhouette coefficient tend to stabilize. Therefore, this paper chooses nine clusters, resulting in nine typical weather scenarios.
3.4. Establishment of Quantitative Mapping Relationship Between Weather Scenarios and Flight Operation Indicators
In this section, the meteorological data and airport flight data of the existing airport are processed and analyzed. Based on the nine typical weather scenarios obtained by K-means clustering, the quantitative mapping relationship between weather scenarios and flight operation indicators is established, providing a data-driven basis for evaluating candidate airport efficiency. Firstly, taking the time recorded by each meteorological data point as the center, the planned takeoff flights are retrieved in the 30 min interval before and after that time. If there are no flights in the interval, the meteorological record is discarded. If there is only one flight, it is paired one-to-one. If there are multiple flights, the meteorological record is paired with all flights to form a “weather–flight” sample in nine typical weather scenarios. Secondly, in order to analyze the influence of different weather scenarios on takeoff, flights are divided into four categories: early takeoff, normal takeoff, delayed takeoff, and canceled takeoff, by calculating the delay time of flights. Finally, in order to further analyze the degree of flight delay under different weather scenarios, the distribution of flight delay degree under different weather scenarios was calculated and analyzed, and the delay degree was divided into six categories: <30 min, 30–60 min, 60–120 min, 120–180 min, >180 min, and cancellation. At the same time, the proportion of flight delays under different weather scenarios was analyzed, and the calculation formula is shown in Equation (2).
- (1)
Flight delay time
Referring to the method used by Li et al. [
36], this paper calculates flight delay time as the difference between the actual and planned departure times. The calculation formula is shown in Equation (1):
where
is the actual departure time of the
-th flight,
is the planned departure time of the
-th flight, and
is the delay time of the
-th flight. If
, the takeoff is early; if
, the takeoff is normal; if
, the takeoff is delayed; if
is empty, the takeoff is canceled.
- (2)
The proportion of flight delays under different weather scenarios
The proportion of flights with different levels of delay among delayed flights under each type of weather scenario
is calculated as follows:
where
is the number of flights with delay level
under scenario
,
, corresponding to the above six delay levels;
is the total number of delayed flights under scenario
.
3.5. Candidate Airport Sorting
Based on the quantitative mapping relationship between typical weather scenarios and flight operation indicators established in
Section 3.4, this section proposes a method to evaluate the operational efficiency of candidate airports by counting the frequency of each type of weather scenario. The specific process is as follows:
- (1)
Data preparation: Prepare the historical meteorological data of the candidate airport (six meteorological elements at the same time intervals as the existing airport);
- (2)
Scenario matching: Classify each meteorological record of the candidate airport into one of nine typical weather scenarios based on the K-means clustering center distance;
- (3)
Frequency statistics: Calculate the frequency proportion of each type of weather scenario in each candidate airport;
- (4)
Airport sorting: Based on the quantitative mapping relationship between the established typical weather scenarios and flight operation indicators, sort the four candidate airports according to the frequency proportion of weather scenarios conducive to flight operation. The higher the frequency proportion of favorable weather scenarios, the higher the operational efficiency of the candidate airport. This indicator is then used to assist in airport site selection.
4. Experimental Results and Analysis
4.1. Extraction of Typical Weather Scenarios
To construct the core input of the site selection method—typical weather scenarios, this study used Zhengzhou Airport as a calibration sample and performed K-means clustering on its meteorological data to extract generalizable weather scenario types. The results are shown in
Table 2.
According to
Table 2, among the 1924 weather data points included in this study, weather data belonging to category 2 account for the largest share (52.49%); the weather data belonging to category 9 account for the smallest share, with only six records.
This study defines category 2 weather as good weather, category 1 weather as moderate-visibility-and-rainfall weather, category 3 weather as moderate-visibility weather, category 4 weather as moderate-visibility, low-cloud-base weather, category 5 weather as moderate-visibility, medium-cloud-base weather, category 6 weather as moderate-visibility, snowfall weather, category 7 weather as high-wind-speed weather, category 8 weather as rainfall weather, and category 9 weather as snowfall weather. Based on the characteristics of each weather category, a distribution map of weather categories from January to April 2016 was drawn, as shown in
Figure 3.
According to
Figure 3, the weather from February to April 2016 was generally good, with most weather conditions being favorable. Among them, good weather was the most frequent in March 2016 and least frequent in April 2016. Moderate-visibility-and-snowfall weather, high-wind-speed weather, and snowfall weather only appear in January, and rainfall weather mainly appears in April.
4.2. Mapping the Relationship Between Weather Scenarios and Abnormal Flight Operation Rates
The key step in the calibration method is to establish the correlation between weather scenarios and operational efficiency. This section uses flight data from Zhengzhou Airport to verify the impact of different weather scenarios on operations, providing an ‘impact weight’ basis for site selection evaluation. By analyzing the impact of different weather classifications on flight departures, we can further validate the accuracy of the weather data clustering. Additionally, based on the clustered weather classification results, we can predict flight delays. In order to analyze the influence of different weather types on flight takeoff, the takeoff of flights under different weather types is counted. The results are shown in
Figure 4.
In this study, nine types of weather are defined, and the proportion of flights taking off early, on time, delayed, and cancelled under various weather conditions is counted. The influence of weather on flight departure is analyzed and the correctness of clustering is verified. It can be seen from
Table 2 and
Figure 4 that extreme weather has the most significant impact on flight departure from the perspective of specific weather types. For example, the delay + cancellation ratio of moderate-visibility and low-cloud-base weather (category 4) is 100%, of which the cancellation ratio is 33.33%, the highest among all categories, which directly reflects the important influence of low cloud base height on the visual reference of takeoff; flights can hardly take off normally in such weather. The delay rates of moderate-visibility, snowfall weather (category 6) and high-wind-speed weather (category 7) both exceed 85% (86.49% and 87.88%, respectively). Although the cancellation rate is relatively low, the problems of reduced runway friction coefficient caused by snowfall and excessive crosswind caused by high wind speed still seriously reduce takeoff efficiency. In contrast, the delay rate for general weather conditions (such as category 2 and category 8) is about 78–82%, which is greatly affected by non-weather factors, but overall meets the expectation of “less impact from good weather and moderate impact from rainfall”. The delay rate is the lowest in mild weather conditions such as moderate visibility (category 3) and snowfall (category 9) (68.37% and 66.67%, respectively), and the cancellation rate is extremely low (3.27%, 0%), indicating that flight operations are smoother in such weather conditions with only minor disruptions.
Overall, the worse the weather conditions, the greater the possibility of flight cancellations, and the proportion of delayed takeoffs is also affected by weather types. This not only verifies the rationality of weather clustering, but also provides a basis for predicting flight delays.
4.3. Mapping the Relationship Between Weather Scenarios and Severity of Flight Delays
To refine the dimensions of operational efficiency evaluation (not only looking at delay rate, but also delay duration), this section supplements the mapping relationship for ‘weather delay degree’ based on Zhengzhou Airport data, further improving the calibration depth of the method. To further analyze the degree of flight delays under different weather classifications, we calculated and analyzed the delay distribution for flights under each weather classification. The results are shown in
Figure 5. Additionally, we analyzed the proportion of each delay degree within the different weather categories, and the results are presented in
Figure 6 and
Table 3.
This study reveals the differential impact of weather on flight delays by counting the delay distribution of delayed flights under nine types of weather. Among them, extreme weather types (categories 4 and 5) pose a serious threat to flight operations: category 4 weather (moderate-visibility, low-cloud-base weather) accounts for 40.00% of flights with over 180 min of long delays, and the cancellation rate reaches 33.33%, totaling over 70%. This means that flight delays are longer in this type of weather, and a low cloud base seriously degrades the visual reference conditions at takeoff, which is an important weather factor leading to long-term flight delays or cancellations; The cancellation rate of delayed flights under category 5 weather conditions (moderate visibility and high cloud base) reached 30.00%, with 10.00% of flights delayed for more than 180 min, totaling 40%. A medium cloud base still significantly interferes with takeoff safety, resulting in a higher proportion of cancellations and long-term delays.
General weather and some specific weather types exhibit low-risk characteristics: under category 2 weather (good weather) and category 3 weather (moderate-visibility weather), the proportion of short delayed flights (<30 min) among delayed flights exceeds 67% (67.69% and 67.69% respectively), and the rates of flights delayed by more than 180 min (3.18%, 4.87%) and cancelled flights (2.57%, 4.69%) are extremely low, indicating that this type of weather has little interference with flight operations, mainly short-term delays. In category 7 weather (high-wind-speed weather), 68.97% of the delayed flights are short-delayed flights, and in category 9 weather (snowfall weather), 75.00% of the delayed flights are short-delayed flights. There are no flights delayed or cancelled for more than 180 min in both weather conditions, indicating that high wind speeds are mostly short-term phenomena, and light snowfall without low visibility has minimal impact on flights. Under category 6 weather conditions (moderate visibility and snowfall), although there are no flights with extremely long delays, the proportion of flights with medium to long delays (30–120 min) exceeds 50%, which constitutes “moderate risk” weather.
Based on the above results and conclusions, the correlation between weather categories and flight delay levels has been verified: extreme weather conditions (such as low cloud base and medium cloud base) are more prone to causing flight delays or cancellations, while general weather conditions (such as good weather and moderate visibility) and specific conditions (such as short-term high wind speeds and light snowfall) are mainly characterized by short-term delays. This result provides data support for the construction of subsequent flight delay prediction models and the development of targeted response strategies.
Overall, there are significant differences in the impact of different types of weather on flight takeoff. Based on empirical data such as the proportion of abnormal takeoffs, flight delay rates, cancellation rates, and the proportion of extremely long delays (>180 min), the impact of various weather conditions on flight operations is ranked in descending order as follows:
- ①
Low-cloud-base weather (category 4): Flight delay and cancellation rate reach 100%. The proportion of ultra-long delayed flights among delayed flights is 40%, and the comprehensive impact is the most serious.
- ②
Moderate-visibility, snowfall weather (category 6): The proportion of flights with abnormal takeoff is as high as 91.9% (86.49% delayed + 5.41% cancelled), and 52.94% of delayed flights are medium to long delays (30–120 min). Although the cancellation rate is low, the delay range is wide and the degree is deep, second only to category 4 in terms of impact.
- ③
Moderate-visibility, high-cloud-base weather (category 5): The flight cancellation rate is 16.67%, with 70% of delayed flights experiencing medium to long delays and 10% experiencing extremely long delays. Although the proportion of flights with abnormal takeoff is lower than category 6, the extreme impact is more prominent, ranking third.
- ④
Moderate-visibility, rainfall weather (category 1): The flight delay and cancellation rate is 80%, and 53.57% of delayed flights are medium to long delays. Although there are no ultra-long delays, the proportion of medium to long delays is high, and the impact is higher than that of high wind speeds (category 7).
- ⑤
High-wind-speed weather (category 7): The flight delay and cancellation rate is 90.91%, but 68.97% of delayed flights have short delays and no ultra-long delays, indicating that although high wind speeds cause flight delays, the impact time is short, ranking fifth.
- ⑥
Rainfall weather (category 8): The flight delay rate is 81.82%, but 66.67% of delayed flights have short delays, with no cancellations or long delays, so the impact is lower than that of high wind speeds.
- ⑦
Good weather (category 2): The flight delay and cancellation rate is 80.1%, but 67.69% of the delayed flights have short delays and only 3.18% have long delays, which is in line with the expectation of “little impact of good weather”.
- ⑧
Moderate-visibility weather (category 3): The flight delay and cancellation rate is 71.64%, with 67.69% of delayed flights experiencing short delays and 4.87% experiencing long delays, slightly lower than good weather.
- ⑨
Snowfall weather (category 9): Flight delay and cancellation rate is 66.67%, with 75% of delayed flights experiencing short delays, with no cancellations or long delays.
It should be noted that the result with the lowest impact of snowfall weather in this ranking needs to be interpreted with caution. One key reason for this result may be that the sample size of flights in this category is extremely small (only four), which may make the statistical results unable to accurately reflect the general pattern of the impact of snowfall weather on flight operations. Therefore, for the impact assessment of category 9, further in-depth analysis and validation based on larger sample sizes are needed in the future. The proportion of delay duration in various weather conditions provides a quantitative standard for assessing the ‘extreme delay risk’ of candidate sites, making site selection recommendations more practically valuable.
4.4. Application of Method: Evaluation and Ranking of Candidate Airport Site Selection
Based on the established quantitative mapping relationship between typical weather scenarios and flight operation indicators, the frequency of occurrence of these typical weather scenarios is retrieved from the meteorological data of the candidate sites. This allows us to infer the future aviation operation efficiency of the candidate airports. The frequency distribution of each weather scenario at the candidate airports is shown in
Table 4.
Based on the quantitative mapping relationship between weather scenarios and flight operation efficiency established in
Section 4.2 and
Section 4.3, we conduct the following inferential evaluation of the meteorological suitability of each candidate airport:
Candidate airport 4 requires close attention. Although none of the airports experienced the worst category 4 weather, this airport is the only site with medium-cloud-base weather (category 5), accounting for 9.55%. According to the calibration results in
Section 4.2, the cancellation rate of flights under category 5 weather is as high as 16.67%, and 10% of delayed flights are extremely long delays (>180 min). Therefore, it can be inferred that candidate airport 4, due to its unique meteorological characteristics, may face higher risks of flight cancellations and extreme delays in future operations than other sites, which directly leads to its lowest ranking in the evaluation system.
The operational efficiency of candidate airport 1 is expected to be better than that of airport 4 but weaker than airports 2 and 3. Although the proportion of good weather (category 2) is the lowest (37.99%), the main weather type is moderate-visibility weather (category 3, accounting for 56.75%). According to
Figure 4 and
Section 4.3 analysis, category 3 weather has a relatively small impact on flight operations, with a low delay rate (71.64%) and mainly short-term delays (less than 30 min, accounting for 67.69% of delays). Therefore, although good weather does not dominate, its dominant weather type has lower risks, ensuring relatively stable operational efficiency.
Candidate airports 2 and 3 exhibit the best meteorological suitability. The proportion of good weather (category 2) in both categories is extremely high (reaching 88.03% and 89.10%, respectively), and the proportion of weather categories that have a high impact on operations (such as categories 1, 4, 5, 6) is extremely low (total < 2.5%). According to the calibration model, flight operations are minimally affected under category 2 weather conditions. Therefore, it can be reasonably inferred that candidate airports 2 and 3 have the highest and most reliable aviation operational efficiency in the future and are the optimal site selection schemes. The slight difference between the two may be due to the slightly higher proportion of high-wind-speed weather (category 7, 0.10%) at airport 2, while airport 3 has very little rainfall (category 1, 0.63%), but overall they can be considered ideal sites.
Based on the frequency distribution of various risk weather scenarios in the meteorological data of candidate sites and the previously established quantitative mapping relationship, the recommended priority order for the four candidate airports is: candidate airport 2 > candidate airport 3 > candidate airport 1 > candidate airport 4. This method provides objective and transparent data-driven support for site selection decisions by quantifying weather risks.
5. Conclusions
This article uses the K-means clustering algorithm to cluster the weather data of Zhengzhou Xinzheng International Airport from 1 January 2016 to 30 April 2016, and studies flight takeoff under different weather conditions in detail. Flight takeoff under different weather conditions, the degree of flight delays, and the proportion of flight delays are analyzed. The results are as follows:
- (1)
Developed and calibrated a site selection assistance method: extracted nine typical weather scenarios from Zhengzhou Airport meteorological data, and established a quantitative mapping relationship between them and flight operation efficiency.
- (2)
Clear seasonal characteristics of weather: good weather was predominant from February to April 2016, with snowfall only present in January and concentrated rainfall in April.
- (3)
Quantified the weight of weather impact. The degree of impact of different types of weather on flight takeoff, from high to low, is as follows: low-cloud-base weather, moderate-visibility, snowfall weather, moderate-visibility, medium-cloud-base weather, moderate-visibility, rainfall weather, high-wind-speed weather, rainfall weather, good weather, moderate-visibility weather, and snowfall weather.
- (4)
Completed the sorting of candidate airports: candidate airport 2, candidate airport 3, candidate airport 1, candidate airport 4.
This study has achieved certain results, but there are still some limitations. Firstly, this study only used meteorological and flight data from January to April 2016, which has a short time span and does not cover seasonal changes throughout the year. This may lead to bias in clustering results and scene frequency statistics and cannot fully reflect the long-term meteorological characteristics of candidate airports. Secondly, this article mainly uses historical weather scenario frequency statistics and descriptive analysis to establish the mapping relationship between weather scenarios and operational efficiency. Although it can effectively reveal the strength of their correlation and provide relative ranking for site selection, formal statistical modeling methods such as logistic regression have not been used to quantify the precise probability of flight delays or cancellations under each typical weather scenario. Therefore, future research will focus on the following aspects: obtaining years of meteorological and flight data to study the relationship between meteorological elements and flight operation efficiency, analyzing the long-term trends of meteorological elements and the frequency of extreme events, in order to improve the robustness of site selection evaluation; at the same time, introducing machine learning models such as logistic regression and random forest to construct a predictive evaluation framework that can output delay probabilities under different weather conditions, thereby providing more accurate and quantitative decision support. Finally, the results with the lowest impact of snowfall weather (category 9) on flight takeoff need to be interpreted with caution. Due to its extremely small sample size, statistical results may not accurately reflect universal patterns, and further validation based on larger sample data is needed in the future.