Application of Fuzzy-Based Support Vector Regression to Forecast of International Airport Freight Volumes

: As freight volumes increase, airports are likely to require additional infrastructure development, increased air services, and expanded facilities. Prediction of freight volumes could ensure effective investment. Among the computational intelligence models, support vector regression (SVR) has become the dominant modeling paradigm. In this study, a fuzzy-based SVR (FSVR) model was used to solve the freight volume prediction problem in international airports. The FSVR model can use a fuzzy time series of historical trafﬁc changes for predictions. A fuzzy classiﬁcation algorithm was used for elements of similar levels in the time series to appropriately divide trafﬁc changes into fuzzy sets, generate membership function values, and establish a fuzzy relationship to produce a fuzzy interpolation with a minimal error. A comparison of the FSVR model with other models revealed that the FSVR model had the lowest mean absolute percentage error (all < 2.5%), mean absolute error, and root mean square error for all types of trafﬁc at all the analyzed airports. Fuzzy sets can handle uncertainty and imprecision in time series. Therefore, the prediction accuracy of the entire time series model is improved by taking advantage of SVR and fuzzy sets. By using the highly accurate FSVR model to predict the future growth of air freight volume, airport management could analyze their existing facilities and service capacity to identify operational bottlenecks and plan future development. The FSVR model is the most accurate forecasting model for air trafﬁc forecasting.


Introduction
Globalization has changed people's lives.According to the International Monetary Fund [1], the four fundamental aspects of globalization are "trade and international exchanges" [2], "capital and investment flows" [3], "population flows", and "knowledge spreading".Increasing international trade and investment have integrated the global markets into a "global village" [4].Air transportation, which facilitates international interactions, is a global industry.Transportation for any purpose-including tourism, entertainment, business, and freight delivery-deepens the connections among international terminals worldwide; these terminals are closely related and affect each other.The 20-Year Passenger Forecast by the International Air Transport Association (IATA) indicated that air passenger and freight transportation demand has remained strong.The center of the industry has been speeding up eastward in October 2018.They further stated that, compared Mathematics 2022, 10, 2399 2 of 18 with the current standard, the approaching 20-year air transport would double [5].The compound growth rate of air transport will reach up to 3.5%.By 2037, air passengers will double again, up to 8.2 billion [6].Furthermore, air transport remains strong compared with July 2021 (calculated based on freight ton/kilometers) according to the global air transport periodic report published on 31 August 2021.Global air freight demand in July 2021 was 8.6% higher than in July 2019, a substantial increase in comparison with the long-term average increase of 4.7% [7].
According to the IATA, global air traffic reached only 43% of its pre-pandemic level in 2021 [8], a downward revision from the estimation of 51% at the end of 2020.However, freight demand is increasing rapidly and may continue to exceed expectations.The IATA expected air freight volumes to increase by 13.1% in 2021 to 63.1 million tons, surpassing 2019 levels and approaching the peak in 2018.Air transport is ripe for development in numerous countries and the future of the aviation industry is bright.Aviation will flourish and remain a vital component of the world economy due to the benefits of increased international connectivity.However, the IATA has also suggested that airports and air traffic control may not be able to handle passenger demand if this trend continues.Governments and infrastructure management authorities must strategically plan the future development of aviation infrastructure and their decisions affect the value created by aviation in their corresponding regions [9].
Southwest Airlines instigated the age of low-cost aviation by entering the market as the earliest low-cost carrier (LCC) in 1971 [10].In the last decade, LCCs have already increased the transportation demand.They provided approximately 3.6 billion seats in 2008 and this number had increased to approximately 5.3 billion by 2017.Likewise, the LCC market share grew from 21% in 2007 to 29% in 2017.Moreover, LCCs grew from having 4.4% of the intercontinental market share in 2008 to 11.4% in 2017; LCCs' regional market share grew from 23.6% to 31.4% in the same time period.Thus, both intercontinental and regional LCCs have grown rapidly.
As air traffic increases, the aviation industry continues to develop.Governments worldwide must devise methods for reducing airline bottlenecks and developing domestic air transport infrastructure [11].Existing infrastructure and air services must be improved in response to the increased demand and any expansion of these services must be sustainable.Forecasting airport operations is necessary to ensure that investments are well-targeted and obtain a return.Governments must also understand that globalization has increased the social and economic prosperity of the world and the use of protectionism to curb globalization will reduce development opportunities.Therefore, governments worldwide must perform appropriate traffic forecast planning for the aviation industry [12], construct efficient infrastructure and services, and deploy programs to meet national economic development goals [13].
Wang et al., explored the relative efficiency of three combined forecasts in forecasting the tourism demand for Hong Kong.Two-, three-and four-model are consisted of four modeling techniques: autoregressive integrated moving average (ARIMA) model, autoregressive distributed lag model, error correction model, and vector autoregressive model.The results showed that the combined forecast outperformed the best single forecasting model and avoided the risk of complete forecast failure [14].Therefore, if predictive models are available-which model would generate the best predictions is unknown-combining the predictions of the alternative models is an optimal low-risk approach.Saayman used the Naive and Naive Forecasting Model, the Holt-Winters Exponential Smoothing Model (ETS) [15], ARIMA [16], and seasonal ARIMA (SARIMA) [17] to model and forecast tourism in South Africa's major intercontinental tourism markets, namely the United Kingdom, Germany, the Netherlands, the United States, and France [18].The results revealed that the SARIMA model had the most accurate arrival predictions on three time horizons: 3, 6, and 12 months.Univariate forecasts were concluded to provide relatively accurate forecasts of tourist arrivals in South Africa, especially in the short term.However, the model did not assess the impact of external events; thus, its policy applications are limited [18].Hassani used the root mean square error (RMSE) and direction of change (DC) to evaluate models; singular spectrum analysis (SSA)-R, SSA-V [19], ARIMA, and the trigonometric Box-Cox autoregression (AR)-moving average (MA) trend seasonal model [20] outperformed other models for forecasting the number of tourists arriving in Europe.Suryani developed a system dynamics model to forecast air freight demand in the future to determine terminal capacity required to support long-term growth in Taiwan Taoyuan International Airport [21].Alexander developed a gravity model to predict air freight demand.They evaluated the gravity models to predict and accurately explain major economic shocks such as the global financial crisis [22].The least accurate models were a neural network [15], the ETS, and AR fractionally integrated MA (ARFIMA), MA, and weighted MA (WMA) models [23].Cao et al., used the hybrid prediction of ARIMA and support vector machines (SVMs) to explore research and analysis of the diversion law of passenger transportation on the subway during holidays [24].Passenger flow data from China's Xi'an Line 3 subway station on National Day were used with the ARIMA model, the SVM model, and a hybrid of these models to predict the hourly passenger flow on the subway during the holiday.The research results revealed that the relative error between the predicted results and the actual results of the hybrid model was smaller than those for the other models.The hybrid model (ARIMA+SVM) is practical, generalizable, and suitable for predicting subway passenger diversion; thus, the hybrid model provides a quantitative basis for rationally developing the organization and management of passenger transport in subway stations during holidays or festivals [25].Moreover, Bildirici has proposed modeling strategies that possess benefit from neural network-based GARCH models and SVR-GARCH models to augment commonly used volatility models with support vector machines and neural networks [26].Sharifian proposed a hybrid ensemble of SVR/GARCH predictors in each subscale and trained it to predict the refactored components of workload dynamic resource allocation in mobile cloud computing environments [27].In summary, hybrid models can obtain better forecasting results than single models.No single model is ever superior in terms of prediction accuracy.Traditional time-series analysis models are not entirely inferior to machine-learning prediction models; single traditional time-series analysis and forecasting models can be the best forecasting choice for a specific time series.However, all these models still have many disadvantages in forecasting in real situations.The traditional methods could not handle the forecasting problems in which the historical data presented as approximate numerical forecast values [28].A comparison of the SVRbased and ARIMA-based methods, each having their own merits and weaknesses, has not been undertaken in the field of forecasting the approximate numerical forecast values of historical data.
The fuzzy time series (FTS) model has demonstrated to effectively solve the limitation related with forecasting the approximate numerical forecast values of historical data [29].Tai proposed an improved fuzzy time-series (IFTS) forecasting model using variations of data to effectively forecast the approximate numerical forecast values of historical data.Moreover, IFTS has shown better accurate predictions than other fuzzy SVRs [30].Our objective is to achieve an accurate grasp of international airport freight volumes using fuzzybased SVR (FSVR).Our FSVR used IFTS as the primary mechanism for our forecasting system, which employed SVR with an improved fuzzy set to approximate fuzzy upper and lower bounds and then approximate numerical forecast values.The parameters in the models were investigated to find the optimal values for each data set because no model was considered to be optimal in all situations.All parameters were investigated and determined by appropriate methods.Performed on many datasets with different characteristics, the proposed model was compared with the existing models.Freight volume period data for membership functions are fuzzy values that can be used to simulate processes that require economic expertise and knowledge.Our forecasting system is suitable for seasonal time series that can interpolate historical data and predict the future.Furthermore, FSVR can efficiently handle time series/non-linear problems, resulting in a better performance.The contributions of this study are as follows: (1) development and optimization of an FSVR model for forecasting international airport freight volumes; (2) validation of the ability of FSVR model to generate forecasts under multiple forecast parameters; (3) by using robust statistical indicators, the international airport association's forecast and observational data are compared with data published on the airport authority's official website to assess the model's accuracy over 1 period and 12 periods.

Dataset Description
The object of this study was a time series of air traffic data.The selected airports were the 10 airports with the highest global airport passenger traffic in 2018 according to the statistics of Airports Council International; these 10 airports were Hartsfield-Jackson Atlanta International Airport (ATL), Beijing Capital International Airport (PEK), Dubai International Airport (DXB), Los Angeles International Airport (LAX), Tokyo International Airport (HND), Chicago O'Hare International Airport (ORD), London Heathrow Airport (LHR), Hong Kong International Airport (HKG), Shanghai Pudong International Airport (PVG), and Paris Charles de Gaulle International Airport (CDG) [31].Data were collected for the period from August 2014 to December 2019.Data were obtained from the statistical reports of the International Airports Association and data published on the official websites of the airports' management authorities [32].For the training data set, 1590 observations among 10 airports were extracted from August 2014 to December 2018.The data used the original values of freight volume and the data unit was 10,000 tons for freight volume.For the test data set, 360 observations among 10 airports were extracted from January to December 2019.The time-series data used monthly intervals to sum the values of observations within the same month in an airport.The training data set was used to analyze and build various predictive models.The established model forecast was used to generate forecast values for the period of January to December 2019 and to verify the model forecast values against the test data set.The object of this study was to provide a forecasting time series of freight volume data.

Support Vector Regression
Support vector regression (SVR) extends the conventional SVM algorithm [33].SVR is a supervised learning model based on regression analysis and is characterized by an ε-insensitive loss function for the training data; SVR can handle predictions of continual data [34].In SVR, a hyperplane is produced and the distance from the farthest sample point to the hyperplane is minimized.A nonlinear problem can thus be transformed into a linear problem by mapping the training data to a high-dimensional feature space.The training dataset is represented as {(x i , y i ); I = 1, 2, . . ., N; x i ∈ R n ; y i ∈ R}, where x i is the input value of i in the nth dimension, y i is the actual output value, and N is the dataset size.The SVR function is as follows: where f (x i ) is the predicted value and ϕ(x i ) is the feature function of the input; ω and b are adjustment factors, which are estimated using a penalty function as follows: where C is the penalty coefficient and ε is the maximum tolerable error.Two slack variables, ξ i and ξ * i , are introduced to handle the infeasible constraints of the optimization problem; these can be expressed as follows: where ξ ( * ) ensures that the constraints are satisfied, C controls the balance between model complexity and the training error rate, and ε is a constant that controls the error tolerance.
If ε is small, overfitting can occur; otherwise, underfitting can occur.The dual-optimization problem-based Lagrangian equations are obtained as follows: Therefore, an SVR function can be obtained as follows: where α i and α * i are Lagrangian multipliers and k(x i , x) is the kernel function.A multivariate model is built by additive decomposition of a univariate time-series model and the kernel function class is closed under additive decomposition for the SVR model.Five kernels (spline kernel, Gaussian radial basis function (RBF) kernel, linear kernel, polynomial kernel, and pair Hidden Markov Model (PHMM)) are commonly used for SVR models.Gaussian RBF kernels are widely used for nonlinear mapping between an input and a high-dimensional space.The Gaussian-RBF kernel has good implementation with additive decomposition, especially for consideration of interaction between the single time series [35].The Gaussian-RBF kernel assumes full interaction between a single time series and all time points in a window, which can be as error prone as the assumption of no interaction.Rohmah has shown that Gaussian-RBF has better results than other common SVR kernel functions in time-series analysis [36].The Gaussian-RBF kernel function constructs a nonlinear decision hyperplane in the input space.The formula is as follows: where σ is the scaling factor of the Gaussian RBF kernel.
The accuracy and stability of SVR are closely related to three parameters: the penalty coefficient C, kernel σ, and width ε of the insensitive loss function.C is the penalty coefficient and is used to balance the complexity of the model and training error; ε is the width of the insensitive loss function, which controls the width of the SVR sensitive band and affects the number of support vectors; finally, σ is the kernel parameter for the Gaussian RBF.The kernel parameter affects the distribution and range characteristics of the training sample data, which determine the width of the local neighborhood.

Fuzzy SVR
The FSVR model is based on the IFTS model [30].The fuzzy system provides a dynamic, probable, and intensive rule base for the system to overcome uncertainties in the raw data [37,38].The IFTS model is based on changes in data between two continual periods and the fuzzy relationship between elements in the series.The IFTS model can be used for fuzzy historical interpolation and prediction of the future; it can be applied to nonseasonal time series.In the proposed model, all parameters are calculated with appropriate methods to enable operations on data sets with different characteristics.The IFTS model is more effective for predictions than other models.The IFTS model is included in the R program's application package function execution program, facilitating its application.The steps for the IFTS are introduced in Algorithm 1, and further details follow the algorithm [30].
Algorithm 1: Fuzzy time series using variations of data

Definition:
Universal set U contains the interval between the least and greatest variations in the dataset.[39], Max [39]] Input: Air traffic: passenger, aircraft movements, and freight data set Xi corresponds to time t i , i=1, 2, . . ., n.

Output:
Fuzzy model of air traffic volume time series with the smallest RMSE value.Divided U into m equal intervals fuzzy set u i , i = 1, 2, . . ., m. Find the middle points of the intervals u 0 i , i = 1, . . ., m , where initial values m = 5, 6, 7, . . ., 11 Calculate the C value of each interval Compute Define F(t) the fuzzy forecasting of variations at the moment t Forecast 7(m(7)×w( 1)) fuzzy model data for time series, forecast value, and the result is calculated for the value t = w based on the variations in the result of prior values(t−1, . . ., t−w) Each fuzzy model data are compared with real data, using the RMSE for all the fuzzy model data calculated, and we use the RMSE as an evaluation criterion to compare with the listed models.Definition 1.Let U be the whole object to be discussed, called the universe, U = {u 1 , u 2 , . . ., u m }.U represents each element in the universe.The fuzzy set of U is defined as follows where µ A (u i ) is the membership function, µ A (u i ):U→[0, 1], µ A (u i ) indicates that the u i is in the collection.The degree of membership in A, µ A (u i )∈[0, 1], and 1 ≤ i ≤ m.Suppose the data set X i corresponds to time t i , i = 1, 2, . . ., n.Based on the fuzzy set between X i+1 and X i , the five steps of establishing an improved fuzzy time-series model [30] are as follows: Step 1: calculate the data change between two consecutive periods and find the minimum value Min [39] and maximum value Max [39] between domains U.
Step 2: divide the universe U into m equal length intervals u i , i = 1, 2, . . ., m.Each interval u i contains different growth rate values for different data.The midpoint of each interval (u 0 i , i = 1, 2, . . ., m) can be obtained.
Step 3: determine the corresponding value of the fuzzy set A i of F(t).Fuzzy sets A 1 , A 2 , . . ., A m is defined as where C is a constant C ∈ (0,1), U i is the data change between two consecutive periods determined in Step 1, and u 0 i is the midpoint of each interval determined in Step 2.
Step 4: Select the corresponding interval base w value (1 < w < n).In accordance with the value of w, the fuzzy relation matrix R w (t) is calculated.Once w has been selected, an i × j operation matrix O w (t) is obtained, where i is the number of rows and j is the number of columns.The operation matrix conforms to the data t − 2, t − 3, . . ., t − w and is consistent with the number of change intervals.A 1 × j matrix K(t) (row matrix corresponding to the fuzzy change in data from year t − 1) is obtained.The relation matrix R(t) is represented as follows: The formula of fuzzy time series F (T) is represented where Step 5: use the following formula to predict the data of time t: where µ t (u i ) is an element of F(t), X(t − 1) is the value at time t − 1, and X(t) is the predicted value at time t.The predicted value X(t) depends on the actual real value of X(t − 1) and the value of V(t).The value of V(t) is determined in accordance with the change in data in the entire time series and its previous values.The algorithm is described as follows: Data changes between two consecutive periods are divided into appropriate groups; larger data changes represent a greater number of clusters.A fuzzy relationship between the universe and the fuzzy set can be established.In accordance with the result of t = w based on the previous value (the results of the changes of t − 1, t − 2, . . ., t − w), a previous value (t = w) is selected as the predicted value.The obtained results are compared with the actual values and the error is estimated to evaluate the model's validity.The constant C affects the result of µA i (u i ).The criterion for evaluating the forecasting model (CEF model) is used to select the optimal C value; this is achieved through the following five steps: Step 1: import k and ε values, where k is the number of times each iteration is divided and ε is the error of C. The smaller the value of ε, the longer the computation time required.
Step 2: when t = 0, allocate the initial value: Step 3: when t = i, i ≥ 1, calculate the following values.
where if a = 0 and b = 1, then Step 4: calculate IFTS using C In the IFTS model, dividing intervals for the universal set (DIU) algorithm includes the following three steps: Step 1: when t = 0, ε > 0 is a small positive number.Clustering elements of the initialization sequence is represented as 2 , . . ., z = (x 1 , x 2 , . . . ,x n ).
Step 2: update each fuzzy data point according to the following formula.
where f (z i ) is a truncated Gaussian kernel and the formula is where d z i and z (t) i and d s is the average value of the distance between all data element pairs, and the formula is where n is the number of data points, λ depends on d s .When λ→0, the data have n intervals and when λ→∞, the data have one interval.
Step 3: repeat Step 2 until max i d z < ε are met.Each element in the data set can converge to the representative element z (t) i , i = 1, 2, . . ., m.When the calculation is finished, we have a sequence of m representative elements and m is the interval value dividing the universe.
In this study, the fuzzy time series of the aforementioned model was used to fuzzify an original air traffic time series.A set of corresponding fuzzy data was output through fuzzy reasoning and the fuzzy time-series data were taken as the independent regression variable for SVR.The independent regression variable extracted the base number corresponding to a 12-period interval (due to the seasonal cycle) combined with the results from the machine-learning SVR model.Thus, an FSVR model was developed for predicting air traffic volume.In the process of fuzzification, the factors affecting the fuzzy set segmentation were as follows: 1.
The schedule in winter and summer in accordance with the time zone of each airport; 2.
The role of each airport in the global air transport network in terms of its unique function due to its geographical location; 3.
The continuous holidays of countries in each region; 4.
Demand for tourism in the low and peak seasons or the impact of significant activities, such as the Olympic Games or World Expo.
Two seasonal flight schedules were adopted.Winter flights were defined as those from November 1 to March 31 of the following year and summer flights were those from April 1 to October 31.The fuzzy set was established in accordance with the air traffic domain data based on the shift table factor and could be divided into the winter peak, summer peak, low peaks, and an intermediate conversion value.The basic fuzzy set segmentation was m = 5 (summer peak, winter peak, intermediate transformation, summer low peak, and low winter peak) in addition to other factors affecting traffic, such as local tourism demand in light and peak seasons, national holidays, and demand for goods or trade.The membership function of the fuzzy features could be adjusted to obtain additional fuzzy sets.In this study, the fuzzy set of air traffic was divided into at least 5 fuzzy sets and at most 11 fuzzy sets; the number of fuzzy sets was thus 5 ≤ m ≤ 11.The previous time interval was set to w = 12 and the seasonal factor was 12 months.By using the IFTS model, a fuzzy data model with seven sets of fuzzy sets (m = 7) for w = 12 was generated.For each group of fuzzy data sets, the RMSE was calculated and the group of fuzzy data sets with the smallest RMSE was selected as the input data for the autoregressive independent variables in the next step.The fuzzy extraction parameter w was the interval cardinality corresponding to the 12 cycles and the fuzzy relationship matrix was calculated prior to these 12 cycles.Therefore, the amount of data in each group was divided by 12.The original data were the dependent variable for the SVR regression calculation; the independent variable was fuzzy.The data were then divided into training and testing sets to identify the optimal parameters for SVR and to establish the prediction model.

Evaluation Criteria
In this study, the MA percentage error (MAPE), MA error (MAE), and RMSE were selected as the measurement indexes to evaluate which models had high predictive performance.The MAPE, RMSE, and MAE are calculated as follows: where Y i represents the actual value, Ŷi represents a predicted value, and n represents the number of prediction periods.The MAPE is a relative index and unaffected by units or the magnitude of the actual and predicted values.The MAPE can be used to evaluate the relative size of the difference between predicted and actual values.The RMSE is the square root of the ratio of the deviation between observed values and the true value to the number of observations.The RMSE is thus a measurement of the deviation between observed and true values.The standard error, MAE, indicates the precision of measurements.The MAE is the absolute residual value between the average predicted value and actual value for each observation; thus, it can be used to evaluate differences between predicted and actual values.

Results and Discussion
The FSVR model predictions were compared with those of five models, namely the Holt-Winters, ETS, ARIMA, SARIMA, and SVR models.The training data set was extracted from August 2014 to December 2018; the test data set was extracted from January to December 2019.The training set was used to train each predictive model; the established models were then used to predict values from January to December 2019; these predicted values for 2019 were verified with the test set values.Tables 1 and 2 provide data on the freight volume of each airport.Based on the study by Zhang (2021) [40], we assumed that additive decomposition decomposes the time series into three components as, where T t is the trend item, S t is the seasonal item, and R t is the residual item.
The trends and seasonality of the three air traffic time series (with 3-, 6-, or 12-month time horizons) were decomposed; the trend strength is defined as in Formula (27) and is between 0 and 1.The seasonal intensity is defined as in Formula (28), which is computed using detrended data and is also between 0 and 1. Seasonal intensity close to 0 indicates that the series has little seasonality [15].The seasonality and trend strength of the three air traffic time series are presented in Table 2.In terms of freight volume, the airports with seasonality exceeding 0.9 were PEK, HND, HKG, and PVG, which had values ranging from 0.92 to 0.96; DXB had weaker minimum seasonality of 0.69.ATL, ORD, and CDG had seasonal intensity of 0.73-0.79,and LAX and LHR had seasonal intensity of 0.86-0.88.However, most airports had low seasonality for freight traffic.The trend strength for freight volume was lowest for DXB at 0.48; CDG also had a weak trend, with a strength of 0.53.LHR has the strongest trend, with a strength of 0.90.The other seven airports had trends of strength 0.71-0.89,indicating weak trends.The trend in freight volume at each airport was weak.
Among the five forecasting models, the SVR model resulted in the lowest MAE value for five airports, namely ATL, LAX, ORD, DXB, and CDG (Table 3).The Holt-Winters additive model yielded the smallest MAE for LHR and HKG.ETS had the lowest MAE for PEK and HND; SARIMA had the lowest MAE for only PVG.However, the MAE and accuracy of the FSVR model were superior to those of the other models for all airports.The average MAE of the FSVR model (0.209) was 82% smaller than the next best MAE, that of the SVR model (1.215).
The MAPEs of the five prediction models (Table 3) were >10% for only two airports: ATL (8.601%) and PEK (9.925%).The Holt-Winters additive model had MAPE >10% for ORD and PVG (good predictions) and <10% for all other airports (highly accurate predictions).However, the FSVR model yielded a lower MAPE (<1.6) than the other models for all airports, indicating that its predictions were highly accurate.The overall average MAPE of the FSVR model (1.019%) was approximately 84% lower than that of SVR, the next best model (6.625%).
The FSVR model was thus the best model for forecasting freight volumes in international airports according to all three of the indicators.The relevant parameters of each prediction model are listed in Table 3. Figure 1 presents the actual and predicted freight volumes for 2019.The RMSEs of the five prediction models are also presented in Table 4.The ARIMA model resulted in the highest RMSE and was thus the least accurate prediction model.The Holt-Winters additive model yielded the smallest RMSE for LAX, LHR, and HKG, the ETS model had the lowest RMSE for PEK and HND, and the SARIMA model had the lowest value for PVG.The SVR model returned the lowest RMSE for three airports: ATL, DXB, and CDG.However, the FSVR model had a smaller RMSE than the other five models for all airports.The mean RMSE for the FSVR model was 0.304, approximately 81% lower than that for the next-best model (SVR), which was 1.681.The hyperparameters of the SVR algorithm, the penalty coefficient C, and the kernel parameter of the Gaussian RBF σ should be optimized to find the optimal values for each data set, because no model is considered to be optimal in all situations.For most of the models, the width of the insensitive loss function ε set to 0.1; however, the ε is the allowable error of "ε-tube", which represents the approximation accuracy of the training data points.Many studies have demonstrated that adjusted ε can improve the SVR accuracy [41][42][43].In our results, the ε was also found to influence the SVR accuracy in hyperparameters of both SVR and FSVR algorithms.
Historical air traffic data were used as the autoregressive term in the regression item of the SVR model.The function correlation characteristics between the current data dependent variables and the historical traffic data independent variables were used to calculate the results.Global air transport is primarily passenger transport and airlines estimate passenger travel demand for each season.Annual applications for allocations are divided into two fixed seasons, winter and summer, as determined by an international conference.Airline transportation operates in accordance with these winter and summer schedules; thus, the schedule has a 1-year (12-month) cycle.According to the seasonal analysis of various types of air traffic volume presented in Table 5, the air freight volume at each airport was inconsistent due to the relatively small volume of all-freight aircraft service traffic, and the seasonal intensity varied from strong to weak.Therefore, the current traffic volume was assumed to be related to the χ value of the independent variable either 1 or 12 periods behind the original traffic data.Through SVR fitting of data from 1 or 12 periods in the past, the functional relationships of the data were calculated.The number of autoregressive lag periods that produced the optimal dependent variable y could be determined using the RMSE and MAPE values.For example, the traffic volume at ATL was determined through the following procedure.The analysis table of the SVR autoregressive lag period of ATL air traffic volume is presented in Table 5.The one period behind indicates that the predicted variable is modeled with the first lagged observation in a form of y t = f (y t − 1).The independent variable in the freight volume item was superior for the SVR model when using traffic data that were one period behind.Therefore, SVR model predictions of the freight volume for each airport in this paper are based on data from one period behind.
For the SVM function in the R software, the three key parameters of SVR were set to their default values: penalty coefficient C = 1, kernel parameter of the Gaussian RBF σ = 1, and width of the insensitive loss function ε = 0.1.The three parameter adjustment stages were for C, σ, and ε; the adjustments were performed using the grid search method in R. The default values of three parameters were starting value in the optimization algorithm.A corresponding model was trained for each parameter combination, the model performance was determined and the model with the highest performance was selected.During the training process, the built-in tune function in the R e1071 package was used to adjust the parameters and automatically cross-validate the model to ensure that it was reliable with Historical air traffic data were used as the autoregressive term in the regression item of the SVR model.The function correlation characteristics between the current data dependent variables and the historical traffic data independent variables were used to calculate the results.Global air transport is primarily passenger transport and airlines estimate passenger travel demand for each season.Annual applications for allocations are divided into two fixed seasons, winter and summer, as determined by an international conference.Airline transportation operates in accordance with these winter and summer schedules; thus, the schedule has a 1-year (12-month) cycle.According to the seasonal analysis of various types of air traffic volume presented in Table 5, the air freight volume at each airport was inconsistent due to the relatively small volume of all-freight aircraft service traffic, and the seasonal intensity varied from strong to weak.Therefore, the current traffic volume was assumed to be related to the χ value of the independent variable either 1 or 12 periods behind the original traffic data.Through SVR fitting of data from 1 or 12 periods in the past, the functional relationships of the data were calculated.The number of autoregressive lag periods that produced the optimal dependent variable y could be determined using the RMSE and MAPE values.For example, the traffic volume at ATL was determined through the following procedure.The analysis table of the SVR autoregressive lag period of ATL air traffic volume is presented in Table 5.The one period behind indicates that the predicted variable is modeled with the first lagged observation in a form of yt = f(yt − 1).The independent variable in the freight volume item was superior for the SVR model when using traffic data that were one period behind.Therefore, SVR The fuzzification of different airports and traffic categories form different universes of discourse.In fuzzy theory, each universe of discourse is divided into different numbers of fuzzy sets in accordance with the universe's characteristics, and membership functions are calculated.The FSVR model used the IFTS model combined with SVR to fuzzify air traffic volume; fuzzy traffic volume was then used as the AR independent variable for SVR.In the fuzzification process, the universe of data for the time series of traffic volume was defined and divided into appropriate fuzzy sets with different increments of change; this process produced fuzzy data with a lower RMSE.As described near the end of the Methods section, seven fuzzy groups defined as m = 5, 6, 7, 8, 9, 10, or 11 were selected, and w was 12.By using the fuzzified data from these 12 periods, the seven groups were further fuzzified by these seven groups.The fuzzified data resulting in the smallest RMSE error were used as the input for the optimal FSVR AR term in the model data.
The top ten airports were divided into three regions in accordance with their geographic location: (1) North America; (2) the Middle East and Europe; and (3) Asia.Table 6 presents the optimal number of fuzzy sets for each airport, with these numbers obtained through experimentation.The fuzzy segmentations for airports in the North American region-such as passenger traffic, aircraft takeoffs and landings, and freight volume-were divided into five to seven sets.In the Middle East and Europe, the segmentations for freight volume differed substantially between airports, at 6 to 10 sets.In the Asian region, the segmentations also varied substantially between airports, at six to nine sets.Bold indicates that the fuzzy segmentation has more fuzzy sets.
The results presented in Table 6 reveal that most of the fuzzy sets of the air traffic IFTS were divided into five to seven sets; for some airports, a greater division into 8-11 sets was required to minimize the RMSE.In the IFTS model, the universe of discourse was established and fuzzy relationships were determined in accordance with changes in historical data.With the fuzzy classification algorithm, the data were appropriately divided into fuzzy sets to establish membership function values for elements of similar levels in a time series.Corresponding fuzzy membership values were then established.The fuzzy relationship matrix was calculated for the period before the relevant time series, facilitating fuzzification of the fuzzy interpolation time series with fewer errors.The fuzzified data set could achieve better predictions than the original data, as indicated by Table 7.The IFTS fuzzification was thus advantageous.DXB expanded rapidly, but its rate of expansion declined.The decline in passenger traffic at DXB in 2019 was due to temporary runway closures and the bankruptcy of India's Jet Airways (for which Dubai was a major destination).Moreover, the second-largest airline at DXB after Emirates-Flydubai-was affected by non-delivery of the Boeing 737 MAX.In 2019, PEK continued to be affected by intensifying economic and trade friction between China and the United States, geopolitical conflict, financial market turmoil, and other problems.With the completion and operation of Beijing Daxing Airport (PKX) on 25 September 2019, some flights were transferred from PEK. Capacity at Daxing Airport has declined under the new pattern of "one city, two games".The freight and mail throughput of Beijing Capital Airport in 2019 reached 1,955,326 tons, 5.7% lower than the previous year.
HKG is another airport affected by Sino, US economic and trade issues.Many vital industries in Hong Kong, such as trade, finance, and tourism, rely on convenient and reliable air travel.The airport's convenient transportation has made Hong Kong an essential gateway between the world and China.From the second half of 2019, the air traffic volume of Hong Kong International Airport was affected by intense political turmoil lasting several months and geopolitical tensions between China and the United States.Actions related to the Hong Kong protests, such as the blockage of transit facilities and widespread violence between protestors and police, affected Hong Kong's export-oriented economy and international image.Due to the violence and rapidly changing political situation, HKG was closed, and approximately 40 countries issued a travel warning for Hong Kong.The overall freight volume of HKG decreased by 6.1% year-on-year to 4.8 million tons.International tourism in Hong Kong remained low; the decline in passenger traffic to and from mainland China and Southeast Asia also significantly affected the operation of the airport.
The passenger volume of DXB declined due to the temporary shocks.PEK had reduced traffic volume due to competition from PKX. HKG was closed due to protests, violence, and political instability, resulting in a decline in its role as a traffic hub.Thus, various nonseasonal factors can lead to a decline in traffic at airports.The use of Tai's IFTS model enabled FSVR [30] to be applied to nonseasonal time series.Historical traffic changes could be used to calculate the fuzzified traffic volume and improve fuzzy historical interpolation.By contrast with other models, the FSVR model still accurately predicted the air traffic volume of each airport in 2019.A comparison of the actual and predicted volumes for these three airports is presented in Figure 2.
predicted the air traffic volume of each airport in 2019.A comparison of the actual and predicted volumes for these three airports is presented in Figure 2.

Conclusions
Forecast transportation volumes are the primary basis for the planning of transportation capacity increases by authorities in the transportation industry.Traffic volumes and trends must be accurately forecast as early as possible to predict bottlenecks and plan equipment updates and capacity expansion.Several years' or decades' worth of air freight volume data for airports were collected to form a time series.A reasonable and accurate forecast model was used to predict the future traffic volume of the airports.These predictions can be used for timely reviews of airport capacity to enable early planning of necessary construction projects.
A predictive analysis of the freight volume for the 10 international airports with the highest global airport passenger traffic (ATL, PEK, DXB, LAX, HND, ORD, LHR, HKG, PVG, and CDG) from August 2014 to December 2019 was performed.The input data for the produced FSVR model were air traffic time-series data fuzzified by the IFTS model proposed by Tai [30].As IFTS can be applied to nonseasonal time series and historical

Conclusions
Forecast transportation volumes are the primary basis for the planning of transportation capacity increases by authorities in the transportation industry.Traffic volumes and trends must be accurately forecast as early as possible to predict bottlenecks and plan equipment updates and capacity expansion.Several years' or decades' worth of air freight volume data for airports were collected to form a time series.A reasonable and accurate forecast model was used to predict the future traffic volume of the airports.These predictions can be used for timely reviews of airport capacity to enable early planning of necessary construction projects.
A predictive analysis of the freight volume for the 10 international airports with the highest global airport passenger traffic (ATL, PEK, DXB, LAX, HND, ORD, LHR, HKG, PVG, and CDG) from August 2014 to December 2019 was performed.The input data for the produced FSVR model were air traffic time-series data fuzzified by the IFTS model proposed by Tai [30].As IFTS can be applied to nonseasonal time series and historical traffic changes were considered when fuzzifying the time series, a fuzzy classification algorithm was used to appropriately classify traffic changes for elements of similar time series levels.A suitable fuzzy set was formed, a membership function was generated, and a fuzzy relationship was established, resulting in fuzzy interpolation with minimal errors.The fuzzy data set with the minimal RMSE was used as the input for SVR for predicting the time-series data.However, fuzzy uncertainty treats noise on different data entries as independent disturbances, which may limit FSVR applicability.The air freight volume forecasts made by the FSVR model had the lowest MAE, RMSE, and MAPE of all compared models for all airports.In particular, the MAPE of the FSVR model was <2.5% for all airports, indicating highly accurate forecasts that are superior to those of other forecasting models.Thus, the FSVR model is the best model for air traffic forecasting among the models analyzed.In addition, the hybrid modeling of the FSVR model over the SVR model provides significant improvement over the SVR model.

l
to optimize CEF model.Step 5: repeat Steps 3 and 4 to find C = C (m) l until b (m) −a (m) < ε.

Figure 2 .
Figure 2. Comparison of air traffic trends in the past five years and FSVR forecast in 2019.

Figure 2 .
Figure 2. Comparison of air traffic trends in the past five years and FSVR forecast in 2019.

Definition 2 .
Let X(t), (t = 1, 2, . . .), for any X∈ U, if a real number f i (t)∈ [0, 1] is specified, f i (t) is defined as a fuzzy subset.If F(t) is the set of f 1 (t), f 2 (t) . . .f i (t), then F(t) is called the fuzzy time series of X(t).Evaluate the difference between the original forecast and estimated data, { Xi }, i = 1, 2, . . ., n, using MSE, MAE, MAPE, symmetric MAPE (SMAPE), and MASE.The smaller difference indicates the better the predictive ability of the model.

Table 1 .
Statistics on the freight volume of the top ten airports.

Table 2 .
Seasonality and trend strength of three major categories of air freight traffic at the top ten airports.

Table 3 .
Parameters related to each forecasting model of freight volume.

Table 4 .
The experimental results of the freight volume forecast model MAPE, MAE, and MAPE values.
Bold means the lowest value.Underline means the value >10.

Table 5 .
Analysis of SVR autoregressive lag periods for air traffic at Atlanta International Airport.

Table 6 .
The optimal number of fuzzy sets for the time series of fuzzy air traffic.

Table 7 .
Fuzzy air freight traffic time series errors.