1. Introduction
With the development of airline business, ancillary services [
1,
2,
3,
4,
5,
6,
7] that satisfy passengers’ personal requirement are becoming increasingly important for airlines. Ancillary revenue has already played a vital role in airline profit and greatly increases the amount of extra financial revenue for airlines. By improving the quality of ancillary services, airlines increase their user satisfaction [
8,
9] and the adhesiveness of customers [
2,
7], which enhances their competitiveness and prevents homogeneity. Due to the worldwide spread of COVID-19, the global market for airlines has reduced dramatically [
10,
11,
12,
13,
14,
15,
16,
17]. Airline companies are, thus, urgently seeking extra profit to reduce fiscal pressure, leading to more serious competition based on ancillary services.
Airline ancillary revenue refers to income beyond the ticket fare and acts as a directly recommended service or implicit travel experience. Ancillary services are rapidly growing due to the fast-growing airline market (2007∼2018) and the impact of COVID-19 (2019∼2021). Ancillary revenue [
18] greatly increased from
$2.1 billion to
$35.2 billion for the top 10 airlines within 12 years (2007∼2018). The significant growth in airline business in these years brings a great potential market for ancillary service.
Owing to the impact of the pandemic, the airline market faced a dramatic regression (2019∼2021), compelling airlines to seek revenue other than from flight tickets [
12,
14]. Therefore, establishing ancillary services is significantly important for airlines due to the ability to increase the airline’s revenue. This also serves as an approach to solving the problem of customer churn and ensuring the resources are adequately utilized.
According to [
19], the airline ancillary service is divided into five categories: (1) a la carte features, (2) commission-based products, (3) frequent flier activities, (4) advertising sold by the airline, and (5) the a la carte components associated with a fare or product bundle. From all of them, a la carte is the most general service that increases the revenue of airlines dramatically. A la carte features consist of multiple services that improve the travel experience, including onboard sales, extra baggage allowance, onboard Wi-Fi, and seat selection.
Seat selection [
2,
5,
20,
21] is one of the most common ancillary services chosen by passengers. This service refers to passengers choosing their preferred seats and paying for them willingly. For example, if passengers want to sit in the first row of the economy class for more legroom, they can spend extra money to reserve those seats in advance. Many reputable airlines around the world already provide this service and obtain revenue based on this service.
However, it is difficult for airlines to identify which passengers are willing to pay for seat selection since passengers who choose this service make up only a small part of all passengers. If the service is recommended to all customers, not only will advertising resources be wasted but passengers may also tire of useless advertisements, leading to a negative impact on customer satisfaction. Thus, how to precisely predict passengers’ willingness to pay for seat selection must be urgently solved for airlines to save in advertisement resources and to increase their ancillary profits.
In this paper, we apply the air passenger seat selection dataset provided by Neusoft as the research object to analyse important factors in the willingness to pay for a seat selection service. In particular, we propose a machine-learning-based model named Bagging in Certain Ratio Light Gradient Boosting Machine (BCR-LightGBM) to predict the willingness of passengers to pay for such services. We conduct extensive experiments to demonstrate the effectiveness of BCR-LightGBM, showing the ability to capture rules between features. Then, we visual a decision tree in BCR-LightGBM and study two typical samples based on the visualization. To further enhance the interpretability, we use a Shapley additive explanation (SHAP) [
22] model to analyse the feature importance and give our recommendations. Our contributions are summarized as follows:
We study the seat selection service from the perspective of passengers’ willingness to pay, create new features from the original dataset, and propose an ensemble model, named BCR-LightGBM, to predict the willingness of passengers to pay for seat selection.
The experimental results show that BCR-LightGBM outperformed all 12 comparison models in terms of the AUC and F1-score.
We demonstrate the rules learned by BCR-LightGBM by visualizing the decision-making phase of two typical samples and analyse the important factors based on the SHAP model.
3. Methodology
How to predict the willingness to pay for airline seat selection is an urgent problem that needs to be solved. We utilize real air passenger history provided by Neusoft (described in
Section 4.1) to predict their willingness since the corresponding dataset is hard to collect from individuals rather than from airlines. First, we construct and select new features to overcome the data sparsity and the curse of dimensionality. Then, we propose Bagging in Certain Ratio LightGBM (BCR-LightGBM) to solve the issue of imbalance.
3.1. Feature Construction
To overcome the problem of data sparsity, we construct new features on the basis of the original dataset. There are three types of data in the dataset, i.e., date, numeral, and category; we apply different transformations on each. For the date and time, e.g., “16 December 2018 20:00”, we construct two features to indicate the season (month-wise) and time period (hour-wise), shown as follows:
where
x is the month of the flight.
where
x is the hour of the flight.
For numerical and categorical features, the names of the features are divided into two parts, i.e., characteristic (prefix) and time interval (suffix). The characteristic denotes the history of the passenger or the inherent property of the flight. For instance, “dist_all_cnt” indicates the total mileage of a passenger and “pax_fcny” indicates the fare of the flight ticket. The time interval denotes the time scope of the characteristic, e.g., “dist_all_cnt_m3” represents the total mileage of a specific passenger collected from three months ago to the current time. The time interval includes five scopes, i.e., 3 months, 6 months, 1 year, 2 years, and 3 years. For simplicity, we call the characteristics prefixes and time intervals suffixes in the following.
We observe that the issue of sparsity is severe for both numerical and categorical features. To overcome the sparsity for the numerical features, we directly conduct statistics-based transformation on the original numerical features, i.e., maximum, minimum, mean, and variance. On the basis of the transformation, we improve the interpretability of each numerical feature. The features newly formed are named “prefix_max”, “prefix_min”, “prefix_mean”, and “prefix_std” for each prefix.
For each categorical feature, we define two sub-flags, named secondary indexes, to indicate the relationship between the target and the feature. The secondary indexes are represented as “prefix_T” and “prefix_F”, where “T” denotes a passenger paying for seat selection services and “F” is a passenger who does not. The constructional rule is shown as follows:
According to the target, all values are divided into two sets for each categorical feature. These two sets are denoted as and , which contain values .
If or contains 0, delete it from the set.
Construct two new features, “prefix_T” and “prefix_F”, based on the transformation rule followed:
where
x represents the values of a sub-label, and
is the prefix of the feature.
3.2. Feature Selection
In addition to sparsity, the dataset suffers from dimensionality. In this section, we apply Pearson correlation coefficient-based [
33] and chi-square test-based [
34] feature selection techniques to reduce the dimension of numerical and categorical features, respectively. To select the numerical features, we perform the process below:
Calculate the Pearson correlation coefficient between any two features and sort them in descending order.
Set a preserve threshold and delete the threshold to indicate the state of features.
For each feature set , if the correlation coefficient is less than the preserve threshold, we set a as a preserve state; if the correlation coefficient is greater than the delete threshold, we set a as a delete state.
If feature a is in the delete state, delete it unless b is in the delete state.
If feature a is in the preserve state, delete feature b unless b is in the preserve state.
If features a and b are in neither the delete state nor the preserve state, delete the feature with the smaller variance.
For each categorical feature, we validate the mutual independence between it and the target through a chi-square test. If the feature is independent from the target, we directly delete the feature.
3.3. BCR-LightGBM
Predicting whether air passengers are willing to pay for seat selection is a binary task. In this section, we illustrate the structure of Bagging in Certain Ratio LightGBM (BCR-LightGBM). To ensure the robustness, multiple LightGBMs are assembled through a bagging method [
35]. Bagging ensembles are multiple models that were trained by subsets extracted from the original dataset through bootstrap sampling. The result is obtained by applying an average or voting strategy. By utilizing bagging, the effectiveness and stability are improved and the variety of the model is lowered. However, the bootstrap sampling does not change the data distribution; thus, it cannot overcome the data imbalance in the original dataset.
To mitigate the imbalance of the original dataset, we only sample the negative samples in the sampling phase and then combine them with the positive samples to create the subset. Note that the ratio between positive and negative needs to be pre-assigned since different ratios lead to different results. Then, each LightGBM is trained through the subsets sampled, and the prediction is the average of their results. The training process of BCR-LightGBM is shown in Algorithm 1.
Algorithm 1: The training process of BCR-LightGBM. |
|
4. Experimental Results
In this section, we compare the proposed BCR-LightGBM against various machine learning algorithms and sampling-based methods. Then, we illustrate the decision-making procedure of a decision tree in BCR-LightGBM on two real samples to demonstrate the learned mode of the model. Furthermore, we analyse the feature importance through a SHAP model to improve the interpretability. Extensive experiments are conducted on a 64-bit Ubuntu 16.04 operating system. The setting environment is as follows: CPU: Intel (R) Xeon(R) Silver 4114 CPU @ 2.20 GHz, memeory: 64 RAM, and graphics: GeForce GTX 1080 Ti.
4.1. Data Description
This paper uses the dataset of air passenger willingness to pay for seat selection provided by Neusoft
http://fwwb.org.cn/attached/file/20201211/20201211132638_47.zip (accessed on 29 November 2021), consisting of flight information, passenger history, and customer characteristics, which are shown in
Table 1. The dataset comprises 23,432 samples, and the feature dimension is 657, which increases the risk of dimensionality [
36]. Note that the dataset contains features with the same prefix.
For example, there are five features prefixed with “cabin_f_cnt”, i.e., “cabin_f_cnt_m3”, “cabin_f_cnt_m6”, “cabin_f_cnt_y1”, “cabin_f_cnt_y2”, and “cabin_f_c-nt_y3”, which represents the number of people who took first-class flights in the last x months/years, where , , , , and represents 3 months, 6 months, 1 year, 2 years, and 3 years, respectively. Moreover, the dataset is especially sparse, where nearly 70% of the values are zero.
Positive samples in the dataset indicate the people who paid for seat selection, and negative samples represent the people who did not. Note that the dataset is extremely imbalanced [
37,
38], where the ratio between positive and negative is
. The situation is common for ancillary services since the majority of people do not choose extra services even though ancillary profits dramatically aid airlines.
4.2. Metrics
In this section, we introduce the five metrics, i.e., Accuracy, Precision, Recall, F1-score, and ROC-AUC, used to evaluate the performance of the model. Accuracy is the simplest metric, which is defined as the number of correct predictions divided by the total number of predictions, indicating the proportion of correct predictions. Precision and Recall are two mutually influencing indicators, where Precision indicates the correctness of the prediction and Recall indicates the prediction performance for users who are willing to pay for seats. However, there are many cases in which these metrics are not good enough to indicate of the model performance.
A scenario is when the class distribution is imbalanced, e.g., the case in the experiment. In this case, even if the model predicts all samples as the most frequent class, the performance of these metrics would obtain a high accuracy rate. However, the model is not learning anything and is simply predicting every sample as the top class. For the dataset used in the experiment where the negative class occupies around samples, if the model predicts all instances as negative, it would result in a accuracy.
To cover the shortage of these metrics and better indicate the model performance, we further introduce the F1-score and ROC-AUC (area under the receiver operating characteristic curve). The F1-score combines Precision and Recall into a single metric, with the cases in which both Precision and Recall are important. The indicator is the harmonic average of Precision and Recall, and always achieves a trade-off between them, which is generally applied to indicate the overall performance of model when the dataset is imbalanced.
ROC-AUC indicates the area under the ROC (receiver operating characteristic curve) where the ROC is used to show the performance of a binary classifier. Specifically, the ROC-AUC is an aggregated measure of performance of a binary classifier on all possible threshold values. Thus, the indicator is not sensitive to threshold. When the ratio between positive samples and negative samples changes, the ROC-AUC value does not change dramatically.
TP is the number of instances correctly classified as positive,
TN is the number of instances correctly classified as negative,
FP is the number of instances incorrectly classified as positive, and
FN is the number of instances incorrectly classified as negative.
4.3. Comparison Models
In this section, we introduce various comparison models used in the experiments, including machine-learning methods and sampling-based methods.
LR (Logistic Regression) is a simple linear model that can be easily interpreted, where the performance greatly relies on feature engineering.
KNN (K-Nearest Neighbours) is a learning-free model that classifies a sample based on the k-nearest samples in the feature space.
SVM (Support Vector Machine) is not sensitive to outliers due to the inherent properties of support vectors. However, the kernal function should be dedicated and designed to fit the input space.
AdaBoost (Adaptive Boosting) is a boosting method, which dynamically adjuncts the weight of each base learner to improve the robustness.
GBC (Gradient Boosting) is a boosting method in which the objective is to find the optimal solution in the parameter space by fitting the residual error of a previous learner.
RF (Random Forest) adopts bagging to improve the robustness, where the decision tree is a base learner that has been widely used in various fields.
XGBoost (eXtreme Gradient Boosting) [
39] is an extension of GBC that achieves better performance and scalability.
LightGBM (Light Gradient Boosting Machine) [
31] is an extension of GBC. Compared with XGBoost, LightGBM is faster and lighter. Note that LightGBM is the base learner in BCR-LightGBM.
RUS (Random Under Sampling) randomly samples negative instances until the number is the same as that of positive instances.
ROS (Random Over Sampling) randomly samples positive instances until the number is same as that of negative instances.
SMOTE (Synthetic Minority Over-Sampling) [
40] is an over-sampling method, creating synthetic instances for minorities based on the nearest neighbours.
SMOTE-ENN (Synthetic Minority Over-Sampling and Edited Nearest Neighbours) [
41] is the combination of SMOTE and ENN, which applies ENN to clean the samples created by SMOTE.
4.4. Comparative Analysis
To demonstrate the superiority of BCR-LightGBM, we compare the performances of the model against existing machine-learning methods and sampling methods in this section. For the proposed BCR-LightGBM, we set the ratio between positive samples and negative samples to 1:3.
Table 2 shows the performance comparison between the proposed BCR-LightGBM and machine-learning models without sampling. Note that BCR-LightGBM outperforms all methods in terms of the F1-score and AUC, which are two widely used indicators when the dataset is imbalanced. Furthermore, BCR-LightGBM achieves the narrowest gap between Precision and Recall. For other compared methods, the Precision is much higher than the Recall, indicating that these models only find a small set of passengers who are willing to pay for seat selection when reducing the error rate.
In other words, these models only can identify instances of people who are the most likely to pay for seat selection. However, this limitation is unnecessary for airlines as the cost of advertising is not too unacceptable that messages cannot be advertised to a relatively large set of people. BCR-LightGBM achieves a desired Recall and an acceptable Precision, satisfying the requirements of airlines.
Table 3 shows the comparative results between the BCR-LightGBM and sampling-based methods. The proposed model achieves the best score in terms of Accuracy, Precision, F1-score, and AUC. Note that sampling-based methods are generally better than machine-learning models, which is reflected in the narrower gap between Precision and Recall. Furthermore, the performance of under-sampling-based methods (RUS and SMOTE-ENN) is worse than over-sampling-based methods (SMOTE and RUS) because under-sampling-based methods drop a large number of instances in the original dataset, leading to the model not effectively learning the characteristics of the discarded samples and increasing the possibility of under-fitting.
The over-sampling-based methods create new samples to mitigate the issue of imbalance, improving the performance even though noise is introduced. Although BCR-LightGBM applies an under-sampling method, its performance is better than that of over-sampling-based models since the ensemble strategy is utilized.
To further illustrate the performance of BCR-LightGBM, we plot the ROC (Receiver Operating curve) of machine-learning methods and sampling-based methods, as shown in
Figure 1. In
Figure 1a, we compare the proposed model against various machine-learning methods, and, in
Figure 1b, we compare BCR-LightGBM with LightGBM based on different sampling strategies. The ROC demonstrates the trade-off between Precision and Recall. We note that the curve of BCR-LightGBM wraps around all other curves, indicating that the proposed model surpasses all other methods.
Note that the superiority of BCR-LightGBM is derived from the ability to correctly learn the relationship between important factors. The relation cannot be learned by other models. We attempted to analyse this from the perspective of model capability. A simple linear model, i.e., LR, cannot perform feature crossing, which limits its capability to learn the relation between features. KNN classifies samples through k-nearest neighbours in the original feature space, and the correlation between features cannot be identified.
Although SVM is not sensitive to outliers, it also cannot perform feature crossing and the kernel function needed to be dedicated in design. AdaBoost dynamically changes the weight of the base learner; however, the weight is sensitive to the data distribution. GBC finds the optimal solution through a descending gradient, which is dramatically influenced by an imbalance in the dataset. Although RF, XGBoost, and LightGBM achieve great performances in various fields, they are still weak when solving with data imbalances.
In summary, these models cannot solve or are weak when solving the issue of data imbalances, mainly in learning information from negative samples, which leads these models to not correctly find the relationship between important factors. However, the proposed BCR-LightGBM sets the ratio between the positive samples and negative samples at a certain value, mitigating the impact of negative samples and achieving the best performance.
For sampling-based methods, RUS causes information loss by dropping existing samples, and ROS magnifies the impact of outliers in positive samples. SMOTE creates new samples based on the samples in the dataset but introduces noise. Although SMOTE-ENN leverages ENN to clean the samples generated by SMOTE, the data distribution may be further misled due to the lack of prior data. To reduce the impact of noise and to avoid information loss, BCR-LightGBM applies random under sampling to avoid extra noise and uses an ensemble approach to learn all information in the dataset, thereby, improving the robustness.
4.5. Hyperparameter Analysis
To further analyse the performance of BCR-LightGBM, we conducted experiments to analyse two important hyperparameters, i.e., the ratio between negative samples and positive samples, and the number of LightGBMs in the model.
Figure 2 shows the performance of BCR-LightGBM under different ratios between negative samples and positive samples. In the figure,
is the ratio between negative samples and positive samples. When
, the number of positive samples is equal to the number of negative samples. For a fair comparison, the number of LightGBMs is set at 100. Note that, if the ratio is too large, the model does not learn the correct relationship between features, thus, causing serious model degradation.
We observe that, when the ratio between negative samples and positive samples is 3:1, the model achieves the best performance in terms of the F1-score and AUC. In fact, with the increase in the ratio, the F1-score and AUC dramatically decreased and the gap between Precision and Recall becomes large. We do not use Accuracy to indicate the performance of the model because, even though all samples are predicted to be negative when the data distribution is imbalanced, the indicator still maintains a high level.
Figure 3 shows the impact of the number of LightGBMs (
) on the model performance. For simplicity, we select ROC-AUC as the indicator from the five matrices. To completely demonstrate the impact of the number of base classifiers, i.e., LightGBM, on the BCR-LightGBM, we conduct experiments when the ratio of negative samples and positive samples is in
.
From the figure, we observe that the increase in the number of LightGBMs can greatly improve the performance of the model since the model obtains a desirable performance gain even though the number of LightGBMs is relatively small. Moreover, with the increase in the number, the model shows strong robustness because the model rapidly converges even though slight fluctuation occurs. Note that, the performance of BCR-LightGBM generally converges when the number of LightGBMs is less than 50, thereby, demonstrating the robustness and stability of the model.
4.6. Importance Analysis of Influencing Factors
In addition to comparing the performances between BCR-LightGBM and existing models, we also attempt to explain the rules learned by the model.
Figure 4 illustrates a decision tree in the proposed model. For simplicity, we only visualize the top four layers. We set the layers of the model to seven to improve the capability. We note that the flight cabin (‘seg_cabin’) is located within the top of the tree, indicating that the cabin is most discriminative factor for seat selection in terms of the Gini index. Moreover, we find that some flight information, e.g., the tax of the ticket (pax_tax) and the month of travel (seg_dep_month), influences their willingness to pay.
In addition to that, the history of flights shows whether the passenger values also contribute to their willingness to pay, e.g., the number of paid seat selections (select_seat_cnt_max), number of seats by the window (select_window_cnt_var), number of international tickets (tkt_i_amt_max), point additions from airline mile accumulation (pit_add_air_cnt_y1), number of economy class flights (cabin_y_cnt_max), number of first-class flights (cabin_f_cnt_max), and member level (member_level).
To illustrate the rules learned by the model, we selected two typical samples from the dataset to simulate decision-making by the model, as shown in
Table 4. The positive sample is a person who pays to select a seat, and the negative sample is one who does not. The features in the table match the corresponding split point in
Figure 4. For the positive sample, we observe that the passenger is a frequent flyer since the value of corresponding factors is high, e.g., the times of economy class (cabin_y_cnt_var), the times of first class (cabin_f_cnt_max), and the total amount of international flight mileage accumulated (tkt_i_amt_max, tkt_i_amt_min).
The passenger always pays to select a seat (select_seat_cnt_max) and prefers to seat by the window (seat_window_cnt_var). Thus, the positive sample has a high customer value. We assume that the passenger generally takes business trips due to the frequency of flight and their willingness to pay for seat selection. Intuitively, a business traveller is generally willing to pay for seat selection to acquire seats that provide them with better rest. The model learns this mode following the orange arrow presented in
Figure 4.
For the negative sample, we observe that the passenger does not always take flights due to the low number of flights in economy class (cabin_y_cnt_var) and in first class (cabin_f_cnt_max), the slow accumulation of points (pit_add_air_cnt_y1, pit_income_avg_amt_var), and the unwillingness to pay for seat selection (select_seat_cnt_max). Intuitively, these kinds of passengers are not willing to pay for seat selection. The model can learn this mode following the blue arrows in
Figure 4.
We further utilize a SHAP [
22] model to enhance the interpretability of BCR-LightGBM. To interpret the results of ensemble models, SHAP [
22] provides an approach to explain the prediction of ensemble models utilizing the contributions of allocation methods from cooperative games. The model considers the contribution of features for the prediction and calculates the feature importance based on that. Compared with feature importance derived from LightGBM to indicate the number of times to create a split point, the SHAP model explains the influences of each sample for the prediction and indicates the positive or negative effects on the prediction. To obtain the influence of each feature, SHAP calculates the Shapley value [
42] of each feature.
Figure 5 shows the feature importance based on the Shapley value. Note that aircraft cabin (seg_cabin), ticket tax (pax_tax), ticket fare (pax_fany), the gap between current and recent travel date (recent_gap_day), and total international flight mileage (dist_i_cnt_max) have the greatest impacts on the willingness of passengers to pay for seat selection. According to the SHAP value, we note that flight information has a great influence on the prediction since five features are important, as presented in
Figure 5, i.e., aircraft cabin (seg_cabin), ticket fare (pax_fany), ticket tax (pax_tax), and the month of flight (seg_deg_month).
We conclude that passengers who pay for better aircraft cabins with higher ticket fares and taxes are more likely to choose the extra seat selection service and that those who travel in fall or winter are more likely to pay for these services. Furthermore, passenger history, which denotes the customer value, also greatly influences the prediction result, i.e., the gap from the most recent flight (recent_gap_day), total mileage and international mileage (dist_all_cnt_mean, dist_i_cnt_max), and average ticket fare and international ticket fare (tkt_avg_amt_max, tkt_i_amt_max). In general, the higher the total mileage and international mileage, the higher the average and international ticket fare, and the more frequently a passenger travels, the more likely the passenger is to pay for a seat selection service.
According to the sample analysis based on a visualization of the decision tree and SHAP-based feature importance analysis mentioned above, we conclude the following:
- (1)
Passengers who have a high customer value, reflected in the total fare of a ticket, the total mileage accumulated from flights, and the frequency of flights, will pay for seat selection. Thus, airlines can recommend seat selection services to them since they may pay more attention to their comfort on the plane.
- (2)
Passengers who choose airlines with higher ticket fares are more willing to pay for seat selection. The fare of the ticket also denotes their customer value when the customer history is difficult to acquire. Airlines can easily identify the willingness of a passenger to pay for seat selection based on information from a single flight and can recommend seat selection services to them.
- (3)
Passengers who make international flights are more likely to pay for seat selection. We assume that this is because passengers like to have a more comfortable experience in long-haul flights. Therefore, airlines can recommend these services to passengers in long-haul flights.
5. Conclusions
Ancillary service revenue has become important for airlines in recent years. Under the impact of COVID-19, how to precisely provide personalized ancillary services to passengers to increase revenue and how to mitigate capital shortages are problems that need to be urgently solved. In this paper, we analysed seat selection services from the perspective of the prediction of the willingness of passengers to pay for seat selection and propose a machine-learning-based method to identify their willingness to pay for seat selection. Specifically, we proposed a model, named BCR-LightGBM, to identify passengers who are willing to pay for seat selection as the basis of recommendation.
We first preprocessed the original dataset to overcome the data sparsity and the curse of dimensionality inherent in the dataset. Then, the bagging method was applied, where positive samples and negative samples were combined at a specific ratio for multiple subsets to solve the problem of data imbalance. The experimental results demonstrated that the proposed model achieved 0.28 and 0.77 in terms of the F1-score and AUC, outperforming all existing machine-learning models and sampling-based methods.
Finally, we analysed two typical samples based on the visualization of a decision tree in BCR-LightGBM and applied a SHAP model to further enhance the interpretability by analysing feature importance. We note that customer value, ticket fare, and flight length had positive influences on the willingness to pay for seat selection. Based on this rule, airlines can recommend seat selection services to the corresponding passengers to increase their revenue.
The limitation of this research is that the number of samples is relatively small and cannot cover all situations regarding seat selection around the world. Thus, our conclusions may only be appropriate in similar cases to those contained in the dataset. In future research, we will collect more samples from different airlines to make the conclusions more convincing. We will further study the intrinsic properties of these important factors and mine knowledge from the dataset to guide the recommendation policies of airlines to increase revenue from other ancillary services, e.g., priority boarding, checked baggage, and onboard Wi-Fi.