Efficiency Analysis and Classification of an Airline’s Email Campaigns Using DEA and Decision Trees

Inci, Gizem; Polat, Seckin

doi:10.3390/info16110969

Open AccessArticle

Efficiency Analysis and Classification of an Airline’s Email Campaigns Using DEA and Decision Trees

by

Gizem Inci

^* and

Seckin Polat

Department of Industrial Engineering, Istanbul Technical University, Macka, Istanbul 34367, Türkiye

^*

Author to whom correspondence should be addressed.

Information 2025, 16(11), 969; https://doi.org/10.3390/info16110969

Submission received: 12 July 2025 / Revised: 7 October 2025 / Accepted: 22 October 2025 / Published: 10 November 2025

Download

Browse Figures

Versions Notes

Abstract

Campaigns significantly impact overall company performance, making the measurement and prediction of campaign efficiency essential. This study proposes an integrated methodology that combines efficiency measurement with efficiency prediction for airline email campaigns. In the first part of the methodology, Data Envelopment Analysis (DEA) was applied to real airline campaign data to evaluate efficiency; this is the first study to analyze email campaign efficiency in this context. In the second part of the methodology, decision tree algorithms were employed to classify historical campaign data based on DEA scores, with the aim of predicting the efficiency of future campaigns—a novel approach in this context. A core dataset of 76 airline email campaigns with six inputs and two outputs was analyzed using output-oriented CCR (Charnes, Cooper, Rhodes) and BCC (Banker, Charnes, Cooper) models; 26 and 46 campaigns were identified as efficient, respectively. The analysis was further segmented by group size, seasonality, and route type. Efficient campaigns were then ranked via super-efficiency, and sensitivity analysis assessed variable and campaign effects. For prediction, decision tree algorithms (J48 (C4.5), C5.0, and CART (Classification and Regression Trees)) were employed to classify campaigns as efficient or inefficient, using DEA efficiency scores as the target variable and DEA inputs as attributes, with classification performed for both BCC and CCR core models. Class imbalance was addressed with SMOTE, and models were evaluated under stratified 10-fold cross-validation. After balancing, the BCC core model (BCC_C) yielded the most reliable predictions (overall accuracy 76.3%), with J48 providing the most balanced results, whereas the CCR core model (CCR_C) remained weak across algorithms.

Keywords:

efficiency evaluation; efficiency prediction; email campaigns; airline industry; data envelopment analysis; decision tree

1. Introduction

In today’s competitive and globalized economic environment, the efficient use of resources has become a fundamental strategic element for businesses seeking to increase profitability, gain a competitive advantage, and ensure long-term sustainability. To remain profitable, companies must prioritize both production efficiency and the performance of their marketing activities [1]. Among these efforts, marketing communication stands out as a key strategic tool for raising awareness and persuading target audiences. However, evaluating and measuring the efficiency of such tools remains one of the most critical challenges for decision-makers. Whether the allocated budget is being spent efficiently and the communication strategy is reaching the intended audience are questions that require clear answers [2,3]. In recent years, market-oriented companies have increasingly adopted direct marketing approaches that focus on specific customer groups rather than mass marketing. Advances in database technologies, the spread of the Internet and e-commerce, and the personalization of websites have significantly contributed to this trend, alongside the strengthening of Customer Relationship Management (CRM) practices [4].

Online marketing is now an integral part of marketing strategies carried out through digital platforms and includes methods such as email marketing, content marketing, and social media. Email marketing, in particular, is an efficient digital marketing tool due to its low cost, broad reach, and data-driven targeting capabilities. It serves various marketing goals, including increasing website traffic and supporting sales [5]. According to the Litmus 2023 State of Email report, email marketing offers a 36:1 ROI and continues to grow in importance as market needs evolve [6]. Email campaigns are widely used across industries such as finance, travel, tourism, retail, and e-commerce. They are frequently employed to strengthen customer loyalty and to leverage cross-selling and upselling opportunities. However, measuring the efficiency of email campaigns remains a major challenge for companies [7,8]. In this context, knowing the efficiency of campaigns provides a benchmark for comparing the results of specific campaigns, and ensures that the evaluation process is designed in a practical way. Therefore, accurately measuring campaign efficiency is of great importance. In addition, predicting the efficiency of a campaign before its implementation is critically important for designing efficient campaigns. This paper proposes a methodology to measure and predict the efficiency of email campaigns.

There are two main approaches to efficiency evaluation and comparison in the literature: non-parametric and parametric. Data Envelopment Analysis (DEA), a non-parametric approach, and Stochastic Frontier Analysis (SFA), a parametric approach, are the most commonly used methods [9].

DEA is an appropriate method for evaluating campaign efficiency, as it provides a single efficiency score (ES) in settings with multiple inputs and outputs. It has been applied in earlier campaign studies across different areas, including marketing, political, and health-related campaigns, with particular applications in marketing and digital advertising. However, the literature to date has not applied DEA to airline email campaigns. This study contributes by employing DEA in this context to calculate efficiency scores, where the inputs are campaign-related, market-related, and customer-related variables, and the outputs are engagement (click-to-open rate (CTOR)) and sales (tickets sold).

In order to design efficient campaigns, it is necessary to predict their efficiency in advance, allowing adjustments to content components (e.g., booking period) according to the predicted efficiency. From a traditional perspective, the relative efficiency of a campaign can only be determined after the campaign has been implemented, the results have been obtained, and the DEA model has been run with previous campaigns. In this approach, no changes can be made to the design of the campaign. A prediction model is required to estimate the efficiency of a campaign before it is implemented. Machine learning (ML) can contribute to this, and decision tree classification algorithms in particular are advantageous because of their high interpretability. A review of the literature reveals widespread DEA-ML applications in various fields. However, most of these studies attempt to predict the efficiency of the decision-making unit from DEA applications when both inputs and outputs are known. One study in the literature, similar to this study, implemented a prediction model for a new decision-making unit where only the inputs are known. A review of the literature indicates that the DEA-ML approach, which has been applied in various fields, can likewise be extended to airline email marketing campaigns. Therefore, the results of the DEA model based on the dataset of implemented campaigns can be used to determine the efficiency before implementing a new campaign. The general methodological framework of the study is summarized in Figure 1.

The remainder of this paper is structured as follows. Section 2 provides a literature review, first focusing on DEA applications in campaign efficiency and then on studies that combine DEA with ML. Section 3 presents the methodological framework, including the DEA models for efficiency measurement and the decision tree classification algorithms used for efficiency prediction. Section 4 reports the empirical results, including DEA efficiency scores, statistical tests, sensitivity analyses, and the results of the decision tree classification models, as well as a comparison with related DEA–ML studies in the literature. Finally, Section 5 concludes the paper by highlighting the study’s contributions to the literature, limitations, and directions for future research.

2. Literature Review

2.1. DEA Studies on Campaign Efficiency

DEA stands out for its ability to determine a single efficiency score in systems with multiple inputs and outputs, making it suitable for measuring campaign efficiency. It has been applied to evaluate the efficiency of various types of campaigns, which can broadly be categorized into profit-oriented and non-profit-oriented contexts. Non-profit-oriented applications include political election campaigns and public health vaccination programs. For example, DEA has been used to measure how parliamentary candidates in France converted campaign resources into votes [10]. Another study developed a DEA model under head-to-head competition to evaluate the efficiency of U.S. congressional campaigns [11]. In the health domain, DEA has been applied to assess the efficiency of COVID-19 vaccination campaigns across German federal states, using vaccine doses as inputs and administered vaccinations as outputs [12]. While these examples demonstrate DEA’s versatility in non-profit settings, they fall outside the scope of our literature review. In the broader literature, DEA has also been applied in several other studies beyond these examples, which underscores its adaptability. From this perspective, applying DEA to assess the efficiency of airline email campaigns in our study represents an extension of this approach to a new digital marketing domain. Accordingly, the literature review focuses on the use of DEA in evaluating marketing campaign efficiency and, more specifically, digital marketing campaign efficiency. DEA studies that compare the relative efficiency of different advertising media (e.g., television, radio, internet, print) were not included within the scope of this study.

DEA has been applied across a wide range of profit-oriented marketing campaign contexts, demonstrating its flexibility in evaluating campaign efficiency. In this study, each campaign is treated as a decision-making unit (DMU). Table 1 presents information on studies in the literature that use DEA to evaluate the efficiency of marketing campaigns (digital and non-digital). In the digital domain, a study assessed 37 banner ads using content-related inputs such as color, animation, and message length, and outputs including click-through rate (CTR), recall, and attitude toward the ad. This study is conceptually related to ours through the use of engagement indicators, with CTR being analogous to our CTOR, although our framework extends beyond design features to incorporate market-related and customer-related inputs and sales-oriented outputs such as tickets sold [13]. Similarly, another study evaluated search advertising for 200 online retailers using an output-oriented BCC model. Their inputs reflected campaign content and cost characteristics, while outputs included both sales and CTR, closely aligned with our tickets sold and CTOR. In contrast to their cross-firm perspective, our analysis focuses on multiple campaigns of a single firm, broadening the model by integrating additional market-level and customer-level inputs [9].

Social media advertising has also been an important context for DEA applications. A study examined 60 Facebook campaigns for a franchised hotel using input-oriented CCR and BCC models, treating each campaign as a DMU—an approach consistent with our treatment of individual email campaigns [14]. Their inputs captured design-related characteristics, while outputs included post clicks and reach, which are conceptually related to our CTOR. Unlike this engagement-only perspective, our study combines both engagement and sales outputs and incorporates market-level (market share, market size) and customer-level (number of flights taken) variables as inputs. Previous work analyzed 45 Facebook ads using DEA, employing cost and ad duration (in days) as inputs—conceptually parallel to our booking and travel periods—and engagement measures such as reach, impressions, and clicks as outputs [15]. In comparison, our study expands the efficiency framework by combining engagement and ticket sales and accounting for market- and customer-related variables. A study on the retail sector evaluated the efficiency of 43 U.S. furniture retailers using constant returns to scale (CRS) DEA model, also known as the CCR model, with benevolent cross-efficiency ranking. Alongside a business-only model with inputs such as number of employees and total assets and sales as the single output, the study also estimated a social-media-augmented model that added tweet count as an input and engagement metrics such as likes, followers, and friends as outputs [16]. This illustrates how DEA can be adapted to integrate both conventional business measures and digital engagement indicators, a perspective that resonates with our combination of engagement and sales out.

Beyond digital media, DEA has also been applied to traditional advertising contexts. A study evaluated 23 outdoor billboard campaigns, using content-based attributes such as the number of concepts and graphics as inputs, and outputs including consumer recall and expert-rated ad quality [2]. This non-digital application parallels our study in that it does not primarily rely on campaign costs but instead emphasizes non-financial drivers of efficiency. Similarly, another study examined 14 real-estate print advertisements with an output-oriented variable returns to scale (VRS) DEA model, also known as the BCC model, treating each advertisement as a DMU and considering consumer responses such as joy, engagement, and positive attention [17]. This demonstrates the broader scope of DEA, which extends from traditional print contexts to digital settings such as email campaigns. In a similar vein, DEA was applied to evaluate the advertising efficiency of 15 Iranian food brands. Using budget and campaign duration (campaign length in days) as inputs, and sales, brand familiarity, and implementation attractiveness as outputs, the study, although not explicitly stated, can be interpreted as focusing on non-digital advertising efficiency [18]. This framework is consistent with our study, where campaign duration is likewise included as an input, while sales are captured through the number of tickets sold as a business-related output.

When the studies in Table 1 are considered collectively, the following observations can be made. Five of the campaigns whose efficiency was examined using DEA involved digital platforms, while three were conducted through non-digital platforms. Among the digital campaigns, two were implemented on Facebook, one on Twitter (currently X), one on search engines, and one on a website. The studies can be classified into two groups based on the types of inputs used to calculate campaign efficiency. The first group uses campaign characteristics (e.g., message features such as word count, color, etc.) as inputs, while the second group focuses on campaign attributes and/or campaign cost. Similarly, the studies can also be divided into two groups according to the types of outputs considered. One group uses revenue or ticket sales as outputs to measure campaign impact, whereas the other relies on engagement metrics such as clicks and reactions, which capture customer interaction with the campaign prior to purchase rather than sales outputs. Two studies analyze different campaigns of the same company, while the other studies analyze campaigns of different companies. All studies conduct an efficiency analysis of campaigns or brands using conventional DEA.

Overall, these studies underscore the versatility of DEA in marketing studies, demonstrating its adaptability across digital and non-digital contexts, varied input–output structures, and units of analysis. Building on this literature, our study applies conventional DEA to airline email campaigns from a single company, integrating inputs related to campaigns, markets, and customers, and evaluating efficiency through both engagement (CTOR) and sales (tickets sold).

2.2. DEA–Machine Learning Studies

Since the second part of the methodological framework relates to combining DEA and ML, a literature survey of studies that integrate DEA with ML was conducted. A summary of the review, focusing on studies that combine DEA with decision tree classification algorithms, is provided in Table A1. The studies in the table can be grouped into three categories based on the types of factors used as attributes in the decision tree classification model. (i) Studies that use both the inputs and outputs of the DEA model as attributes. (ii) Studies that use only the inputs of the DEA model. (iii) Studies that use factors other than the inputs and outputs of the DEA model. In the first type of study, the efficiency of new DMUs can be estimated without using DEA when data on both inputs and outputs are available. Seven of the studies detailed above fall into this group. DEA cannot determine the efficiency of new DMUs with only known input values; in this case, the efficiency of new DMUs can be estimated using the ML algorithm by defining only the input values as attributes. These types of studies are referred to here as the second group. This issue is addressed in a study that applied the ML algorithm with both inputs alone and with both inputs and outputs as attributes [19]. The third group consists of studies that predicted the efficiency of new DMUs using internal and/or external factors rather than DEA inputs and outputs [20,21,22,23]. The studies in the second and third groups are similar in approach because neither uses the DMU’s outputs as attributes in the ML algorithm. In many real-world situations, the output values of a new DMU are unknown, while the input values are available. Therefore, the approach in the second group is particularly useful in practice. The prediction model in this study uses only the DMU’s inputs as attributes because the outputs are unknown before a campaign is implemented. However, campaign designers want to know whether their designed campaign will be efficient; if not, they can redesign it by modifying its components (e.g., booking period). In this respect, the second group of approaches has the potential to make significant contributions to campaign design.

Five of these studies used WEKA (Waikato Environment for Knowledge Analysis) software. In one study, the accuracies of three different ML algorithms were evaluated using WEKA, RapidMiner, and Tanagra, and the highest accuracy rate in the decision tree classification algorithm was 91.23%, achieved with WEKA [24].

C4.5/J48 was used as the decision tree classification algorithm in six studies, C5.0 in one study, CART in four studies, and CIT (Classification and Interaction Trees) together with CART in one study. The following results were obtained in studies comparing decision tree classification algorithms with other types of classification algorithms in terms of accuracy. In one study, the accuracy of the decision tree classification algorithm C5.0 was compared with Random Forest (RF) and Neural Network (NN), and the following results were reported: C5.0 (100), RF (98.5), and NN (86) [25]. The corresponding Kappa values were 1, 0.95, and –0.0141. Another study, which belongs to the second group of approaches (using only DEA inputs as attributes), showed that J48 outperformed Naive Bayes and Support Vector Machine (SVM), both with and without balancing [19]. When both inputs and outputs were used, J48 was more accurate than Naive Bayes and SVM without balancing; however, with balancing, J48 was more accurate only compared to Naive Bayes. In another work, CCR and BCC efficiency scores were compared with the accuracy of C4.5 and NN, yielding the following results: C4.5—CCR: 90.91, BCC: 81.82; NN—CCR: 72.3, BCC: 100 [26]. In one study, three classification algorithms, including a decision tree, were compared in WEKA, with the following results: Decision tree (91.23), K-Nearest Neighbor (89.51), and Naive Bayes (79.25) [24]. While the decision tree produced the highest accuracy in WEKA, other algorithms achieved better results in different software. Another study compared decision trees with SVM, K-Nearest Neighbors (K-NN), Linear Discriminant Analysis, and RF, concluding that the best algorithms overall were RF and SVM. For two classes, decision trees ranked second after RF [27]. Finally, in a study, CART was compared with CIT, RF-CART, RF-CIT, Artificial Neural Network (ANN), and Bagging, producing the following accuracy rates: CART (75.50), CIT (67.55), RF-CART (82.78), RF-CIT (75.50), ANN (68.21), and Bagging (84.11) [22]. The corresponding Area Under the Curve (AUC) values were 0.7466, 0.7077, 0.9293, 0.8516, 0.6951, and 0.9221, respectively. CART performed better than CIT, but was outperformed by Bagging, RF-CART, and RF-CIT (although CART and RF-CIT had the same accuracy, the AUC of CART was lower). Among these algorithms, CIT was the worst.

Differences in the accuracy rates of predictions made using ML algorithms based on the efficiency results of different DEA models have been examined in several studies. The accuracy rates of CCR and BCC models were compared [26]. The differences between CRS-Tier, VRS-Tier, conventional CRS, and conventional VRS models were also analyzed [27]. Predictions based on the efficiency results of the CCR model (90.91) were reported to be more accurate than those based on the BCC model (81.82) [26]. Conventional CRS and VRS models were found to produce more accurate predictions than Tier-DEA models. Moreover, CRS models yielded lower accuracy rates than VRS models, with the difference being very small for two classes but much larger for ten classes (CRS (2): 98.24, VRS (2): 98.85; CRS (5): 86.29, VRS (5): 91.55; CRS (10): 69.94, VRS (10): 82.62) [27].

In these studies, the number of classes used for the target variable in decision tree classification algorithms is generally two. Unlike one study that used 10 classes [28], another examined accuracy rates using 2, 3, 5, and 10 classes and found that the accuracy rate decreased with the increase in the number of classes [27]. In DEA–Decision Tree classification models, the efficiency scores generated by the DEA model are used to determine the classes of the target variable. When efficiency scores are measured in the 0–1 range, several studies classify efficiency scores equal to 1 as efficient and those below 1 as inefficient. Contrary to this common practice, a threshold value of 0.8 was used to define classes in one study [25]. In studies where efficiency was measured using the Malmquist Productivity Index (MPI), which can yield efficiency values greater than 1, efficiency scores of 1 and above were classified as efficient and those below 1 as inefficient [24,29]. In another MPI-based study, three classes were used [21].

Decision tree classification algorithms were selected because they are among the most widely used classification methods in business and marketing applications [30]. The method is easy to interpret, and allow complex relationships between inputs and outputs to be simplified into understandable rules [31]. Moreover, in the campaigns analyzed in this study, managers make a series of decisions—such as determining booking period, travel period, and the number of emails to be sent. Since campaigns are inherently decision-making processes, the use of decision tree classification algorithms is conceptually consistent with the nature of the problem. Classifying campaigns as efficient or inefficient using decision trees helps transform efficiency estimation into actionable insights that can directly support managerial decision-making. For these reasons, decision tree classification algorithms were selected for this study.

3. Methodology

This paper is intended as a methodological contribution. The aim of the study is to develop and demonstrate an approach for measuring the efficiency of airline email campaigns and for predicting the efficiency of a new campaign at the design stage, before it is implemented. The proposed methodology combines two elements: (i) an efficiency measurement model with DEA, and (ii) a campaign efficiency prediction model using ML, specifically decision tree classification algorithms. Three algorithms were applied: J48 (C4.5 implementation in WEKA (version 3.8.6; University of Waikato, Hamilton, New Zealand)), SimpleCART (CART implementation in WEKA, version 3.8.6; hereafter referred to as CART), and C5.0 (implemented in R 4.3.1 using the C5.0 package). These were chosen due to their interpretability and widespread use in the literature. In doing so, the study not only evaluates past campaigns but also provides a framework that allows practitioners to estimate whether a newly designed campaign is likely to be efficient, and to use this information during the design process. This contributes to the proactive management of inefficiencies before campaigns are launched.

A methodology, which is an expanded form of the general methodological framework shown in Figure 1, was applied to email campaigns from an airline company (see Figure 2). A total of 76 previously executed email campaigns from a one-year period were selected. These campaigns were sales-oriented and targeted specific customer segments through email communication. The dataset includes both the structural characteristics of the campaigns and their corresponding outputs.

3.1. Efficiency Measurement with DEA

This subsection presents the methodological background of DEA and explains its rationale, model specifications, DMU definitions, methodological choices, and input–output structure in the context of this study.

Various techniques are employed in the evaluation of efficiency, including ratio analysis, regression analysis, the Stochastic Frontier Approach, and DEA. In this study, DEA was employed owing to its several advantages, such as the capacity to simultaneously manage multiple input and output variables, to identify reference units for inefficient DMUs (campaigns), and to determine the sources of inefficiency within these campaigns.

The DEA methodology is rooted in the seminal work of Farrell in 1957, which laid the foundation for efficiency measurement. The CCR model, introduced by Charnes, Cooper, and Rhodes in 1978, assumes CRS, while the BCC model, developed by Banker, Charnes, and Cooper in 1984, is based on VRS. Both models can be formulated as either output-oriented or input-oriented, depending on the objective of the analysis [32]. In addition to these, other DEA models include the Additive Model (AM), Multiplicative Model, Slack-Based Model, Hybrid Model, Super-Efficiency Model, and Dynamic DEA Model.

3.1.1. DMUs Definitions

In this study, each email campaign of an airline company carried out within a one-year period was considered a DMU. A total of 76 campaigns were included in the DEA.

3.1.2. Inputs and Outputs

Inputs and outputs for DEA can vary across studies. In addition to the DEA-based campaign efficiency studies summarized in Table 1, indicators frequently used in previous studies in the literature are also reviewed, with the aim of clarifying the rationale for the selection of inputs and outputs in this study.

•: Literature survey on campaign evaluation indicators

To reflect the diversity of indicators adopted in prior studies, the following literature review summarizes a broad range of indicators used in airline marketing campaigns and in email marketing efficiency studies, regardless of whether these studies were conducted within a DEA framework.

Airline marketing campaign efficiency has been examined from multiple perspectives, with each study employing distinct evaluation indicators (see Table 2). Airlines’ social media presence has been assessed by identifying which platforms and content types generate stronger engagement [33]. The use of third-party seals in digital advertising has been experimentally tested, showing that recognized awards embedded in messages increased click-throughs when targeted appropriately [34]. The impact of paid advertising intensity on airline website performance has also been analyzed, emphasizing user engagement measures [35]. From a collaborative stakeholder perspective, a framework including destination awareness, emotional proximity, intention to visit, incremental spending, and ROI has been proposed [36]. A subsequent study expanded this framework in the Athens case using additional indicators to examine determinants of visit intention and conversion [37].

The literature on email marketing efficiency highlights a variety of indicators, which are summarized in Table 3. One study adapts the AIDA framework, where Attention is linked to open rate, Interest to click-through behaviors such as CTR or CTOR, and Action to conversion outputs, while unsubscribe is considered as a retention measure [38]. Another study proposes a real-time testing methodology to predict open and click metrics within hours, drawing on metrics such as open rate, CTOR, CTR, and timing variables (time-to-first-open, time-to-first-click, doubling time), and further emphasizes that clicks should be tied to purchases [39]. Other study highlights the need to account for past purchasing behavior [40]. A study demonstrates how personalization, even when peripheral, can enhance engagement and reduce unsubscribes [41]. Finally, recent work offers a systematic categorization of metrics across six dimensions, underlining the multidimensional nature of email efficiency evaluation [42].

•: Selection of inputs and outputs for this study

The inputs and outputs for the 76 email campaigns analyzed in this study were determined based on a combination of insights from the literature, expert opinions, and the availability of data. Two streams of prior studies in the literaturer were particularly relevant in this process: studies applying DEA to marketing campaigns and studies on email marketing indicators.

The literature on DEA applications to campaign efficiency shows that some inputs and outputs are highly specific to the platform through which the campaign is conducted (e.g., Facebook, website). Therefore, such platform-dependent variables were not considered in this study. Operational costs associated with preparing and sending emails (such as content creation and layout design) were also excluded, since these costs are negligible in email campaigns and, as all campaigns in this study belong to the same organization, they can be assumed to be identical across campaigns. Previous studies in the lterature has also relied on stylistic characteristics of campaign messages (e.g., length, color). In addition, inputs such as campaign duration or ad duration have been employed in prior studies, which are conceptually related to the booking period and travel period variables used in this study. In contrast, the present study incorporates content-oriented factors, such as booking period, to capture the substantive aspects of the campaigns.

In terms of outputs, prior studies have examined both engagement metrics (e.g., CTR, likes, shares) and business metrics (e.g., sales, conversions). Consistent with this approach, the present study considers both dimensions by including CTOR as an engagement indicator and tickets sold as a business-related outcome.

The literature on email campaigns further shows that most indicators can be linked to the AIDA framework, which is also adopted in this study [38]. Accordingly, CTOR is used to represent Interest and tickets sold to represent Action, while unsubscribe is excluded as it is primarily influenced by external factors. In addition, findings on the role of past purchasing behavior are reflected in this study through the inclusion of customer flight history (number of flights taken) as an input. In this way, the specification of inputs and outputs in this study adapts the indicators identified in previous research while tailoring them to the context of airline email campaigns.

Other possible inputs were deliberately excluded from the analysis. Discount rate was not considered, as it was very similar across campaigns and thus not a distinguishing factor. Seat allocation was excluded because data on this variable could not be obtained. Ticket price was excluded due to heterogeneity across campaigns stemming from differences in departure and arrival destinations, which made it unsuitable for meaningful comparison.

Among other potential outputs, unsubscribe rate was omitted because it is considered to be largely influenced by external factors. Loyalty program variables (e.g., new membership counts) were not considered, as none of the campaigns in this study directly targeted loyalty program participation. Bounce rate and spam complaint rate were also excluded, as they mainly capture technical issues of email delivery and user-specific filtering behaviors rather than the design or effectiveness of the campaign.

Based on these considerations, six input variables were included in the analysis: booking period, travel period, number of emails sent, number of flights taken (customer history), market share, and market size. CTOR, which serves as a digital performance metric for email campaigns [43], and tickets sold, which represents a business-related outcome reflecting purchasing behavior. The literature emphasizes that the purchasing process should not be viewed solely as a final sale, but as a multi-stage journey involving cognitive (awareness), emotional (evaluation), and volitional (purchase) stages [44]. The input and output variables used in the analysis are presented in Table 4 and Table 5, respectively. All data in these tables are numeric.

To prevent potential issues arising from differences in scale and to ensure comparability among datasets, mean normalization was applied to the numerical values [45].

When determining the inputs and outputs, the number of DMUs was also taken into account. The literature proposes several heuristics for ensuring an adequate sample size: (i)

n \geq (m + p + 1)

[46], (ii)

n \geq 2 * (m + p)

[47], (iii)

n \geq m a x [(3 * (m + p)), (m * p)]

[48,49], (iv)

n \geq (2 * m * p)

[50].

3.1.3. Choice of DEA Methodology

Output-oriented CCR and BCC models were applied, and super-efficiency models were used to further discriminate among efficient campaigns. Dynamic DEA, which incorporates carry-over variables linking consecutive periods was not employed [51]. Because the email campaigns identified along the time axis were treated as independent DMUs. Inputs such as booking periods of successive campaigns were determined independently—for example, two successive campaigns had booking periods of 9 and 180 days, respectively, indicating independence. Therefore, it was appropriate to treat campaigns as separate DMUs and to apply the conventional (i.e., cross-sectional) DEA methodology.

3.1.4. Orientation Choice

There is no strict rule for determining the orientation of a DEA model; in general, it depends on the context [52]. In this study, two factors were considered in choosing the orientation: (i) the purpose of the analysis [53], and (ii) the level of competition in the sector in which the DMUs operate [54]. In competitive markets, inputs are typically under the control of DMUs, whereas outputs are subject to market demand; therefore, DMUs seek to maximize their outputs [54]. In the context of this study, the campaigns were specifically designed to increase the outputs of CTOR and tickets sold. For this reason, the DEA models used here were specified as output-oriented. This choice is further supported by the highly competitive nature of the airline industry.

Moreover, Table 1 shows that, among eight studies on DEA and marketing campaigns, two explicitly specify their orientation as output-oriented and two explicitly as input-oriented. Of the remaining four, two implicitly indicate an output orientation, while the other two do not specify their orientation. In addition, three non-profit campaigns not included in our main literature review but similar in structure to the present study (two election campaigns and one vaccination campaign) also reported output orientation, either implicitly or explicitly, since their objective was to increase votes or vaccinations [10,11,12]. These findings further support the use of output-oriented DEA models in campaign efficiency analysis.

Another practical reason for adopting output orientation is that when predicting the efficiency of a new campaign using ML algorithms, only the inputs are known ex ante, while the outputs are not. An input-oriented DEA would therefore generate efficiency scores inconsistent with the very purpose of campaign design. In addition, the orientation that yields lower efficiency values under the VRS (BCC) assumption should be preferred [52]. The average efficiency score of these campaigns was higher for the input-oriented BCC model (87.06) than for the output-oriented BCC model (80.81), indicating that output orientation provides stronger discriminatory power. All efficiency results are therefore based on output-oriented DEA models. Moreover, in an output-oriented DEA model, input targets can also be derived together with output targets [55].

3.1.5. Scale Type: CCR and BCC

•: CCR Model

The CCR model assumes constant returns to scale (CRS), under which an increase in all inputs by a given proportion results in an equal proportional increase in outputs. The model is first presented in the input-oriented CCR form and then converted into the output-oriented form, which is the focus of this study, in order to demonstrate the mathematical equivalence between the two orientations. The input-oriented model aims to minimize input usage while producing at least the observed levels of outputs. In contrast, the output-oriented model aims to maximize outputs without exceeding the observed input levels. These models can be formulated as follows: [56]:

({D L P O}_{0}) m a x η

(1)

s u b j e c t t o x_{0} - X_{μ} \geq 0

(2)

{η y}_{0} - Y_{μ} \leq 0

(3)

μ \geq 0

(4)

An optimal solution of

{D L P O}_{0}

can be derived from the optimal solution of the input-oriented CCR model as follows. By defining

λ = μ / η

,

θ = 1 / η

, then

{D L P O}_{0}

can be expressed as:

({D L P}_{0}) m i n θ

(5)

s u b j e c t t o θ x_{0} - X λ \geq 0

(6)

y_{0} - Y λ \leq 0

(7)

λ \geq 0

(8)

This model is an input-oriented CCR model. Accordingly, the optimal solution of the output-oriented model can be derived from that of the input-oriented model as follows:

η^{*} = 1 / θ^{*}, μ^{*} = λ^{*} / θ^{*}

(9)

The slack of the output-oriented model (

t^{-}, t^{+}

) is as follows.

X μ + t^{-} = x_{0}

(10)

Y μ - t^{+} = η y_{0}

(11)

These values are also related to input-oriented model.

t^{- *} = s^{- *} / θ^{*}, t^{+ *} = s^{+ *} / θ^{*}

(12)

θ^{*}

≤ 1,

η^{*}

satisfies the

η^{*}

≥ 1 condition.

The higher the value of

η^{*}

, the less efficient the DMU.

θ^{*}

represents the input reduction rate, whereas

η^{*}

describes the output enlargement rate. Based on these relationships, a DMU is considered efficient under the input-oriented CCR model if and only if it also maintains efficiency under the output-oriented CCR model.

The

{D L P O}_{0}

dual problem is expressed in the following model, where the components of vectors p and q are used as variables.

({L P O}_{0}) m i n p x_{0}

(13)

{s u b j e c t t o q y}_{0} = 1

(14)

- p X + q Y \leq 0

(15)

p \geq 0, q \geq 0

(16)

On the multiplier side; let an optimal solution of

({L P O}_{0})

be denoted by (

ν

*,

υ

*). Then, the optimal solution of the output-oriented model

({L P O}_{0})

can be obtained as follows:

p^{*} = ν^{*} / θ^{*}, q^{*} = υ^{*} / θ^{*}

(17)

It is clear that (p*, q*) is feasible for

({L P O}_{0})

. Its optimality comes from the equation below:

p^{*} x_{0} = ν^{*} x_{0} / θ^{*} = η^{*}

(18)

Therefore, the output-oriented CCR model can be formulated based on the input-oriented solution, and the resulting efficiency gain is captured as follows:

{\hat{x}}_{0} ⟸ x_{0} - t^{- *}

(19)

{{\hat{y}}_{0} ⟸ η}^{*} y_{0} + t^{+ *}

(20)

Further analysis reveals that (LPO₀) is mathematically equivalent to the following fractional programming expression:

m i n \frac{π x_{0}}{ρ y_{0}}

(21)

\frac{π x_{j}}{ρ y_{j}} \geq 1 (j = 1, \dots, n)

(22)

π \geq 0, ρ \geq 0

(23)

•: BCC Model

In the BCC model, the upper limit of production is defined by the concave envelopment surface formed by the existing decision-making units (DMUs). This structure is piecewise linear and exhibits the characteristic of Variable Returns to Scale (VRS) [56]. A version of the BCC model that aims to maximize outputs without exceeding the observed input levels is presented below [57]:

({B C C}_{0}) m a x η_{B}

(24)

s u b j e c t t o X λ \leq x_{0}

(25)

η_{B} y_{0} - Y λ \leq 0

(26)

e λ = 1

(27)

λ \geq 0

(28)

Dual (multiplier) form:

m i n z = {v x}_{0} - v_{0}

(29)

{s u b j e c t t o u y}_{0} = 1

(30)

v X - u Y - v_{0} e \geq 0

(31)

{v \geq 0, u \geq 0, v}_{0} f r e e i n s i g n .

(32)

Here

v_{0}

,

e λ

= 1 are related scales. The equivalent fractional programming formulation of the BCC dual model is as follows:

m i n \frac{{v x}_{0} - v_{0}}{{u y}_{0}}

(33)

s u b j e c t t o \frac{{v x}_{j} - v_{0}}{{u y}_{j}} \geq 1 (j = 1, \dots, n)

(34)

v \geq 0, u \geq 0, v_{0} f r e e i n s i g n .

(35)

3.1.6. Super Efficiency DEA Model

The super-efficiency model Andersen and Petersen, 1993 was also employed in this study to rank efficient DMUs. In DEA’s basic models, CCR and BCC, the efficiency scores of fully efficient units is equal to 1. However, these models are insufficient for ranking among the efficient decision-making units (DMUs). To address this limitation, Andersen and Petersen (1993) proposed a ranking approach known as the Super-Efficiency Model [58]. In this model, the DMU under evaluation is excluded from the reference set and compared with a production frontier formed by the linear combinations of the remaining units. This exclusion allows the evaluated DMU to obtain an efficiency score greater than 1, thereby enabling the ranking of units that are otherwise all considered efficient in the standard CCR and BCC models.

The mathematical formulation of the input-oriented BCC Super-Efficiency model is presented below [59]:

m i n E_{j} - δ é s^{-} - δ é s^{+}

(36)

s u b j e c t t o E_{j} X_{j} = \sum_{\begin{matrix} k = 1 \\ k \neq j \end{matrix}}^{n} λ_{k} X_{k} + s^{-}

(37)

Y_{j} = \sum_{\begin{matrix} k = 1 \\ k \neq j \end{matrix}}^{n} λ_{k} Y_{k} - s^{+}

(38)

λ, s^{+}, s^{-} \geq 0

(39)

3.1.7. DEA Models Applied in the Study

In this study, the conventional CCR and BCC models were employed under an output-oriented approach. The CCR model assumes CRS, while the BCC model incorporates a convexity constraint to allow for VRS. To rank efficient campaigns beyond the efficiency frontier, super-efficiency models were also applied. In addition, segment-based models (e.g., CCR_G_G, CCR_R_O, BCC_R_B) represent campaign subsets defined by different marketing conditions such as group size, seasonality, or route type, as detailed in Table 6.

The CCR version of the core model is referred to as CCR_C, while the BCC version is denoted as BCC_C. CCR-G is the group-based version of the core model: CCR-G-I refers to campaigns available for individual bookings, while CCR-G-G refers to campaigns available for group bookings (at least two customers). CCR-S is the seasonality-based version of the core model: CCR-S-L refers to campaigns with travel periods in the low season, while CCR-S-H refers to those with travel periods in the high season. CCR-R is the route type-based version of the core model: CCR-R-O refers to one-way campaigns, CCR-R-R to round-trip campaigns, and CCR-R-B to campaigns that include both one-way and round-trip options. The same segmentation structure was also applied to the BCC models. While the input and output variables remain consistent across all models, all models other than the core model are based on datasets segmented according to categorical variables.

As noted in the formulas provided under Section 3.1.2, the number of DMUs was considered when determining the variables, and a set of heuristic rules was applied to ensure an adequate sample size. In our study,

m = 6

and

p = 2

, implying thresholds of

\geq 9, \geq 16, a n d \geq 24

, which correspond to formulas (i), (ii), and (iii)-(iv), respectively. Accordingly, CCR_G_G and BCC_G_G (

n = 9

) and CCR_R_O and BCC_R_O (

n = 11

) satisfy only the first rule; CCR_R_B and BCC_R_B (

n = 17

) and CCR_S_H and BCC_S_H (

n = 21

) satisfy the first two rules; while the remaining segments exceed

n > 24

, and thus satisfy all four criteria. Moreover, if the number of DMUs is less than the combined number of inputs and outputs (m + p), many DMUs will tend to appear efficient and the discriminatory power of the model is reduced; therefore, it is desirable that n exceed (m + p) by several times [49]. Findings for segments not meeting the stricter rules should therefore be interpreted with appropriate caution.

There are numerous commercial and non-commercial software options available for conducting DEA. In this study, Frontier Analyst 4.0 was selected due to its advanced graphical user interface, compatibility with other applications, ability to efficiently handle large datasets, and comprehensive modeling capabilities [60].

3.2. Efficiency Prediction with Decision Trees

The decision tree method, selected in this study for the reasons outlined in Section 2.2, is a hierarchical classification strategy that operates from top to bottom. During the classification process, each level of the tree evaluates an attribute, progressing from the root node to the leaf nodes to make predictions on new data [61]. The most common decision-tree algorithms include ID3, CART, CHAID, and C4.5 with its extension C5.0. In this study, two WEKA implementations—J48 (C4.5) and CART—were adopted, and C5.0 was additionally implemented in R 4.3.1 using the C5.0 package.

C4.5 is a widely used decision tree classification algorithm developed by Quinlan [62]. J48, its open-source Java implementation available in the WEKA software package, builds on C4.5 and the earlier ID3 algorithm by introducing improvements such as pruning and the use of the gain ratio. The algorithm selects the attribute that best separates the data based on entropy and information gain, recursively partitioning the dataset into increasingly homogeneous subsets. This process continues until no further splits are possible, and pruning is applied to reduce overfitting and improve generalization [63].

The C5.0 algorithm is a new generation of decision tree-based ML methods developed as an improved version of the widely used C4.5 classifier [64]. Compared to its predecessor, C5.0 often generates more accurate rules, produces smaller trees, and runs significantly faster. It introduces several enhancements, including boosting (combining multiple trees to improve predictions), support for variable misclassification costs, the ability to handle new attribute types (such as dates, times, and ordered discrete features), and robust handling of missing values [65].

CART (Classification and Regression Trees) is a classification technique that produces binary decision trees, where each node splits into two child nodes. It can generate either classification or regression trees depending on whether the target variable is categorical or numeric. The method is widely known and frequently used, relying on cross-validation or independent test samples to select the optimal tree during pruning. The CART algorithm follows a greedy approach by choosing the best feature locally at each step, which is computationally efficient though not globally optimal. The process recursively divides the dataset into subgroups until a minimum size is reached, making CART a practical alternative to traditional prediction methods [66].

The selection of decision-tree classification algorithms was motivated not only by their suitability for post-DEA binary classification of “Efficient” vs. “Inefficient” campaigns, but also by their interpretability, which provides actionable insights for campaign design and managerial decision-making. For prediction, a software environment was employed for practicality and reproducibility. This choice is consistent with the literature, as several studies combining DEA modeling with decision tree classification algorithms (5 out of 13 reviewed) also relied on software implementations (see Table A1). WEKA was selected due to its widespread use, accessibility, and user-friendly interface [67]. This preference is further supported by studies reporting that WEKA achieved the highest accuracy among three software programs in decision tree classification tasks [24]. WEKA does not include C5.0 as a decision tree classification algorithm; instead, it provides CART, and J48. Previous studies indicate that the accuracy of J48 and CART is generally comparable, with no consistent superiority of one algorithm over the other. For this reason, and to ensure robustness, J48, C5.0, and CART were employed in this study.

In summary, the methodological choices made in this study are consistent with the general findings reported in the literature. This study follows the second group of approaches discussed in the DEA–ML literature, which rely solely on DEA inputs as attributes and are particularly relevant when outputs are not yet available. WEKA was selected as the primary software due to its reported superior performance, and three decision tree classification algorithms—J48, CART, and C5.0—were employed to ensure a robust comparison, with J48 and CART being the most frequently used in prior studies and C5.0 having demonstrated the highest accuracy despite its limited application. Furthermore, conventional CCR and BCC models were adopted, as they have been shown to outperform Tier-based alternatives in prediction accuracy. Finally, consistent with the prevailing practice in the literature, the target variable was defined with two classes, which has been found to yield higher predictive accuracy. Taken together, these methodological choices reflect both the gaps and the recommendations highlighted in Section 2.2, while tailoring the analysis to the airline email campaign context.

4. Results

4.1. Results of the DEA Models

The CCR and BCC DEA models described above were applied to models based on historical real-world data from an airline, using Frontier Analyst 4.0. During the data loading stage, each variable was assigned to one of three categories based on its data type: controlled input, uncontrolled input, and output. The variables I4, I5, and I6 were classified as uncontrolled inputs, as they represent factors determined by external conditions beyond the direct control of the DMUs (campaigns). These variables reflect environmental or structural characteristics that cannot be altered through managerial intervention but still influence efficiency outputs [68,69].

In this study, the core model, based on the output variables CTOR and the number of tickets sold, was analyzed alongside segmented models developed according to group size, seasonality, and route type. All models are based on a dataset that includes 6 input and 2 output variables. After applying DEA, the number of efficient and inefficient campaigns for these models and their descriptive statistics are presented in Table 7. Since the maximum value for all models is 100, it is not specified.

Due to the VRS assumption of the BCC model, it is observed that the same model identifies more campaigns as efficient compared to the CCR model. This is an expected result because the BCC model evaluates each campaign against a more flexible reference set and isolates pure technical efficiency by eliminating the effect of scale efficiency. In contrast, the CCR model assumes CRS and measures overall efficiency, which includes both technical and scale efficiency. As a result, the efficiency scores obtained from the BCC model are generally higher than those from the CCR model. Additionally, the lower standard deviations observed in the BCC models indicate that it assesses campaign efficiency in a more stable and consistent manner. However, standard deviations are still relatively high, pointing to substantial differences in efficiency between campaigns and suggesting that the model has strong discriminating power, which aligns with the heterogeneous nature of actual campaign data.

When CCR_G_I and CCR_G_G, segmented by group size, were examined separately, it was observed that all campaigns requiring purchases by at least two customers were evaluated as efficient in both CCR and BCC models, while those available for individual purchases received lower average efficiency scores compared to group campaigns. This may be due to the small number of campaigns in the group (G) segment. In models CCR_S and BCC_S, which incorporate the seasonality variable, the average efficiency scores for the high season (H) are found to be higher than those for the low season (L). This is an expected result, as demand tends to be higher during the high season. Regarding the route type variable in CCR_R and BCC_R, the round-trip (R) segment yields the lowest average efficiency scores among the route-type segments. This situation may stem from the difficulty of planning return travel simultaneously. Conversely, one-way (O) campaigns and those offering both one-way and round-trip options (B) show higher average efficiency scores, possibly because they offer greater flexibility to customers. Had the number of campaigns across route types been more balanced, the findings could have led to stronger and more generalizable interpretations. In the CCR_C model, the ratio of efficient campaigns to the total number of campaigns is 34%. In the BCC_C model, the number of efficient campaigns is 61% of the total number of campaigns.

While both the CCR and BCC models provide measures of technical efficiency, they differ in their underlying assumptions about returns to scale. The CCR model assumes CRS, whereas the BCC model allows for VRS. By comparing the efficiency scores obtained from these two models, it is possible to isolate the effect of scale inefficiency. This comparison forms the basis for calculating scale efficiency (SE), which indicates whether a DMU (campaign) is operating at an optimal production scale. Specifically, scale efficiency is computed as the ratio of the CCR model to the BCC model; values less than one signal the presence of scale inefficiency [70].

Scale Efficiency (SE), which is calculated as the ratio between the CCR and BCC scores (SE = CCR/BCC), values for the models are presented in Table 8. The results show that campaign efficiency varies depending on distinctions based on campaign features such as group size, seasonality, and route type. In the core model, all campaigns were evaluated together, and the average scale efficiency was found to be 0.87, indicating that campaigns, on average, could improve their efficiency by 13% by through scale adjustments. Also, when campaigns are segmented based on characteristics such as group size, seasonality, and route-type, individual campaigns (I), high-season campaigns (H), one-way campaigns (O), and both round-trip and one-way valid campaigns (B) exhibit higher average scale efficiency values. This indicates that segmenting and analyzing campaigns based on specific structural characteristics reveals which campaign types operate closer to optimal scale conditions.

As mentioned above, DEA was applied to a model consisting of six input and two output variables in this study. Subsequently, group size, seasonality, and route type were incorporated separately, and efficiency scores were recalculated for each case. However, it is important to determine whether the efficiency scores obtained from these models and corresponding categories differ significantly in order to identify the most appropriate model structure. At this stage, the normality of the efficiency scores was assessed using the Shapiro–Wilk test, while comparisons between models were performed using the Kruskal–Wallis test followed by the Dunn post hoc test. All statistical analyses were conducted using R (version 4.3.1). The Shapiro–Wilk test resulted in p-values below 0.05 for both the CCR and BCC models, indicating that the assumption of normality was not met. Therefore, nonparametric tests were employed for comparing efficiency across models.

The Kruskal–Wallis test is a nonparametric alternative to analysis of variance, designed to compare the median values of three or more independent groups when the assumption of normality is not met. While it reveals an overall difference (omnibus effect) among groups, post hoc analysis is required to identify —which specific group pairs exhibit statistically significant differences [71].

According to the results in Table 9, the Kruskal–Wallis test indicated a statistically significant difference among the CCR models. This indicates that there is a significant difference in campaign efficiency scores among at least one pair of models. To identify which specific model pairs exhibited this difference, the Dunn post hoc test with Bonferroni correction was applied, and the results are presented in Table 10. As an example, a significant difference has been observed between CCR_G_G and CCR_C. In addition, a significant difference has been found between CCR_S_H, which represents high season campaigns, and the core model CCR_C. Differences have also been identified between CCR_S_H and CCR_R_R, which covers round-trip campaigns.

To evaluate the strength of these differences, the rank-biserial correlation coefficient (r) was calculated as a measure of effect size. This coefficient ranges from −1 to +1, where values closer to ±1 indicate stronger differences between groups. Positive values imply that the first group ranks higher than the second, while negative values indicate the opposite. As shown in Table 11 the difference between CCR_R_R and CCR_R_O demonstrated a greater effect size than the differences between CCR_C and CCR_R_O or CCR_R_R and CCR_S_H. An r value between 0.10 and 0.30 is interpreted as a small effect, 0.30 to 0.50 as a moderate effect, and above 0.50 as a large effect. In addition, the small number of campaigns in the CCR_G_G may have caused all campaigns in this model to be efficient, so this limitation should be taken into account when interpreting the results.

In the Kruskal–Wallis test for the BCC models indicated a significant overall difference (p < 0.05), Dunn’s post hoc test with Bonferroni correction revealed no significant pairwise differences. Therefore, detailed post hoc results were not presented.

The super-efficiency model was applied using the Frontier Analyst program to rank the campaigns identified as efficient by the CCR and BCC models. In order to facilitate the interpretation of the results and to observe the distribution of super-efficiency scores more clearly, the values were grouped into intervals. The number of groups was determined according to Sturges’ rule, expressed as

k = 1 + [3.3 \log_{10} (n

)], where k is the number of classes and n is the number of items [72]. In this study, n was taken as 46 (the number of campaigns in the segment with the largest sample size), which yields approximately

k = 6

intervals. Accordingly, the super-efficiency models are grouped into six intervals in Table 12 and Table 13 to enhance the interpretability of the results. CCR_C, which evaluates all campaigns collectively without any segmentation based on campaign characteristics, shows that 69% (18 out of 26) of the super-efficient campaigns fall within the [100–250) range. This suggests that most efficient campaigns are near the efficiency threshold, indicating relatively small efficiency differences among them. CCR_G_G, which represents group campaigns, includes a small number of observations, yet clearly distinguishes high-performing campaigns. Among the more numerous individual campaigns in CCR_G_I, 64% (14 out of 22) also fall within the [100–250) range. Other super-efficient campaigns demonstrate a more homogeneous distribution across efficiency intervals compared to earlier models. The model (CCR_R_B) with the highest proportion, 27% (3 out of 11), of super-efficient campaigns is the one in which customers could purchase either one-way or round-trip tickets. Notably, most of the campaigns in this group fall within the [850–1000] efficiency score range, indicating a concentration of highly distinguished campaigns in this segment.

On the other hand, the super-efficiency evaluation based on the BCC model, as presented in Table 13, reveals that campaigns included in the models tend to exhibit high levels of super efficiency, with a substantial concentration in the [850–1000] range. Compared to the CCR model, the BCC results demonstrate a sharper distinction among efficient campaigns. In BCC_S_L, which represents campaigns conducted during the low season, 88% of the efficient campaigns were found to be super-efficient within the highest interval, indicating a strong efficiency concentration. Conversely, BCC_S_H, covering high-season campaigns, shows a more dispersed distribution of efficiency scores, suggesting greater variability in campaign efficiency during peak travel periods. In BCC_R_R, which includes round-trip campaigns, 97% of the campaigns fall within the [850–1000] range, signifying an exceptionally high level of super efficiency. Similarly, BCC_R_O for one-way campaigns and BCC_R_B for campaigns covering both one-way and round-trip options also exhibit a strong super-efficiency profile, reinforcing the notion that route flexibility may contribute positively to campaign efficiency.

In this study, quantitative sensitivity analysis was conducted on the CCR models CCR_C and CCR_S_H, which showed statistically significant differences based on the Kruskal–Wallis and Dunn tests and had the majority of their campaigns concentrated near the lower bound of the super-efficiency range. These models were selected to evaluate the robustness of the model structure with respect to sample characteristics and to assess the stability of the DEA results. For both models, ±5% and ±10% changes were applied to the output variables O1 and O2 in order to analyze the sensitivity of the efficiency scores to small and moderate variations in output data.

CCR_C is constructed based on two output variables: O1 (CTOR) and O2 (number of tickets sold). To assess the robustness of this model, both output variables were increased and decreased by 5% and 10%, and the DEA models were recalculated accordingly. As presented in Table 14, the average difference in efficiency scores across all scenarios is either zero or negligible. The Spearman rank correlation coefficient was found to be 1.00 in all cases, indicating that the ranking of campaigns remained completely unchanged. Moreover, the number of campaigns classified as efficient consistently remained at 26 in all variations. These results demonstrate that CCR_C is not sensitive to small or moderate changes in the output values, suggesting a robust and stable model structure. The consistency of both the efficiency rankings and the number of efficient campaigns underlines the reliability of the findings. CCR_S_L and CCR_S_H represent the seasonal segmentation of CCR_C, corresponding to low season and high season campaigns, respectively. As shown in Table 14, when the output variables were altered by ±5% and ±10%, the average change in efficiency scores was minimal, and the ranking order remained constant. These findings are consistent with those of Model CCR_C.

Following the quantitative sensitivity analysis that examined the impact of small- and medium-scale changes in output variables, a structural robustness analysis was conducted to evaluate the influence of each input variable on the defined models. In this analysis, each input variable (I1–I6) was individually removed from the model, and the resulting efficiency scores were compared with those of the original model. The evaluation was based on three key metrics: the mean difference in efficiency scores, the Spearman rank correlation coefficient (Spearman’s ρ), and the number of efficient campaigns.

In Model CCR_C, the original mean efficiency score was 69.5, and the number of efficient campaigns was 26. As presented in Table 15 the structural sensitivity analysis shows that the removal of the I3 (the number of emails sent) had the least impact on the model across all metrics. The average efficiency score difference was only −0.56, the Spearman rank correlation remained high at 0.99, and the number of efficient campaigns slightly decreased to 25. These findings indicate that I3 has a relatively limited influence on model results. In contrast, the exclusion of I6 significantly disrupted the model: the average model difference reached −13.55, the rank correlation dropped to 0.65, and the number of efficient campaigns declined to 16. These results suggest that I6 is a critical variable and should be retained in the model. The removal of I1, I2, I4, and I5 produced moderate effects, indicating that while these variables contribute to the model, their individual impact is less pronounced compared to I6.

Table 16 and Table 17 present the results of the input variable sensitivity analysis conducted on two different datasets segmented by travel period: low season and high season. In this analysis, each input variable was removed one at a time, and the resulting efficiency scores were compared with the original DEA results. Before sensitivity analysis, CCR_S_L had 20 efficient campaigns with an average efficiency score of 76.5, while CCR_S_H had 16 efficient campaigns with an average efficiency score of 94. For the low season group, I3 was found to have the least impact on model efficiency. In contrast, the removal of I6, I1, and I5 caused significant changes. For example, when I6 (market size) was excluded, the Spearman rank correlation dropped to 0.64, the average model difference was −9.60, and the number of efficient campaigns decreased by 7.

In the analysis of high season campaigns presented in Table 17, the I4 variable was found to have no impact on the efficiency scores. In contrast, I6 emerged as the most critical input. When I6 was excluded from the model, the mean efficiency score difference reached −28.05, the Spearman rank correlation dropped to 0.62, and the number of efficient campaigns declined to 8. These results suggest that I6 is an indispensable variable for evaluating campaigns conducted during the high season. Likewise, the removal of I5 also led to a substantial impact on model results.

Table 18 presents the results of the sensitivity analysis conducted by removing the super-efficient campaigns from CCR_C, CCR_S_L, and CCR_S_H. In the analysis for CCR_C, the removal of three super-efficient campaigns (35th, 41st, and 49th) resulted in an increase in the number of efficient campaigns from 23 to 26 among the remaining 73. This suggests that eliminating dominant campaigns reduced their influence within the reference set, allowing other campaigns to approach or surpass the efficiency threshold. After excluding the three campaigns with a super-efficiency score of 1000, the mean efficiency score increased by 6.12 points, indicating that more campaigns attained higher efficiency. The Spearman rank correlation coefficient of 0.88 implies that the overall ranking was largely preserved, though some shifts in position occurred. In CCR_S_L, which includes campaigns conducted during the low season, the mean efficiency score increased by 4.4 points, with rankings remaining relatively stable. Although the number of efficient campaigns was initially expected to decrease from 20 to 17 in the new scenario, it instead increased to 19, suggesting that certain campaigns gained relative advantage and exceeded the threshold due to the change in the reference set. In CCR_S_H, after the removal of a single super-efficient campaign from the original model, no changes were observed in the ranking order. As expected, the number of efficient campaigns decreased by one.

In addition, DEA was used to calculate the percentage improvements required in input and output variables to transform inefficient campaigns into efficient ones. To give an example of one of the inefficient campaigns, the efficiency score of Campaign 15 in CCR_C is 69.45%, with five campaigns—specifically Campaigns 3, 31, 35, 41, and 42—identified as peers or reference points for efficiency. As shown in Table 19, to make Campaign 15 efficient, output variables should be increased by approximately 44%, while the I2 variable (booking period) should be reduced by 34%, and the I3 variable (number of emails sent) should be reduced by 26%. Table 19 illustrates the potential improvements required for Campaign 15, presented as an example, where the reported values correspond to slack-based adjustments, i.e., the specific input reductions or output increases needed for this campaign to reach the efficient frontier. Similar analyses can also be performed for the other campaigns, while Campaign 15 is shown here as an illustrative case.

Campaign 42 was selected as an example from among the efficient campaigns in the reference set of Campaign 15. As shown in Table 20, It contributes 36% to output variable O1 and 91% to output variable O2. Similarly, it makes substantial contributions to I6 and I3. By examining the remaining campaigns in the reference set, it is possible to identify which campaign contributes most significantly to each input or output variable.

Table 21, which presents input and output contributions, was analyzed to identify the variables that most significantly contributed to the overall efficiency of Campaign 15. O2 emerged as the most influential. On the input side, I6 contributed 77% and I1 contributed 21%, indicating that these two variables played the most critical roles in the campaign’s efficiency. These findings offer strategic guidance to decision-makers by highlighting which variables should be prioritized to enhance campaign efficiency.

The relatively high standard deviations observed in campaign efficiency scores indicate significant differences between campaigns. It was found that all campaigns requiring the participation of at least two customers were efficient. This result may be attributed to the limited number of group campaigns in the dataset. When campaign efficiency scores were analyzed according to seasonality, campaigns associated with high season travel periods tended to achieve higher efficiency scores than those associated with low season travel. This may be attributed to increased customer travel activity during periods of high demand for the destinations targeted by the campaign. In models segmented by route type, campaigns offering one-way or one-way and round-trip options were found to be more efficient than campaigns offering only round-trip travel. This indicated that offering customers more flexibility tended to increase campaign efficiency. Having a more balanced and larger number of campaigns in each categorical segment could have increased the robustness of the findings and the reliability of the comparisons. In terms of scale efficiency, high season campaigns and campaigns offering one-way or combined one-way and round-trip travel options show relatively higher efficiency. When examining whether there are significant differences between models, group campaigns were excluded from the evaluation due to the small sample size. It was observed that the difference between round-trip and one-way campaigns was greater than the difference between the core model and one-way campaigns, as well as the difference between round-trip and high-season campaigns. In the analysis of super-efficient models, it was observed that the most efficient campaigns in CCR models clustered near the efficiency frontier, with relatively small differences in their efficiency scores. In contrast, in BCC models, super-efficient models showed a concentration near the upper bound, indicating a sharper distinction among high-performing campaigns. Sensitivity analysis conducted on the core model and seasonality-based segments revealed that the models are robust to small and medium-scale changes in output variables. However, when input variables were removed individually, the removal of the market size variable led to significant changes in the efficiency scores, highlighting the critical role of this variable in all models. The market share variable played an important role in the high season, while it was relatively less influential in the low season. The number of travel days variable was more efficient in the low season, while its removal from the model did not cause any change in the high season. This situation can be interpreted as seasonal differences having a significant effect on model sensitivity. These findings increase the interpretability of the model and highlight the potential of season-specific planning.

4.2. Results of the Decision Tree Classification Algorithms

Building on these results, DEA efficiency scores were used as class labels to train decision-tree models and identify the key variables influencing campaign classification, with the aim of generating interpretable rules for evaluating the efficiency of new campaigns. J48 (C4.5) and CART were implemented in WEKA with confidence factor set to 0.25 (C = 0.25), minimum number of instances per leaf set to 2 (M = 2), subtree raising enabled, reduced-error pruning disabled, and stratified 10-fold cross-validation with random seed 1; all other parameters were left at their default values. Since WEKA does not provide C5.0, this algorithm was implemented in R 4.3.1 using the C5.0 package. The R implementation used a single tree (trials = 1), a minimum of two cases per leaf (minCases = 2), and a pruning confidence factor of 0.25, evaluated through stratified 10-fold cross-validation with random seed 123.

These decision tree-based classification methods categorize campaigns as efficient or inefficient based on the independent variables in the dataset. Through their hierarchical structures, the algorithms identify the variables that are most influential in the classification process.

In the classification model, campaigns were divided into two classes—Efficient and Inefficient—based on the efficiency scores derived from DEA, which served as the target class. The six attributes (predictors) used in the classification algorithms correspond to the six input variables of the DEA model, as presented Table 4. The models were evaluated using a dataset of 76 campaigns and validated through 10-fold cross-validation. In this analysis, three algorithms were employed: J48 (C4.5), CART, and C5.0. The results for the core models (CCR_C and BCC_C) are summarized in Table 22.

For the CCR_C model, J48 achieved an accuracy of 76.3% (58 correctly classified campaigns), C5.0 reached 73.6% (56 campaigns), while CART lagged behind with 67.1% (51 campaigns). For the BCC_C model, C5.0 delivered the highest classification accuracy with 78.9% (60 campaigns), followed by J48 with 71.1% (54 campaigns) and CART with 68.4% (52 campaigns).

The confusion matrices in Table 23 reveal differences in error distribution across the algorithms. CART shows a tendency to misclassify inefficient campaigns as efficient, leading to imbalanced results, whereas J48 is more balanced but still produces some asymmetry. By contrast, C5.0 demonstrates the most consistent results, with relatively fewer misclassifications across both classes, which explains its superior accuracy in the BCC_C model. These findings confirm that class imbalance affects the classification results of certain algorithms. To address this issue, the SMOTE technique was subsequently applied to balance the dataset and improve classification robustness. Based on these results, the subsequent analyses in this study focus exclusively on the BCC_C model, which yielded the best overall classification results.

SMOTE was applied in WEKA and in R with equivalent settings (K = 5, 100% minority oversampling) under stratified 10-fold cross-validation. In WEKA, the built-in SMOTE filter was used, while in R the implementation from the smotefamily package (function SMOTE, seed = 123) was employed. For the CCR_C model, SMOTE yielded relatively low accuracy rates (60.5% for J48, 55.3% for CART, and 59.2% for C5.0); therefore, detailed results are not specified, and the focus remains on the BCC_C model, where SMOTE produced higher accuracy rates compared with the CCR_C model and led to clear improvements for J48 and CART. For the core BCC model (BCC_C), the SMOTE-based results are summarized in Table 24. Accuracy rates were 76.3% for J48, 72.3% for CART, and 75.0% for C5.0. The corresponding confusion matrices (Table 25) show that J48 achieves a relatively balanced classification between Efficient (36/46) and Inefficient (22/30) campaigns; CART is slightly lower yet still balanced (Efficient 34/46, Inefficient 21/30); while C5.0 emphasizes the Efficient class (40/46) at the expense of the Inefficient class (17/30), yielding higher Efficient recall but more false positives for Inefficient. Overall, under SMOTE, J48 produced the highest accuracy on BCC_C, followed closely by C5.0, which more aggressively detected Efficient cases, whereas CART lagged behind. These patterns indicate that, although SMOTE reduces class imbalance during training, residual asymmetry can persist. If needed, further rebalancing (e.g., alternative oversampling ratios) or threshold tuning may be considered to adjust the recall–specificity trade-off for the Inefficient class.

After applying SMOTE, the classification results of the three algorithms improved for the BCC_C model, summarized in Table 26. J48 achieved the highest overall accuracy at 76.3% with a Kappa statistic of 0.51, reflecting a moderate level of agreement beyond chance. For the Efficient class, J48 reached a True Positive Rate of 78.3%, a Precision of 81.8%, and an F-Measure (F-M) of 0.80, indicating balanced accuracy results in identifying efficient campaigns. CART followed with an overall accuracy of 72.3% and a Kappa of 0.432, producing slightly lower results for the Efficient class (TP Rate 73.9%, Precision 79.1%, F-M 0.764) but still comparable to J48. C5.0 produced an overall accuracy of 75.0% and a Kappa of 0.455, with particularly strong results for the Efficient class (TP Rate 86.9%, F-M 0.808) but relatively weaker results in classifying Inefficient campaigns (TP Rate 56.7%). This imbalance is also reflected in its lower ROC Area (0.684) compared with J48 (0.763) and CART (0.758). Overall, under SMOTE, J48 yielded the most balanced results, CART performed moderately, and C5.0 showed the highest sensitivity to Efficient campaigns but misclassified a considerable share of Inefficient ones.

In addition, error-based metrics provide further insights into the classifiers’ results. These measures indicate that J48 outperformed the other classifiers after SMOTE balancing, with the lowest Mean Absolute Error (MAE = 0.251) and Root Mean Squared Error (RMSE = 0.475), as well as the highest Matthews Correlation Coefficient (MCC = 0.511). C5.0 achieved moderate results (MAE = 0.297, RMSE = 0.482, MCC = 0.464), while CART performed slightly weaker (MAE = 0.299, RMSE = 0.478, MCC = 0.433). These findings suggest that J48 not only yielded higher accuracy but also provided more reliable and stable classification compared with C5.0 and CART. Thus, J48 was selected as the most suitable algorithm for subsequent interpretations.

As an illustrative example, the decision tree of the J48 algorithm with SMOTE applied to the BCC_C model is presented, since J48 achieved the most balanced classification results across classes (Accuracy = 76.3%, Kappa = 0.51, ROC Area = 0.763). As shown in Figure 3, which is based on the normalized dataset, A1(I1) appears as the most influential variable in separating efficient from inefficient campaigns, with campaigns below the threshold of 147.44 classified directly as efficient. High values of A2(I2) are considered to indicate inefficiency (43 campaigns), while A6(I6) and A4(I4) provide additional discrimination among campaigns. These findings suggest that both input-related thresholds (e.g., A1(I1) and A6(I6)) and contextual factors (e.g., A2(I2)) play a decisive role in determining campaign efficiency.

To situate the findings of this study within the broader literature, Table 27 compares the results of the DEA–ML-based prediction model in this study with those reported in previous studies discussed in Section 2.2. Because DMUs have different characteristics, the comparison is made based on prediction accuracy.

When the studies are examined in general (regardless of the number of DMUs or the prediction model used), the prediction accuracy rates of all but one study are higher than those of the prediction model in this study. The accuracy rates obtained for CART (75.5%, with an AUC of 0.746) and CIT (67.5%, with an AUC of 0.708), based on the efficiency scores of the VRS DEA model in a previous study [22], are slightly lower than the accuracy result (76.3%, AUC = 0.763) of the J48 prediction model based on the BCC_C core model in this study. The 72.4% (AUC = 0.758) obtained when CART was applied instead of J48 in this study is also slightly lower than the CART value reported in the same study [22].

When the accuracy results of the studies employing prediction models more similar to the model in this paper are compared, the following conclusions can be drawn. As mentioned previously, a study that used only the inputs of the DEA model as attributes in the decision tree classification algorithm reported an accuracy rate of 76.2% before balancing. After balancing, the accuracy of its prediction model increased to 88.9% with an F-M of 0.875 [19] In comparison, the accuracy of the BCC_C-based prediction model in this study improved to 76.3% with an F-M of 0.764.

In terms of the number of DMUs, there is no study that is highly comparable to the present one. Two studies, one including 131 DMUs and another including 53 DMUs, can be considered relatively more similar to this study than the others. However, since the accuracy rates were not specified in the study with 131 DMUs, a direct comparison could not be made [20]. In a study where efficiency values were derived from the combined CRS and VRS DEA models and the number of DMUs was 53, the accuracy rate was 79.9% (F-M = 0.804), which increased to 81.6% (F-M = 0.818) when DECORATE was applied [73]. In this study, the accuracy rate of the BCC_C-based prediction model was 71.1% (F-M = 0.714) before balancing and 76.3% (F-M = 0.764) after balancing. As for the CCR_C core model, the accuracy rate was 76.3% (F-M = 0.746) before balancing and 60.5% (F-M = 0.615) after balancing. Both of these values are lower than the accuracy rates reported in most of the studies listed in Table 27.

5. Conclusions

Making campaigns efficient is crucial for businesses seeking to gain a competitive advantage, retain customers, and strengthen customer engagement. To achieve these goals, companies need to measure the efficiency of their campaigns and predict the efficiency of new ones. This study proposes an integrated methodology that combines efficiency measurement and efficiency prediction. For measurement, a DEA model was applied, while for prediction, a ML model—specifically a decision tree classification algorithm—was developed. Although the literature provides some insights into campaign efficiency, no study has specifically examined the efficiency of email campaigns targeting specific audiences in the airline industry. This study contributes to the literature by providing a comprehensive framework for evaluating campaign efficiency and aims to support decision-makers in making data-driven choices in a competitive marketing environment. The analysis is based on a core DEA model using 76 campaigns with six input variables and two output variables, and extends to additional models segmented by group size, seasonality, and route type. An output-oriented DEA approach was applied, and both CCR and BCC results were examined.

The DEA indicates that campaign efficiency is influenced by seasonality, route flexibility, and market-related factors. High-season campaigns and those offering more flexible travel options tended to perform better, while market size and market share emerged as important inputs across the models. Sensitivity analysis further showed that the relative importance of variables such as market share and travel period varied by season, underlining the value of season-specific planning. However, the limited number of campaigns in certain categories, particularly group campaigns, reduces the robustness of the comparisons. Future research with larger and more balanced datasets would allow these findings to be tested and generalized more confidently.

This study applies a DEA–ML framework in a setting where only input variables are available prior to campaign implementation, reflecting practical design conditions. The comparative analysis of J48, CART, and C5.0 demonstrated that J48 achieved the most stable results, while CART and C5.0 performed at lower but comparable levels. The BCC_C model produced more reliable predictions than the CCR_C model, indicating the suitability of variable-returns-to-scale assumptions for efficiency prediction. After balancing with SMOTE, the BCC_C model provided more reliable predictions, with J48 showing the most stable results, whereas the CCR_C model remained weak across all algorithms. The decision trees obtained can serve as practical tools for decision-makers, providing guidance for managerial actions based on campaign characteristics and assisting in predicting the efficiency of future campaigns.

Limitations and Future Research

The study’s limitations are discussed under two headings: the DEA stage and the prediction stage.

Regarding the DEA part, DEA results are sensitive to the chosen input–output specification. In this study, several campaign-specific factors were not included as inputs—namely, the number of seats allocated within the campaign, ticket price, and discount rate (promotion size)—as explained in Section 3.1.2. Their omission may affect measured efficiency. These factors can be incorporated as follows: (i) seat allocation can be added as an input once it is systematically recorded; (ii) when campaigns with different discount levels are available, the discount rate can be included as an input; and (iii) to make price usable, campaigns covering destinations with markedly different prices can be partitioned into price-homogeneous sub-campaigns and treated as separate DMUs, thereby allowing price to enter the model. Similarly, some potential outputs were deliberately excluded based on expert opinion. For example, unsubscribe rate was not included, as it is considered to be influenced by factors outside the campaign itself, and the analysis aimed to focus on variables under campaign control. Revenue was also excluded, because the wide variety of departure and arrival destinations across campaigns made revenue an inconsistent and potentially misleading measure; instead, tickets sold was selected as a comparable and consistently measurable output. In addition, loyalty program indicators (e.g., new membership counts) were not used, since none of the campaigns in the dataset were designed to increase loyalty program participation. Other email metrics such as bounce rate and spam complaint rate were also omitted, as they l are thought to largely reflect external factors not directly attributable to campaign design. As future work, if campaign-level data on ticket price, discount rate, seat allocation, revenue, or loyalty program variables are systematically collected, a revised DEA model including these variables can be estimated, and the resulting efficiency scores can then be used in the prediction stage.

Following the DEA specification, each email campaign was treated as a separate DMU with independently defined inputs; for example, the booking periods of two consecutive campaigns were 9 and 180 days, respectively. It is therefore reasonable to treat campaigns as separate DMUs and to apply the standard CCR/BCC DEA methodology. However, modeling campaigns as independent DMUs does not capture potential intertemporal interactions (e.g., carryover/spillover effects) between campaigns, which may bias the efficiency estimates. This study has the following limitations. It covers only 76 ticket sales campaigns offering similar discounts over a one-year period conducted by a specific airline company. The study is limited to campaigns that communicated via email and cannot be generalized to other sales channels used in the airline industry.

Regarding the prediction part, the target variable of the prediction model was derived from the DEA model’s efficiency scores. Accordingly, any limitations of the DEA-based efficiency estimates propagate to the prediction results. Therefore, studies that mitigate the limitations of DEA models are expected to reduce the limitations of prediction models that rely on DEA efficiency scores.

In future research, alternative efficiency measurement methods—such as fuzzy DEA or SFA—may be applied to validate or enhance the current DEA findings. This study focuses solely on ticket sales campaigns; other types of campaigns, such as loyalty program campaigns or ancillary service campaigns, could be examined in future studies on campaign efficiency. In addition, the dataset could be expanded to cover a longer time period, allowing the study to be replicated and changes in results to be observed over the years.

Author Contributions

Conceptualization, G.I. and S.P.; methodology, G.I. and S.P.; software, G.I.; validation, G.I. and S.P.; formal analysis, G.I. and S.P.; investigation, G.I.; resources, G.I.; data curation, G.I.; writing—original draft preparation, G.I.; writing—review and editing, G.I. and S.P.; visualization, G.I.; supervision, S.P.; project administration, G.I. and S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are not available due to commercial confidentiality restrictions required by the data provider.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DEA	Data Envelopment Analysis
CCR	Charnes, Cooper and Rhodes
BCC	Banker, Charnes and Cooper
CART	Classification and Regression Trees
CRM	Customer Relationship Management
SFA	Stochastic Frontier Analysis
ES	Efficiency Score
CTOR	Click-to-Open Rate
ML	Machine Learning
DMU	Decision-Making Unit
CTR	Click-Through Rate
CRS	Constant Returns to Scale
VRS	Variable Returns to Scale
WEKA	Waikato Environment for Knowledge Analysis
NN	Neural Network
CIT	Classification and Interaction Trees
SVM	Support Vector Machine
K-NN	K-Nearest Neighbors
RF	Random Forest
ANN	Artificial Neural Network
AUC	Area Under the Curve
AM	Additive Model
F-M	F-Measure
MAE	Mean Absolute Error
RMSE	Root Mean Squared Error
MCC	Matthews Correlation Coefficient

Appendix A

Table A1. Overview of DEA–Decision Tree Applications in the Literature.

Study	DEA Models	Purpose	Type of Decision Tree (DT) Classification Algorithm	Attributes of the Classification Algorithms	Target Variables, and Number of Classes	Accuracy and Other Evaluation Metrics of the Main Algorithm(s)	Algorithms * Compared and Their Evaluation Metrics
[20]	131 IT development projects, BCC, output oriented, 6 inputs 3 outputs	to predict efficiency of new IT development projects	C4.5	9 environmental factors for technology commercialization	efficiency class, 2 classes, based on efficiency scores of DEA model, Efficient: ES = 1 Inefficient: ES < 1	not specified	No comparison
[25]	444 Bank branches in Ghana, CCR (2- stage DEA), 4 inputs, 2 outputs	to predict efficiencies of bank branches	C5.0	inputs and outputs of DEA model	efficiency class, 2 classes, based on efficiency scores of DEA model, Efficient: ES * ≥ 0.8 Inefficient: ES < 0.8	100, Kappa = 1	RF (98.5) Kappa = 0.95, NN (86) Kappa = −0.014
[19]	21 health centers in Jordan, 3 input 2 output	to predict efficiencies of a new health centers	J48 (with WEKA)	only inputs of DEA model	efficiency class, 2 classes, based on efficiency scores of DEA model, Efficient: ES = 1 Inefficient: ES < 1	imbalanced 76.19 F-M: 0.747, balanced 88.09 F-M: 0.875	imbalanced NB (52.38) SVM (52.38), balanced NB (86.9) F-M = 0.87 SVM (87.47) F-M = 0.835
[19]	21 health centers in Jordan, 3 input 2 output	to predict efficiencies of a new health centers	J48 (with WEKA)	both ofinputs and outputs of DEA model		imbalanced: 71.42 F-M = 0.714, balanced: 89.29 F-M = 0.888	imbalanced NB (52.38) SVM (61.9), balanced: NB (69.04) F-M = 0.692 SVM (91.66) F-M = 0.893
[26]	23 suppliers of a firm CCR, BCC data were taken from a paper originally 6 inputs and 5 outputs this study added 2 new factors.	to predict efficiencies of potential suppliers and make selection	C4.5 (with WEKA)	inputs and outputs of DEA model	efficiency class, 2 classes, based on efficiency scores of DEA model	CCR: 90.91 BCC: 81.82	for NN; CCR: 72.3 BCC: 100
[29]	15 manufacturers firms of automotive parts in Iran MPI Based DEA, output oriented, 3 inputs 2 outputs, data belongs to years of 2013–2016	to Assess the green supply chain performance of auto parts manufacturers with higher accuracy and determining the rules behind high performance	J48 (with WEKA)	inputs and outputs of DEA model	efficiency class, 2 classes, based on efficiency scores of MPI based DEA model, Inefficient MPI < 1 Efficient MPI > 1	not specified	no comparison
[73]	53 industrial companies listed on the stock Exchange in Amman, input and output orientation for both VRS and CRS model, 11 inputs 11 outputs, data belongs to years of 2012–2015	predicting the performance of companies	J48 (with WEKA)	inputs and outputs of DEA model, to determine ultimate attributes for J48, variable Importance Ranker in WEKA was applied to data	efficiency class, 2 classes, based on aggregated efficiency scores of DEA models Efficient: ES = 1 Inefficient: ES < 1	79.9 F-M = 0.804, 81.6 (after application of DECORATE) F-M: 0.818	no comparison
[24]	22 cement producers in Iran, 2-Stage MPI based CCR, BCC and AM, 2 inputs, 4 outputs for single stage, data belongs to years of 2015–2019	to develop a method to predict and analyze the Eco-efficiency values of cement companies and the factors affecting them.	not specified (with WEKA, Rapid Miner and Tanagra)	inputs and outputs of the single stage DEA models	efficiency class, 2 classes, based on efficiency scores of single stage MPI based DEA model, Inefficient MPI < 1 Efficient MPI > 1	different accuracy rates in WEKA, Rapid Miner and Tanagra highest degree of 91.23 is with WEKA.	with WEKA K-N (89.51) NB (79.25) with other software K-N and NB are better
[28]	200 bank branches in Iran, CCR input oriented, 3 inputs 5 outputs	to predict efficiency of a new branch without running DEA model.	C4.5	inputs and outputs of DEA model	efficiency class, 10 classes with equal intervals, based on efficiency scores of DEA model	86.5	no comparision
[27]	energy assessment project offered to 7548 medium and small-scale manufacturing companies in USA, Projects belong to years of 1981–2020, CCR, BCC CCR-Tier, BCC-Tier models, 6 inputs, 1 output	to predict efficiencies of energy assessment projects	not specified	inputs and outputs of DEA model	efficiency class, 2, 3, 5, 10 classes, based on efficiency scores of DEA model	as the number of classes increases, the accuracy rate decreases, conventional DEA modeling provides more accurate predictions than DEA-Tier modeling, for conventional: CCR (2): 98.24 CCR (5): 86.29 CCR (10): 69.94 BCC (2) 98.85 BCC (5): 91.55 BCC (10): 82.62	SVM K-NN Linear Discrimination Analysis RF overall, the best algorithms are RF and SVM. for 2 of classes, DT is the second after RF
[21]	18 Insurance branches of an insurance company in Iran, MPI based DEA (MPI based Latent Variable VRS model), 3 inputs 3 outputs, data belongs to 2008–2010	to explore rules behind the productivity based on CART algorithm using 8 internal and external factors	CART	8 internal and external factors	efficiency class, 3 classes, based on efficiency scores MPI based DEA model, Inefficient MPI < 1 Constant MPI = 1 Efficient MPI > 1	for period I 98.02 for period II 100 (with Bootstrap)	no comparison
[22]	151 Banks in MENA countries, VRS input and output oriented, 5 inputs, 4 outputs, data on 2008–2010	to asses impact of environmental factors on banking performance and predict banking performance based on environmental factors	CART CIT	15 environmental factors	efficiency class, 2 classes, based on efficiency scores of DEA model, Efficient: ES = 1 Inefficient: ES < 1	CART: 75.50 AUC: 0.7466 CIT: 67.55 AUC: 0.7077	RF-CART: 82.78 AUC: 0.9293 RF-CIT: 75.50 AUC: 0.8516 ANN: 68.21 AUC: 0.6951 Bagging: 84.11 AUC: 0.9221 Bootstrap except ANN
[23]	36 banks in Gulf Cooperation Council countries, VRS, output oriented, 3 inputs 2 outputs, use of bootstrap	to predict efficiency of bank based on internal and external factors and discover reasons behind inefficiencies	CART	12 internal and external factors	efficiency class, 2 classes, based on efficiency scores of DEA model, Efficient: ES = 1 Inefficient: ES < 1	97.93	no comparison
[74]	200 service units of a hypothetical firm, Pure outputs DEA model, 11 outputs	to find inefficient service units and inefficient process es in inefficient service units.	CART	efficiency scores of processes in service units	efficiency class, 2 classes, based on efficiency scores of DEA model, Efficient: ES = 1 Inefficient: ES < 1	not specified	no comparison

* Non-decision-tree algorithms.

References and Note

Shafiee Roodposhti, M.; Behrang, K.; Kamali, H.; Rezadoost, B. An Evaluation of the Advertising Media Function Using DEA and DEMATEL. J. Promot. Manag. 2022, 28, 923–943. [Google Scholar] [CrossRef]
Luo, X.; Donthu, N. Benchmarking Advertising Efficiency. J. Advert. Res. 2001, 41, 7–18. [Google Scholar] [CrossRef]
Luo, X.; Donthu, N. Assessing advertising media spending inefficiencies in generating sales. J. Bus. Res. 2005, 58, 28–36. [Google Scholar] [CrossRef]
Guido, G.; Prete, M.I.; Miraglia, S.; De Mare, I. Targeting direct marketing campaigns by neural networks. J. Mark. Manag. 2011, 27, 992–1006. [Google Scholar] [CrossRef]
Hudák, M.; Kianičková, E.; Madleňák, R. The importance of e-mail marketing in e-commerce. Procedia Eng. 2017, 192, 342–347. [Google Scholar] [CrossRef]
Litmus Software. The 2023 State of E-Mail Report; Litmus: Boston, MA, USA, 2023. [Google Scholar]
Păvăloaia, V.-D.; Anastasiei, I.-D.; Fotache, D. Social Media and E-mail Marketing Campaigns: Symmetry versus Convergence. Symmetry 2020, 12, 1940. [Google Scholar] [CrossRef]
Qabbaah, H.; Sammour, G.; Vanhoof, K. Decision Tree Analysis to Improve e-mail Marketing Campaigns. Int. J. Inf. Theor. Appl. 2019, 26, 3–36. [Google Scholar]
Ayanso, A.; Mokaya, B. Efficiency Evaluation in Search Advertising. Decis. Sci. 2013, 44, 877–913. [Google Scholar] [CrossRef]
Farvaque, E.; Foucault, M.; Vigeant, S. The politician and the vote factory: Candidates’ resource management skills and electoral returns. J. Policy Model. 2020, 42, 38–55. [Google Scholar] [CrossRef]
Sexton, T.R.; Lewis, H.F. Measuring efficiency in the presence of head-to-head competition. J. Product. Anal. 2012, 38, 183–197. [Google Scholar] [CrossRef]
Götz, G.; Herold, D.; Klotz, P.-A.; Schäfer, J.T. Efficiency in COVID-19 Vaccination Campaigns—A Comparison across Germany’s Federal States. Vaccines 2021, 9, 788. [Google Scholar] [CrossRef]
Lohtia, R.; Donthu, N.; Yaveroglu, I. Evaluating the efficiency of Internet banner advertisements. J. Bus. Res. 2007, 60, 365–370. [Google Scholar] [CrossRef]
Lo, Y.C.; Fang, C.-Y. Facebook marketing campaign benchmarking for a franchised hotel. Int. J. Contemp. Hosp. Manag. 2018, 30, 1705–1723. [Google Scholar] [CrossRef]
Cordero-Gutiérrez, R.; Lahuerta-Otero, E. Social media advertising efficiency on higher education programs. Span. J. Mark.—ESIC 2020, 24, 247–262. [Google Scholar] [CrossRef]
Kongar, E.; Adebayo, O. Impact of Social Media Marketing on Business Performance: A Hybrid Performance Measurement Approach Using Data Analytics and Machine Learning. IEEE Eng. Manag. Rev. 2021, 49, 133–147. [Google Scholar] [CrossRef]
Hamelin, N.; Al-Shihabi, S.; Quach, S.; Thaichon, P. Forecasting Advertisement Effectiveness: Neuroscience and Data Envelopment Analysis. Australas. Mark. J. 2022, 30, 313–330. [Google Scholar] [CrossRef]
Ejlal, A.; Roodposhti, M.S. Providing a framework for evaluating the advertising efficiency using data envelopment analysis technique. Middle East J. Manag. 2019, 6, 451. [Google Scholar] [CrossRef]
Najadat, H.; Najadat, H.; Althebyan, Q.; Khamaiseh, A.; Al-Saad, M.; Rawashdeh, A.A. The Society of Digital Information and Wireless Communication Efficiency Analysis of Health Care Centers Using Data Envelopment Analysis. Int. J. E-Learn. Educ. Technol. Digit. Media 2018, 4, 34–38. [Google Scholar] [CrossRef]
Sohn, S.Y.; Moon, T.H. Decision Tree based on data envelopment analysis for effective technology commercialization. Expert Syst. Appl. 2004, 26, 279–284. [Google Scholar] [CrossRef]
Alinezhad, A. An Integrated DEA and Data Mining Approach for Performance Assessment. Iran. J. Optim. 2016, 8, 968–987. [Google Scholar]
Anouze, A.L.M.; Bou-Hamad, I. Data envelopment analysis and data mining to efficiency estimation and evaluation. Int. J. Islam. Middle East. Financ. Manag. 2019, 12, 169–190. [Google Scholar] [CrossRef]
Emrouznejad, A.; Anouze, A.L. Data envelopment analysis with classification and regression tree—A case of banking efficiency. Expert Syst. 2010, 27, 231–246. [Google Scholar] [CrossRef]
Mirmozaffari, M.; Shadkam, E.; Khalili, S.M.; Kabirifar, K.; Yazdani, R.; Asgari Gashteroodkhani, T. A novel artificial intelligent approach: Comparison of machine learning tools and algorithms based on optimization DEA Malmquist productivity index for eco-efficiency evaluation. Int. J. Energy Sect. Manag. 2021, 15, 523–550. [Google Scholar] [CrossRef]
Appiahene, P.; Missah, Y.M.; Najim, U. Predicting Bank Operational Efficiency Using Machine Learning Algorithm: Comparative Study of Decision Tree, Random Forest, and Neural Networks. Adv. Fuzzy Syst. 2020, 2020, 8581202. [Google Scholar] [CrossRef]
Wu, D. Supplier selection: A hybrid model using DEA, decision tree and neural network. Expert Syst. Appl. 2009, 36, 9105–9112. [Google Scholar] [CrossRef]
Perroni, M.G.; Veiga, C.P.D.; Forteski, E.; Marconatto, D.A.B.; Da Silva, W.V.; Senff, C.O.; Su, Z. Integrating Relative Efficiency Models with Machine Learning Algorithms for Performance Prediction. Sage Open 2024, 14, 21582440241257800. [Google Scholar] [CrossRef]
Dalvand, B.; Jahanshahloo, G.; Lotfi, F.H.; Rostami, M. Using C4.5 Algorithm for Predicting Efficiency Score of DMUs in DEA. Adv. Environ. Biol. 2014, 8, 473–477. [Google Scholar]
Khalili, J.; Alinezhad, A. Performance Evaluation in Green Supply Chain Using BSC, DEA and Data Mining. Int. J. Supply Oper. Manag. 2018, 5, 182–191. [Google Scholar] [CrossRef]
Patel, B.R.; Rana, K.K. A Survey on Decision Tree Algorithm For Classification. Int. J. Eng. Dev. Res. 2014, 2, 1–5. [Google Scholar]
Song, Y.; Lu, Y. Decision tree methods: Applications for classification and prediction. Shanghai Arch. Psychiatry 2015, 27, 130–135. [Google Scholar] [CrossRef]
Isbilen-Yucel, L. Veri Zarflama Analizi, 1st ed.; Der Yayinlari: Istanbul, Türkiye, 2017; ISBN 978-975-353-484-0. [Google Scholar]
Heiets, I.; Ng, S.; Singh, N.; Farrell, J.; Kumar, A. Social media activities of airlines: What makes them successful? J. Air Transp. Res. Soc. 2024, 2, 100017. [Google Scholar] [CrossRef]
Murphy, D. Increasing clicks through advanced targeting: Applying the third-party seal model to airline advertising. J. Tour. Herit. Serv. Mark. 2019, 5, 24–30. [Google Scholar] [CrossRef]
Sakas, D.P.; Reklitis, D.P. The Impact of Organic Traffic of Crowdsourcing Platforms on Airlines’ Website Traffic and User Engagement. Sustainability 2021, 13, 8850. [Google Scholar] [CrossRef]
Vlassi, E.; Papatheodorou, A. Towards a Method to Assess the Role of Online Marketing Campaigns in the Airline–Airport–Destination Authority Triangular Business Relationship: The Case of Athens Tourism Partnership. In Air Transport and Regional Development Policies; Routledge: London, UK, 2020; pp. 227–239. [Google Scholar]
Vlassi, E.; Papatheodorou, A.; Karachalis, N. Evaluating the Effectiveness of Online Destination Marketing Campaigns from a Sustainability and Resilience Viewpoint: The Case of “This Is Athens & Partners” in Greece. Sustainability 2024, 16, 7649. [Google Scholar] [CrossRef]
Lorente-Páramo, Á.-J.; Hernández-García, Á.; Chaparro-Peláez, J. Modelling e-mail marketing effectiveness—An approach based on the theory of hierarchy-of-effects. Manag. Lett. 2021, 21, 19–27. [Google Scholar] [CrossRef]
Bonfrer, A.; Dréze, X. Real-Time Evaluation of E-mail Campaign Performance. Mark. Sci. 2009, 28, 251–263. [Google Scholar] [CrossRef]
Smart, K.L.; Cappel, J. Assessing the Response to and Success of Email Marketing Promotions. Issues Inf. Syst. 2003, 4, 309–315. [Google Scholar]
Sahni, N.S.; Wheeler, S.C.; Chintagunta, P. Personalization in Email Marketing: The Role of Non-Informative Advertising Content. Mark. Sci. 2018, 37, 177–331. [Google Scholar] [CrossRef]
Skačkauskienė, I.; Nekrošienė, J.; Szarucki, M. A Review on Marketing Activities Effectiveness Evaluation Metrics. In Proceedings of the 13th International Scientific Conference “Business and Management 2023”, Vilnius, Lithuania, 11–13 May 2023. [Google Scholar]
Hartemo, M. Conversions on the rise—Modernizing e-mail marketing practices by utilizing volunteered data. J. Res. Interact. Mark. 2022, 16, 585–600. [Google Scholar] [CrossRef]
Wiesel, T.; Pauwels, K.; Arts, J. Marketing’s Profit Impact: Quantifying Online and Off-line Funnel Progression. Mark. Sci. 2011, 30, 604–611. [Google Scholar] [CrossRef]
Sarkis, J. Preparing Your Data for DEA. In Modeling Data Irregularities and Structural Complexities in Data Envelopment Analysis; Zhu, J., Cook, W.D., Eds.; Springer US: Boston, MA, USA, 2007; pp. 305–320. ISBN 978-0-387-71606-0. [Google Scholar]
Boussofiane, A.; Dyson, R.G.; Thanassoulis, E. Applied data envelopment analysis. Eur. J. Oper. Res. 1991, 52, 1–15. [Google Scholar] [CrossRef]
Golany, B.; Roll, Y. An application procedure for DEA. Omega 1989, 17, 237–250. [Google Scholar] [CrossRef]
Friedman, L.; Sinuany-Stern, Z. Combining ranking scales and selecting variables in the DEA context: The case of industrial branches. Comput. Oper. Res. 1998, 25, 781–791. [Google Scholar] [CrossRef]
Cooper, W.W.; Seiford, L.M.; Tone, K. (Eds.) Data Envelopment Analysis: A Comprehensive Text with Models, Applications, References and DEA-Solver Software, 2nd ed.; Springer Science & Business Media, LLC: Boston, MA, USA, 2007; ISBN 978-0-387-45281-4. [Google Scholar]
Dyson, R.G.; Allen, R.; Camanho, A.S.; Podinovski, V.V.; Sarrico, C.S.; Shale, E.A. Pitfalls and protocols in DEA. Eur. J. Oper. Res. 2001, 132, 245–259. [Google Scholar] [CrossRef]
Tone, K.; Tsutsui, M. Dynamic DEA: A slacks-based measure approach. Omega 2010, 38, 145–156. [Google Scholar] [CrossRef]
Ray, S.C. Data Envelopment Analysis: An Overview; University of Connecticut: Mansfield, CT, USA, 2014. [Google Scholar]
Cook, W.D.; Tone, K.; Zhu, J. Data envelopment analysis: Prior to choosing a model. Omega 2014, 44, 1–4. [Google Scholar] [CrossRef]
Barros, C.P.; Athanassiou, M. Efficiency in European Seaports with DEA: Evidence from Greece and Portugal. Marit. Econ. Logist. 2004, 6, 122–140. [Google Scholar] [CrossRef]
Cooper, W.W.; Seiford, L.M.; Tone, K. Introduction to Data Envelopment Analysis and Its Uses: With DEA-Solver Software and References; Springer US: Boston, MA, USA, 2006; ISBN 978-0-387-28580-1. [Google Scholar]
Cooper, W.W.; Seiford, L.M.; Tone, K. Data Envelopment Analysis, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2007; ISBN 0-387-45283-4. [Google Scholar]
Kutlar, A.; Bakirci, F. Veri Zarflama Analizi Teori ve Uygulama; Orion: Ankara, Türkiye, 2018; ISBN 978-605-9524-22-3. [Google Scholar]
Yang, L.; Ouyang, H.; Fang, K.; Ye, L.; Zhang, J. Evaluation of regional environmental efficiencies in China based on super-efficiency-DEA. Ecol. Indic. 2015, 51, 13–19. [Google Scholar] [CrossRef]
Andersen, P.; Petersen, N.C. A Procedure for Ranking Efficient Units in Data Envelopment Analysis. Manag. Sci. 1993, 39, 1261–1264. [Google Scholar] [CrossRef]
Frontier Analyst, version 4.0; Commercial computer software; Banxia Software Ltd.: Kendal, UK, 2025.
Tanza, A.; Utari, D.T. Comparison of the Naïve Bayes Classifier and Decision Tree J48 for Credit Classification of Bank Customers. EKSAKTA J. Sci. Data Anal. 2022, 3, 70–77. [Google Scholar] [CrossRef]
Quinlan, J.R. Improved Use of Continuous Attributes in C4.5. J. Artif. Intell. Res. 1996, 4, 77–90. [Google Scholar] [CrossRef]
Mahboob, T.; Irfan, S.; Karamat, A. A machine learning approach for student assessment in E-learning using Quinlan’s C4.5, Naive Bayes and Random Forest algorithms. In Proceedings of the 2016 19th International Multi-Topic Conference (INMIC), Islamabad, Pakistan, 5–6 December 2016; pp. 1–8. [Google Scholar]
Bujlow, T.; Riaz, T.; Pedersen, J.M. A method for classification of network traffic based on C5.0 Machine Learning Algorithm. In Proceedings of the 2012 International Conference on Computing, Networking and Communications (ICNC), Maui, HI, USA, 30 January–2 February 2012; pp. 237–241. [Google Scholar]
Montazeri, M.; Montazeri, M.; Beygzadeh, A.; Javad Zahedi, M. Identifying efficient features in diagnose of liver disease by decision tree models. HealthMED 2014, 8, 1115–1124. [Google Scholar]
Kalmegh, S. Analysis of WEKA Data Mining Algorithm Reptree, Simple Cart and RandomTree for Classification of Indian News. Int. J. Innov. Sci. Eng. Technol. 2015, 2, 438–446. [Google Scholar]
Weka Homepage. Available online: https://ml.cms.waikato.ac.nz/weka/ (accessed on 1 May 2025).
Banker, R.D.; Morey, R.C. The Use of Categorical Variables in Data Envelopment Analysis. Manag. Sci. 1986, 32, 1613–1627. [Google Scholar] [CrossRef]
Banker, R.D.; Morey, R.C. Efficiency Analysis for Exogenously Fixed Inputs and Outputs. Oper. Res. 1986, 34, 513–521. [Google Scholar] [CrossRef]
Chen, K.; Zhu, J. Scale efficiency in two-stage network DEA. J. Oper. Res. Soc. 2019, 70, 101–110. [Google Scholar] [CrossRef]
Elliott, A.C.; Hynan, L.S. A SAS^® macro implementation of a multiple comparison post hoc test for a Kruskal–Wallis analysis. Comput. Methods Programs Biomed. 2011, 102, 75–80. [Google Scholar] [CrossRef] [PubMed]
Dogan, N.; Dogan, I. Determination of the number of bins/classes used in histograms and frequency tables: A short bibliography. J. Stat. Res. 2010, 7, 77–86. [Google Scholar]
Najadat, H.; Al-Daher, I.; Alkhatib, K. Performance Evaluation of Industrial Firms Using DEA and DECORATE Ensemble Method. Int. Arab J. Inf. Technol. 2020, 17, 750–757. [Google Scholar] [CrossRef]
Seol, H.; Choi, J.; Park, G.; Park, Y. A framework for benchmarking service process using data envelopment analysis and decision tree. Expert Syst. Appl. 2007, 32, 432–440. [Google Scholar] [CrossRef]

Figure 1. General Methodological Framework.

Figure 2. A summary for methodology.

Figure 3. J48 decision tree for the BCC_C model with SMOTE. “A” denotes “Attribute” in the decision tree, corresponding to DEA inputs (e.g., A1(I1)).

Table 1. Overview of DEA Applications in Marketing Campaign Efficiency Studies.

Study	DMU(s)/Unit of Analysis	Orientation and DEA Model	Input(s)	Output(s)
[13]	37 internet banner ads, multiple firms	orientation not specified (implicitly output-oriented), DEA model not specified	color level presence of emotion presence of incentive presence of interactivity presence of animation message length	CTR
[13]	37 internet banner ads, multiple firms			CTR attitude toward the ad recall
[9]	200 online retailers in search advertising, multiple firms	output-oriented, BCC	number of paid keywords, number of organic keywords keyword length cost per click (CPC) cost per day number of ad copies	online sales impressions CTR conversion rate ad-rank percentile for sponsored links organic ranking
[14]	60 Facebook marketing campaigns, single firm	input-oriented, CCR, BCC	text length number of pictures number of colors	people reached reactions, comments and shares post clicks
[15]	45 Facebook ads, single institution	not specified	ad duration (in days) amount spent	reach impressions clicks reactions post engagement
[16]	43 U.S. furniture retailers, multiple firms	CCR with benevolent cross-efficiency ranking, orientation not specified (implicitly output-oriented)	number of employees total assets	annual sales
[16]	43 U.S. furniture retailers, multiple firms		number of employees total assets tweets	annual sales likes followers friends list count
[2]	23 outdoor billboard campaigns, multiple firms	not specified	number of large words number of concepts color vs. black-and-white level of graphics	consumer recall expert-rated ad quality
[17]	14 real-estate print ads, single firm	output-oriented, BCC	no inputs used	joy engagement positive attention
[18]	15 Iranian food brands, multiple firms	input-oriented, BCC	advertising budget campaign duration	sales brand familiarity attractiveness of implementation

Table 2. Summary of Indicators Used in Airline Marketing Campaign Studies.

Study	Campaign Evaluation Indicators
[33]	followers, post frequency, total interactions (comments, likes), content type (advertising, informational, customer care, interactive, entertainment, promotional), platform efficiencies (Facebook, Instagram, X, LinkedIn, TikTok)
[34]	impressions, reach, clicks, CTR, cost per click (CPC)
[35]	organic traffic, paid keywords, paid traffic cost, average visit duration, unique visitors, user engagement, global rank
[36]	destination awareness, emotional proximity, intention to visit incremental spending, ROI
[37]	destination awareness, campaign awareness, campaign engagement, intention to visit, conversion, average cost per visitor, ROI, perception, emotional proximity, additional spending

Table 3. Summary of Evaluation Indicators Used in Email Marketing Studies.

Study	Evaluation Indicators
[38]	open rate (Attention), CTR (Interest), unsubscribe rate (Retention), conversion (Action), (Desire excluded)
[39]	open rate, CTOR, CTR, emails sent, opens per campaign, clicks per campaign, time-to-first-open, time-to-first-click, doubling time
[40]	CTR, conversion rate
[41]	open rate, sales leads, unsubscribe rate
[42]	delivery metrics: delivery rate, bounce rate, spam complaint rate open metrics: open rate, unique open rate click metrics: CTR, unique click rate conversion metrics: conversion rate, revenue per email, ROI engagement metrics: forward rate, sharing rate, reply rate list growth metrics: new subscribers, list growth rate, unsubscribe rate

Table 4. Input variables used in DEA models.

Variable Code	Variable Name	Variable Definition
I1	booking period	time period during which tickets can be purchased as part of the campaign
I2	travel period	time period during which travel is possible as part of the campaign
I3	number of emails sent	the number of campaign emails sent to customers
I4	number of flights taken	average number of flights taken by customers to whom campaign emails were sent in the last 18 months prior to the campaign
I5	market share	the market share of the company within the routes and travel periods targeted by each campaign
I6	market size	total customers carried by all airlines on the campaign’s relevant routes and dates

Table 5. Output variables used in DEA models.

Variable Code	Variable Name	Variable Definition
O1	CTOR (click-to-open rate)	the ratio of recipients who clicked on a link after opening the campaign email
O2	tickets sold	the total number of tickets sold during the campaign period

Table 6. Summary of DEA models.

Model No.	Categorical Variable	Segment	Inputs	Outputs
CCR_C *			I1, I2, I3, I4, I5, I6	O1, O2
CCR_G	G (group size)	all
CCR_G_I		I (individual)
CCR_G_G		G (group)
CCR_S	S (seasonality)	all
CCR_S_L		L (low)
CCR_S_H		H (high)
CCR_R	R (route type)	all
CCR_R_O		O (one-way)
CCR_R_R		R (round-trip)
CCR_R_B		B (both)
BCC_C *			I1, I2, I3, I4, I5, I6	O1, O2
BCC_G	G (group size)	I (individual)
BCC_G_I		G (group)
BCC_G_G		all
BCC_S	S (seasonality)	L (low)
BCC_S_L		H (high)
BCC_S_H		all
BCC_R	R (route type)	O (one-way)
BCC_R_O		R (round-trip)
BCC_R_R		B (both)
BCC_R_B		I (individual)

* “_C” represents the core model without categorical segmentation.

Table 7. Efficiency scores of models.

Model No.	Number of Efficient Campaigns (Score = 100)	Number of Inefficient Campaigns (Score < 100)	Mean	Median	Std	Min
CCR_C	26	50	69.5	74.6	28.6	12.7
CCR-G-I CCR-G-G	22 9	45 0	70.2 100	75.6 100	27.6 0	12.7 100
CCR-S-L CCR-S-H	20 16	35 5	76.5 94	82.9 100	23.9 16.6	19.7 34.2
CCR-R-O CCR-R-R CCR-R-B	10 15 11	1 33 6	98.8 71.3 84.5	100 77.1 100	4 27.2 27.8	86.9 12.8 28.2
BCC-C	46	30	80.8	100	27.7	13.3
BCC-G-I BCC-G-G	41 9	26 0	81.4 100	100 100	26.9 0	13.3 100
BCC-S-L BCC-S-H	41 20	14 1	90.4 98.6	100 100	20.2 6.6	20.7 69.6
BCC-R-O BCC-R-R BCC-R-B	11 31 14	0 17 3	100 84 87.8	100 100 100	0 25.9 27.2	100 13.3 30.4

Table 8. Scale efficiency (SE) of models.

DEA Model	Models	Categorical Variables	Mean	Median	Std	Min
	CCR_C—BCC_C		0.87	0.97	0.19	0.32
	CCR_G—BCC_G	G I	1 0.87	1 0.96	0 0.17	1 0.38
SE	CCR_S—BCC_S	L H	0.85 0.95	0.94 1	0.18 0.15	0.36 0.34
	CCR_R—BCC_R	R O B	0.86 0.99 0.96	0.96 1 1	0.19 0.04 0.1	0.31 0.87 0.61

Table 9. Kruskal–Wallis test results for CCR and BCC model groups.

Models	Chi-Square (χ²)	Degrees of Freedom (df)	p-Value
CCR	41.12	7	1 × 10⁻⁶
BCC	22.68	7	1.934 × 10⁻³

Significance threshold: p < 0.05.

Table 10. Z-Statistics and adjusted p-values for significantly different model pairs based on Dunn’s test with Bonferroni correction (p < 0.05).

Model Comparison	Z-Statistic	Adjusted p-Value (Bonferroni)
CCR_G_G—CCR_C	3.56	0.010
CCR_S_H—CCR_C	3.78	0.004
CCR_R_O—CCR_C	3.61	0.008
CCR_G_G—CCR_G_I	3.50	0.013
CCR_S_H—CCR_G_I	3.69	0.006
CCR_R_O—CCR_G_I	3.55	0.011
CCR_G_G—CCR_R_R	3.38	0.020
CCR_S_H—CCR_R_R	3.47	0.015
CCR_R_O—CCR_R_R	3.41	0.018

Significance threshold: p < 0.05.

Table 11. Pairwise rank-biserial correlations (r) between CCR models.

Group 1	Group 2	Rank-Biserial r
CCR_C	CCR_G_G	0.65
CCR_C	CCR_S_H	0.49
CCR_C	CCR_R_O	0.62
CCR_G_G	CCR_G_I	−0.67
CCR_G_I	CCR_S_H	0.51
CCR_G_G	CCR_R_R	−0.69
CCR_G_I	CCR_R_O	0.64
CCR_R_R	CCR_S_H	0.52
CCR_R_R	CCR_R_O	0.64

Table 12. Super-efficiency CCR models.

Range	CCR Models
Range	CCR_C	CCR_G_I	CCR_G_G	CCR_S_L	CCR_S_H	CCR_R_O	CCR_R_R	CCR_R_B
[100–250)	18	14	5	13	12	7	8	5
[200–400)	4	4	1	3	2	0	5	3
[400–550)	0	0	1	0	0	0	0	0
[550–700)	0	0	0	0	0	0	0	0
[700–850)	1	2	0	1	1	0	1	0
[850–1000]	3	2	2	3	1	3	1	3

Table 13. Super-efficiency BCC models.

Range	BCC Models
Range	BCC_C	BCC_G_I	BCC_G_G	BCC_S_L	BCC_S_H	BCC_R_O	BCC_R_R	BCC_R_B
[100–250)	7	7	0	4	3	2	0	1
[200–400)	0	0	0	0	1	0	0	0
[400–550)	1	0	0	0	0	0	0	0
[550–700)	0	0	0	0	0	0	0	0
[700–850)	2	2	0	1	0	0	1	0
[850–1000]	36	32	9	36	16	9	30	13

Table 14. Summary of quantitative sensitivity analysis results for models.

Model	Output Variable	Variation (%)	Range of Mean Model Differences	Spearman’s ρ	Number of Efficient Campaigns
CCR_C	O1	±5, ±10	0 to 2.63 × 10⁻⁴ 0	1	26
CCR_C	O2	±5, ±10	0 to 2.63 × 10⁻⁴ 0	1	26
CCR_S_L	O1	±5, ±10	−1.82 × 10⁻⁴ to 0	1	20
	O2	±5, ±10	0	1	20
CCR_S_H	O1	±5, ±10	0	1	16
CCR_S_H	O2	±5, ±10	0	1	16

Table 15. Structural sensitivity of model CCR_C (impact of removing each input variable).

Input Removed	Δ Mean Model	Spearman’s ρ	Number of Efficient Campaigns
I1	−7.83	0.88	21
I2	−5.29	0.95	22
I3	−0.56	0.99	25
I4	−4.93	0.93	22
I5	−7.42	0.94	18
I6	−13.55	0.65	16

Table 16. Structural sensitivity of model CCR_S_L (impact of removing each input variable).

Input Removed	Δ Mean Model	Spearman’s ρ	Efficient Campaigns
I1	−10.89	0.80	17
I2	−7.17	0.89	16
I3	−0.50	0.98	19
I4	−1.79	0.97	19
I5	−10.02	0.91	14
I6	−9.60	0.64	13

Table 17. Structural sensitivity of model CCR_S_H (impact of removing each input variable).

Input Removed	Δ Mean Model	Spearman’s ρ	Efficient Campaigns
I1	−1.27	0.90	15
I2	−1.74	1.00	16
I3	−0.29	1.00	16
I4	0.00	1.00	16
I5	−6.24	0.64	13
I6	−28.05	0.62	8

Table 18. Structural sensitivity analysis of models (effect of removing super-efficient campaigns).

Model	Scenario	Δ Mean Model	Spearman’s ρ	Efficient Campaigns
CCR_C	3 campaigns removed	6.12	0.88	26
CCR_S_L	3 campaigns removed	4.40	0.92	19
CCR_S_H	1 campaign removed	0	1	15

Table 19. Potential improvements of Campaign 15.

Variable	Potential Improvement (%)
O1	44
O2	44
I1	0
I2	−34
I3	−26
I4	0
I5	0
I6	0

Table 20. Example of peer contributions: Campaign 42’s role in improving the efficiency of Campaign 15.

Campaign	Variable	Contribution (%)
42	O1	36
	O2	91
	I1	39
	I2	55
	I3	83
	I4	23
	I5	17
	I6	99

Table 21. Contribution of inputs and outputs for Campaign 15.

Variable	Contribution (%)	Input/Output
I1	21	Input
I2	0	Input
I3	0	Input
I4	1.52	Input
I5	0.69	Input
I6	77	Input
O1	12	Output
O2	88	Output

Table 22. Accuracy rates and number of correctly classified campaigns for the CCR_C and BCC_C models.

Model	Correctly Classified Campaigns (n)	J48	CART	C5.0
CCR_C	accuracy rate (%)	76.3	67.1	73.7
CCR_C	correctly classified instances	58	51	56
BCC_C	accuracy rate (%)	71.1	68.4	78.9
BCC_C	correctly classified instances	54	52	60

Table 23. Confusion matrices for the CCR_C and BCC_C models.

Algorithm	Model	Efficiency	Classified as Efficient	Classified as Inefficient
J48	CCR_C	efficient	12	14
	CCR_C	inefficient	4	46
	BCC_C	efficient	31	15
	BCC_C	inefficient	7	23
CART	CCR_C	efficient	7	19
	CCR_C	inefficient	6	44
	BCC_C	efficient	36	10
	BCC_C	inefficient	14	16
C5.0	CCR_C	efficient	8	18
	CCR_C	inefficient	2	48
	BCC_C	efficient	39	7
	BCC_C	inefficient	9	21

Table 24. Accuracy rates and number of correctly classified campaigns for the BCC_C model (SMOTE).

Correctly Classified Campaigns (n)	J48	CART	C5.0
accuracy rate (%)	76.3	72.3	75.0
correctly classified instances	58	55	57

Table 25. Confusion matrices for the BCC_C model (SMOTE).

Algorithm	Efficiency	Classified as Efficient	Classified as Inefficient
J48	efficient	36	10
J48	inefficient	8	22
CART	efficient	34	12
CART	inefficient	9	21
C5.0	efficient	40	6
C5.0	inefficient	13	17

Table 26. Classification results by class for the BCC_C model (SMOTE).

Model	Efficiency	TP Rate	FP Rate	Precision	Recall	F-Measure	Kappa	ROC Area
J48	efficient	0.783	0.267	0.818	0.783	0.800	0.510	0.763
	inefficient	0.733	0.217	0.688	0.733	0.710
	weighted average	0.763	0.247	0.767	0.763	0.764
CART	efficient	0.739	0.300	0.791	0.739	0.764	0.432	0.758
	inefficient	0.700	0.261	0.636	0.700	0.667
	weighted average	0.724	0.285	0.73	0.724	0.726
C5.0	efficient	0.870	0.433	0.755	0.870	0.808	0.455	0.684
	inefficient	0.567	0.130	0.739	0.567	0.642
	weighted average	0.750	0.314	0.749	0.750	0.749

Table 27. Summary of studies combining DEA and ML in the literature.

Study	Number of DMUs	Accuracy Rates (%)
[26]	23	using C4.5; CCR (90.9), BCC (81.8)
[29]	15	not specified
[73]	53	using J48 for aggregated scores of VRS and CRS DEA models; imbalanced (79.9), balanced (81.6) (after DECORATE)
[24]	22	MPI-based AM model (91.2)
[28]	200	using C4.5; CCR (86.5)
[27]	7548	CCR (98.2)-2 , BCC (98.9)-2
[20]	131	not specified
[25]	444	using C5.0; CCR (100)
[19]	21	using J48; for use of only inputs: imbalanced (76.2), balanced (88.1) for use of both inputs and outputs: imbalanced (71.4) balanced (89.3)
[21]	18	using CART based on Latent Variable VRS DEA model; for Period I: 98.02, period II: 100 (with Bootstrap)
[22]	151	using CART; VRS DEA (75.5), for CIT; VRS DEA (67.5)
[23]	36	using CART; VRS DEA (97.9) (with Bootstrap)
[74]	200	not specified

* 2 indicates class number.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Inci, G.; Polat, S. Efficiency Analysis and Classification of an Airline’s Email Campaigns Using DEA and Decision Trees. Information 2025, 16, 969. https://doi.org/10.3390/info16110969

AMA Style

Inci G, Polat S. Efficiency Analysis and Classification of an Airline’s Email Campaigns Using DEA and Decision Trees. Information. 2025; 16(11):969. https://doi.org/10.3390/info16110969

Chicago/Turabian Style

Inci, Gizem, and Seckin Polat. 2025. "Efficiency Analysis and Classification of an Airline’s Email Campaigns Using DEA and Decision Trees" Information 16, no. 11: 969. https://doi.org/10.3390/info16110969

APA Style

Inci, G., & Polat, S. (2025). Efficiency Analysis and Classification of an Airline’s Email Campaigns Using DEA and Decision Trees. Information, 16(11), 969. https://doi.org/10.3390/info16110969

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Efficiency Analysis and Classification of an Airline’s Email Campaigns Using DEA and Decision Trees

Abstract

1. Introduction

2. Literature Review

2.1. DEA Studies on Campaign Efficiency

2.2. DEA–Machine Learning Studies

3. Methodology

3.1. Efficiency Measurement with DEA

3.1.1. DMUs Definitions

3.1.2. Inputs and Outputs

3.1.3. Choice of DEA Methodology

3.1.4. Orientation Choice

3.1.5. Scale Type: CCR and BCC

3.1.6. Super Efficiency DEA Model

3.1.7. DEA Models Applied in the Study

3.2. Efficiency Prediction with Decision Trees

4. Results

4.1. Results of the DEA Models

4.2. Results of the Decision Tree Classification Algorithms

5. Conclusions

Limitations and Future Research

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References and Note

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI