During the procurement process of an aircraft platform, there is an emphasis in affordability and cost management issues. Potential buyers can be always offered reliable estimates for the O&S costs of an aircraft platform which has been in operation and has reached its ‘fleet maturity’ stage. This is not possible though in the case of comparing and evaluating new (‘unknown’) aircraft platforms, therefore the decisions should be based, if possible, to cost estimates the variables of which are known at the time of the procurement process. This is one of the main objectives of the paper, to develop a CER which will enable an analyst to carry out a timely and reliable O&S cost forecast, despite the lack of actual data from the utilization and support life cycle stages of the platform, during which the largest portion of the Life Cycle Cost (LCC), nearly 60%, is incurred [
16]. As such, aircraft’s physical and performance characteristics and parameters which are known very well in advance at the aircraft’s conceptual design phase, are being researched on the present work as potential variables of an O&S parametric cost estimation model. An introductory demonstration of this concept has been presented by one of the authors at the 2016 International Training Symposium of the ICEAA [
17].
2.3. The Development of the Parametric Model
The relationship between historical CPFH and specific aircraft characteristics is investigated with an objective to identify a strong CER that will be used to estimate the hypothetical CPFH for an ‘unknown’ aircraft. Typically, the CPFH includes the following six main cost categories [
16] according to the O&S cost element structure: Unit-level manpower, unit operations, maintenance, sustaining support, continuing system improvements and indirect support. Since the purpose of the presented parametric model is the assessment of the relationship between cost and technical or performance characteristics, the ‘indirect support’ cost category is excluded from the analysis.
The Fiscal Year (FY) 2013 CPFH data from the Hellenic Air Force (HAF) aircraft fleet have been used as input for the developed parametric model (
Table 1). The Ministry of National Defence of Greece restricts the publication of the CPFH data for the fighter jet fleet, CPFH data though for other than fighter jet aircraft types are publicly available without any restriction. CPFH data of
Table 1 includes the contribution of the ‘indirect support’ cost category which, as mentioned previously, was excluded from the present study.
In more detail, the 2012 CPFH Issue of the OGGG has been based on O&S data collected during the FY 2011, the 2013 CPFH Issue of the OGGG has been based on O&S data collected during the FY 2012 and the 2018 CPFH Issue of the OGGG has been based in analysis of O&S data collected during the FY 2015. At any point in time, the most recently published at that particular point in time CPFH Issue of the OGGG is used to feed the decision making of the Ministry of National Defence of Greece.
The 2013 CPFH Issue of the OGGG, which have been used for the development of the parametric model, is considered as having more reliable data than that of 2012, as the Ministry has transitioned from 2012 to a more GAO-streamlined [
22] procedure for categorizing and analysing the cost input data. It was deemed necessary to exclude the contribution of the ‘indirect support’ cost category when developing the parametric model, as this category is influenced by extrinsic factors (institutional framework, organizational structure, integrated logistic support policies to name a few). As a matter of fact, the ‘indirect support’ cost category portion in HAF aircraft types ranges from 5% up to 50% of the CPFH, depending on these extrinsic factors.
The HAF fire-fighting fleet (CL-215, CL-415) case can provide some evidence to support our decision to exclude the ‘indirect support’ cost category. As observed from
Table 1 (data including ‘indirect support’ cost), column ‘CPFH FY 2018,’ the CPFH of the CL-415 seems to be considerably higher than the CL-215 CPFH. It is worth noting that the CL-415, as opposed to CL-215, is a contemporary aircraft with enhanced reliability and maintainability provisions, equipped with very fuel efficient, latest technology turboprop engines. The paradox in the CPFH difference has to do with the fact that the CL-415 squadron experiences a higher ‘indirect support’ cost than the CL-215 squadron does. Almost the entire structure of 113th Wing supports the CL-415 squadron; on the contrary, the CL-215 squadron belongs in the organizational chart of 112th Wing, which supports other squadrons as well. Thus, in the case of 112th Wing, the ‘indirect support’ cost is allocated to multiple aircraft types (generally, each HAF Wing supports multiple aircraft squadrons and/or aircraft types [
23]). Nevertheless, if the ‘indirect support’ cost is excluded, CL-415 experiences lower CPFH than the CL-215, which makes more sense for the comparative purpose of the development of the parametric model.
A generic view of the constraints/requirements and the parametric model performance is presented at the
Table 2. The variables used for the analysis are shown at the
Table 3.
The variables used for the analysis are shown at the
Table 3.
Work done by Bryant ([
24], Table 2, page 10) offers an overview of independent variables used in previous CPFH research. The variables are grouped in four distinct groups, namely ‘aircraft characteristics,’ ‘operational factors,’ ‘economic factors’ and ‘environmental factors.’ The most frequently used independent variables by the researchers are ‘average aircraft age’ (belonging to the ‘aircraft characteristics’ group) and ‘utilization rate’ (belonging to the ‘operational factors’ group), which also reflect the fact that the CPFH research efforts are mainly channelled to find answers to policy hypotheses which are questioning the effects of the aircraft fleet aging and utilization. Instead, the present study aims to provide a ‘universal’ CER applicable to a wide range of platforms, which can be used at the very early development stages of the ‘new’ aircraft design and, as such, the independent variables are concentrated to aircraft design and performance characteristics. Moreover, in most cases, the values of the variables are unclassified and they are readily available online, something that enhances the usability of the developed model.
Figure 3 offers an overview of the way that the systems are being classified, based on the selected independent variables and depending on the desired systems similarity level. Because the selected independent variables serve as system identifiers, it is important to examine if the systems are perceptible in a realistic way and identified as ‘different’ or ‘similar’ through the regression process. The vertical axis of the cluster dendrogram in
Figure 3 corresponds to the level of ‘resolution’ of the sample’s ‘image.’ The ‘resolution’ is tied to the independent variables, which serve as system identifiers.
For example, the red horizontal dashed line in
Figure 3 corresponds to the selection of a low difference level (high ‘resolution’). This line cuts 8 branches of the dendrogram, meaning that the 22 different systems are identified and classified as 8 different entities (clusters), at the selected difference level. Specifically, the systems are grouped as follows: The 1st cluster includes two training aircraft (T-2E, T-6A II); the 2nd cluster includes the rotary wing platforms (AB-205, A-109E, AS-332C1, B-212) plus two light fixed-wing, single reciprocating engine, aircraft (T-41D, PZL); the 3rd cluster includes the four-engine transporter C-130H/B; the 4th cluster includes the lighter, two-engine transporter C-27J; the 5th cluster includes the two fire-fighting aircraft (CL-215, CL-415); the 6th cluster includes the two-engine supersonic jet fighter (F/RF-4E); the 7th cluster includes the single-engine supersonic jet fighters (F-16 blocks 30/50/52+, M2000/-5); the 8th cluster includes the VIP and early warning subsonic jet aircraft (ERJ-135BJ/LR, ERJ-145H, GV) plus the A-7H single-engine subsonic jet fighter. Apart from a few pitfalls, the classification of the systems according to the aforementioned clusters seems quite realistic.
Prior to the CPFH Log-transformation, the sample’s actual CPFH was multiplied by a positive real c, such that the selected model will estimate CPFH = 1 for the F-16C/D Block 52+. The F-16’s CPFH was chosen as the reference point for relative comparisons against other system’s CPFH. As a result, the model estimates the CPFH of any aircraft type times the F-16 mean CPFH. Analysts may use analogy to turn the model’s outputs into absolute CPFH estimates, by choosing a known system as their reference and multiplying its actual CPFH by the model’s output.
The pairwise assessment of the selected independent variables reveals multicollinearity issues. Two or more independent variables may be highly correlated, for example LogEMPTY and LogMTOW, meaning that one can be linearly estimated from the others with a substantial degree of accuracy. A parametric model should not include strongly correlated independent variables, because its predictive ability will decrease. The Pearson correlation matrix (
Figure 4) offers an overview of the existing correlations among the transformed variables.
Figure 5 shows examples of multicollinearity amongst various variables.
2.4. Selection of the Optimal CER
As seen in
Figure 4, the highest correlation coefficient between LogCPFH and the independent variables is
r = 0.83. Therefore, LogMTOW would be the best choice for developing a simple CER. Unluckily, this model does not comply at least with one of the requirements in
Table 2, which is:
R2adj ≥ 0.75 (indeed,
r2 = 0.83
2 = 0.69 < 0.75).
The next step is to investigate all possible complex CERs with two independent variables, including their interaction, of the form:
where
are the model’s coefficients and
,
. After performing stepwise regression on all possible (2 out of 6 = 15) models with two independent variables, as shown in Equation (2), the following model is chosen, according to the AIC as the measure of the CERs relative quality:
Notably, the two selected independent variables do not correlate significantly (
Figure 4 and
Figure 6), so there is no multicollinearity in the selected model. Also, the interaction of the two independent variables is not significant, hence the term
is omitted from the right hand of the equation.
The selected model explains a remarkable 82.15% of the LogCPFH variance, while the intercept, LogEMPTY and LogSFC have significant explanatory power at the 5% significance level. The assumptions of model’s linearity are not rejected at the 5% significance level. Moreover, there is a statistically strong indication that no power transformation is required on LogCPFH.
Table 4 shows the regression analysis details obtained from R software (v5.3.1, released on 2 July 2018) and
Figure 6 shows a 2D density plot (log-scale) for the model’s independent variables.
2.5. Residuals Diagnostics
Getting valid prediction or confidence intervals relies on the assumptions that the residuals are normal with mean zero, have constant variance and no autocorrelations. The residuals of the selected model pass all the necessary tests (
Table 5).
The residual plots (
Figure 7) offer some additional insight, following the tests in
Table 5. The histogram indicates normality for the residuals; there are a few high-discrepancy observations (candidate outliers) shown in the boxplot, however the respective test in
Table 5 supports the null hypothesis that there are no outliers. The residual plots versus the fitted values indicate absence of curvature pattern and presence of constant variance, supported by the minor slope in the spread-level plot; furthermore, no significant autocorrelation is evident in the respective plots. There are also a few high-leverage observations, however no hat value exceeds the empirical limit for small samples (3 times the hat values mean). The studentized residuals seem to follow the theoretical student-t distribution, as shown in the Q-Q plot.
Observation no.14 needs further attention as being influential for the model, since its Cook’s distance exceeds the empirical cut-off limit. This data point corresponds to the C-130H/B, which is the heaviest aircraft within the sample (high-leverage observation); additionally, the model overestimates to a great degree the C-130H/B’s actual CPFH (high discrepancy observation). Excluding observation no.14 from the analysis would trigger a vicious cycle of consecutive exclusions, diminishing the size of the sample. Because the sample size is small, we decided not to exclude any observation from the regression analysis.
The construction of prediction or confidence intervals requires the residuals standard error and degrees of freedom (provided in
Table 4), as well as the hat matrix. For any given input
, the hat matrix
can be calculated using the following information obtained from the sample:
Figure 8 shows the standard deviation map for the construction of LogCPFH confidence intervals. The highest precision for mean CPFH predictions is obtained when SFC ≈ 0.75 lb/(lbf∙h) or lb/(hp∙h) and empty weight ≈ 15000 lb.