1. Introduction
In aviation, it is crucial for airlines to maintain their fleet in an airworthy state. Each individual aircraft is required to meet a high standard of technical reliability, which is accomplished through maintenance. Aircraft maintenance encompasses a variety of tasks that can be deployed to keep the aircraft in an airworthy state. One of these tasks is replacement. As components are replaced, demand for new components is generated. To minimize the associated downtime of an aircraft, maintenance, repair and overhaul (MRO) providers (a term used here to indicate the contributions of Part 145 Approved Maintenance Organizations and Part M Continued Airworthiness Maintenance Organizations) aim to meet this generated demand by having spare parts available in their inventory. However, available inventory may not always be sufficient for the experienced demand, a phenomenon which is compounded by the highly variable nature (in both frequency and quantity) of spare part demand in aviation [
1,
2,
3]. Due to this high variability, actual demand is difficult to estimate or forecast accurately. Consequently, this drives companies to keep relatively high stock buffers in order to ensure the availability of parts, leading to increased holding costs and waste of part life.
As noted by Regattieri et al. [
2], many MROs and airlines do not use sophisticated techniques for demand estimation and forecasting, but rely on in-house experience or component supplier suggestions. If forecasting is in place, time-series techniques are often employed. Regattieri et al. [
2] analysed the accuracy of twenty time-series forecasting techniques and noted their strengths and weaknesses. In a similar vein, Ghobbar and Friend [
4] stated that airline operators could improve their forecasts by identifying which drivers induce the variable behaviour of demand.
The latter points towards one of the noteworthy limitations in the current state of the art, which is that many studies do not provide further understanding of the generation of demand due to the use of time-series techniques, where demand is the only variable taken into account. Multiple demand drivers are not usually considered, either individually or on a joint basis. Furthermore, a substantial subset of the literature assumes that the state of installed components is always in a “as-good-as-new” state, i.e., denoting perfect repairs, or a “bad-as-old” state, i.e., denoting minimal repairs [
5]. Neither assumption necessarily matches with spare part configurations in which overhauled or refurbished components are reintroduced into service. While the current state of the art does present models for imperfect repairs [
6,
7,
8,
9,
10] and applies them for policy evaluation and optimisation purposes [
11,
12], their use for the prediction of demand behaviour has not been explored, to the best of the authors’ knowledge.
One of the downsides of the use of time-series techniques is that no further understanding of the generation of the demand is provided. As the study of Van der Auweraer et al. [
13] noted, installed base information can be used to forecast the upcoming demand of spare parts. Similarly, environmental conditions can be used to improve the quality of forecasts [
14]. The research by Lowas and Ciarallo [
15] uncovered some reasons for the unpredictable behaviour of spare part demand. The most significant single factor driving demand variability was found to be the size of the fleet of aircraft. It was concluded that smaller fleets have higher values for the Coefficient of Variance (CV) and Average Demand Interval (ADI)—measures of the variability in quantity and frequency of demand—when compared to large fleets. The authors of the study recommended further study to better understand demand generation drivers, as only some were tested.
One commonly made assumption throughout the literature is related to the state of the component when installed on an aircraft. Studies using the expected lifetime of components often neglect the fact that errors in the repair process occur and hence repairable components are not restored in an as-good-as-new state [
6,
7,
16]. As maintenance personnel face high levels of time pressure and the effects of environmental circumstances in the industry, errors in the process will occur. Research has shown that in at least 39% of cases, maintenance errors are related to installation errors or incomplete repairs [
17]. Broken components that are placed back into operation result in subsequent failures due to the incorrect state of the component [
18]. This leads to concentrations in failures, resulting in a peak in spare part demand. However, this type of failure dependency is typically not addressed within the scope of spare part demand forecasting.
This research aims to address the aforementioned limitations in the state of the art by (1) modelling, simulating and evaluating the effect of incorrect repairs on demand patterns; and (2) quantifying the effect of multiple demand drivers in conjunction, leading to an improved understanding of demand driver priority. In terms of demand drivers, fleet size, incorrect repairs, environmental conditions, and different component commonality strategies are considered.
Section 2 gives an overview of the academic state of the art regarding the research topic, and further highlights limitations that will be addressed in this research.
Section 3 presents the modelling and simulation approach. Subsequently, this approach is applied to a case study comprising real-life component data from an aircraft MRO provider that services CS25 category aircraft.
Section 4 presents the case study characteristics and gives the results of a quantitative evaluation. Next,
Section 5 discusses the validity and applicability of the results. Finally,
Section 6 and
Section 7 present the conclusions and recommendations for future studies.
3. Modelling and Simulation Approach
In this section, in which the model formulation and implementation are described, the required input data and the simulation setup are elaborated.
Section 3.1 provides the explanation of the modelling and simulation approach, and
Section 3.2 describes the approach towards a systematic evaluation of the influencing parameters.
3.1. Methods
The approach aims to provide a quantitative answer to what the impact of different levels of repair quality is on spare part demand. Therefore, it was decided to capture the impact on ADI, CV2 and the overall number of failures when varying the levels of repair quality. The first two metrics cover the variability in demand, whereas the final metric captures the overall demand size.
In order to incorporate the effect of incorrect repairs on spare part demand in conjunction with several other drivers of demand, the developed approach incorporates a model for characterising incorrect repair in combination with a Monte Carlo simulation to generate spare part demand sequences on the basis of multiple input parameters. A visualisation of this approach is provided in
Figure 3. As randomness is in play with the occurrence of incorrect repairs and the distribution of subsidiary failures, a total of 50 iterations of the model are performed before analysing the results. The final results are based on distributional characteristics taken from across the individual iterations.
For modelling incorrect repairs, a Branching Poisson Process (BPP) is implemented. The BPP utilises a parameter, r, that represents the chance of an incorrect repair being performed. This parameter influences the discrete random variable that represents the spawning of subsequent failures. Hence, when an incorrect repair takes place and the component is placed back into service, the number of failures during a relatively short timespan may peak due to a certain number of subsequent failures. The influence of this parameter on CV2 and ADI is the core attribute of this model. Although it could be argued that the value of r might change over time, and that therefore a time-dependent function r(t) might be present, this is not undertaken in this study.
Aside from the main parameter
r, additional parameters that may influence spare part demand patterns are considered. The work of Lowas and Ciarallo [
15] showed the influence of fleet size on demand patterns. Thijssens and Verhagen [
14] showed that environmental conditions impact the reliability of components for multiple different reasons. Air pollutants and salinity all have an impact on the corrosion process of components. In addition to a natural reference climate (i.e., temperate), humid and desert climates are taken into account as well, both of which affect the Mean Time Between Failure (MTBF) (note that other relevant metrics include Mean Time Between Repair (MTBR) and Mean Time Between Overhaul (MTBO), but the cited study focuses on MTBF). The impact of incorrect repairs in combination with these other varying circumstances provides a wider perspective on the general behaviour of spare part demand.
The BPP and the previously highlighted parameters are implemented and subsequently simulated for every aircraft in the fleet. Failures at the selected aircraft component locations (expressed using system ATA codes; see
Section 3.2) are simulated according to their corresponding failure rate
λ, obtained from the analysis of the underlying data. Based on the number of primary removals that contain subsequent removals in a data set, an estimate of the probability of an incorrect repair
r can be made for the specific component location. Subsequently, possible subsequent failures are simulated. Next, the results of the individual aircraft are summed, resulting in the sum of the failures over time for every ATA location that is selected to be part of the model. This is done for multiple combinations of parameters. The values for ADI and CV
2 are stored. The above-mentioned metrics describe the predictability of the failures over time, but do not provide an answer regarding the quantity of failures. Therefore, this metric is added to the results as well, in order to capture both the behaviour as well as the sum of the failures.
The model can be applied across a variety of scenarios. In this study, four scenario variants are considered: each variant builds on the previous one to allow for the progressive generation of results, enabling the evaluation of individual effects followed by joint effects. The variants and their progressive nature are briefly discussed below.
Variant 1—Base: In this variant, the level of repair is the only parameter to be varied. Only temperate environmental conditions are taken into account. All fleet sizes are taken into account, but no distinction is made in the presentation of the results. No increase in component commonality across different aircraft is taken into consideration.
Variant 2—Incorporation of varying fleet sizes: This variant uses the same set of results as Variant 1, but a distinction between the different fleet sizes is made in the presentation of the results.
Variant 3—Incorporation of varying fleet sizes and environmental conditions: As different environmental conditions influence the effect of the expected lifetime of components, this will result in varying values for the different λs. Here, the results of humid and desert environments are also taken into account.
Variant 4: Incorporation of varying fleet sizes, environmental conditions, and component commonality strategies. As flag carriers tend have more diverse fleets when compared to low-cost carriers [
31], MRO providers have to deal with different aircraft types. Component commonality across the aircraft types is typically limited, with aircraft types in a family concept usually sharing the greatest degree of commonality. However, recent research by Zhang et al. [
32] has shown promising results regarding potential gains with respect to costs when component commonality is increased. Hence, this variant investigates the effect of the increment of component commonality. From a practical perspective, this may give insights into any additional requirements on aircraft and component design, where OEMs have an opportunity to increase the similarity of components across multiple aircraft types. This has obvious manufacturing and supply chain benefits, but using Variant 4, it becomes possible to assess any potential effects on spare part demand.
The model algorithm can be described as per the pseudo-code given in Algorithm 1.
Algorithm 1: Pseudocode of model |
|
3.2. Parameters
In order to determine the influence of the different scenarios, multiple parameter values have to be taken into consideration.
The main goal of this research is to reveal the impact of the quality of the repair process for components that are placed back into the aircraft on the CV2 and ADI of spare part demand. As the initial values of r are retrieved from the data analysis of the dataset, these values are used as reference values (i.e., the Normal scenario). Scenarios with values for r increased by 100% (the Worse scenario), or decreased by 50% (the Improved scenario) or 100% (the Perfect scenario) are tested. The first of the mentioned alterations of r represents a scenario in which the amount of incorrectly repaired components that is placed back into service is twice as high as the reference scenario. The second alteration represents a scenario in which the chances of an incorrect repair are decreased by 50%. Therefore, less incorrectly repaired components are placed back into service. The last option represents a scenario where no incorrectly repaired components are placed back into the aircraft, and thus all components that are placed back function properly.
The work of Thijssens and Verhagen [
14] showed the impact of three environmental factors on the Restricted Mean Survival Time (RMST) of components in aviation. The RMST is equal to the mean survival time, except that the RMST is restricted to within a time range
to avoid the negative influences of the poorly determined right tail of a survival curve during estimation [
33]. In this study, the impact of the environmental factors is directly related to the MTBF of components by the numerical factor provided in
Table 1. For every aircraft considered in the analysis, the airline can be traced back via the external organisation code. In this way, the dominant environmental conditions at the main hub of the airline can be applied, and the values for the specific aircraft can be adjusted.
The study by Lowas and Ciarallo [
15] provided insights into the reasons for lumpy spare part demand. The study found that the parameter with the greatest impact on the lumpiness of the demand for spare parts was the fleet size. In order to validate this finding and to extend its scope, it is tested in this research, as well. As the reference study clearly described the range of values selected for the fleet size, this was not further thematised in this study, and the same range of values were chosen for the model. Finally, the increment of component commonality across different aircraft types was tested. Here, it is assumed that different aircraft types perform differently, resulting in variations in the average operating time of components. A deviation of 20% is assumed. The size of the deviation itself is not crucial, as the outcome will be directly compared to variants in which no different aircraft are considered. If significant differences are observed, this will serve as a stepping stone motivating the development of future research.
Table 2 represents the parameters discussed in the previous paragraph and used in the Monte Carlo simulation in a single consistent overview.
4. Results
Before the application of the proposed approach and the subsequent evaluation of the results, this section starts with a brief discussion of the case study application and the associated data characteristics.
Section 4.1 provides insights into the dataset and the manner in which input data for the model are generated.
4.1. Case Study Characteristics
In order to provide the model with the right input parameters based on the failure behaviour of aircraft components, data from an anonymous aircraft manufacturer were used. The data consist of removal data spanning across multiple decades.
For each data point, in this research, the part number, date, aircraft type, ATA chapter code (denoting the associated (sub)system), the serial number of the aircraft, and the operator are used. A selection of components is made in order to limit the scope. An overview of this analysis can be found in
Table 3.
This limits the scope to components in the following eight ATA chapters: 23 (Communications), 24 (Electrical Power), 27 (Flight Controls), 28 (Fuel), 29 (Hydraulic Power), 32 (Landing gear), 34 (Navigation) and 77 (Engine Indicating). Based on the operator, the environmental conditions can be determined for every aircraft in the data set. This has a direct impact on the lifetime of the components, and therefore influences the Mean Time Between Failure (MTBF), and hence the spawn rate of primary failures [
14]. From the selected data, primary and subsidiary removals could be identified. In this analysis, a subsidiary removal is defined as a removal occurring within fourteen days of the primary removal. Here, it is assumed that components are interdependent if and only if they are located in the same ATA chapter. With this information, the spawn rate of primary removals can be determined for every ATA chapter code and location on every aircraft. Adjustments are made with respect to the environmental conditions in order to be able to correctly quantify the effect of variations in environmental conditions. With an overview of primary and subsidiary removals, the likelihood of an incorrect repair occurring at each ATA location can be made by reviewing the number of primary failures that incorporate subsidiary removals. For every location, the composition of subsequent failures is reviewed. Through this, in the proposed approach, the offset of a primary failure can be varied based on the distribution of the offset from the data. In implementation, the subsidiary failures are randomly distributed over the fourteen days following the day on which the primary failure occurs.
Next, for all aircraft and ATA locations, a check has to be made regarding the homogeneity of the primary removal rates of the components. The results of this test are presented in
Table 4. It can be concluded that the spawn rate of primary removals is constant in most cases. Therefore, a Homogeneous Poisson Process can be used to simulate these removals. Furthermore, no significant differences were found among the performances of the different aircraft represented in the data. Hence, the results could be aggregated.
The results are presented in the order of the four different variants considered in this study. Visualisations of the results are available, although only a small selection of all visualisations are provided here for ease of interpretation. In the figures, each data point represents the average demand characteristics (ADI and CV2) of a unique combination of varying parameters.
For each variant, the impact of improving the repair quality is quantified. The motivation for presenting the deviations resulting from this improvement originates from the desire of MRO providers to minimize the number of errors during the repair process and to strive for improvement. Therefore, MRO providers can use the outcomes of this study to quantify the effect of improving their repair quality on ADI, CV2 and total number of failures.
The results of the statistical tes”s ar’ provided in
Appendix A. The results are presented in the form of
p-values of the Mann–Whitney
U test and the Kruskal–Wallis
H test [
34,
35,
36]. Values that are not significantly different according to these tests (
p > 0.05), are marked with an asterisk in the tables.
4.2. Variant 1—The Influence of Variations in Repair Quality
The visual representation of the results in
Figure 4 does not directly indicate a discernible difference in performance with different levels of repair quality.
Table 5 and
Table 6 provide a quantitative comparison. Here, the comparison is made between the current level of repair quality (left column) and the desired level of repair quality (top row). The number provides a ratio of the average value of the metric of the desired level of repair quality and the current level of repair quality.
Table 7 provides the total number of failures for the different levels of repair quality. Here, it can be seen that improved levels of repair quality result in lower numbers of failures.
Improving the repair quality from “Worse” to “Normal” increased the ADI by 15.1%, decreased the CV2 by 3.0% and decreased the total number of failures by 17.5%.
Improving the repair quality from “Normal” to “Improved” led to an increase in the ADI by 12.6%, an increase in the CV2 by 1.1%, and a reduction in the total number of failures by 15.6%. Hence, this improvement has a positive effect on the total number of failures, but decreases the predictability of failures over time.
Improving the repair quality from the “Improved” level to the “Perfect” level resulted in an improvement in ADI by 15.2%, a reduction in CV2 by 3.9%, and a reduction in the total number of failures by 20.1%.
4.3. Variant 2—The Influence of Variations in Repair Quality and Fleet Size
It can be seen from the results presented in
Figure 5 that an increased fleet size lowers the ADI and increases the CV
2. The results of varying the repair quality for all fleet sizes are provided in
Table 8.
Generally, it can be concluded that improved repair quality results in a higher ADI, a lower CV
2 and a lower total number of failures. It can be seen from
Table 8 that the impact of improvements in repair quality is larger with smaller fleet sizes. However, the impact on the number of failures is not strongly influenced by fleet size. Hence, it can be seen that this decrease remains somewhat constant for different fleet sizes.
It is interesting to note that the improvement in repair quality from “Normal” to “Improved” in most cases does not have a positive effect on CV
2—wee the italicised numbers in
Table 8. This can be explained by the fact that, although fewer subsequent failures occur, the variance in demand quantity increases at a higher rate than the mean value of demand quantity. An example is given in
Table 9. For each level of repair quality, an overview of the number of failures at each time point is provided. Primary failures are indicated by bold numbers, subsequent failures are provided as regular text. It can be observed that the time series for the “Improved” scenario has longer periods of zero demand, causing a more lumpy demand pattern when compared to the “Normal” and “Worse” scenarios. The CV
2 rises accordingly.
4.4. Variant 3—The Influence of Variations in Repair Quality, Fleet Size and Environmental Conditions
The combination of fleet size and climate is taken into account here, and eighteen different scenarios (six fleet sizes, three environmental conditions) were generated. However, these scenarios were increasingly hard to interpret. Hence, only a quantitative overview in the form of
Table 10 is provided. The table provides the influence of changing the level of repair quality on ADI and CV
2. Note that the results of temperate environmental conditions were already provided in
Section 4.3. The table shows the deviations of the improvement displayed in the first column.
The main addition of Variant 3 to the study is the exploration of the effect of improving the level of repair quality for varying environmental conditions, expressed in the values for ADI, CV2 and total failures.
For desert and humid environments, patterns similar to those of temperate environmental conditions were found. The changes in ADI and CV2 were dampened when the fleet size becomes larger, while the relative losses in total failures remained somewhat constant. Both the “Worse to Normal” and “Improved to Perfect” improvements performed similarly for all metrics. However, the improvement in repair quality from “Normal to Improved” saw a limited decrease in CV2. In fact, many scenarios induce an increase in the CV2. This is similar to the results of Variant 2.
By comparing the temperate and humid environmental scenarios, it can be concluded that under humid conditions, the ADI is less sensitive to the improvement in the repair quality. This results in smaller increments in ADI compared to under temperate environmental conditions. The results of the deviation in CV2 provide no clear winner, as both environmental conditions outperform the other conditions for different values of fleet size and improvement. The relative losses in total failures are higher for temperate environmental conditions, although the difference between the two scenarios is small.
When comparing the temperate and desert environmental scenarios, it is clear that the desert environmental conditions perform slightly better than the temperate conditions when it comes to increasing the ADI. That is to say, the increase in the ADI under the same situation is slightly less compared to the increase in the ADI under temperate environmental conditions. When comparing the deviations in CV2, no clear pattern can be found. In some cases, desert conditions outperform the temperate conditions, but the opposite occurs for the same number of scenarios. With respect to decreasing the total number of failures, desert conditions are slightly less advantageous compared to temperate environmental conditions.
Generally, the increment in ADI with improvement is the most limited under humid conditions, the performance with decreasing CV2 is similar for all environmental conditions, and the relative reduction in the total number of failures is similar for all environmental conditions, although the temperate environmental conditions perform slightly better in most cases.
4.5. Variant 4—The Influence of Variations in Repair Quality, Fleet Size, Environmental Conditions and Component Commonality Strategies
The individual results of the outcome of this variant are not presented, but are directly compared with the results of Variant 3. In this way, the impact of increasing the component commonality index can be evaluated. Hence, the results are discussed with the support of
Table 11. Here, the results are presented as the difference in performance between Variant 3 and Variant 4. Therefore, if, in a certain scenario, the ADI of Variant 3 is increased by 1.0% and the ADI of Variant 4 is increased by 2.0%, the table will state a difference of +1.0%.
Although in some cases there are significant differences, most of the deviations are relatively small. Hence, the variability in spare part demand is not deteriorated by the introduction of components that are operable for multiple aircraft types.
5. Discussion
For different fleet sizes, employing shared component strategies among different aircraft types and environmental conditions, the influence of repair quality was quantified by capturing the changing values for the ADI and CV
2.
Table 12 provides the total-effect indices obtained using Sobol’s sensitivity analysis for the different variables [
37].
The total-effect index translates the contribution to the output variance of the variable. The influence of the fleet size is dominant for the variance in the outcomes of both ADI and CV
2. Therefore, it can be concluded that the fleet size is the main influencing factor for both metrics. This suggests that adjusting the fleet size will have the greatest impact on potentially lowering the ADI and CV
2. However, for many reasons, the expansion of the fleet is not always possible. In cases where this expansion is not feasible and the fleet sizes cannot be increased, the influence of repair quality on the demand pattern becomes more dominant. This can be seen in
Table 13, where the fleet size is fixed and the variance in the outcome depends on repair quality, component commonality, and environmental conditions.
A critical note has to be made regarding the values of parameters for different levels of repair quality. The results obtained from the data analysis are used as a reference scenario (i.e., the “Normal” repair quality), while the other three are based on a multiplication of this scenario. The values of the parameters for different levels of repair quality were chosen in order to conduct a thorough numerical evaluation. In practice, the difference in performance is unlikely to be of this size.
The strong assumption regarding the interdependency among components in the same ATA chapter results in the limitation of the usefulness of the outcome when it comes to the location of failures and the corresponding failure patterns. In other words, while the ATA chapter results are related to aircraft systems, and an assumption of interdependency is made, this is not necessarily true. The same ATA code can refer to multiple instances of a system on a single aircraft; for example, a failure related to ATA chapter 38, which covers waste/water systems, may in fact relate to different instances of galleys, bathrooms, wastewater tanks, etc., which are located at a number of different places in the aircraft, and may not have interdependent functionality. In addition, components could be connected with and dependent on components belonging to different ATA chapters. Without detailed ATA chapter indications (using the full six digits to describe systems at the unit level) or additional information on the location (for instance, through ATA zonal codes), conclusions with respect to failure location and their associated patterns are difficult to draw, and spare part demand can only be determined at a higher level of system aggregation. From this perspective, when aggregating the results of the failures, the locations of the components are not decisive for the outcomes of this research.
Another assumption is made in Variant 4, where the effect of the different component commonalities in a fleet is tested. Due to the lack of research and data on component commonality across heterogeneous fleets, only a rough estimation of the associated effect on component demand was performed.
As this research quantifies the impact of repair quality on the different demand metrics, repair quality is used as a varying parameter in the model. However, the influence of changes in other varying parameters might also affect the metrics. Hence, there is no proof that all changes are the result of variations in repair quality alone, and variances caused by the interaction of the different parameters should also be included.
Table 12 provides the first- and total-effect indices of the sensitivity analysis. As can be seen from the table, the differences among the first- and total-effect indices are relatively small. Hence, the influence of the interaction is limited.
Another important note should be made regarding the statistical outcomes of the Kruskal–Wallis H and Mann–Whitney U tests. As the commonly chosen 95% interval provides a fair threshold for the rejection of the null hypothesis, p-values below 0.05 cause the null hypothesis to be rejected and thus it can be assumed that different groups of data have different medians. However, this p-value is highly dependent on the number of data points in the compared groups. As the number of iterations for the simulation was set to 50, the sizes of the subsets grew by a factor 50. Therefore, the p-values became smaller, resulting in a more frequent rejection of the null hypothesis. However, when reviewing only a single iteration, the p-values are higher, and the null hypothesis is rejected less often. It is, however, not an option to exclude the iterations from the model, as these iterations provide outcome stability by omitting the random factor.
A final critical note can be made on the limited set of drivers for failures. As frequently stated in previous research, not all drivers of failure are known, resulting in research that includes a limited number of drivers. However, this research provides a broadening to the current knowledge by including the effect of different levels of repair quality.
6. Conclusions
The impact of changing repair quality on the predictability of the failures of components was quantified. In general, the following can be established:
An improvement in repair quality induces an increase in ADI, a reduction in CV2 and a reduction in the total number of failures.
For larger fleet sizes (more than 64 aircraft of the same type), the effects of increased repair quality on the ADI and CV2 become less significant, while the effect on the total failures remains the same.
Therefore, it can be concluded that when facing larger fleets, the improvement in repair quality has a wider support base, as the downside of the implementation becomes smaller. Ironically, larger fleets have fewer problems with variability in spare part demand.
This research contributes towards a more complete understanding of the way in which drivers for component spare part demand may behave. To the best of the authors’ knowledge, this study is the first to explicitly address the influence of repair quality on the demand behaviour of components, while systematically exploring and verifying the influence of a range of additional demand drivers. This gives further insight into spare part demand behaviour under more realistic conditions, where multiple drivers may apply at the same time.