A Software Reliability Model with a Weibull Fault Detection Rate Function Subject to Operating Environments

The main focus when developing software is to improve the reliability and stability of a software system. When software systems are introduced, these systems are used in field environments that are the same as or close to those used in the development-testing environment; however, they may also be used in many different locations that may differ from the environment in which they were developed and tested. In this paper, we propose a new software reliability model that takes into account the uncertainty of operating environments. The explicit mean value function solution for the proposed model is presented. Examples are presented to illustrate the goodness-of-fit of the proposed model and several existing non-homogeneous Poisson process (NHPP) models and confidence intervals of all models based on two sets of failure data collected from software applications. The results show that the proposed model fits the data more closely than other existing NHPP models to a significant extent.


Introduction
Software systems have become an essential part of our lives.These systems are very important because they are able to ensure the provision of high-quality services to customers due to their reliability and stability.However, software development is a difficult and complex process.Therefore, the main focus of software companies is on improving the reliability and stability of a software system.This has prompted research in software reliability engineering and many software reliability growth models (SRGM) have been proposed over the past decades.Many existing non-homogeneous Poisson process (NHPP) software reliability models have been developed through the fault intensity rate function and the mean value functions m(t) within a controlled testing environment to estimate reliability metrics such as the number of residual faults, failure rate, and reliability of software.Generally, the reliability increases more quickly and later the improvement slows down.Software reliability models are used to estimate and predict the reliability, number of remaining faults, failure intensity, total software development cost, and so forth, of software.Various software reliability models and application studies have been developed to date.Discovering the confidence intervals of software reliability is done in the field of software reliability because it can enhance the decision of software releases and control the related expenditures for software testing [1].First, Yamada and Osaki [2] considered that the maximum likelihood estimates concerning the confidence interval of the mean value function can be estimated.Yin and Trivedi [3] present the confidence bounds for the model parameters via the Bayesian approach.Huang [4] also present a graph to illustrate the confidence interval of the mean value function.Gonzalez et al. [5] presented a general methodology that applied to a power distribution test system considering the effect of weather conditions and aging of components in the system reliability indexes for the analysis of repairable systems using non-homogeneous Poisson process, including several conditions in the system at the same time.Nagaraju and Fiondella [6] presented an adaptive expectation-maximization algorithm for non-homogeneous Poisson process software reliability growth models, and illustrated the steps of this adaptive approach through a detailed example, which demonstrates improved flexibility over the standard expectation-maximization (EM) algorithm.Srivastava and Mondal [7] proposed a predictive maintenance model for an N-component repairable system by integrating non-homogeneous Poisson process (NHPP) models and system availability concept, such that the use of costly predictive maintenance technology is minimized.Kim et al. [8] described application of the software reliability model of the target system to increase the software reliability, and presented some analytical methods as well as the prediction and estimation results.
Chatterjee and Singh [9] proposed a software reliability model based on NHPP that incorporates a logistic-exponential testing coverage function with imperfect debugging.In addition, Chatterjee and Shukla [10] developed a software reliability model that considers different types of faults incorporating both imperfect debugging and a change point.Yamada et al. [11] developed a software-reliability growth model incorporating the amount of test effort expended during the software testing phase.Joh et al. [12] proposed a new Weibull distribution based on vulnerability discovery model.Sagar et al. [13] presented best software reliability growth model with including feature of both Weibull distribution and inflection S-shaped SRGM to estimate the defects of software system, and provide help to researchers and software industries to develop highly reliable software products.
Generally, existing models are applied to software testing data and then used to make predictions on the software failures and reliability in the field.Here, the important point is that the test environment and operational environment are different from each other.Once software systems are introduced, the software systems used in the field environments are the same as or close to those used in the development-testing environment; however, the systems may be used in many different locations.Several researchers started applying the factor of operational environments.A few researchers, Yang and Xie, Huang et al., and Zhang et al. [14][15][16], proposed a method of predicting the fault detection rate to reflect changes in operating environments, and used methodology that modifies the software reliability model in the operating environments by introducing a calibration factor.Teng and Pham [17] discussed a generalized model that captures the uncertainty of the environment and its effects upon the software failure rate.Pham [18,19] and Chang et al. [20] developed a software reliability model incorporating the uncertainty of the system fault detection rate per unit of time subject to the operating environment.Honda et al. [21] proposed a generalized software reliability model (GSRM) based on a stochastic process and simulated developments that include uncertainties and dynamics.Pham [22] recently presented a new generalized software reliability model subject to the uncertainty of operating environments.And also, Song et al. [23] presented a new model with consideration of a three-parameter fault detection rate in the software development process, and relate it to the error detection rate function with consideration of the uncertainty of operating environments.
In this paper, we discuss a new model with consideration for the Weibull function in the software development process and relate it to the error detection rate function with consideration of the uncertainty of operating environments.We examine the goodness of fit of the fault detection rate software reliability model and other existing NHPP models based on several sets of software testing data.The explicit solution of the mean value function for the new model is derived in Section 2. Criteria for model comparisons and confidence interval for selection of the best model are discussed in Section 3. Model analysis and results are discussed in Section 4. Section 5 presents the conclusions and remarks.

A New Software Reliability Model
In this section, we propose a new NHPP software reliability model.First, we describe the NHPP software reliability model and present a solution of the new mean value function considering the new fault detection rate function against the generalized NHPP software reliability model, incorporating the uncertainty of fault detection rate per unit of time in the operating environments.

Non-Homogeneous Poisson Process Model
The software fault detection process has been widely formulated by using a counting process.A counting process, {N(t), t ≥ 0}, is said to be a non-homogeneous Poisson process with intensity function λ(t) if N(t) follows a Poisson distribution with the mean value function m(t), namely, A general framework for NHPP-based SRGM has been proposed by Pham et al. [24].They have modeled m(t) using the differential equation Solving Equation (1) makes it possible to obtain different values of m(t) using different values for a(t) and b(t), which reflects various assumptions of the software testing process.

Weibull Fault Detection Rate Function Model
A generalized NHPP model incorporating the uncertainty of operating environments can be formulated as follows [19]: where η is a random variable that represents the uncertainty of the system fault detection rate in the operating environments with a probability density function g, N is the expected number of faults that exists in the software before testing, b(t) is the fault detection rate function, which also represents the average failure rate of a fault, and m(t) is the expected number of errors detected by time t or the mean value function.We propose an NHPP software reliability model, including the uncertainty of the operating environment using Equation (2) and the following assumptions [19,23]: The solution for the mean value function m(t), where the initial condition m(0) = 0, is given by [19]: Pham [22] recently developed a generalized software reliability model incorporating the uncertainty of fault detection rate per unit of time in the operating environments where the random variable η has a generalized probability density function g with two parameters, α ≥ 0 and β ≥ 0, and the mean value function from Equation ( 3) is given by: where b(t) is the fault detection rate per fault per unit of time.
The Weibull distribution is one of the most commonly used distributions for modeling irregular data, is very easy to interpret and very useful.The Weibull distribution is a distribution that can be used instead of a normal distribution for data with a bias, and used lifetime distributions in reliability engineering.Weibull distribution has been applied in the area of reliability quality control duration, and failure time modelling.This distribution can be widely and effectively used in reliability applications because it has wide variety of shapes in its density and failure rate functions, making it useful for fitting many types of data.In the modelling software development was often described by Weibull-type curves.The discrete Weibull distribution can describe flexibility stochastic behavior of the failure occurrence times.The Weibull-based method is significantly better than the Laplacian-based rate prediction.Both logistic and Weibull distributions will result in a cumulative distribution function with an S-shaped for the lifetime software product [25,26].
In this paper, we consider a Weibull fault detection rate function b(t) to be as follows: where a and b are known as the scale and shape parameters, respectively.A Weibull fault detection rate function b(t) is decreasing for b < 1, increasing for b > 1, and constant when b = 1.We obtain a new NHPP software reliability model subject to the uncertainty of the environments, m(t), that can be used to determine the expected number of software failures detected by time t by substituting the function b(t) above into Equation (4):

Model Comparisons
In this section, we present a set of comparison criteria for best model selection, quantitatively compare the models using these comparison criteria, and obtain the confidence intervals of the NHPP software reliability model.

Criteria for Model Comparisons
Once the analytical expression for the mean value function m(t) is derived, the model parameters to be estimated in the mean value function can then be obtained with the help of a developed Matlab program based on the least-squares estimate (LSE) method.Five common criteria [27,28], namely the mean squared error (MSE), the sum absolute error (SAE), the predictive ratio risk (PRR), the predictive power (PP), and Akaike's information criterion (AIC), will be used as criteria for the model estimation of the goodness of fit and to compare the proposed model and other existing models as listed in Table 1.Table 1 summarizes the proposed model and several existing well-known NHPP models with different mean value functions.Note that models 9 and 10 in Table 1 did consider environmental uncertainty.

No.
Model m(t) The mean squared error is given by The sum absolute error is given by The predictive ratio risk and the predictive power are given as follows: To compare the all model's ability in terms of maximizing the likelihood function (MLF) while considering the degrees of freedom, Akaike's information criterion (AIC) is applied: where y i is the total number of failures observed at time t i ; m is the number of unknown parameters in the model; and m(t i ) is the estimated cumulative number of failures at t The mean squared error measures the distance of a model estimate from the actual data with the consideration of the number of observations, n, and the number of unknown parameters in the model, m.The sum absolute error is similar to the sum squared error, but the way of measuring the deviation is by the use of absolute values, and sums the absolute value of the deviation between the actual data and the estimated curve.The predictive ratio risk measures the distance of model estimates from the actual data against the model estimate.The predictive power measures the distance of model estimates from the actual data against the actual data.MSE, SAE, PRR, and PP are the criterion to measure the difference between the actual and predicted values.AIC is a measure of goodness of fit of an estimated statistical model, and considered to be a measure which can be used to rank the models, and it gives a penalty to a model with more number of parameters.For all five of these criteria-MSE, SAE, PRR, PP and AIC-the smaller the value, the closer the model fits relative to other models run on the same data set.

Estimation of the Confidence Intervals
In this section, we use Equation ( 7) to obtain the confidence intervals [27] of the software reliability models in Table 1.The confidence interval is given by m(t) ± Z α/2 m(t), (7) where Z α/2 is 100(1 − α) percentile of the standard normal distribution.

Numerical Examples
Wireless base stations provide the interface between mobile phone users and the conventional telephone network.It can take hundreds of wireless base stations to provide adequate coverage for users within a moderately sized metropolitan area.Controlling the cost of an individual base station is therefore an important objective.On the other hand, the availability of a base station is also an important consideration since wireless users expect the system availability to be comparable to the high availability they experience with the conventional telephone network.The software in this numerical example runs on an element within a wireless network switching center.Its main function includes routing voice channels and signaling messages to relevant radio resources and processing entities [35].Dataset #1, field failure data for Release 1 listed in Table 2, was reported by Jeske and Zhang [35].Release 1 included Year 2000 compatibility modifications, an operating system upgrade, and some new features pertaining to the signaling message processing.Release 1 had a life cycle of 13 months in the field.The cumulative field exposure time of the software was 167,900 system days, and a total of 115 failures were observed in the field.Table 2 shows the field failure data for Release 1 for each of the 13 months.Software failure data is available from the field for Release 1. Dataset #2, test data for Release 2 listed in Table 3, was also reported by Jeske and Zhang [35].The test data is the set of failures that were observed during a combination of feature testing and load testing.The test interval that was used in this analysis was a 36-week period between.At times, as many as 11 different base station controller frame (BCF) frames were being used in parallel to test the software.Thus, to obtain an overall number of days spent testing the software we aggregated the number of days spent testing the software on each frame.The 36 weeks of Release 2 testing accumulates 1001 days of exposure time.Dataset #2 also show the cumulative software failures and the cumulative exposure time for the software on a weekly basis during the test interval.Tables 4 and 5 summarize the results of the estimated parameters of all 11 models in Table 1 using the least-squares estimation (LSE) technique and the values of the five common criteria (MSE, SAE, PRR, PP and AIC).We obtained the five common criteria when t = 1, 2, • • • , 13 from Dataset #1 (Table 2), with exposure time (Cum.System days) from Dataset #2 (Table 3), As can be seen from Table 4, the MSE, SAE, PRR, PP and AIC values for the proposed new model are the lowest values compared to all models.We can see that the values of MSE, SAE, and AIC of the proposed new model are 11.2281, 26.5568, and 79.3459, respectively, which is significantly smaller than the value of the other software reliability models.The values of PRR and PP of the proposed new model are 0.2042, 0.1558, respectively.As can be seen from Table 5, the MSE, SAE and PRR value for the proposed new model are the lowest values, and the PP and AIC value for the proposed new model are the second lowest values compared to all models.We can see that the values of MSE and SAE of the proposed new model are 9.8789, 90.3633, respectively, which is significantly smaller than the value of the other software reliability models.The values of PRR, PP, and AIC of the proposed new model are 0.2944, 0.5159, 187.4204, respectively.The results show the difference between the actual and predicted values of the new model is smaller than the other models and the AIC value which is the measure of goodness of fit of an estimated statistical model is much smaller than the other software reliability models.
Figure 1 shows the graph of the mean value functions for all 11 models for Datasets #1 and #2, respectively.Figures 2 and 3 show that the relative error value of the software reliability model can quickly approach zero in comparison with the other models confirming its ability to provide more accurate prediction.Figures 4 and A1 show the graph of the mean value function and confidence interval each of the proposed new model for Datasets #1 and #2, respectively.Refer to the Appendix A for confidence intervals for the other software reliability models.
includes routing voice channels and signaling messages to relevant radio resources and processing entities [35].Dataset #1, field failure data for Release 1 listed in Table 2, was reported by Jeske and Zhang [35].Release 1 included Year 2000 compatibility modifications, an operating system upgrade, and some new features pertaining to the signaling message processing.Release 1 had a life cycle of 13 months in the field.The cumulative field exposure time of the software was 167,900 system days, and a total of 115 failures were observed in the field.Table 2 shows the field failure data for Release 1 for each of the 13 months.Software failure data is available from the field for Release 1. Dataset #2, test data for Release 2 listed in Table 4, was also reported by Jeske and Zhang [35].The test data is the set of failures that were observed during a combination of feature testing and load testing.The test interval that was used in this analysis was a 36-week period between.At times, as many as 11 different base station controller frame (BCF) frames were being used in parallel to test the software.Thus, to obtain an overall number of days spent testing the software we aggregated the number of days spent testing the software on each frame.The 36 weeks of Release 2 testing accumulates 1001 days of exposure time.Dataset #2 also show the cumulative software failures and the cumulative exposure time for the software on a weekly basis during the test interval.Tables 3 and 5 summarize the results of the estimated parameters of all 11 models in Table 1 using the least-squares estimation (LSE) technique and the values of the five common criteria (MSE, SAE, PRR, PP and AIC).(a) (b)   (a) (b)

Conclusions
Generally, existing models are applied to software testing data and then used to make predictions on the software failures and reliability in the field.Here, the important point is that the test environment and operational environment are different from each other.We do not know in which operating environment the software will be used.Therefore, we need to develop the software reliability model considering uncertainty of operating environment.In this paper, we discussed a new software reliability model based on a Weibull fault detection rate function of Weibull distribution, which is the most commonly used distribution for modeling irregular data subject to the uncertainty of operating environments.Tables 4 and 5 summarized the results of the estimated parameters of all 11 models in Table 1 using the LSE technique and the five common criteria (MSE, SAE, PRR, PP and AIC) value for two data sets.As can be seen from Table 4, the MSE, SAE, PRR, PP and AIC value for the proposed new model are the lowest values compared to all models.As can be seen from Table 5, the MSE, SAE and PRR value for the proposed new model are the lowest values compared to all models.As the results show the difference between the actual and predicted values of the new model is smaller than the other models, and the AIC value, which is the measure of goodness of fit of an estimated statistical model, is much smaller than the other models.Finally, we show confidence intervals of all 11 models from Dataset #1 and #2, respectively.By estimating the confidence interval, we will help to find the optimal software reliability model at different confidence levels.Future work will involve broader validation of this conclusion based on recent data sets.

( a )
The occurrence of software failures follows an NHPP.(b) Software can fail during execution, caused by faults in the software.(c) The software-failure detection rate at any time is proportional to the number of remaining faults in the software at that time.(d) When a software failure occurs, a debugging effort removes the faults immediately.(e) For each debugging effort, regardless of whether the faults are successfully removed, some new faults may be introduced into the software system.(f) The environment affects the unit failure detection rate, b(t), by multiplying by a factor η.

Table 5 .
Model parameter estimation and comparison criteria from Dataset #2.

Table 1
Relative error value of 11 models in Table1for Dataset #2.
for Dataset #1.Figure 2. Relative error value of 11 models in Table 1 for Dataset #1.Figure 2. Relative error value of 11 models in Table 1 for Dataset #1.Figure 3.

Table 1 for
Dataset #2.Relative error value of 11 models in Table1for Dataset #1.Relative error value of 11 models in Table1for Dataset #2.