Optimal Release Time and Sensitivity Analysis Using a New NHPP Software Reliability Model with Probability of Fault Removal Subject to Operating Environments

: With the latest technological developments, the software industry is at the center of the fourth industrial revolution. In today’s complex and rapidly changing environment, where software applications must be developed quickly and easily, software must be focused on rapidly changing information technology. The basic goal of software engineering is to produce high-quality software at low cost. However, because of the complexity of software systems, software development can be time consuming and expensive. Software reliability models (SRMs) are used to estimate and predict the reliability, number of remaining faults, failure intensity, total and development cost, etc., of software. Additionally, it is very important to decide when, how, and at what cost to release the software to users. In this study, we propose a new nonhomogeneous Poisson process (NHPP) SRM with a fault detection rate function affected by the probability of fault removal on failure subject to operating environments and discuss the optimal release time and software reliability with the new NHPP SRM. The example results show a good ﬁt to the proposed model, and we propose an optimal release time for a given change in the proposed model.


Introduction
With the latest technological developments, the software industry is at the center of the fourth industrial revolution. The fourth industrial revolution relies on new and innovative information and communication technologies, cyber-physical systems, network communications, simulation, big data analysis, and cloud computing. Software systems play a vital role in controlling key machines in large industries such as aviation, medical, defense, and energy. In today's complex and rapidly changing environment, where software applications must be developed quickly and easily, software must be focused on rapidly changing information technology. Software systems improve solutions for immediate problems in a variety of industries and continue to offer customers convenience. The systems should be easy-to-use, error-free, and create a product that gives value to its users. To create good software, functions must be implemented that exactly match user requirements, and guarantee reliability, functionality, and performance. Software reliability, defined as the probability of failure-free operation under certain conditions and for a specific time, is one of the significant attributes of the software system development life cycle. Many software reliability models (SRMs) have been proposed to measure, predict, and ensure reliability. Moreover, the growth of reliability, and the trade-off between cost expenditure and optimal release both depend on the accuracy of the established SRM. The basic goal is to produce high-quality software at low cost in software engineering. However, because of the complexity of software systems, software development can be time-consuming and expensive. Therefore, the main focus of software companies is on improving the reliability and stability of a software system. This has prompted research into software reliability engineering and many software reliability growth models have been proposed in recent decades. SRMs are used to estimate and predict the reliability, number of remaining faults, failure intensity, total development cost, etc., of software. Discovering the reliability confidence intervals is done in the field of software reliability because it can aid the decision of software releases and control the related expenditures for software testing [1]. The purpose of many nonhomogeneous Poisson process (NHPP) software reliability models is to obtain an explicit formula for the mean value function m(t), which is applied to the software testing data and used to make predictions of software failures and reliability in the field environments [2]. The Goel-Okumoto (GO) [3] model is one of the most representative studies of SRMs. The GO model defines the mean value function m(t) and the intensity function λ(t) using an exponential distribution and estimates the reliability during mission time t by estimating the number of failures in the course of removing the remaining defects in the software. Yamada et al. [4] and Ohba [5] developed a model of the software fault detection process to evaluate software reliability based on test results during software development. Since then, many models using environmental changes have been developed; Teng and Pham [6] discussed a generalized model that captures the uncertainty of the environment and its effects upon the software failure rate. Recently, Inoue et al. [7] conducted software reliability modeling with uncertainty of testing environments. In addition, Li and Pham [2,8] proposed NHPP SRMs considering fault removal efficiency and error generation, and the uncertainty of operating environments with imperfect debugging and testing coverage. Song et al. [9][10][11] studied NHPP SRMs with various fault detection rates considering the uncertainty of operating environments.
Achieving the unique intended purpose of the software is more important than anything else. If a software failure occurs, such as when the software runs well and suddenly stops, it causes a huge loss to the enterprise and society. The general and specific requirements of the software are interrelated and require a variety of complex features. The cost of investing in software has increased. Indeed, this phenomenon is evident in the management of many software companies. Many companies today prefer short software release cycles. A short software release cycle has many advantages and disadvantages, but the reason for preferring a short software release cycle is to reduce the user's waiting time. Therefore, it is very important to decide when, how, and at what cost to release the software to users. Some early works have been conducted on the optimal software release problems [12][13][14][15][16], and new software cost models have been developed used in related works in recent years [17][18][19][20][21].
In this paper, we discuss a new NHPP SRM with a fault detection rate function affected by the probability of fault removal on failure when considering operating environments and discuss the optimal release time and software reliability with the new NHPP SRM. The explicit solution of the mean value function for the new NHPP software reliability model is derived in Section 2. The optimal release policy is discussed in Section 3, and criteria for the model comparison, results of a model analysis, optimal release time, and software reliability are discussed in Section 4. Finally, Section 5 provides some concluding remarks.

NHPP SRM
A repairable system means a system capable of switching from an active state to a faulty state and back to an active state. Statistical analysis of repairable systems is a very important part of the reliability field because there are many more repairable products compared to products that are virtually non-repairable. Such a repairable system includes a dummy model, a maintenance model, and a repair model, and the selection of the model may vary depending on how the failure is handled. In this case, if the time required for replacement, maintenance, and repair is negligible, that is, if it can be assumed to be zero, the failure time can be adapted to a point process. The Poisson process, which is one of the most important processes in the count process, shows probabilistic characteristics when the number of occurrences in a given interval follows the Poisson distribution and the number of occurrences of the events are independent of each other. This Poisson process can be classified as homogeneous Poisson process (HPP) and NHPP, which deals only with the NHPP.
If the counting process {N(t), t ≥ 0} satisfies three conditions: N(0) = 0, with independent increments, and the average of the number of failures in the interval [t 1 , t 2 ], it is called an NHPP with an intensity function λ(t). N(t) (t ≥ 0) follows a Poisson distribution with parameter m(t): where m(t) is the mean value function of the NHPP. The intensity function λ(t) is as follows.
Pham et al. [22] formalized the general framework for the software reliability based on NHPP and provided numerical expressions for the mean value function m(t) using differential equations. The mean value function m(t) of the general NHPP SRM with different values for a(t) and b(t), which reflect various assumptions, can be obtained with the initial condition N(0 The general solution of (1) is where B(t) = t t 0 b(s)ds, and m(t 0 ) = m 0 is the marginal condition of (2). A generalized NHPP SRM that incorporates uncertainty in the operating environment is formulated as follows [23]: where η is a random variable that represents the uncertainty of the system fault detection rate in the operating environment with a probability density function g; b(t) is the fault detection rate function, which also represents the average failure rate caused by faults; N is the expected number of faults that exist in the software before testing. Thus, a generalized mean value function, m(t), where the initial condition m(0) = 0, is given by The mean value function [24] from (4) using the random variable η has a generalized probability density function g with two parameters α ≥ 0 and β ≥ 0, and is given by where b(t) is the fault detection rate per fault per unit time.

New NHPP SRM
We propose a new NHPP SRM using Equations (3)-(5) and add the following assumptions to those of the existing NHPP SRM subject to operating environments [9][10][11]:

•
The fault detection rate function will be affected by the probability of fault removal on a failure.
In this study, we consider the fault detection rate function b(t) to be as follows: where b(t) is an S-shaped curve that can capture the learning process of the software testers/developers and this function is affected by the probability of fault removal on a failure. We obtain a new NHPP SRM with a fault detection rate function affected by the probability of fault removal on failure subject to the uncertainty of the environments, m(t), that can be used to determine the expected number of software failures detected by time t by substituting the function b(t) above into Equation (5): Table 1 summarizes the model types and the different mean value functions and intensity functions of the proposed new model and other existing NHPP models. Models 10 through 13 take into account the uncertainty of the operating environment.  [4] (DS) Pham (Dependent Parameter 2) [27] (DP2) Concave 11 Song et al. (Three parameter) [9] (3PD)

Optimal Software Release
The basic infrastructure of the software cost is described in Figure 1 from the testing environment to the end of the software field environment. As can be seen from Figure 1, we can deduce the basic cost model with software testing and operating cost, software removal cost, and risk cost when software is released.

Optimal Software Release
The basic infrastructure of the software cost is described in Figure 1 from the testing environment to the end of the software field environment. As can be seen from Figure 1, we can deduce the basic cost model with software testing and operating cost, software removal cost, and risk cost when software is released.
where C T is the cost of testing, C m(T) and C [m(T + x) − m(T)] are the cost to remove all errors detected by time T during the testing and operating phases, C (1 − R(x|T)) is the risk cost owing to failures that occur after the system release time T. We aim to find the optimal software release time, T*, with the basic reliability requirements and minimum total software cost as follows: The expected total software cost EC(T) can be expressed as where C 1 T is the cost of testing, C 2 m(T) and C 4 [m(T + x) − m(T)] are the cost to remove all errors detected by time T during the testing and operating phases, C 3 (1 − R(x|T)) is the risk cost owing to failures that occur after the system release time T. We aim to find the optimal software release time, T*, with the basic reliability requirements and minimum total software cost as follows: Minimize C(T) Subject to R(x|T) ≥ R 0

Criteria for Model Comparison
The parameters in the mean value function m(t) of models can be estimated using various parameter estimation methods. Here, the parameter for the mean value function m(t) is estimated by the LSE (least squares estimate) method. Eight common criteria for model comparison, i.e., MSE (mean squared error), RMSE (root mean squared error), AIC (Akaike's information criterion), R 2 (correlation index of the regression curve equation), Adj R 2 (adjected R 2 ), SAE (sum of absolute error), PRR (predictive ratio risk), and PP (predictive power), will be used for the goodness-of-fit estimation of the model and to compare the proposed model with other models in Table 1. These criteria are described as follows in Table 2. For six of these criteria, i.e., MSE, RMSE, AIC, SAE, PRR, and PP, the smaller the value, the closer the model fits relative to other models. R 2 and Adj R 2 values should be close to 1 for a good fit. Table 2. List of criteria for model comparisons. MSE (mean squared error), RMSE (root mean squared error), AIC (Akaike's information criterion), R 2 (correlation index of the regression curve equation), Adj R 2 (adjected R 2 ), SAE (sum of absolute error), PRR (predictive ratio risk), and PP (predictive power)

No.
Criteria Formula In Table 2,m(t i ) is the estimated cumulative number of failures at t i for i = 1, 2, · · · , n; y i is the total number of failures observed at time t i ; n is the actual data, which include the total number of observations; and m is the number of unknown parameters in the model. We use the following Equation (9) to obtain the confidence interval of the NHPP SRM in Table 1.
where Z α/2 is the percentile of the standard normal distribution and α is a significant level [30].

Model Analysis
Dataset #1, presented in Table 3, was reported by [31]. The failure data come from two releases of a large medical record system (LMRS). The week index is from 1 week to 18 weeks, and there are 176 cumulative failures at 18 weeks in Dataset #1. In Dataset #2, the week index is from 1 week to 17 weeks, and there are 204 cumulative failures at 17 weeks. Detailed information can be seen in [31].
Dataset #3, given in Table 4, was reported by [32] based on system test data for a telecommunication system (TS data set). In Dataset #3, the week index is from 1 week to 21 weeks, and there are 43 cumulative failures at 21 weeks. Detailed information can be seen in [32].  Table 4. System test data for a telecommunication system (TS)-Dataset #3.

Week Index Exposure Time (Cumulative System Test Hours)
Failures Cumulative Failures   1  416  3  3  2  832  1  4  3  1248  0  4  4  1664  3  7  5  2080  2  9  6  2496  0  9  7  2912  1  10  8  3328  3  13  9  3744  4  17  10  4160  2  19  11  4576  4  23  12  4992  2  25  13  5408  5  30  14  5824  2  32  15  6240  4  36  16  6656  1  37  17  7072  2  39  18  7488  0  39  19  7904  0  39  20  8320  3  42  21  8736  1  43 We obtained the estimated parameters and the eight common criteria in Table 2 of all 13 models at t = 1, 2, 3, · · · , 18 from Dataset #1, at t = 1, 2, 3, · · · , 17 from Dataset #2, and at t = 1, 2, 3, · · · , 21 from Dataset #3. We used the LSE method for parameter estimation. Table 5 shows the estimated  parameters and Tables 6-8 show the values of the eight common criteria for all 13 models. The closer the values of MSE, RMSE, AIC, SAE, PRR, and PP are to 0 and the closer the values of R 2 and Adj R 2 are to 1, the better. As a result, Table 6 shows that for the proposed model, the values of MSE, RMSE, AIC, SAE, and PP are 93.2910, 9.6587, 182.2178, 99.6157, and 0.9340, respectively, which are lower than those of the other models. In addition, the values of R 2 and Adj R 2 are 0.9836 and 0.9726, respectively, which are higher than those of the other models. In Table 7, the proposed model for the values of MSE,  RMSE, AIC, SAE, PRR and PP are 25.2102, 5.0210, 131.1529, 56.3823, 0.0073, and 0.0072, respectively, which are lower than those of the other models and, the values of R 2 and Adj R 2 are 0.9872 and 0.9773, respectively, which are higher than those of the other models. In Table 8, the proposed model for the values of MSE, RMSE, SAE, PRR, and PP are 1.1626, 1.0783, 15.7154, 0.2447, and 0.1851, respectively, which are lower than those of the other models, and the values of R 2 and Adj R 2 are 0.9964 and 0.9944, respectively, which are higher than those of the other models. As can be seen from the results, the proposed model is the most suitable when comparing the common criteria with the other models. Figures 2-4 show graphs of the mean value functions for all 13 models for Dataset #1, #2 and #3, respectively. Figures 5-7 show graphs of the 95% and 99% confidence limits of the proposed model for Dataset #1, #2 and #3. Figures 8-10 show graphs of the relative error of all 13 models for Dataset #1, #2, and #3, respectively, and, the proposed model provides a more accurate prediction because the value of relative error is closer to zero. Figures 11-13 compare RMSE and AIC of all 13 models for Dataset #1, #2, and #3, showing that the values of the proposed model are close to zero. In addition, Tables A1-A3 in Appendix A list the 99% confidence intervals of all 13 models for Dataset #1, #2 and #3.                                        As shown in Table 9, the optimal release time T*, with the expected minimum total cost of 19,157.92 is 102.6. The software reliability at T* = 102.6 is 0.8564, which is larger than R = 0.85 for the baseline case. In addition, we compared some of the operating periods and coefficients to study the impact of the operating period and various coefficients on the optimal release time with the expected minimum total cost and the software reliability. First, we find the impact of the operating period x on the optimal release time with the expected minimum total cost by changing the value of the operating period x and comparing the optimal release times and the software reliability. We change the operating period x from 9 weeks to 12, 18, and 21 weeks. In Table 9, when x = 9, the optimal release time T* is 88.0, the expected minimum total cost is 19,024.95, and the software reliability is 0.8729. When x = 12, the optimal release time T* is 96.1, the expected minimum total cost is 19,098.35, and the software reliability is 0.8641. When x = 18, the optimal release time T* is 108.1, the expected minimum total cost is 19,211.82, and the software reliability is 0.8496. When x = 21, the optimal release time T* is 112.7, the expected minimum total cost is 19,261.71 is, and the software reliability is 0.8431. The expected total software cost is as follows.
We consider coefficients in the cost model for the baseline case, which are set as follows: As shown in Table 9, the optimal release time T*, with the expected minimum total cost of 19,157.92 is 102.6. The software reliability at T* = 102.6 is 0.8564, which is larger than R 0 = 0.85 for the baseline case. In addition, we compared some of the operating periods and coefficients to study the impact of the operating period and various coefficients on the optimal release time with the expected minimum total cost and the software reliability. First, we find the impact of the operating period x on the optimal release time with the expected minimum total cost by changing the value of the operating period x and comparing the optimal release times and the software reliability. We change the operating period x from 9 weeks to 12, 18, and 21 weeks. In Table 9, when x = 9, the optimal release time T* is 88.0, the expected minimum total cost is 19,024.95, and the software reliability is 0.8729. When x = 12, the optimal release time T* is 96.1, the expected minimum total cost is 19,098.35, and the software reliability is 0.8641. When x = 18, the optimal release time T* is 108.1, the expected minimum total cost is 19,211.82, and the software reliability is 0.8496. When x = 21, the optimal release time T* is 112.7, the expected minimum total cost is 19,261.71 is, and the software reliability is 0.8431. Figure 14 shows graphs of the expected total cost and the software reliability subject to the operating period. Figure 15 shows graphs of the expected total cost based on different operating periods. As shown in Table 9, as the operating period increases, the optimal release time also increases, and the software reliability decreases.  Figure 14 shows graphs of the expected total cost and the software reliability subject to the operating period. Figure 15 shows graphs of the expected total cost based on different operating periods. As shown in Table 9, as the operating period increases, the optimal release time also increases, and the software reliability decreases.  Second, we find the impact of the cost coefficients, C1, C2, C3, and C4 on the optimal release time with the expected minimum total cost and the software reliability by changing their values. When C1 = 2.5, the optimal release time T* is 129.1, the expected minimum total cost is 18,907.92, and the software reliability is 0.9074. When C1 = 7.5, the optimal release time T* is 89.2, the expected minimum total cost is 195,395.72, and the software reliability is 0.8135. When C2 = 80, the optimal release time T* is 104.3, the expected minimum total cost is 15,496.43, and the software reliability is 0.8608. When C2 = 120, the optimal release time T* is 100.9, the expected minimum total cost is 22,819.4, and the software reliability is 0.8518. When C3 = 1500, the optimal release time T* is 92.6, the expected minimum total cost is 19,078.56, and the software reliability is 0.8260. When C3 = 2500, the optimal release time T* is 111.2, the expected minimum total cost is 19,233.28, and the software reliability is 0.8767. When C4 = 200, the optimal release time T* is 100.4, the expected minimum total cost if 19,141.58, and the software reliability is 0.8504. When C4 = 400, the optimal release time T* is 104.7, the expected minimum total cost is 19,174.25, and the software reliability is 0.8618. As shown in Table 10, as the values of C1 and C2 increase, the optimal release time decreases, and the software Second, we find the impact of the cost coefficients, C1, C2, C3, and C4 on the optimal release time with the expected minimum total cost and the software reliability by changing their values. When C1 = 2.5, the optimal release time T* is 129.1, the expected minimum total cost is 18,907.92, and the software reliability is 0.9074. When C1 = 7.5, the optimal release time T* is 89.2, the expected minimum total cost is 195,395.72, and the software reliability is 0.8135. When C2 = 80, the optimal release time T* is 104.3, the expected minimum total cost is 15,496.43, and the software reliability is 0.8608. When C2 = 120, the optimal release time T* is 100.9, the expected minimum total cost is 22,819.4, and the software reliability is 0.8518. When C3 = 1500, the optimal release time T* is 92.6, the expected minimum total cost is 19,078.56, and the software reliability is 0.8260. When C3 = 2500, the optimal release time T* is 111.2, the expected minimum total cost is 19,233.28, and the software reliability is 0.8767. When C4 = 200, the optimal release time T* is 100.4, the expected minimum total cost if 19,141.58, and the software reliability is 0.8504. When C4 = 400, the optimal release time T* is 104.7, the expected minimum total cost is 19,174.25, and the software reliability is 0.8618. As shown in Table 10, as the values of C1 and C2 increase, the optimal release time decreases, and the software reliability decreases. On the contrary, as the values of C3 and C4 increase, the optimal release time increases, and the software reliability increases.
17 of 25 reliability decreases. On the contrary, as the values of C3 and C4 increase, the optimal release time increases, and the software reliability increases.

Sensitivity Analysis
A sensitivity analysis refers to an analysis of the effect of a parameter change by substituting all the possible values of the parameter when it is uncertain in a model. We conducted a sensitivity analysis on the parameters of the proposed model. A sensitivity analysis of parameter θ can be performed using the following formula: where S is the sensitivity of the optimal release time, S is the sensitivity of the software reliability, Figure 15. Expected total cost for the baseline case.

Sensitivity Analysis
A sensitivity analysis refers to an analysis of the effect of a parameter change by substituting all the possible values of the parameter when it is uncertain in a model. We conducted a sensitivity analysis on the parameters of the proposed model. A sensitivity analysis of parameter θ can be performed using the following formula: where S T is the sensitivity of the optimal release time, S R is the sensitivity of the software reliability, and P is the parameter of the proposed model. Here, the relative changes in parameters are set as 0%, ±10%, ±20%, and ±30%. As a result, as shown in Table 11, the variation in the parameter p is greater than the other parameters for the sensitivity of the optimal release time. In Table 12, again, the variations in the parameters a and p are greater than the other parameters for the sensitivity of the software reliability. Figure 16 shows the graphs of variations in parameters of the optimal release time. the variations in the parameters a and p are greater than the other parameters for the sensitivity of the software reliability. Figure 16 shows the graphs of variations in parameters of the optimal release time.  Figure 16. Variation in parameters of the optimal release time.

Conclusions
Many NHPP SRMs have been developed in a controlled (testing) environment. The environments in which software runs are very diverse and complex. Moreover, it is very important to decide when, how, and at what cost to release software to users. We proposed a new NHPP SRM Figure 16. Variation in parameters of the optimal release time.

Conclusions
Many NHPP SRMs have been developed in a controlled (testing) environment. The environments in which software runs are very diverse and complex. Moreover, it is very important to decide when, how, and at what cost to release software to users. We proposed a new NHPP SRM with a fault detection rate function affected by the probability of fault removal on failure when considering operating environments and discussed the optimal release time and software reliability with the new NHPP SRM. A comparison of eight common criteria through numerical examples shows that the proposed new NHPP SRM model is more suitable than the other NHPP SRMs. Additionally, we analyzed the software reliability, the optimal release time, and the expected total cost with respect to changes in operating period and cost coefficients. A sensitivity analysis was performed to examine the parameter uncertainty of the parameters for the proposed model. As a result, it was confirmed that the variation of p, the parameter of the defect detection function, is the largest.

Conflicts of Interest:
The authors declare no conflict of interest.