Next Article in Journal
Propagation Characteristics of a Twisted Cosine-Gaussian Correlated Radially Polarized Beam
Previous Article in Journal
Phase-Coded and Noise-Based Brillouin Optical Correlation-Domain Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Software Reliability Model Considering the Syntax Error in Uncertainty Environment, Optimal Release Time, and Sensitivity Analysis

1
Department of Computer Science and Statistics, Chosun University, 309 Pilmun-daero, Dong-gu, Gwangju 61452, Korea
2
Department of Industrial and Systems Engineering, Rutgers University, 96 Frelinghuysen Road, Piscataway, NJ 08855-8018, USA
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2018, 8(9), 1483; https://doi.org/10.3390/app8091483
Submission received: 27 July 2018 / Revised: 24 August 2018 / Accepted: 25 August 2018 / Published: 28 August 2018

Abstract

:
The goal set by software developers is to develop high quality and reliable software products. During the past decades, software has become complex, and thus, it is difficult to develop stable software products. Software failures often cause serious social or economic losses, and therefore, software reliability is considered important. Software reliability growth models (SRGMs) have been used to estimate software reliability. In this work, we introduce a new software reliability model and compare it with several non-homogeneous Poisson process (NHPP) models. In addition, we compare the goodness of fit for existing SRGMs using actual data sets based on eight criteria. The results allow us to determine which model is optimal.

1. Introduction

The basic goal set by software developers is to develop high quality software products that are stable and reliable. As technology has advanced, consumers demand more functions. For this reason, the software structure has become more complex during the last few decades, which makes it difficult to produce software with high quality and stable reliability [1]. Software reliability growth models (SRGMs) have been used by researchers to estimate software reliability. The goodness of fit is decided by common criteria, which will be discussed in Section 3. Many SRGMs have been developed during the past decades. Most SRGMs are based on a non-homogeneous Poisson process (NHPP). These models assume that the total number of failures follows m ( t ) , which is a mean value function based on a NHPP. SRGMs are distinguished by its unique function, m ( t ) . This m ( t ) reflects various environments through the assumptions of the parameters. After developing a software reliability model, we fit the model to the actual data and estimate its goodness of fit. If the model that best fits the actual data is determined, it is possible to estimate the optimal release time and the minimum expected cost of development, thereby establishing a release policy. Thus, it is crucial to consider not only the development of a model that reflects diverse environmental factors but also one that presents the best goodness of fit for actual data sets.
Goel–Okumoto [2] developed an exponential model that has been extended to develop SRGMs as a basic framework. Yamada [3,4] proposed the inflection S-shaped NHPP model and the delayed S-shaped NHPP model, which incorporates the testing effort. Quadri et al. [5] and Ahmad et al. [6] extended the SRGM with the exponentiated Weibull to consider testing efforts. Pham et al. [7] proposed the Pham–Zhang Model that has the Inflection S-shaped model as fault detection function. Pham and Zhang [8] proposed a generalized NHPP software testing coverage model. Several studies have focused on the factors of the development environment. Teng et al. [9] dealt with software reliability models with parameters that indicated the random field environments. Pham [10] discussed a new model that incorporates the uncertainty of the system fault detection rate per unit of time subjected to operating environments. Inoue et al. [11] conducted software reliability modeling considering the uncertainty of the testing environment. Li et al. [12] performed new testing coverage modeling that considers not only the error generation but also the fault removal efficiency based on NHPP. Song et al. [13,14,15] considered operating environments. They applied random variable to the software reliability model, which represent the uncertainty in the operating environment. Zhu et al. [16] defined two type of software faults for considering software fault dependency and imperfect fault removal. Zhu et al. [17] described a theoretic software reliability model follows gamma distribution incorporating the fault detection process is a stochastic process due to the randomness caused by the environmental factors. As shown above, many software reliability models have been proposed. In addition, several papers have been suggested from the statistical method perspective. Zeephongsekul et al. [18] applied the maximum-likelihood estimation of parameters. Candini et al. [19] referred to a Bayesian Monte Carlo method to estimate small failure probabilities with uncertainties. Meta learning and deep learning have been brought to attention recently. Caiuta et al. [20] applied meta-learning algorithms in software reliability models. Tamura et al. [21] performed simulations to select the optimal software reliability model based on deep learning. Tamura et al. [22] and Wang et al. [23] studied the prediction of the number of software failures based on a deep learning model. Kim et al. [24] explained application of the software reliability model to increase the software reliability, and introduced not only some analytical methods but also the prediction and estimation results.
In this paper, we propose a new NHPP SRGM based on the Weibull distribution, which considers testing time with syntax error. Then, we discuss optimal release time and sensitivity analysis using the proposed model. In Section 2, the proposed model mean value function with common NHPP SRGMs is presented. We show eight criteria to estimate the goodness of fit in Section 3. Section 4 discusses the best model by various criteria using actual data sets. Section 5 examines software release policy and Section 6 deals with sensitivity analysis of parameters affecting the release policy. Finally, conclusions are presented in Section 7 and future research topics are suggested in Section 8.

2. Proposed Software Reliability Model

2.1. Non Homogeneous Poisson Process Model

General software reliability models follow the NHPP as follows;
  Pr { N ( t )   = n } = [ m ( t ) ] n n ! e m ( t ) ,   n = 0 ,   1 ,   2 ,     .
where m(t) is mean value function, which is expected number of failures detected by testing time t.
It can be written as:
  m ( t )   = 0 t λ ( s ) ds
where λ ( t ) is intensity function of failure.
Most of NHPP SRGM is expressed using the differential equation as:
  dm ( t )   dt = b ( t ) [ a ( t ) m ( t ) ]
Through solution of Equation (3), that make to find unique m(t) using a ( t ) and b ( t ) . Also, this process can be applied to assume for software testing.

2.2. Proposed Software Reliability Model

The proposed model is also based on the NHPP. The mean value functions of the proposed model and the testing coverage model are similar, but there are some differences.
The important point is the proposed model consider uncertainty environment. The actual testing environment and the theoretical test environment of the software are very different. In an actual testing environment, the developer may have unexpected variables like syntax error, which is considered in proposed model.
Figure 1 shows infrastructure of time testing. Generalized testing time of the SRGMs is 0 < t. However, if the software that will perform the test has a syntax error, it cannot be compiled and tested. To continue testing it, the code must be modified to remove the syntax errors. Thus, the actual testing time differs from the theoretical testing time. This is a point that the proposed model considered. Testing time of proposed model considers the syntax error, which is an error in the syntax or tokens written in a programming language. So, the testing time of proposed model is t 0 < t .
The mean value function m ( t ) of the proposed model is based on the Weibull distribution model [14], which is the expected number of software failures by time t and is defined as follows:
  dm ( t )   dt = η [ b ( t ) ] [ N m ( t ) ]
where b ( t ) is the failure detection rate function, N is the expected number of failures which exists in the software before testing phase. η is a random variable that represents the uncertainty of the system fault detection rate in the operating environments with a probability density function:
  m ( t )   = N ( 1 e t 0 t b ( s ) ds ) ,   t > t 0 > 0
  m ( t )   = N ( 1 β β + t 0 t b ( s ) ds ) α
where t 0 is the time when debugging starts after modifying the code causing syntax errors.
b ( t ) of proposed model follows a Weibull failure detection rate function, that can be written as:
  b ( t )   = a b bt b 1 ,   a ,   b > 0
where a is a scale parameter and b is a shape parameter, that are known.
Then, we find solution of m ( t ) for proposed model with syntax error by substituting b ( t ) above into Equation (6). Finally, the mean value function is defined by:
  m ( t )   = N ( 1 β β + a ( t t 0 ) b ) α .
Table 1 shows the mean value function m ( t ) of using NHPP SRGMs for existing and well-known software reliability models.

3. Numerical Example

To assess whether the proposed model well, we can compare with other NHPP SRGMs through estimation of criteria using actual data sets.

3.1. Criteria

In this section, we compare the NHPP SRGMs using the two data sets (in Section 3.2) and discuss the goodness of fit of the models. Eight criteria—MSE, PRR, PP, R 2 , AIC, SAE, PRV, and RMSPE—are used for the comparison of goodness of fit.
The first criterion is the mean squared error (MSE), which measures the distance of the model estimates from the actual data considering the number of observations and number N of parameters in the models. It is defined as follows:
  MSE = i = 1   n ( m ( t i ) y i ) 2 n N ,
where y i is the total number of failures observed at time t i according to the real data and M ( t i ) is the estimated cumulative number of failures at time t i for i = 1 ,   2 ,   n .
The second criterion, the predictive ratio risk (PRR) [31], measures the distance of the model estimates from the actual data against the model estimate. It is defined by:
  PRR = i = 1   n ( m ^ ( t i ) y i m ^ ( t i ) ) 2 .
The third criterion, the predictive power (PP) [31], measures the distance of model actual data from the estimates against the actual data. It is defined by:
  PP = i = 1   n ( m ^ ( t i ) y i y i ) 2 .
The fourth criterion is R-square ( R 2 ) [32], which is used to examine the fitting power of the SRGMs, and it is the correlation index of the regression curve equation, which is expressed as follows:
  R 2 = 1 i = 1   n ( y i m ^ ( t i ) ) 2 i = 1 n ( y i y ¯ ) 2 .
The fifth criterion is Akaike’s information criteria (AIC) [33], which is used to compare for maximization of the likelihood function. It can be considered as an approximate distance from the true probability model:
AIC = 2 logL + 2 N ,
where N is the degree of freedom. L and lnL are given as follows:
  L = i = 1   n ( m ( t i ) m ( t i 1 ) ) y i y i 1 ( y i y i 1 ) ! e ( m ( t i ) m ( t i 1 ) ) .
  lnL = i = 1   n { ( y i y i 1 ) ln ( m ( t i ) m ( t i 1 ) ) ( m ( t i ) m ( t i 1 ) ) ln ( ( y i y i 1 ) ! ) } .
The sixth criterion is the sum of absolute error (SAE) [14], which measures the distance between the prediction for the number of failures and the observed data. SAE is defined by:
  SAE = i = 0   n | m ( t i ) y i | .
The seventh criterion is the predicted relative variation (PRV), which is called variance [34,35,36]. It means the standard deviation of the prediction bias and is defined as:
  PRV = i = 1   n ( ( m ( t i ) y i ) Bias ) 2 n 1   ,
where bias is given as follows:
  Bias   = i = 1   n [ m ( t i ) y i n ]   .
The last criterion is the root mean square prediction error (RMSPE). It can estimate the closeness with which the model predicts the observation [34,35,36]:
  RMSPE = PRV   2 + Bias 2
For all these criteria, excluding R 2 , a smaller value means better goodness of fit of the model. On the contrary, the larger the value R 2 is, the better the goodness of fit that it indicates.

3.2. Data Sets Information

Table 2 and Table 3 present the cumulative failures of each data set [37]. Data set 1, which was collected from software based on code from a product with enhancements provided with a new hardware platform, was observed during 13 months in the field. The total failures were collected for 58,633 system-days. In this work, t i is used as cumulative real time, and thus, t 1 , t 2 ,   , t 13 = { 1249 ,   4721 ,   58 , 633 } . Data set 2 includes test data that was collected from a product that has a high rate of wireless data service, during a combination of feature testing and load testing. When t i = { 1 ,   2 ,   ,   19 } , it shows 19 observations of the field failure process and 22 total failures. Detailed information can be seen in [37].

4. Results

4.1. Comparison of Goodness of Fit

We estimate the parameter of all nine models at t 1 , t 2 ,   , t 13 = { 1249 ,   4721 ,   58 , 633 } from data set 1. The parameters of all nine models were also estimated at t 1 , t 2 ,   , t 19 = { 1 ,   2 ,   ,   19 } from data set 2. Then, we use the least squares estimation (LSE) method with Matlab and R. Table 4 summarizes the parameters estimated for all the SRGMs in Table 1. Table 5 and Table 6 compare all the models using the criteria presented in Section 2, for both data sets. For the reasons mentioned above, except for R 2 , to have smaller values in most of the criteria is considered better. On the contrary, the larger the value of R 2 is, the better.
In Table 5, the values for the new model of MSE, PRR, PP, SAE, PRV, and RMSPE are the lowest in comparison with those of the other models. Furthermore, the R 2 value is 0.9937, which is the largest. Although the value of AIC is not the lowest among the values of all the SRGMs, we can safely say that the proposed model is sufficient to fit better than the other SRGMs for data set 1.
In Table 6, in the values of the new model, it can also be seen that its value of AIC is not the smallest among those of all the SRGMs. However, the values of MSE, PRR, PP, SAE, PRV, and RMSPE for the proposed model are significant. In general, when all the values are considered, the new model proposed in this work is the optimal among all the SRGMs for data set 2.

4.2. Confidence Interval

We also estimate the confidence interval [31] of the new proposed model for the data sets in Table 7, which is defined as:
  m ( t )   = m ( t ) ± z α / 2 m ( t ) ,
where z α / 2 is the 100(1 − α ) percentile of the standard normal distribution. Figure 2 and Figure 3 show the confidence interval of the proposed model for both data sets. Figure 4 and Figure 5 show the mean value functions of all the SRGMs for both data sets.

5. Software Release Policy

5.1. Optimal Release Time and Cost

In this section, we address the optimal release policy from the point of the view of time and cost. It can be helpful to find the optimal software release time ( T * ) that has the minimum expected total software development cost. Although the optimal release policy has been studied for decades, it is still a sensitive problem. For example, if the testing period is long, the software can be reliable, but the software development cost is increased. On the contrary, if the testing period is short, the product can be unreliable. Moreover, risk costs, such as the follow-up service cost, can be increased. Thus, it is crucial to find a balanced time point between release time and minimum cost.
Figure 6 shows the system development lifecycle considered in the following cost model, which includes the testing phase before release time T, the testing environment period, the warranty period, and the operational life in the actual field environment (that is usually quite different from the testing environment) [33].
The expected total software development cost C ( T ) depends on various factors and it can be expressed as:
C ( T ) = C 0 + C 1 T + C 2 m ( T ) μ Y + C 3 ( 1 R ( x | T ) ) + C 4 [ m ( T + T W ) m ( T ) ] μ W ,
where C 0 is the set-up cost for the test, C 1 T is the cost of test, C 2 m ( T ) μ y is the expected cost to remove all failures detected by time T during the testing period, C 3 ( 1 R ( x | T ) ) is the penalty cost of removing failures that occur after the system release time T, and C 4 [ m ( T + T W ) m ( T ) ] μ W is the expected cost to remove all of the failures that are detected during the warranty period [ T ,   T + T W ] . Furthermore, it requires the assumption that the cost required to remove errors during the operating period is higher than that during the testing period and the time that is needed is much longer.
Finally, the expected total software cost can be calculated using the m ( t ) function consisting of the estimated parameters. The primary purpose of the equation is to find the optimal software release time ( T * ) minimizing the expected total cost.

5.2. Results of Optimal Release Time and Cost

We apply the mean value function m ( t ) obtained in Section 4 to the defined cost model C ( T ) and consider coefficients in the cost model for the baseline case. The coefficients of the baseline case are listed in Table 8.
Table 9 presents the values of release time and expected total cost under certain conditions derived from variation of the cost coefficients and warranty period T w . Similarly, in Table 10, Table 11, Table 12 and Table 13 appear the values for change of parameters of the C ( T ) function. Figure 7, Figure 8, Figure 9, Figure 10 and Figure 11 illustrate Table 9, Table 10, Table 11, Table 12 and Table 13 when T w is 10. In addition, the baseline case is drawn as a red line.
In Table 9, T * and C ( T ) increase when the warranty period T w increases. Further, C ( T ) increases when C 0 increases in Table 10. In Table 11, Table 12 and Table 13, T * and C ( T ) increase when C 1 , C 3 , and C 4 increase. As a result, C 0 does not affect T * , but it affects C ( T ) . C 1 affects T * , and has some effect on C ( T ) . C 3 and C 4 have a significant effect on T * and C ( T ) .

6. Sensitivity Analysis

6.1. Sensitivity Analysis of Parameters

We conduct a sensitivity analysis of the parameters for the optimal release [38]. S T is defined as the relative change of release time when θ is changed by 100p%, where θ is a parameter of the mean value function, m ( t ) . S T can be expressed as:
  S T = T ( ( 1 + θ   ) P ) T ( P ) T ( P ) .  

6.2. Results of Sensitivity Analysis

Table 14 and Figure 12 reveal how much each parameter is changed for release time T. From Table 14, b is the most sensitive parameter and T 0 is the most insensitive parameter. Moreover, the values in the sensitivity analysis of parameters α and β are almost equal because the software reliability function has the same amount of variation when parameters α and β vary. In brief, the optimal release time ( T * ) increases with the decrease in a and b . Furthermore, T * decreases with the increase in α , β , T 0 , and N .
If the software is released too early, more resources will be required, such as risk cost, update cost, and human resources from the users and the development company. On the contrary, if the software is released too late, it will be necessary to assume higher cost of development. Therefore, the overestimation of α , β , T 0 , and N and the underestimation of a and b , which can lead to misestimations, such as underestimation of the optimal release time ( T * ), should be avoided.

7. Conclusions

We suggested a new SRGM considering the start of actual debugging time for software affected by syntax errors. To compare it with several existing NHPP SRGMs, we applied two real data sets. Parameters for all models were estimated by the LSE method. In addition, common criteria were used to compare the goodness of fit in order to discuss the optimal model (Table 5 and Table 6). The values of the criteria of the proposed model are better than those of the other SRGMs, as listed in Table 1. The values of AIC for both data sets were not the lowest compared to those of the other models, but the proposed model, from the point of view of the other criteria, was the most significant. Then, we applied a cost model to the new proposed SRGM, and found out how much each parameter is changed regarding coefficients, release time, and cost.
In summary, the proposed model fits better than all the other models for both data sets. Through in Section 5, we found out variance of C 3 and C 4 affecting the field environment are greater than other coefficients. In order to establish the optimal release policy, it is necessary to subdivide the coefficients related to the field environment. As discussed in Section 6, b is the most sensitive parameter and T 0 is the most insensitive parameter (in Table 14). Overestimation of α , β , T 0 , and N and underestimation of a and b have to be avoided because they can lead to misestimation of the optimal release time.
Recently, many researchers have studied software reliability model considering software development environment. Likewise, we studied software reliability model considering uncertainty of software development environment like syntax error. Then, we provided optimal release policies having minimized the total development cost for various environments. Therefore, when the other data sets and various environments are given, the proposed model is beneficial.

8. Future Research

A further direction of this study will be to find diverse and more recent data sets to prove clearly the goodness of fit of the new model. In addition, we estimated parameters using LSE method. Therefore, we need to apply MLE or Bayesian inference to estimate parameters, and also need to consider the change-point.

Author Contributions

D.H.L. analyzed the data; K.Y.S. contributed analysis tools; D.H.L. and K.Y.S. wrote the paper; I.H.C. supported funding; H.P. suggested development of new proposed model; I.H.C. and H.P. designed the paper.

Funding

This research was supported by NRF-2015R1D1A1A01060050, NRF-2018R1D1A1B07045734.

Acknowledgments

We are pleased to thank the Editor and the Referees for their useful suggestions. This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2015R1D1A1A01060050, NRF-2018R1D1A1B07045734).

Conflicts of Interest

The authors declare no conflict of interest.

Acronyms

SRGMSoftware Reliability Growth Model
NHPPNon Homogenous Poisson Process
LSELeast Squares Estimation
MLEMaximum Likelihood Estimation
MSEMean Squared Error
PRRPredictive Ratio Risk
PPPredictive Power
R2R-square
AICAkaike’s Information Criteria
SAESum of Absolute Error
PRVPredicted Relative Variation
RMSPERoot Mean Square Prediction Error

References

  1. Clarke, P.; O’Connor, R.V. The situational factors that affect the software development process: Towards a comprehensive reference framework. Inf. Softw. Technol. 2012, 54, 433–447. [Google Scholar] [CrossRef] [Green Version]
  2. Musa, J.D.; Iannino, A.; Okumoto, K. Software Reliability: Measurement, Prediction, and Application; McGraw-Hill: New York, NY, USA, 1987. [Google Scholar]
  3. Yamada, S.; Ohba, M.; Osaki, S. S-shaped reliability growth modeling for software fault detection. IEEE Trans. Reliab. 1983, 32, 475–484. [Google Scholar] [CrossRef]
  4. Yamada, S.; Ohba, M.; Osaki, S. Software Reliability Growth Models with Testing-effort. IEEE Trans. Reliab. 1986, 35, 19–23. [Google Scholar] [CrossRef]
  5. Quadri, S.M.K.; Ahmad, N.; Peer, M.A. Software optimal release policy and reliability growth modeling. In Proceedings of the 2nd National Conference on Computing for Nation Development, New Delhi, India, 8–9 February 2008; pp. 423–431. [Google Scholar]
  6. Ahmd, N.; Khan, M.G.M.; Rafi, L.S. A study of testing-effort dependent inflection S-shaped software reliability growth models with imperfect debugging. Int. J. Qual. Reliab. Manag. 2009, 27, 89–110. [Google Scholar] [CrossRef]
  7. Pham, H.; Zhang, X. An NHPP software reliability models and its comparison. Int. J. Reliab. Qual. Saf. Eng. 1997, 4, 269–282. [Google Scholar] [CrossRef]
  8. Pham, H.; Zhang, X. Software Reliability and Cost Models with Testing Coverage. Eur. J. Oper. Res. 2003, 145, 443–454. [Google Scholar] [CrossRef]
  9. Teng, X.; Pham, H. A new methodology for predicting software reliability in the random field environments. IEEE Trans. Reliab. 2006, 55, 458–468. [Google Scholar] [CrossRef]
  10. Pham, H. Loglog Fault-Detection Rate and Testing Coverage Software Reliability Models Subject to Random Environments. Vietnam J. Comput. Sci. 2014, 1, 39–45. [Google Scholar] [CrossRef]
  11. Inoue, S.; Ikeda, J.; Yamda, S. Bivariate change-point modeling for software reliability assessment with uncertainty of testing-environment factor. Ann. Oper. Res. 2016, 244, 209–220. [Google Scholar] [CrossRef]
  12. Li, Q.; Pham, H. A testing-coverage software reliability model considering fault removal efficiency and error generation. PLoS ONE 2017, 12, e0181524. [Google Scholar] [CrossRef] [PubMed]
  13. Song, K.Y.; Chang, I.H.; Pham, H. A three-parameter fault-detection software reliability model with the uncertainty of operating environments. J. Syst. Sci. Syst. Eng. 2017, 26, 121–132. [Google Scholar] [CrossRef]
  14. Song, K.Y.; Chang, I.H.; Pham, H. A Software Reliability Model with a Weibull Fault Detection Rate Function Subject to Operating Environments. Appl. Sci. 2017, 7, 983. [Google Scholar] [CrossRef]
  15. Song, K.Y.; Chang, I.H.; Pham, H. An NHPP Software Reliability Model with S-Shaped Growth Curve Subject to Random Operating Environments and Optimal Release Time. Appl. Sci. 2017, 7, 1304. [Google Scholar] [CrossRef]
  16. Zhu, M.; Pham, H. A two-phase software reliability modeling involving with software fault dependency and imperfect fault removal. Comput. Lang. Syst. Struct. 2018, 53, 27–42. [Google Scholar] [CrossRef]
  17. Zhu, M.; Pham, H. A software reliability model incorporating martingale process with gamma-distributed environmental factors. Ann. Oper. Res. 2018, 1–22. [Google Scholar] [CrossRef]
  18. Zeephongsekul, P.; Jayasinghe, C.L.; Fiondella, L.; Nagaraju, V. Maximum-Likelihood Estimation of Parameters of NHPP Software Reliability Models Using Expectation Conditional Maximization Algorithm. IEEE Trans. Reliab. 2016, 65, 1571–1583. [Google Scholar] [CrossRef]
  19. Candini, F.; Gioletta, A. A Bayesian Monte Carlo-based algorithm for the estimation of small failure probabilities of systems affected by uncertainties. Reliab. Eng. Syst. Saf. 2016, 153, 15–27. [Google Scholar] [CrossRef]
  20. Caiuta, R.; Pozo, A.; Vergilio, S.R. Meta-learning based selection of software reliability models. Automat. Softw. 2017, 24, 575–602. [Google Scholar] [CrossRef]
  21. Tamura, Y.; Yamda, S. Software Reliability Model Selection Based on Deep Learning with Application to the Optimal Release Problem. J. Ind. Eng. Manag. Sci. 2018, 2016, 43–58. [Google Scholar] [CrossRef]
  22. Tamura, Y.; Matsumoto, S.; Yamada, S. Software Reliability Model Selection Based on Deep Learning. In Proceedings of the International Conference on Industrial Engineering Management Science and Application, Jeju, Korea, 23–26 May 2016; pp. 1–5. [Google Scholar]
  23. Wang, J.; Zhang, C. Software reliability prediction using a deep learning model based on the RNN encoder-decoder. Reliab. Eng. Syst. Saf. 2018, 170, 73–82. [Google Scholar] [CrossRef]
  24. Kim, K.C.; Kim, Y.H.; Shin, J.H.; Han, K.J. A Case Study on Application for Software Reliability Model to Improve Reliability of the Weapon System. J. KIISE 2011, 38, 405–418. [Google Scholar]
  25. Goel, A.L.; Okumoto, K. Time dependent error detection rate model for software reliability and other performance measures. IEEE Trans. Reliab. 1979, 28, 206–211. [Google Scholar] [CrossRef]
  26. Ohba, M. Inflexion S-shaped software reliability growth models. In Stochastic Models in Reliability Theory; Osaki, S., Hatoyama, Y., Eds.; Springer: Berlin, Germany, 1984; pp. 144–162. [Google Scholar]
  27. Yamada, S.; Tokuno, K.; Osaki, S. Imperfect debugging models with fault introduction rate for software reliability assessment. Int. J. Syst. Sci. 1992, 23, 2241–2252. [Google Scholar] [CrossRef]
  28. Pham, H.; Nordmann, L.; Zhang, X. A general imperfect software debugging model with S-shaped fault detection rate. IEEE Trans. Reliab. 1999, 48, 169–175. [Google Scholar] [CrossRef]
  29. Pham, H. Software Reliability Models with Time Dependent Hazard Function Based on Bayesian Approach. Int. J. Autom. Comput. 2007, 4, 325–328. [Google Scholar] [CrossRef]
  30. Chang, I.H.; Pham, H.; Lee, S.W.; Song, K.Y. A testing-coverage software reliability model with the uncertainty of operation environments. Int. J. Syst. Sci. Oper. Logist. 2014, 1, 220–227. [Google Scholar]
  31. Pham, H. System Software Reliability; Springer: London, UK, 2006; p. 13. [Google Scholar]
  32. Li, Q.; Pham, H. NHPP software reliability model considering the uncertainty of operating environments with imperfect debugging and testing coverage. Appl. Math. Model. 2017, 51, 68–85. [Google Scholar] [CrossRef]
  33. Akaike, H. A new look at statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–719. [Google Scholar] [CrossRef]
  34. Pillai, K.; Nair, V.S. A model for software development effort and cost estimation. IEEE Trans. Softw. Eng. 1997, 23, 485–497. [Google Scholar] [CrossRef]
  35. Xu, J.; Yao, S. Software Reliability Growth model with Partial Differential Equation for Various Debugging Processes. Math. Probl. Eng. 2016, 2016, 1–13. [Google Scholar] [CrossRef]
  36. Anjum, M.; Haque, M.A.; Ahmad, N. Analysis and ranking of software reliability models based on weighted criteria value. J. Inform. Technol. Comput. Sci. 2013, 2, 1–14. [Google Scholar] [CrossRef]
  37. Daniel, R.J.; Zhang, X. Some successful approaches to software reliability modeling in industry. J. Syst. Softw. 2005, 74, 85–99. [Google Scholar]
  38. Li, X.; Xie, M.; Ng, S.H. Sensitivity analysis of release time of software reliability models incorporating testing effort with multiple change-points. Appl. Math. Model. 2010, 34, 3560–3570. [Google Scholar] [CrossRef]
Figure 1. Testing time infrastructure.
Figure 1. Testing time infrastructure.
Applsci 08 01483 g001
Figure 2. Confidence interval of the new proposed model for data set 1.
Figure 2. Confidence interval of the new proposed model for data set 1.
Applsci 08 01483 g002
Figure 3. Confidence interval of the new proposed model for data set 2.
Figure 3. Confidence interval of the new proposed model for data set 2.
Applsci 08 01483 g003
Figure 4. m(t) of all SRGMs for data set 1.
Figure 4. m(t) of all SRGMs for data set 1.
Applsci 08 01483 g004
Figure 5. m(t) of all SRGMs for data set 2.
Figure 5. m(t) of all SRGMs for data set 2.
Applsci 08 01483 g005
Figure 6. System development lifecycle.
Figure 6. System development lifecycle.
Applsci 08 01483 g006
Figure 7. T * and C ( T ) for T w   (Table 9).
Figure 7. T * and C ( T ) for T w   (Table 9).
Applsci 08 01483 g007
Figure 8. T * and C ( T ) for T w = 10 (Table 10).
Figure 8. T * and C ( T ) for T w = 10 (Table 10).
Applsci 08 01483 g008
Figure 9. T * and C ( T ) for T w = 10 (Table 11).
Figure 9. T * and C ( T ) for T w = 10 (Table 11).
Applsci 08 01483 g009
Figure 10. T * and C ( T ) for T w = 10 (Table 12).
Figure 10. T * and C ( T ) for T w = 10 (Table 12).
Applsci 08 01483 g010
Figure 11. T * and C ( T ) for T w = 10 (Table 13).
Figure 11. T * and C ( T ) for T w = 10 (Table 13).
Applsci 08 01483 g011
Figure 12. Sensitivity analysis for parameters of the optimal release time.
Figure 12. Sensitivity analysis for parameters of the optimal release time.
Applsci 08 01483 g012
Table 1. Mean value functions for SRGMs.
Table 1. Mean value functions for SRGMs.
No.MODELm(t)
1Goel Okumoto [25] m ( t ) = a ( 1 e bt )
2Delayed S-shaped [3] m ( t ) = a ( 1 ( 1 + bt ) e bt )
3Inflection S-shaped [26] m ( t ) = a ( 1 e bt ) 1 + β e bt
4Yamada Imperfect [27] m ( t ) = a ( 1 e bt ) ( 1 α b ) + α at
5Pham–Nordmann–Zhang (PNZ) [28] m ( t ) = a ( 1 e bt ) [ 1 α b ] + α at 1 + β e bt
6Pham–Zhang (PZ) [7] m ( t ) = ( ( c + a ) [ 1 e bt ] [ ab b α ] ( e at e bt ) ) 1 + β e bt
7Dependent Parameter 2 [29] m ( t ) = m 0 ( γ t + 1 γ t 0 + 1 ) e γ ( t t 0 ) + α ( γ t + 1 )
[ γ t 1 + ( 1 γ t 0 ) e γ ( t t 0 ) ]
8Testing Coverage [30] m ( t ) = N [ 1 ( β β + ( at ) b ) α ]
9New model m ( t ) = N ( 1 β β + a ( t t 0 ) b ) α
Table 2. Data set 1.
Table 2. Data set 1.
IndexTimeFailureCum. FailureIndexTimeFailureCum. Failure
1124944840,594430
24721610949,476131
387864141055,596031
413,6693171158,061132
519,0946231258,588133
624,7501241358,633033
732,299226----
Table 3. Data set 2.
Table 3. Data set 2.
TimeFailureCum. FailureTimeFailureCum. Failure
11111214
21212115
32413116
41514218
51615220
61716121
72917122
811018022
911119022
10112---
Table 4. Estimation of parameters for both data sets.
Table 4. Estimation of parameters for both data sets.
ModelData Set 1Data Set 2
GO a = 33.01 ,   b = 0.000058 a = 104.806 ,   b = 0.0131
DS a = 31.013 ,   b = 0.000144 a = 28.34 ,   b = 0.156
IS a = 33.009 ,   b = 0.000058 , β = 0.00001 a = 30.71 ,   b = 0.137 , β = 3.152
YID a = 33.009 ,   b = 0.000058 , α = 0.00000001 a = 42.489 ,   b = 0.03 , α = 0.027
PNZ a = 32.74 ,   b = 0.00006
α = 0.0000001 ,   β = 0.002
a = 30.589 ,   b = 0.138
α = 0.00001 ,   β = 3.179
PZ a = 33.008 ,   b = 0.000058
α = 10.000 ,   β = 0.000 , c = 0.001
a = 0.00001 ,   b = 0.138
α = 0.000001 ,   β = 3.179 , c = 30.595
DP2 α = 623.23 ,   γ = 0.000004
m 0 = 15.200 ,   t 0 = 0.000
α = 867.712 ,   γ = 0.011
m 0 = 4.999 ,   t 0 = 0.173
TC a = 0.0099 ,   b = 0.721
α = 1.001 ,   β = 61.100 , N = 52.24
a = 0.071 ,   b = 1.218 ,   α = 1294.000 ,
β = 2555.000 , N = 44.940
New model a = 0.001 ,   b = 1.502 ,   α = 0.414 ,
β = 9125.8 ,   t 0 = 151.304 ,   N = 39.171
a = 0.001 ,   b = 6.169 , α = 0.179 ,
β = 75096 , t 0 = 0.001 ,   N = 25.744
Table 5. Comparison of all the criteria for data set 1.
Table 5. Comparison of all the criteria for data set 1.
ModelMSEPRRPP   R 2 SAEAICPRVRMSPE
GO1.62600.62770.24250.983912.938549.40881.19791.2191
DS7.371365.17781.16640.926925.646472.23772.59222.3315
IS1.78870.62790.24260.983912.939751.40891.21911.2940
YID1.78700.62780.24260.983912.921651.39451.21871.2922
PNZ2.02320.56040.22660.983612.976653.65341.23071.4418
PZ2.23590.62800.24260.983912.940055.40911.21921.6175
DP233.59281.02528.36390.727350.9805131.68615.01945.6645
TC1.39750.05470.06960.989910.642152.22730.96401.3303
New model1.00290.01650.01520.99228.196554.13550.77600.8417
Table 6. Comparison of all the criteria for data set 2.
Table 6. Comparison of all the criteria for data set 2.
ModelMSEPRRPPR2SAEAICPRVRMSPE
GO0.54720.19190.31740.989810.996048.93150.70040.7179
DS0.74726.25940.96730.985813.690248.51780.82260.8492
IS0.33950.07150.06050.99418.143749.38960.54890.5493
YID0.48860.12100.17860.99159.288150.53970.65640.6589
PNZ0.36210.07210.06060.99418.148451.37910.54880.5493
PZ0.38800.07220.06060.99418.148553.37900.54880.5493
DP25.27471.346819.35410.913532.499270.16562.09662.0966
TC0.45580.07970.06480.99308.583153.87630.59510.5954
New model0.34810.06210.05220.99517.365754.72930.50110.5014
Table 7. Confidence interval of the new proposed model for both data sets ( α = 0.05 ).
Table 7. Confidence interval of the new proposed model for both data sets ( α = 0.05 ).
Data Set 1Data Set 2
TimeLCUCTimeLCUC
12490.0767147.9101121−0.959942.962066
47213.51031615.640062−0.722765.029449
87866.60805421.233533−0.227976.968153
13,6699.55065126.1007940.4130078.848266
19,09412.1347630.1613451.15397210.6953
24,75014.2708233.4106561.96978812.52126
32,29916.4778136.6885172.84442114.33204
40,59418.2981339.3420083.76616216.12977
49,47619.7605741.4456494.72512917.91337
55,59620.5572242.58206105.7116219.67833
58,06120.8396942.98349116.7148421.41613
58,58820.8975343.06561127.72190523.1137
58,63320.9024343.07256138.71724324.75315
---149.68267726.31238
---1510.5984627.76699
---1611.4453329.09345
---1712.2072230.27298
---1812.8736931.29499
---1913.4413232.15873
Table 8. Baseline case.
Table 8. Baseline case.
C0C1C2C3C4xμxμwTw
50010505000500100.10.110
Table 9. Optimal release time T * and total cost C ( T ) for T w (Case 1).
Table 9. Optimal release time T * and total cost C ( T ) for T w (Case 1).
Tw = 5Tw = 10Tw = 15Tw = 20
T*C(T)T*C(T)T*C(T)T*C(T)
New46.61166.8328471172.847147.21176.228747.41178.2435
Table 10. Optimal release time T * and total cost C ( T ) for C 0 (Case 2).
Table 10. Optimal release time T * and total cost C ( T ) for C 0 (Case 2).
C0Tw = 5Tw = 10Tw = 15Tw = 20
T*C(T)T*C(T)T*C(T)T*C(T)
10046.6766.832847772.847147.2776.228747.4778.2435
20046.6866.832847872.847147.2876.228747.4878.2435
50046.61166.8328471172.847147.21176.228747.41178.2435
70046.61366.8328471372.847147.21376.228747.41378.2435
Table 11. Optimal release time T * and total cost C ( T ) for C 2 (Case 3).
Table 11. Optimal release time T * and total cost C ( T ) for C 2 (Case 3).
C2Tw = 5Tw = 10Tw = 15Tw = 20
T*C(T)T*C(T)T*C(T)T*C(T)
2046.61089.6537471095.665247.21099.045647.41101.0591
5046.61166.8328471172.847147.21176.228747.41178.2435
10046.61295.4647471301.483547.21304.867347.31306.8837
15046.61424.0966471430.119947.21433.505947.31435.5233
Table 12. Optimal release time T * and total cost C ( T ) for C 3 (Case 4).
Table 12. Optimal release time T * and total cost C ( T ) for C 3 (Case 4).
C3Tw = 5Tw = 10Tw = 15Tw = 20
T*C(T)T*C(T)T*C(T)T*C(T)
300043.41131.187543.81136.4696441139.361444.11141.0439
400045.21150.979645.61156.658445.81159.814845.91161.6762
500046.61166.8328471172.847147.21176.228747.41178.2435
700048.71191.593549.21198.159849.51201.911449.61204.1773
Table 13. Optimal release time T * and total cost C ( T ) for C 4 (Case 5).
Table 13. Optimal release time T * and total cost C ( T ) for C 4 (Case 5).
C4Tw = 5Tw = 10Tw = 15Tw = 20
T*C(T)T*C(T)T*C(T)T*C(T)
10046.51166.5033471172.382247.21175.696347.31177.6743
30046.61166.6685471172.614647.21175.962547.31177.9591
50046.61166.8328471172.847147.21176.228747.41178.2435
100046.61167.243547.11173.427347.31176.888647.41178.9461
Table 14. Sensitivity analysis for parameters of the optimal release time.
Table 14. Sensitivity analysis for parameters of the optimal release time.
−30%−20%−10%010%20%30%
a0.0228370.0141650.0066370−0.005927−0.011273−0.016140
b1.7396380.4987210.1875180−0.118787−0.197279−0.251138
α −0.021859−0.013792−0.00656100.0060120.0115650.016728
β −0.021811−0.013763−0.00654700.0060000.0115430.016696
T 0 −0.000003−0.000002−0.00000100.0000010.0000020.000003
N −0.054793−0.035748−0.01753800.0169890.0335190.049658

Share and Cite

MDPI and ACS Style

Lee, D.H.; Chang, I.H.; Pham, H.; Song, K.Y. A Software Reliability Model Considering the Syntax Error in Uncertainty Environment, Optimal Release Time, and Sensitivity Analysis. Appl. Sci. 2018, 8, 1483. https://doi.org/10.3390/app8091483

AMA Style

Lee DH, Chang IH, Pham H, Song KY. A Software Reliability Model Considering the Syntax Error in Uncertainty Environment, Optimal Release Time, and Sensitivity Analysis. Applied Sciences. 2018; 8(9):1483. https://doi.org/10.3390/app8091483

Chicago/Turabian Style

Lee, Da Hye, In Hong Chang, Hoang Pham, and Kwang Yoon Song. 2018. "A Software Reliability Model Considering the Syntax Error in Uncertainty Environment, Optimal Release Time, and Sensitivity Analysis" Applied Sciences 8, no. 9: 1483. https://doi.org/10.3390/app8091483

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop