Next Article in Journal
SDN-OpenFlow Topology Discovery: An Overview of Performance Issues
Previous Article in Journal
Transcriptome Analysis Identified Candidate Genes Involved in Fruit Body Development under Blue Light in Lentinula edodes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modeling Software Fault-Detection and Fault-Correction Processes by Considering the Dependencies between Fault Amounts

1
School of Reliability and Systems Engineering, Beihang University, Beijing 100191, China
2
Science and Technology on Reliability and Environmental Engineering Laboratory, Beijing 100191, China
3
Department of Industrial and Systems Engineering, Rutgers University, Piscataway, NJ 08854, USA
*
Author to whom correspondence should be addressed.
Appl. Sci. 2021, 11(15), 6998; https://doi.org/10.3390/app11156998
Submission received: 10 June 2021 / Revised: 22 July 2021 / Accepted: 23 July 2021 / Published: 29 July 2021

Abstract

:
Many NHPP software reliability growth models (SRGMs) have been proposed to assess software reliability during the past 40 years, but most of them have focused on modeling the fault detection process (FDP) in two ways: one is to ignore the fault correction process (FCP), i.e., faults are assumed to be instantaneously removed after the failure caused by the faults is detected. However, in real software development, it is not always reliable as fault removal usually needs time, i.e., the faults causing failures cannot always be removed at once and the detected failures will become more and more difficult to correct as testing progresses. Another way to model the fault correction process is to consider the time delay between the fault detection and fault correction. The time delay has been assumed to be constant and function dependent on time or random variables following some kind of distribution. In this paper, some useful approaches to the modeling of dual fault detection and correction processes are discussed. The dependencies between fault amounts of dual processes are considered instead of fault correction time-delay. A model aiming to integrate fault-detection processes and fault-correction processes, along with the incorporation of a fault introduction rate and testing coverage rate into the software reliability evaluation is proposed. The model parameters are estimated using the Least Squares Estimation (LSE) method. The descriptive and predictive performance of this proposed model and other existing NHPP SRGMs are investigated by using three real data-sets based on four criteria, respectively. The results show that the new model can be significantly effective in yielding better reliability estimation and prediction.

1. Introduction

Software reliability has been viewed as the most significant factor to improve the reliability of safety-critical software systems. Many time-dependent SRGMs have been studied to determine the reliability measures for software over the past four decades [1,2,3,4]. Researchers have developed different models upon different assumptions. Some models make an assumption that once a failure is detected, the errors which cause the failure are immediately corrected and no new errors are brought in simultaneously (i.e., perfect debugging) [5]. Other models take into account an imperfect debugging [6,7], i.e., faults are not always perfectly removed, and new ones can be introduced as a by-product of the fault repair process. However, most of the existing models assume that faults will be instantaneously repaired after being detected. However, it is not realistic and in fact detected faults will become more and more difficult to be corrected as testing progresses. Therefore, it is of great importance to build software reliability models from the viewpoint of the fault correction process, i.e., give the same priority to modeling the fault correction process as the fault detection process.
Schneidewind first modeled the software correction process along with the software detection process by proposing a fault-correction model using a constant time delay in the fault-detection process [8]. Then Xie and Zhao extended Schneidewind’s idea from a constant time delay to a time-dependent delay function [9]. Later, Schneidewind provided an extension of his original model by using a random variable for the time delay following an exponential distribution [10]. Xie et al. further proposed another distributed correction time model as to provide a more flexible modeling of correction processes [11,12] and Peng et al. incorporated a testing effort function and imperfect debugging into the time delay function [13]. Lo and Huang proposed a general framework for modeling software’s fault detection and correction processes and showed that many existing SRGMs based on NHPP could be covered by the proposed approaches [14]. Shu proposed a model from the viewpoint of the fault amount relationship between the two processes [15]. Additionally, some other attempts have been made to model these two processes from different viewpoints, such as Markov chain [16,17,18], finite and infinite server queuing models [19], and quasi-renewal time-delay fault removal model [20]. Researches also suggest that the estimation accuracy of SRGMs could be further improved by considering the influence of some real issues happening during the testing process [21,22], such as testing coverage. Testing coverage is a promising indicator for testing completeness and effectiveness, which can help developers evaluate how much test effort has been spent and help customers estimate the confidence of accepting the software product. Many time-dependent test coverage functions (TCFs) have been proposed by using different distributions, such as logarithmic-exponential [23], S-shaped [24], Rayleigh [25], Weibull and logistic [26] and lognormal [27]. Many TCFs based reliability models have been developed to formulate the relationship between the testing coverage and the number of detected faults, such as the Rayleigh model [25], logarithmic-exponential model [23], beta model, hyper-exponential model [22] and so on [22,24,26].
Therefore, it is of great importance to model dual fault detection and correction processes. In contrast to the existing research that considers the time dependency between fault detection and fault correction processes, in this paper, we will propose a new software reliability model considering both fault detection and correction processes from the viewpoint of fault content, that is, the quantitative dependence between the number of faults detected by the fault detection process and the number of faults corrected by the fault correction process. The fault introduction rate and test coverage are also considered to improve the accuracy of the final derived model.
The remainder of this paper is organized as follows. In Section 2, we first give a brief overview of the fault-detection models’ assumptions and fault-correction process, then build a relationship between the numbers of detected faults and corrected faults, after which we present the proposed model incorporating the fault introduction rate and testing coverage rate, and several existing SRGMs are also presented. In Section 3, we examine the fitting and prediction performance of this model on three sets of software failure data compared with other existing SRGMs. Finally, Section 4 gives the conclusions.

2. Modeling Fault Detection and Fault Correction Processes

2.1. Basic Assumptions of Existing NHPP SRGMs

NHPP is used to describe the failure phenomenon during the testing process. The counting process N ( t ) , t 0 of an NHPP process is given as follows.
Pr N ( t ) = k = [ m ( t ) ] k k ! e m ( t ) , k = 0 , 1 , 2
The mean value function (MVF) m ( t ) can be expressed as follows.
m ( t ) = 0 t λ ( s ) d s
where λ ( s ) is the fault intensity function.
Most existing SRGMs based on NHPP have the following basic assumptions concerning the software fault detection process:
  • The software failures’ occurrence and faults’ removal follow NHPP;
  • The software failure intensity at any time is proportional to the number of remaining faults presented at that time;
  • The detected faults are immediately removed with certainty and correction of faults takes only negligible time.
According to the above assumptions, the general NHPP model can be proposed by solving the following equation:
d m ( t ) d t = b ( t ) [ a ( t ) m ( t ) ]
where a ( t ) is the total fault content function, m ( t ) is the mean number of detected faults and b ( t ) represents the fault detection rate.

2.2. Considering the Fault-Detection Process and Fault-Correction Process Together

Most existing SRGMs only focus on describing the behavior of the fault detection process and assume that faults will be fixed instantaneously upon detection, but actually most latent software faults may remain uncorrected for a long time due to the complexity of software systems and incomplete comprehension of software by the testers and learning process even after they are detected.
Suppose m c ( t ) denotes the mean value of corrected faults by time t , and assume that the mean number of faults corrected in the time interval ( t , t + Δ t ) is proportional to the mean number of detected but not yet corrected faults remaining in the software system. The MVF m c ( t ) can be expressed in terms of the following equations:
d m c ( t ) d t = μ ( t ) [ m ( t ) m c ( t ) ]
where μ ( t ) is the fault correction rate, and m ( t ) is shown in Equation (3).

2.3. The Relationship between m ( t ) and m c ( t )

Here we use data collected from testing a software program (Data Set 1, DS-1) [28] to study the relationship between m ( t ) and m c ( t ) . Let r ( t ) = m c ( t ) m ( t ) , the cumulative detected faults and corrected faults are shown in Figure 1 and the value of r ( t ) is shown in Figure 2. Apparently, at the beginning of the testing phase, lots of faults are detected and most of them are simple and easy to be removed, so the difference between the number of corrected faults and detected faults is small, then faults detected become more complicated and difficult to be removed, so the difference becomes larger, then the difference becomes less again. Thus, a concave or S-shaped function can be used to model the ratio of the number of corrected faults to the number of detected faults.
Here we use two S-shaped functions to model r ( t ) . From Table 1, we can see that r ( t ) = 1 1 + β e b t provides a better descriptive power.

2.4. A New Model with Imperfect Debugging and Testing Coverage

Here we incorporate testing coverage and fault introduction rate into software reliability model.
Suppose c ( t ) denotes the proportion of the code covered by time t against the whole code. Obviously, c ( t ) increases with testing time t . Firstly, when testing starts, c ( t ) grows at a quick rate as more test cases are executed to examine the software; after a certain point in time, the software becomes stable and less testing coverage take place to realize the residual fault detection, and function c ( t ) becomes flat when the testing comes to the end. Thus, a concave or S-shaped function can be suitable to model the testing coverage function. Obviously, (1 − c ( t ) ) is the percentage of the software code which has not yet been covered by test cases up to time t . The derivative of the testing coverage function, c ( t ) , represents the coverage rate. Therefore, c ( t ) / ( 1 c ( t ) ) could be used to measure the fault detection rate b ( t ) , which is shown in Equation (3).
To build a model incorporating fault-detection process and fault-correction process as well as fault introduction rate and testing coverage, the following assumptions are proposed for this model:
  • The software failures’ occurrence follows an NHPP process.
  • The software failure rate at any time depends on both the fault detection rate and the number of remaining faults in the software at that time.
  • The fault detection rate can be expressed by c ( t ) 1 c ( t ) ; c ( t ) is the percentage of the code that has been examined up to time t , c ( t ) is the derivative of the testing coverage function and represents the coverage rate.
  • Faults can be introduced during the debugging phase with a constant fault introduction rate and the overall fault content function is linear time-dependent.
  • m c ( t ) denotes the mean value of corrected faults by time t , which is proportional to the mean number of detected but not yet corrected faults remaining in the software system, and r ( t ) represents the relationship between m ( t ) and m c ( t ) expressed by r ( t ) = m c ( t ) m ( t ) ; m ( t ) is the cumulative detected faults.
From Assumption 4, the total fault number function a ( t ) , is a linear function of the expected fault number detected up to time t . That is,
a ( t ) = a + α m ( t )
where a denotes the initial fault number presented in the software system before testing starts and α > 0 .
Substituting a ( t ) from Equation (5) into Equation (3), and solving it in terms of the initial condition that at t = 0 , m ( t ) = 0 , we can obtain
m ( t ) = a p α 1 1 c ( 0 ) 1 c ( t ) ( α p ) β
where c ( 0 ) refers to the testing coverage function when t = 0 .
From Assumption 5, once m ( t ) is determined, m c ( t ) can be achieved. That is,
m c ( t ) = a p α 1 1 c ( 0 ) 1 c ( t ) ( α p ) β · r ( t )
Substituting different types of testing coverage functions for c ( t ) and ratio functions of corrected faults number to detected faults number for r ( t ) in Equation (7), we can obtain different MVFs corresponding to them. As mentioned above, the testing coverage function should be a non-negative and non-decreasing function of testing time t . So the following function can be used to model the testing coverage function, that is:
c ( t ) = A ( 1 e r t ) 1 + c e r t
where A denotes the maximum percentage of testing coverage, r is the shape parameter and c is the scale parameter. Clearly, when t = 0 , c ( 0 ) = 0 .
According to the results of Section 2.3, here the following function is used to describe r ( t ) :
r ( t ) = 1 1 + β e b t
Substituting Equations (8) and (9) into Equation (7), we can get the MVF of corrected faults correspondingly:
m c ( t ) = a 1 α 1 1 A ( 1 e r t ) 1 + c e r t 1 α 1 1 + β e b t
It can be seen that fault detection process and correction process as well as fault introduction rate and testing coverage are all integrated into the proposed model.
Table 2 lists the existing NHPP models to depict the MVF of fault correction process [14] and fault detection process [24] as well as the proposed new model. All models together will be used for the following comparisons.

3. Model Comparisons

3.1. Comparison Criteria and Parameter Estimation Method

3.1.1. Criteria for Models’ Descriptive Power Comparison

Here we use three criteria to judge the performance of the models. The first criterion is the mean value of squared error (Mean Square Error, MSE), which can be calculated as follows:
MSE = 1 n N i = 1 n ( y i m ^ ( t i ) ) 2
where n represents the number of observations, y i represents the total number of faults observed by time t i , m ^ ( t i ) denotes the estimated cumulative number of faults up to time t i , N represents the number of parameters in the model. Therefore, the lower the value of MSE, the better the model performs.
The second criterion is correlation index of the regression curve equation ( R 2 ), which can be expressed as follows:
R 2 = 1 i = 1 n y i m ^ ( t i ) 2 i = 1 n y i y ¯ 2
where y ¯ = 1 n i = 1 n y i . Therefore, the larger R 2 , the better the model explains the variation in the data.
The last criterion is adjusted R 2 (Adjusted R 2 ), which can be expressed as follows:
Adjusted   R 2 = 1 ( 1 R ) ( n 1 ) n M 1
where R denotes the value of R 2 and M represents the model’s predictor number. Therefore, the larger value of Adjusted R 2 , the better is the model’s performance.

3.1.2. Criteria for Models’ Predictive Power Comparison

Here we use SSE criterion to examine the predictive power of SRGMs. SSE is the sum of squared error, which is expressed as follows:
SSE = i = m n ( y i m ^ ( t i ) ) 2
Assume that by the end of testing time t n , totally y n faults have been detected. Firstly we use the data points up to time t m 1 ( t m 1 < t n ) to estimate the parameters of m ( t ) , then substituting the estimated parameters in the mean value function yields the prediction value of the cumulative fault number m ^ ( t m ) by t m ( t m < t n ), y m is the actual number of faults detected by t m . Then the procedure is repeated for several values of t i ( i = m + 1 , m + 2 , , n . ) until t n .
Therefore, the less SSE, the better is the model’s performance.

3.1.3. Parameter Estimation Method

Once the analytical expression for m ( t ) is derived, the parameters in m ( t ) can be estimated by using the maximum likelihood estimation (MLE) method or the least square estimation (LSE) method. Though estimates from MLE are consistent and asymptotically normally distributed as the sample size increases, sometimes the estimations may not be obtained especially under conditions where m ( t ) is too complex. Here we turn to LSE methods to estimate the models’ parameters.
The sum of the squared distance is given as follows:
L = i = 1 n ( y i m ^ ( t i ) ) 2
where y i is the cumulative number of faults detected or corrected in time (0, t i ), and all failure data are denoted in the form of pairs ( t i , y i ) ( i = 1 , 2 , , n ; 0 < t 1 < t 2 < < t n ).
By taking derivatives of (15) with respect to each parameter, and setting the results equal to zero, we can obtain several equations for the proposed model as follows:
L a = L α = L A = L c = L r = L b = L β = 0
After solving the above equations simultaneously, we can obtain the least square estimates of all parameters for the proposed model.
As noted, solutions of the above Equation (16) are extremely difficult and require either graphical or numerical methods. Under the help of MATLAB, the calculation of the parameters is not a critical problem, though adding additional parameters to make the software reliability model more complex makes the work of parameter estimation more difficult.

3.2. Data Analysis and Model Comparison with Real Application

3.2.1. A Middle-Size Software System Data

Here we examine the performance of the proposed model and compare it with several traditional models using data collected from testing a middle-size software system (Data Set 1, DS-1) [11]. The failure data are recorded by week and are shown in Table 3. In contrast to a traditional software reliability data set, this data set includes not only fault-detection data but also fault-correction data. We use all data points to fit the models and get the parameters estimation of all models. The model parameters, MSE values, R 2 values and Adjusted R 2 values are listed in Table 4.
All models’ fitting results for DS-1 are graphically illustrated in Figure 3. Figure 4 is three dimensional and the coordinates X, Y and Z illustrate the values of MSE, R 2 and Adjusted R 2 respectively. We can see that the proposed model has the best criteria values of MSE, R 2 and Adjusted R 2 , i.e., the new model shows the best fitting power to the real data set than all other models.

3.2.2. Monitor and Control System Data

Here we examine models using another software’s testing data collected from testing a software program developed for a real-time monitor and control system (Data Set 2, DS-2) [35]. The failure data are recorded by day and are shown in Table 5. We use all data points to fit the models and estimate the parameters in the models. The fault data set has only detected fault number. The model parameters, MSE values, R 2 values and Adjusted R 2 values are listed in Table 6.
From Table 6, we can see that comparing all models using all the three criteria, the new model yields the best criteria values and provides the best descriptive power. All models’ fitting results for DS-2 are graphically illustrated in Figure 5. Figure 6 is three dimensional and the coordinates X, Y and Z illustrate the values of MSE, R 2 and Adjusted R 2 respectively.

3.2.3. Tandem Computer Data

Here we examine models using another software’s testing data collected from Tandem Computers Release #1 (Data Set 3, DS-3) [5]. The failure data are recorded by week and are shown in Table 7. We use all data points to fit the models and estimate the parameters in the models. The fault data set also only has the detected fault number. The model parameters, MSE values, R 2 values and Adjusted R 2 values for goodness-of-fit are listed in Table 8.
From Table 8, we can see that the new model provides the best descriptive power. The fitting of the models (existing and proposed) to DS-3 is graphically illustrated in Figure 7. Figure 8 is three dimensional and the coordinates X, Y and Z illustrate the values of MSE, R 2 and Adjusted R 2 respectively.

3.2.4. Comparison of Models’ Predictive Power

For the predictive power comparison, we divide the data set into two parts, 80% and the remaining 20%, respectively. We use the first 80% of data points to estimate the models’ parameters, then use the remaining data points to compare the models’ predictive power. The SSE values for the prediction are 37.0169, 236.0253 and 23.0348, accordingly shown in Table 9. For DS-1, comparing all models, we find that the new model has the smallest SSE value of 37.0169, which are smaller than the values of other models, e.g., others’ SSE values can be 1.87-times (P-Z coverage model’s 69.0496) and even 22.24-times (Yamada imperfect (1) model’s 823.0770) larger than the value of the proposed model. For DS-2, comparing all models, we find that the new model has the smallest SSE value of 236.0253, which are smaller than the values of other models, e.g., others’ SSE values can be 11.75 times (Inflection S-shaped model’s 2.7742 × 103), even 1468.78 times (Yamada Rayleigh model’s 3.4667 × 105) larger than the value of the proposed model. For DS-3, though SSE = 23.0348 for the proposed model is not the best result, it is only a little larger than the best result, for the smallest SSE value is 6.9655 (given by the delayed S-shaped model), and comparing to other models, we find that the new model’s SSE value is much smaller than the values of other models, e.g., others’ SSE values can be 3.16-times (Yamada Rayleigh model’s 73.4246) and even 555.33-times (SRGM-3 model’s 1.2792 × 104) larger than the value of the proposed model. Additionally, for the delayed S-shaped model, it only provides the best result for DS-3, but does not provide the best results for DS-1 and DS-2. This may indicate that this proposed model gives a better predictive power.

4. Conclusions

In this paper, the problem of modeling software fault-detection processes and fault-correction processes together with imperfect debugging and testing coverage has been investigated. From the viewpoint of fault amount instead of fault correction time delay, a new SRGT model addressing the fault re-introduction and testing coverage is put forward by introducing the relationship function between the MVF of detected faults and corrected faults. The proposed model is applied to two kinds of data sets: one that contains information not only of detected fault numbers but also corrected fault numbers, and another that contains only detected fault numbers. No matter what kind of data sets, the proposed model gives significantly better goodness-of-fit and prediction results comparing with other existing NHPP models for three real data sets according to four criteria. It should be noted that, though adding more parameters makes the software reliability model more complicated and the task of parameter estimation more difficult, by automating the calculations using software tools it is not a critical problem.

Author Contributions

Conceptualization, Q.L. and H.P.; Data Curation, Q.L.; Formal Analysis, Q.L.; Funding Acquisition, Q.L.; Investigation, Q.L.; Methodology, Q.L. and H.P.; Project Administration, Q.L.; Resources, Q.L. and H.P.; Supervision, H.P.; Writing—Review and Editing, H.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partly funded by “National Key Laboratory of Science andTechnology on Reliability and Environmental Engineering of China”, grant number “WDZC2019601A303”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Erto, P.; Giorgio, M.; Lepore, A. The Generalized Inflection S-Shaped Software Reliability Growth Model. Reliab. IEEE Trans. 2018, 69, 228–244. [Google Scholar] [CrossRef]
  2. Utkin, L.V.; Coolen, F. A robust weighted SVR-based software reliability growth model. Reliab. Eng. Syst. Saf. 2018, 176, 93–101. [Google Scholar] [CrossRef] [Green Version]
  3. Saraf, I.; Iqbal, J. Generalized multi-release modelling of software reliability growth models from the perspective of two types of imperfect debugging and change point. Qual. Reliab. Eng. Int. 2019, 35, 2358–2370. [Google Scholar] [CrossRef]
  4. Jin, C.; Jin, S.W. Parameter optimization of software reliability growth model with S-shaped testing-effort function using improved swarm intelligent optimization. Appl. Soft Comput. 2016, 40, 283–291. [Google Scholar] [CrossRef]
  5. Pham, H. System Software Reliability; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  6. Yamada, S.; Tokuno, K.; Osaki, S. Software reliability measurement in imperfect debugging environment and its application. Reliab. Eng. Syst. Saf. 1993, 40, 139–147. [Google Scholar] [CrossRef]
  7. Huang, C.Y.; Lyu, M.R. Estimation and analysis of some generalized multiple change-point software reliability models. Reliab. IEEE Trans. 2011, 60, 498–514. [Google Scholar] [CrossRef]
  8. Schneidewind, N.F. Analysis of error processes in computer software. Sigplan Not. 1975, 10, 337–346. [Google Scholar] [CrossRef]
  9. Xie, M.; Zhao, M. The Schneidewind software reliability model revisited. In Proceedings of the Third International Symposium on Software Reliability Engineering, Research Triangle Park, NC, USA, 7–10 October 1992. [Google Scholar]
  10. Schneidewind, N.F. Modelling the fault correction process. In Proceedings of the 12th International Symposium on Software Reliability Engineering, Los Alamitos, CA, USA, 27–30 November 2001; pp. 185–190. [Google Scholar]
  11. Xie, M.; Hu, Q.P.; Wu, Y.P.; Ng, S.H. A study of the modeling and analysis of software fault-detection and fault-correction processes. Qual. Reliab. Eng. Int. 2007, 23, 459–470. [Google Scholar] [CrossRef]
  12. Wu, Y.P.; Hu, Q.P.; Xie, M.; Ng, S.H. Modeling and analysis of software fault detection and correction process by considering time dependency. Reliab. IEEE Trans. 2007, 56, 629–642. [Google Scholar] [CrossRef]
  13. Peng, R.; Li, Y.F.; Zhang, W.J.; Hu, Q.P. Testing effort dependent software reliability model for imperfect debugging process considering both detection and correction. Reliab. Eng. Syst. Saf. 2014, 126, 37–43. [Google Scholar] [CrossRef] [Green Version]
  14. Lo, J.H.; Huang, C.Y. An integration of fault detection and correction processes in software reliability analysis. J. Syst. Softw. 2006, 79, 1312–1323. [Google Scholar] [CrossRef]
  15. Shuy, J.; Liu, H.W.; Wu, Z.B.; Yang, X.Z. A software reliability growth model integrating fault detection and fault correction processes. Chin. High Technol. Lett. 2010, 20, 386–391. (In Chinese) [Google Scholar]
  16. Gokhale, S.S.; Lyu, M.R.; Trivedi, K.S. Analysis of Software Fault Removal Policies Using a Non-Homogeneous Continuous Time Markov Chain. Softw. Qual. J. 2004, 12, 211–230. [Google Scholar] [CrossRef]
  17. Gokhale, S.S.; Lyu, M.R.; Trivedi, K.S. Incorporating fault debugging activities into software reliability models: A simulation approach. Reliab. IEEE Trans. 2006, 55, 281–292. [Google Scholar] [CrossRef]
  18. Jia, L.; Yang, B.; Guo, S.; Park, D.H. Software reliability modeling considering fault correction process. IEICE Trans. Inf. Syst. 2010, 93, 185–188. [Google Scholar] [CrossRef] [Green Version]
  19. Huang, C.Y.; Huang, W.C. Software reliability analysis and measurement using finite and infinite server queueing models. Reliab. IEEE Trans. 2008, 57, 192–203. [Google Scholar] [CrossRef]
  20. Hwang, S.; Pham, H. Quasi-renewal time-delay fault-removal consideration in software reliability modeling. Systems, Man and Cybernetics, Part A: Systems and Humans. IEEE Trans. 2009, 39, 200–209. [Google Scholar]
  21. Huang, C.Y.; Kuo, S.Y.; Lyu, M.R. An assessment of testing-effort dependent software reliability growth models. Reliab. IEEE Trans. 2007, 56, 198–211. [Google Scholar] [CrossRef]
  22. Cai, X.; Lyu, M.R. Software Reliability Modeling with Test Coverage: Experimentation and Measurement with A Fault-Tolerant Software Project. 18th IEEE Int. Symp. Softw. Reliab. 2007, 2007, 17–26. [Google Scholar] [CrossRef]
  23. Malaiya, Y.K.; Li, M.N.; Bieman, J.M.; Karcich, R. Software reliability growth with test coverage. Reliab. IEEE Trans. 2002, 51, 420–426. [Google Scholar] [CrossRef] [Green Version]
  24. Pham, H.; Zhang, X. NHPP software reliability and cost models with testing coverage. Eur. J. Oper. Res. 2003, 145, 443–454. [Google Scholar] [CrossRef]
  25. Vouk, M.A. Using Reliability Models during Testing with Non-Operational Profiles. Available online: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.47.8863&rep=rep1&type=pdf (accessed on 29 July 2021).
  26. Gokhale, S.; Trivedi, K.S. A time/structure based software reliability model. Ann. Softw. Eng. 1999, 8, 85–121. [Google Scholar] [CrossRef]
  27. Park, J.-Y.; Lee, G.; Park, J.H. A class of coverage growth functions and its practical application. J. Korean Stat. Soc. 2008, 37, 241–247. [Google Scholar] [CrossRef]
  28. Zhang, X.; Teng, X.; Pham, H. Considering fault removal efficiency in software reliability assessment. IEEE Trans. Syst. Man, Cybern. Part A Syst. Hum. 2003, 33, 114–120. [Google Scholar] [CrossRef]
  29. Ohba, M. Inflection S-Shaped Software Reliability Growth Model; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 1984; pp. 144–162. [Google Scholar]
  30. Yamada, S.; Ohba, M.; Osaki, S. S-shaped reliability growth modeling for software fault detection. Reliab. IEEE Trans. 1983, 12, 475–484. [Google Scholar] [CrossRef]
  31. Hossain, S.A.; Ram, C.D. Estimating the parameters of a non-homogeneous Poisson process model for software reliability. Reliab.IEEE Trans. 1993, 42, 604–612. [Google Scholar] [CrossRef]
  32. Yamada, S.; Tokuno, K.; Osaki, S. Imperfect debugging models with fault introduction rate for software reliability assessment. Int. J. Syst. Sci. 1992, 23, 2241–2252. [Google Scholar] [CrossRef]
  33. Pham, H.; Zhang, X. An NHPP Software Reliability Model and Its Comparison. Int. J. Reliab. Qual. Saf. Eng. 1997, 4, 269–282. [Google Scholar] [CrossRef]
  34. Kapur, P.K.; Pham, H.; Anand, S.; Yadav, K. A Unified Approach for Developing Software Reliability Growth Models in the Presence of Imperfect Debugging and Error Generation. Reliab. IEEE Trans. 2011, 60, 331–340. [Google Scholar] [CrossRef]
  35. Tohma, Y.; Yamano, H.; Ohba, M.; Jacoby, R. The estimation of parameters of the hypergeometric distribution and its application to the software reliability growth model. IEEE Trans. Softw. Eng. 1991, 17, 483–489. [Google Scholar] [CrossRef]
Figure 1. The number of cumulative detected faults and corrected faults for DS-1.
Figure 1. The number of cumulative detected faults and corrected faults for DS-1.
Applsci 11 06998 g001
Figure 2. The ratio of the number of corrected faults to the number of detected faults for DS-1.
Figure 2. The ratio of the number of corrected faults to the number of detected faults for DS-1.
Applsci 11 06998 g002
Figure 3. The comparison fitting results of SRGMs for DS-1.
Figure 3. The comparison fitting results of SRGMs for DS-1.
Applsci 11 06998 g003
Figure 4. (X, Y, Z) represents (MSE, R 2 , Adjusted R 2 ) values for DS-1.
Figure 4. (X, Y, Z) represents (MSE, R 2 , Adjusted R 2 ) values for DS-1.
Applsci 11 06998 g004
Figure 5. The comparison fitting results of SRGMs for DS-2.
Figure 5. The comparison fitting results of SRGMs for DS-2.
Applsci 11 06998 g005
Figure 6. (X, Y, Z) represents (MSE, R 2 , Adjusted R 2 ) values for DS-2.
Figure 6. (X, Y, Z) represents (MSE, R 2 , Adjusted R 2 ) values for DS-2.
Applsci 11 06998 g006
Figure 7. The fitting results of comparison SRGMs compared with actual data for DS-3.
Figure 7. The fitting results of comparison SRGMs compared with actual data for DS-3.
Applsci 11 06998 g007
Figure 8. (X, Y, Z) represents (MSE, R 2 , Adjusted R 2 ) values for DS-3.
Figure 8. (X, Y, Z) represents (MSE, R 2 , Adjusted R 2 ) values for DS-3.
Applsci 11 06998 g008
Table 1. Comparison of two S-shaped functions as r ( t ) for DS-1.
Table 1. Comparison of two S-shaped functions as r ( t ) for DS-1.
No. r ( t ) MSE R 2 Adjusted   R 2
1 1 1 + β e b t 0.00220.97650.9749
2 1 ( 1 + b t ) e b t 0.00380.97860.9756
Note: The definitions of MSE, R 2 and Adjusted R 2 are shown in Section 3.1.
Table 2. Software reliability models and their MVFs.
Table 2. Software reliability models and their MVFs.
No.Model NameModel Type MVF   ( m ( t )   or   m c ( t ) )
1G-O model(also called m 1 ( t ) [14])Concave m ( t ) = a ( 1 e b t )
2 m 2 ( t ) [14]Concave m c ( t ) = a ( 1 + c b c e b t b b c e c t )
3 m 3 ( t ) [14]S-shaped m c ( t ) = a ( 1 ( 1 + b t ) e b t )
4Inflection S-shaped [29]Concave m ( t ) = a ( 1 e b t ) 1 + β e b t
5Yamada exponential [30]Concave m ( t ) = a ( 1 e γ α ( 1 e β t ) )
6Yamada Rayleigh [30]S-shaped m ( t ) = a ( 1 e γ α ( 1 e β t 2 / 2 ) )
7Yamada Weibull [30]Concave and S-shaped m ( t ) = a ( 1 e γ α ( 1 e β t r ) )
8Delayed S-shaped [30]S-shaped m ( t ) = a ( 1 ( 1 + b t ) e b t )
9HD/G-O [31]Concave m ( t ) = log [ ( e a c ) / ( e a e b t c ) ]
10Yamada imperfect (1) [32]Concave m ( t ) = a b α + b ( e α t e b t )
11Yamada imperfect (2) [32]Concave m ( t ) = a ( 1 e b t ) ( 1 α b ) + α a t
12P-Z(1997) model [33]S-shaped and Concave m ( t ) = 1 ( 1 + β e b t ) ( c + a ) ( 1 e b t ) a b b α ( e α t e b t )
13Fault removal model (2003) [28]S-shaped m ( t ) = a p β 1 ( 1 + α ) e b t 1 + α e b t c b ( p β )
14P-Z coverage model (2003) [24]S-shaped and Concave m ( t ) = a ( 1 + α t b t + 1 e b t ) a α ( 1 + b t ) b e b t + 1 ln ( b t + 1 ) + i = 0 ( 1 + b t ) i + 1 1 ( i + 1 ) ! ( i + 1 )
15SRGM-3 model (2011) [34]S-shaped m ( t ) = A 1 α 1 1 + b t + b 2 t 2 2 e b t p ( 1 α )
16proposed modelS-shaped and Concave m c ( t ) = a 1 α 1 1 A ( 1 e r t ) 1 + c e r t 1 α 1 1 + β e b t
Table 3. A middle-size software system data (DS-1).
Table 3. A middle-size software system data (DS-1).
WeeksCumulative Detected FaultsCumulative Corrected FaultsWeeksCumulative Detected FaultsCumulative Corrected Faults
112310114109
223311116113
3431212123120
4643213126125
5845314128127
6977815132127
71098916141135
81119817144143
9112107
Table 4. Comparison of goodness-of-fit power of SRGMs for m c ( t ) of DS-1.
Table 4. Comparison of goodness-of-fit power of SRGMs for m c ( t ) of DS-1.
Model No.Model NameModel Parameter Estimation ResultsMSER2Adjusted R2
1 m 1 ( t ) a ^ = 231.6 , b ^ = 0.05838 148.33330.93960.9356
2 m 2 ( t ) a ^ = 147.5 , b ^ = 0.2627 , c ^ = 0.263 56.23570.97860.9756
3 m 3 ( t ) a ^ = 147.5 , b ^ = 0.2627 52.48670.97860.9772
4Inflection S-shaped a ^ = 129.9 , b ^ = 0.5074 , c ^ = 17.3 48.12860.98170.9791
5Yamada exponential a ^ = 235.8 , β ^ = 0.001389 , γ ^ = 6.448 , α ^ = 6.403 171.46150.93950.9256
6Yamada Rayleigh a ^ = 142.9 , β ^ = 0.01177 , γ ^ = 2.109 , α ^ = 1.638 40.33080.98580.9825
7Yamada Weibull a ^ = 141.9 , r ^ = 2.315 , α ^ = 1.197 , β ^ = 0.004362 34.63850.98780.9850
8Delayed S-shaped a ^ = 147.5 , b ^ = 0.2627 52.48670.97860.9772
9HD/G-O a ^ = 231.6 , b ^ = 0.05837 , c ^ = 4.891 158.92860.93960.9310
10Yamada imperfect(1) a ^ = 231.6 , b ^ = 0.05837 , α ^ = 1.436 × 10 9 158.92860.93960.9356
11Yamada imperfect(2) a ^ = 231.6 , b ^ = 0.05837 , α ^ = 1.641 × 10 9 158.92860.93960.9356
12P-Z (1997) model a ^ = 0.9179 , b ^ = 0.5102 , c ^ = 129.3 ,
α ^ = 0.08294 , β ^ = 17.49
55.79170.98180.9758
13Fault removal model (2003) a ^ = 99.8 , α ^ = 932.9 , b ^ = 2.483 ,
p ^ = 0.8698 , c ^ = 0.3123 , β ^ = 0.1595
19.03640.99430.9917
14P-Z coverage model (2003) a ^ = 147.5 , b ^ = 0.2627 , α ^ = 4.776 × 10 10 56.23570.97860.9772
15SRGM-3 model (2011) A ^ = 126.7 , α ^ = 0.08195 , b ^ = 2.569 ,
p ^ = 0.1126
34.01540.98800.9863
16proposed model a ^ = 230.5 , A ^ = 0.8538 , α ^ = 0.7528 , c ^ = 73.78
b ^ = 3.794 , β ^ = 0.05419 , r ^ = 1.093
5.93700.99840.9974
Notes: The bold number means the result of the best SRGM in this column.
Table 5. Failures per day and cumulative failures for DS-2.
Table 5. Failures per day and cumulative failures for DS-2.
DaysFaultsCumulative FaultsDaysFaultsCumulative Faults
155572448
2510583451
3515592453
4520607460
5626613463
6834620463
7236631464
8743640464
9447651465
10249660465
113180670465
12484681466
1324108691467
1449157700467
1514171710467
1612183721468
178191731469
189200740469
194204750469
207211760469
216217771470
229226782472
234230790472
244234801473
252236810473
264240820473
273243830473
289252840473
292254850473
305259860473
314263872475
321264880475
334268890475
343271900475
356277910475
3613290920475
3719309930475
3815324940475
397331950475
4015346961476
4121367970476
428375980476
436381990476
44204011001477
45104111010477
4634141020477
4734171031478
4884251040478
4954301050478
5014311061479
5124331070479
5224351080479
5324371091480
5474441100480
5524461111481
560446
Table 6. Comparison of goodness-of-fit power of SRGMs for DS-2.
Table 6. Comparison of goodness-of-fit power of SRGMs for DS-2.
Model No.Model NameModel Parameter Estimation ResultsMSER2Adjusted R2
1G-O model a ^ = 538.1 , b ^ = 0.02575 804.22020.96460.9643
2 m 2 ( t ) a ^ = 484.9 , b ^ = 0.06756 , c ^ = 0.06659 338.33330.98520.9850
3 m 3 ( t ) a ^ = 488.1 , b ^ = 0.06629 331.83490.98540.9853
4Inflection S-shaped a ^ = 484.6 , b ^ = 0.06681 , β ^ = 3.648 300.00000.98690.9867
5Yamada exponential a ^ = 6.987 × 10 4 , β ^ = 0.02566 , γ ^ = 0.03822 ,
α ^ = 0.2023
820.93460.96450.9635
6Yamada Rayleigh a ^ = 568.2 , β ^ = 0.001074 , γ ^ = 1.325 ,
α ^ = 1.384
461.58880.98000.9795
7Yamada Weibull a ^ = 485.4 , r ^ = 1.456 , α ^ = 128.7 , β ^ = 3.359 × 10 5 308.78500.98670.9863
8Delayed S-shaped a ^ = 488.1 , b ^ = 0.06629 331.83490.98540.9853
9HD/G-O a ^ = 538.1 , b ^ = 0.02575 , c ^ = 2.849 811.66670.96460.9639
10Yamada imperfect(1) a ^ = 538.1 , b ^ = 0.02575 , α ^ = 4.124 × 10 10 811.66670.96460.9643
11Yamada imperfect(2) a ^ = 538.1 , b ^ = 0.02575 , α ^ = 1.074 × 10 12 811.66670.96460.9643
12P-Z (1997) model a ^ = 0.9988 , b ^ = 0.06682 , c ^ = 483.6 ,
α ^ = 0.5211 , β ^ = 3.65
305.66040.98690.9864
13Fault removal model (2003) a ^ = 468.2 , α ^ = 150.7 , b ^ = 0.8082 ,
p ^ = 0.9518 , c ^ = 0.03979 , β ^ = 0.02274
419.23810.98220.9814
14P-Z coverage model (2003) a ^ = 488.1 , b ^ = 0.06629 , α ^ = 1.254 × 10 10 334.90740.98540.9853
15SRGM-3 model (2011) A ^ = 480.7 , α ^ = 0.02476 , b ^ = 0.3128 ,
p ^ = 0.1695
354.85980.98470.9842
16proposed model a ^ = 71.06 , A ^ = 0.9998 , α ^ = 0.942 , c ^ = 0.05328
b ^ = 0.4515 , β ^ = 32.14 , r ^ = 0.1498
193.84620.99190.9914
Notes: The bold number means the result of the best SRGM in this column.
Table 7. Failures per week and cumulative failures for DS-3.
Table 7. Failures per week and cumulative failures for DS-3.
Testing Time (Weeks)CPU (h)FaultsCumulative FaultsTesting Time (Weeks)CPU (h)FaultsCumulative Faults
15191616116539681
2968824127083586
31430327137487490
41893633147846393
52490841158205396
63058849168564298
73625554178923199
844224581892821100
9521811691996410100
1058236752010,0000100
Table 8. Comparison of goodness-of-fit power of SRGMs for DS-3.
Table 8. Comparison of goodness-of-fit power of SRGMs for DS-3.
Model No.Model NameModel Parameter Estimation ResultsMSER2Adjusted R2
1G-O model a ^ = 130.2 , b ^ = 0.08317 12.90560.98570.9849
2 m 2 ( t ) a ^ = 130.2 , b ^ = 0.08317 , c ^ = 4.646 × 10 4 13.66470.98570.9840
3 m 3 ( t ) a ^ = 104 , b ^ = 0.2654 28.06110.96890.9672
4Inflection S-shaped a ^ = 110.8 , b ^ = 0.1721 , β ^ = 1.205 10.56470.98900.9877
5Yamada exponential a ^ = 999.5 , β ^ = 0.07685 , γ ^ = 0.279 , α ^ = 0.51 14.84380.98540.9827
6Yamada Rayleigh a ^ = 115.8 , β ^ = 0.01721 , γ ^ = 3.03 , α ^ = 0.6548 49.41880.95140.9422
7Yamada Weibull a ^ = 121.1 , r ^ = 1.027 , α ^ = 245.2 , β ^ = 3.542 × 10 4 15.17500.98510.9823
8Delayed S-shaped a ^ = 104 , b ^ = 0.2654 28.06110.96890.9672
9HD/G-O a ^ = 130.2 , b ^ = 0.08317 , c ^ = 0.2769 13.66470.98570.9840
10Yamada imperfect(1) a ^ = 130.2 , b ^ = 0.08317 , α ^ = 9.363 × 10 10 13.66470.98570.9849
11Yamada imperfect(2) a ^ = 130.2 , b ^ = 0.08317 , α ^ = 1.04 × 10 9 13.66470.98570.9849
12P-Z(1997) model a ^ = 1.589 × 10 8 , b ^ = 0.1721 , c ^ = 110.8 ,
α ^ = 0.000368 , β ^ = 1.205
11.97330.98900.9860
13Fault removal model (2003) a ^ = 103.6 , α ^ = 57.49 , b ^ = 0.08193 ,
p ^ = 0.9993 , c ^ = 4.916 , β ^ = 4.827 × 10 5
10.87860.99060.9873
14P-Z coverage model a ^ = 73.19 , b ^ = 0.3563 , α ^ = 0.02588 27.15880.97160.9683
15SRGM-3 model (2011) A ^ = 83.46 , α ^ = 0.3433 , b ^ = 37.1 , p ^ = 0.003678 14.92500.98530.9826
16proposed model a ^ = 160.4 , A ^ = 0.5101 , α ^ = 0.7292 , c ^ = 87.26
b ^ = 0.2813 , β ^ = 6.468 , r ^ = 7.007
2.31920.99810.9973
Notes: The bold number means the result of the best SRGM in this column.
Table 9. Comparison of SRGMs’ predictive power for DS-1, DS-2 and DS-3.
Table 9. Comparison of SRGMs’ predictive power for DS-1, DS-2 and DS-3.
Model No.Model Name80% OF DS-180% OF DS-280% OF DS-3
SSESSESSE
1G-O model(also called m 1 ( t ) )823.07136.7713 × 104147.2964
2 m 2 ( t ) 119.71117.3709 × 103356.5743
3 m 3 ( t ) 69.04963.5343 × 1036.9655
4Inflection S-shaped750.18032.7742 × 103356.5814
5Yamada exponential401.78626.0381 × 104377.2231
6Yamada Rayleigh197.33023.4667 × 10573.4246
7Yamada Weibull553.60763.4596 × 103676.6515
8Delayed S-shaped69.04963.5343 × 1036.9655
9HD/G-O823.07136.7713 × 104356.5818
10Yamada imperfect(1)823.07706.7713 × 104356.5837
11Yamada imperfect(2)823.07245.7234 × 104356.5827
12P-Z (1997) model595.67702.8149 × 103334.1059
13Fault removal model (2003)286.21381.9332 × 104345.1824
14P-Z coverage model (2003)69.04963.5343 × 1031.5362 × 103
15SRGM-3 model (2011)549.90811.1495 × 1041.2792 × 104
16proposed model37.0169236.025323.0348
Notes: The bold number means the result of the best SRGM in this column.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Li, Q.; Pham, H. Modeling Software Fault-Detection and Fault-Correction Processes by Considering the Dependencies between Fault Amounts. Appl. Sci. 2021, 11, 6998. https://doi.org/10.3390/app11156998

AMA Style

Li Q, Pham H. Modeling Software Fault-Detection and Fault-Correction Processes by Considering the Dependencies between Fault Amounts. Applied Sciences. 2021; 11(15):6998. https://doi.org/10.3390/app11156998

Chicago/Turabian Style

Li, Qiuying, and Hoang Pham. 2021. "Modeling Software Fault-Detection and Fault-Correction Processes by Considering the Dependencies between Fault Amounts" Applied Sciences 11, no. 15: 6998. https://doi.org/10.3390/app11156998

APA Style

Li, Q., & Pham, H. (2021). Modeling Software Fault-Detection and Fault-Correction Processes by Considering the Dependencies between Fault Amounts. Applied Sciences, 11(15), 6998. https://doi.org/10.3390/app11156998

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop