Are Inﬁnite-Failure NHPP-Based Software Reliability Models Useful?

: In the literature, inﬁnite-failure software reliability models (SRMs), such as Musa-Okumoto SRM (1984), have been demonstrated to be effective in quantitatively characterizing software testing processes and assessing software reliability. This paper primarily focuses on the inﬁnite-failure (type-II) non-homogeneous Poisson process (NHPP)-based SRMs and evaluates the performances of these SRMs comprehensively by comparing with the existing ﬁnite-failure (type-I) NHPP-based SRMs. In more speciﬁc terms, to describe the software fault-detection time distribution, we postulate 11 representative probability distribution functions that can be categorized into the generalized exponential distribution family and the extreme-value distribution family. Then, we compare the goodness-of-ﬁt and predictive performances with the associated 11 type-I and type-II NHPP-based SRMs. In numerical experiments, we analyze software fault-count data, collected from 16 actual development projects, which are commonly known in the software industry as fault-count time-domain data and fault-count time-interval data (group data). The maximum likelihood method is utilized to estimate the model parameters in both NHPP-based SRMs. In a comparison of the type-I with the type-II, it is shown that the type-II NHPP-based SRMs could exhibit better predictive performance than the existing type-I NHPP-based SRMs, especially in the early stage of software testing.


Introduction
In actual software development projects, clients frequently expect software products to be of high quality. As software products tend to become more complex with increasing size, software reliability is gradually gaining much attention from developers as an important attribute of software quality. Therefore, in the modern software development process, developers concentrate their human and material resources on the testing process to detect and fix inherent faults as much as possible for the purpose of improving software reliability. Due to the high cost of software testing, quantification of software reliability is also regarded as a significant concern during the verification phase.
To the best of our knowledge, the probabilistic behavior of the fault detection and correction process during the software testing phase in software reliability engineering is usually characterized by any stochastic counting process. On the other hand, after the software product is released, the probability of a product not experiencing any failure caused by software faults over a specific time interval is usually defined as quantitative software reliability. To measure the above software reliability, over the last five decades, hundreds of probabilistic models known as software reliability models (SRMs) have been implemented to quantitatively assess software reliability during the testing and operational phases. Among the existing SRMs, the non-homogeneous Poisson process (NHPP)-based SRMs are recognized as a very important class because of their mathematical tractability and high applicability. By modeling the software failure time, Kuo and Yang [1] classified NHPP-based SRMs into general order statistics SRMs and record value statistics SRMs. The same authors [1] proposed an alternative and more general classification by dividing NHPPbased SRMs into two types; finite-failure (type-I) and infinite-failure (type-II) NHPP-based SRMs with mean value functions, which are defined as the expected cumulative number of software failures. The best-known finite-failure (type-I) NHPP-based SRM was proposed by Goel and Okumoto [2], who assumed the exponential distribution as the fault-detection time distribution in software testing. The mean value function there is in proportion to the cumulative distribution function (CDF) of the exponential distribution. After that, postulating the other fault-detection time distributions, several type-I NHPP-based SRMs were proposed in the literature, such as the truncated-normal NHPP-based SRM [3], the lognormal NHPP-based SRM [3,4], the truncated-logistic NHPP-based SRM [5], the log-logistic NHPP-based SRM [6], the extreme-value NHPP-based SRMs [7,8], the gamma NHPP-based SRM [9,10], and the Pareto NHPP-based SRM [11]. At the same time, introducing a series of commonly used lifetime CDFs for modeling software failure time in reliability engineering, a few infinite-failure (type-II) NHPP-based SRMs have also been developed and widely used to quantitatively evaluate software reliability. The power-law process model [12][13][14] and the logarithmic Poisson execution time model [15,16] are classified as type-II NHPPbased SRMs. Note that the use of type-I NHPP-based SRMs does not necessarily imply that all inherent software faults are detected over an infinite time horizon. In other words, the precise number of inherent faults cannot be known, even if software testing is performed for an indefinite period. Thus, the finiteness in the type-I NHPP-based SRMs holds in the sense of the expectation of a cumulative number of detected faults. Unfortunately, a fair comparison between the type-I and type-II NHPP-based SRMs has not been made in past, because only a limited number of type-II NHPP-based SRMs have been considered in the literature.
Our research question of this paper is "Are infinite-failure NHPP-based SRMs useful?" More specifically, this paper investigates whether infinite-failure NHPP-based SRMs can guarantee better goodness-of-fit performance for the fault-count data collected in the software testing phase in comparison with finite-failure NHPP-based SRMS, and whether they can guarantee more accurate performance in predicting the number of software faults. Goodness-of-fit and predictive performance are generally recognized as the critical factors determining which SRM should be practically applied to quantitatively assess software reliability.
The original contribution of this paper is to investigate the type-II NHPP-based SRMs with the representative 11 CDFs in the literature. Three of these type-II NHPP-based SRMs are confirmed to be equivalent to the existing Cox-Lewis process, logarithmic Poisson execution time model, and power-law process, while the remaining eight type-II SRMs are novel SRMs. We confirm that the corresponding type-I and type-II NHPP-based SRMs can be obtained by importing the same software fault-detection time distribution CDFs to the finite-and infinite-failure assumptions, respectively. We make a comprehensive comparison between the existing type-I NHPP-based SRMs and their associated type-II NHPP-based SRMs. As shown in [17], it seems enough to consider 11 kinds of software fault-detection time CDFs in making goodness-of-fit and predictive performance comparisons between two different NHPP-based modeling frameworks.
The rest of this paper is organized as follows. Section 2 describes the definition of NHPP and illustrates NHPP-based software reliability modeling under the finite-failure and the infinite-failure hypotheses. We present 11 existing type-I NHPP-based SRMs based on the finite-failure hypothesis in [17] and propose 11 type-II NHPP-based SRMs with the same CDFs. The maximum likelihood approach to estimate the model parameters is summarized. We confirm that maximum likelihood estimation can be used for parameter estimation of our type-II NHPP-based SRMs for software fault-count time-domain data. We also give specific expressions of likelihood function and log-likelihood function for software fault-count time-interval data (group data), which are more common in the industry.
In the numerical examples in Section 3, we employ a total of 16 datasets collected from 16 actual development projects. In each dataset, we investigate the goodness-of-fit performance and predictive performance of type-I NHPP-based SRMs and type-II NHPPbased SRMs that have completed parameter estimation by maximum likelihood estimation. In addition, we also use these SRMs to quantitatively evaluate software reliability over a given time period, and analyze the applicability of the type-I NHPPs as well as the type-II NHPPs in predicting software reliability. In Section 4, the paper is summarized with some remarks and future directions.

Preliminary
As a well-known Markov process, a non-homogeneous Poisson process (NHPP) is regarded as a generalization of the classical homogeneous Poisson process (HPP). If the intensity at time point t in the definition of HPP is given by a function˘(t) with respect to t, then an HPP can be generalized to an NHPP. More specifically, if a stochastic counting process {N(t), t ≥ 0} is non-negative and non-decreasing, it becomes an NHPP under the following assumptions.

•
NHPP has independent increments, so the number of occurrences in a specific time interval depends on only the current time t and not on the past history of the process, which is also known as the Markov property.

•
The initial state of the process is given by N(0) = 0.

•
The occurrence probability of one event in a given time period [t, t + ∆t) for an NHPP is defined by is an absolutely continuous function, and is named the intensity function of NHPP. ∆t is recognized as an infinitesimal period of time. • NHPP has negligible probability for two or more events occurring in [t, t + ∆t), i.e., ∆t = 0 and o(∆t) is the higher-order term of ∆t.

•
As a typical Markov process, the Kolmogorov forward equations of NHPP can be written as with P 0 (0) = 1 and P n (0) = 0, where θ represents the free parameter vector in the transition rate function˘(t; θ). By solving the above simultaneous equations, the steady-state transition probability Pr{N(t) = n|N(0) = 0} = P n (t) is given by Through the Poisson property, M(t; θ) is defined as the mean value function of NHPP and represents the expected cumulative number of event occurrences during the interval (0, t].

NHPP-Based SRMs
Most textbooks [16,18,19] have pointed out that when the mean value function was used to characterize the cumulative number of software failures by time t, there were two types of NHPP-based SRMs; finite-failure NHPP-based SRMs and infinite-failure NHPPbased SRMs.

Finite-Failure (Type-I) NHPP-Based SRMs
In the software reliability modeling framework developed based on finite-failure (type-I) NHPP, before testing, the remaining number of software faults is assumed to obey a Poisson distribution with a positive mean µ 0 . Each software fault is assumed to be detected at independent and identically distributed (i. i. d.) random times, and is fixed immediately after it is detected. For any t ∈ (0, +∞), F(t; α), a non-decreasing function, is applied to describe the time distribution of each fault detection during the software testing phase, which is also known as the cumulative distribution function (CDF). α indicates the free parameter vector in the CDF. Then, a binomial distributed random variable with probability F(t; α) with a Poisson distributed population with parameter µ 0 is employed to characterize the resultant software fault-detection process. From a simple algebraic manipulation, the mean value function of NHPP can be derived as which can also be recognized as the cumulative number of faults detected by the software testing at time point t with θ = (µ 0 , α) and lim t→∞ M(t; θ) = µ 0 (> 0). This property is consistent with the assumption of software reliability modeling for type-I NHPP that the number of initial remaining faults expected before software testing begins is finite. In Table 1, we summarize 11 existing type-I NHPP-based SRMs with their associated CDFs and bounded mean value functions, which were employed in the software reliability assessment tool on the spreadsheet (SRATS) by Okamura and Dohi [17].  Even though the type-I NHPP-based SRMs are recognized as plausible models in term of software reliability growth phenomena, it must be acknowledged that reliability engineers sometimes feel discomfort when handling finite-failure NHPPs, since the interfailure time distributions in the type-I NHPP-based SRMs are defective [20]. Let us suppose that the random variables T 1 , T 2 , . . . , T n represent the first, second, . . . , n-th failure times that occur after the software testing starts at T 0 = 0. Let the random variables X 1 , X 2 , . . . , X n denote the inter-failure times between two consecutive failures:

SRM & Time Distribution F(t;α) M(t;θ)
From Equations (3) and (5), the CDF of T n can be obtained as Then, it is straightforward to see in the type-I NHPP-based SRMs that lim t→∞ G n (t; θ) < 1 for an arbitrary n. In other words, even if the testing time tends to be infinite, there still exists a positive probability of the n-th failure not occurring. It is obvious that the CDF of T n is defective. Similarly, for realizations of T i (i = 1, 2, . . . , n), t 1 , t 2 , . . . , t n , we can obtain the CDF of the inter-failure time X n in the time interval (t n−1 , t n−1 + x) as follows. (7) where Pr{ N(t n−1 + x) − N(t n−1 ) = 0|N(t n−1 ) = n − 1} denotes the probability that no failure occurs in time interval (t n−1 , t n−1 + x). Since the mean value function is bounded, i.e., lim t→∞ M(t; θ) = µ 0 , when x is infinite, Equation (7) can be reduced to 1 − e −(µ 0 −M(t n−1 ;θ)) < 1.
It means that regardless of the number of previous failures, the probability that the software fails over an infinite time horizon is always non-zero. Hence, the inter-failure time CDF of type-I NHPP is also defective. For the type-I NHPP-based SRMs, it is not meaningful to discuss some reliability metrics, such as mean time between failures (MTBF), because the finite moments of T n and X n always diverge.

Infinite-Failure (Type-II) NHPP-Based SRMs
Type-II NHPP assumes that a new software fault is not inserted at each software debugging. However, this assumption may be somewhat specific, because so-called imperfect debugging may occur in the actual software testing phases. When the possibility of imperfect debugging is considered, the assumption of finiteness in the type-I NHPP-based SRMs seems to be rather strong. Similarly to the classical preventive maintenance modeling [21], if each software failure is minimally repaired through the debugging, the mean value function of software fault-detection process is unbounded and is given by where lim t→∞ M(t; α) → ∞ . It is obvious that the CDFs, G n (t; θ) and F n (x; θ) in Equations (6) and (7) are not defective; for instance, lim t→∞ G n (t; θ) = 1 and lim x→∞ F X i (x; θ) = 1. Hence, it becomes significant to consider important metrics, such as MTBF. In this modeling framework, investigating the residual number of software faults before testing has no significant meaning, because it may increase by imperfect debugging through the software testing.
Another well-known type-II NHPP-based SRM is referred to as a power-law process model [12][13][14], where mean value function and CDF are given by M(t; α) = µ 2 /µ 1 t (1/µ 1 ) and F(t; α) = 1 − exp − exp − µ 2 +ln(t) µ 1 , respectively. The latter is also recognized as the log-extreme-value minimum distribution in [17]. Besides the above two representative NHPPs, the well-known logarithmic Poisson execution time SRM [15,16] belongs to the type-II category, too. The mean value function of this model is given by [17]. In Table 1, it is easy to see that the same CDFs are used for type-I Txvmin SRM v.s. type-II Cox-Lewis SRM, type-I Lxvmin SRM v.s. type-II Power-law SRM, and type-I Pareto SRM v.s. type-II Musa-Okumoto SRM, respectively. Hence, by substituting 11 software fault-detection time CDFs in Table 1 into Equation (8), we can derive the corresponding type-II NHPP-based SRMs in Table 2. Table 2. Infinite-failure (type-II) NHPP-based SRMs.

Parameter Estimation
For the existing type-I and type-II NHPP-based SRMs, the maximum likelihood (ML) estimation is a typical technique that is widely applied in software reliability modeling. The parameters of the maximized log-likelihood function (LLF) provide the ML estimates. In addition, the LLF depends on the observed fault-count data, the intensity function and/or the mean value function in the type-I and type-II NHPP-based SRMs. Next, we give the likelihood functions for the software fault-count time-domain data and the software fault-count time-interval data (group data).
Taking logarithm of both sides in Equation (9), the log-likelihood function is obtained as The ML estimate,θ orα, is given by argmax θ ln L(θ; D) or argmax α ln L(α; D).

Datasets
In the selection of data sources for numerical experiments, we selected the well-known benchmark software fault-count datasets in software reliability engineering, which are observed in mission-critical systems. Although the evolution of these systems may be slower than that of business-oriented systems, the effects of a failure are much greater. Hence, reliability is particularly important for the developers of these mission-critical systems. In the industry, the software fault-count data observed in the distributed test environment for mission-critical systems can be divided into two categories: software fault-count time-domain data and software fault-count time-interval data (group data). We selected eight sets of each type of data, which have been widely utilized as fault-count data in software reliability engineering. The details of these datasets are shown in Tables 3 and 4, respectively.

Goodness-of-Fit Performance
Assuming that the parameters of the SRMs were estimated by maximum likelihood estimation, in the first experiment, we employed two criteria for evaluating the goodnessof-fit performance of the 11 type-I and type-II NHPP-based SRMs, for instance, Akaike information criterion (AIC): AIC θ orα = 2 × (number of free parameters) − 2 ln L θ orα (13) and the mean squared error (MSE); or MSE θ orα; respectively. In Equations (14) and (15), n i is defined as the count of detected faults in the time interval (0, t i ], m D and m I are the lengths of time-domain and group data, andθ andα are the ML estimates by maximizing ln L(θ or α; D) and ln L(θ or α; I). The AIC with ML estimates generally represented an approximation of the Kullback-Leibler divergence between our proposed SRM and the empirical stochastic process behind the fault-count data, while direct MSE exhibited a vertical distance between the estimated mean value function and the fault-count data. A smaller AIC/MSE indicated that the SRM had a better goodness-of-fit performance (showing a better fit to the underlying data).

Figures 1 and 2 plot the behavior of the mean value functions of type-I and type-II SRMs in the time-domain data, TDDS1
, and the group data, TIDS7. The red curve and the orange curve are plotted as the best SRMs selected from 11 type-II SRMs and 11 type-I SRMs based on their AIC, respectively. Not surprisingly, the two modeling frameworks showed slightly different growth trends. More specifically, the type-I (orange curve) always fitted better to the actual data in the tail segment, in both the time-domain and the group data. However, we still were not able to make a comprehensive assessment regarding which SRM exhibited a better fitting ability over the whole dataset. It was therefore necessary to think about AIC as well as MSE as such criteria. First, in Table 5, we make a more precise comparison between our proposed type-II and the existing type-I on AIC and MSE. Without comparing them with each other, it is evident that in the vast majority of cases, the best models among the type-I SRMs were given by the extreme-value distributions. By contrast, the type-II Pareto (Musa-Okumoto) SRM performed better than the other SRMs. In the next step, by comparing the best type-I and type-II SRMs for each dataset, it is not difficult to observe that in three cases (TDDS1, TDDS3, and TDDS6), the type-II SRMs provided a smaller AIC than the type-I SRMs. However, in all the datasets, the type-I SRMs provided a smaller MSE than the type-II SRMs.  In Table 6, we compared the SRMs of our type-II NHPP with the existing type-I NHPP-based SRMs in eight group datasets. It can be seen that our type-II SRMs could guarantee a smaller AIC than the existing type-I in three cases (TIDS2, TIDS3, and TIDS7), but at the same time, it still could not outperform the type-I from the viewpoint of MSE for any group dataset. We can therefore draw the conclusion that the type-II NHPP-based SRMs could not consistently outperform the existing type-I NHPP-based SRMs in terms of goodness-of-fit performance, but in some cases, especially in time-domain data, the  In Table 6, we compared the SRMs of our type-II NHPP with the existing type-I NHPP-based SRMs in eight group datasets. It can be seen that our type-II SRMs could guarantee a smaller AIC than the existing type-I in three cases (TIDS2, TIDS3, and TIDS7), but at the same time, it still could not outperform the type-I from the viewpoint of MSE for any group dataset. We can therefore draw the conclusion that the type-II NHPP-based SRMs could not consistently outperform the existing type-I NHPP-based SRMs in terms In Table 6, we compared the SRMs of our type-II NHPP with the existing type-I NHPPbased SRMs in eight group datasets. It can be seen that our type-II SRMs could guarantee a smaller AIC than the existing type-I in three cases (TIDS2, TIDS3, and TIDS7), but at the same time, it still could not outperform the type-I from the viewpoint of MSE for any group dataset. We can therefore draw the conclusion that the type-II NHPP-based SRMs could not consistently outperform the existing type-I NHPP-based SRMs in terms of goodness-of-fit performance, but in some cases, especially in time-domain data, the three existing type-II NHPP-based SRMs, Musa-Okumoto, Cox-Lewis, and power-law SRMs, could indicate the better experimental results.

Predictive Performance
Notably, according to the previous studies, SRMs with better goodness-of-fit do not necessarily provide an excellent predictive performance. In other words, investigating the predictive performance of the type-I and type-II NHPP-based SRMs is of significant importance. Hence, in our second experiment, we employed the prediction mean squared error (PMSE) to measure the predictive performance of our type-II SRMs, where and PMSE θ orα; for the time-domain and group data, respectively, where m or n m software faults were observed in (0, t m ], and the prediction length is given by l (= 1, 2, · · ·).θ andα are the ML estimates at observation time t m for the type-I and type-II NHPP-based SRMs, respectively. Similarly to MSE, PMSE is also a metric that evaluates the mean squared distance between the predicted number of detected faults and its (unknown) realization for each prediction length. For a comprehensive investigation of the predictive performance of SRMs at different software testing phases, three observation points were set at 20%, 50%, and 80% of the total length of each dataset to represent the early, middle, and late phases of software testing and to predict the total number of software faults at the remaining 80%, 50% and 20% of the time periods. Then, we calculated the PMSE for the type-I and type-II NHPP-based SRMs. It was immediately evident that a larger observation point corresponded to a shorter prediction length. In Figures 3-5, we plot the predictive behavior of the best existing type-I and the best type-II NHPP-based SRMs in time-domain data, TDDS1, at three different observation points. The red curve in each figure represents our best type-II NHPP, while the orange curve denotes the best type-I NHPP. All the best SRMs were taken from the type-I NHPP-based SRMs and the type-II NHPP-based SRMs with their smaller PMSEs in TDDS1. It can be seen that both type-I and type-II tended to give almost the same number of predicted software faults in the early and late testing phases. However, after the mid-term of testing, the type-I NHPP-based SRM tended to make more optimistic software fault predictions. In Figures 6-8, we also plot the predictive behavior of the best existing type-I and the best type-II NHPP-based SRMs in group data, TIDS7. It can be seen that the type-I still tended to falsely predict the number of software faults in the early and middle testing phases. More specifically, in Figures 6 and 7, the type-II NHPP-based SRMs showed an increasing trend, the opposite was true for the type-I, whose predictive trend for future phases becomes very flat. However, in Figure 8, the type-I and type-II showed more similar predictive trends. In general, prediction of unknown trend changes over longer periods of time in the future is essentially difficult for both the type-I NHPP and the type-II NHPP. In contrast, prediction of trend changes over a short period of time is relatively easy, but absolute accuracy cannot be guaranteed.
Software 2023, 2, FOR PEER REVIEW 12 NHPP-based SRMs showed an increasing trend, the opposite was true for the type-I, whose predictive trend for future phases becomes very flat. However, in Figure 8, the type-I and type-II showed more similar predictive trends. In general, prediction of unknown trend changes over longer periods of time in the future is essentially difficult for both the type-I NHPP and the type-II NHPP. In contrast, prediction of trend changes over a short period of time is relatively easy, but absolute accuracy cannot be guaranteed.                 In Table 7, we present the PMSEs of the best type-I SRM compared to the best type-II SRM in each set of time-domain data. We compared the PMSEs in 11 type-I SRMs and 11 type-II SRMs by selecting the models with the smaller PMSEs as the best SRMs at each observation point. It can be seen that at the 20% observation point, our type-II SRMs provided smaller PMSEs than the existing type-I SRMs in three cases (TDDS2, TDDS6, and TDDS7). During the middle testing phase (at the 50% observation point), we observed that our type-II SRMs outperformed the type-I SRMs in four datasets (TDDS4, TDDS6~TDDS8). As the test proceeded to the late phases (at the 80% observation point), type-II SRMs were able to guarantee smaller PMSEs in TDDS1, TDDS4, and TDDS7. On the other hand, it was found that the best type-II SRMs with better predictive performance than the type-I were all provided by logistic distribution, Musa-Okumoto SRM, and power-law SRM. Upon comparing PMSEs in time-domain data, we believe that the type-II SRMs could become a good alternative to the type-I SRMs. As shown in Table 8, when testing in the early phase (at the 20% observation point), it was immediately noticed that our type-II SRMs showed smaller PMSEs than the type-I SRMs in seven out of eight group datasets (except in TIDS3). In addition to logistic-based SRM, Musa-Okumoto SRM, and power-law SRM, which were proven to perform better in Table 7, we observed that Cox-Lewis SRM was also appropriate in some cases of group data (TIDS4 and TIDS5) in terms of predictive performance. At the 50% observation point, we found that the type-II SRMs could guarantee better predictive performance than the type-I SRMs in three cases (TIDS3, TIDS6 and TIDS7). In the late testing phase (at the 80% observation point), only in TIDS2 did our Tlogist type-II SRMs give the smallest PMSE in the future prediction phase. In the group data, the predictive performance of the type-II SRMs decreased as the software testing proceeded. Hence, it is possible to conclude that the infinite-failure NHPP-based SRMs outperformed the existing finite-failure NHPP-based SRMs for software fault-detection prediction in the early testing phase when group data were available.

Software Reliability Assessment
Our final research question for NHPP-based Type-II SRMs is how to utilize them to quantitatively assess the software reliability. In general, in NHPP software reliability modeling, the reliability of software at a given time point t r can be calculated by R(t r ; θ or α); that is, the probability that the software will be failure-free during time interval t r − t m , which can be written as where for time data and group data, t m is defined as the time point of the last fault detected during the software test and the calendar time when the test was stopped, respectively. m is the total number of detected faults before the time point t m . In this numerical experiment, we assumed that t r = t m . The software reliability in each software development project was predicted quantitatively by importing the mean value functions of type-I NHPP and type-II NHPP into Equation (18), respectively.
In Tables 9 and 10, we compare the quantitative software reliability of the best type-I SRMs and the best type-II SRMs in the time-domain data and group data, respectively. We selected the type-I SRM and the type-II SRM with the smaller AIC in each time domain and group dataset as the best SRMs. We can see that in almost all datasets (except in TDDS1 and TIDS7), the type-I SRMs tended to predict software reliability than our type-II SRMs. In other words, during the time period t r − t m , the probability of software failure predicted by the type-II NHPP was much higher than the type-I NHPP. This observation indicates that our type-II SRMs tended to make more conservative decisions than the type-I SRMs in software reliability assessment. It is important to note that optimistic reliability estimates are often undesirable. This is because software faults are additionally detected as the ex-post results after each observation point in all the datasets.

Conclusions
Under the infinite-failure assumption, in addition to the well-known Musa-Okumoto model, Cox-Lewis model, and power-law model, in this work we proposed another eight type-II NHPP-based SRMs with eight different software fault-detection time CDFs. By analyzing eight software fault-count time-domain datasets and eight software fault-count time-interval datasets (group data), we investigated the goodness-of-fit performance and predictive performance of our SRMs. We also compared these SRMs with 11 existing type-I NHPP-based SRMs under the finite-failure assumption. The important point to note is that the type-I and type-II NHPP-based SRMs considered in this paper had the same software fault-detection CDFs, which has never been addressed in the past literature.
The experimental results confirmed that our type-II NHPP-based SRMs showed better goodness-of-fit performance in some cases. On the other hand, for the group data, the type-II NHPP-based SRMs exhibited rather better predictive ability than the existing type-I NHPP-based SRMs in the early testing phase. However, as software testing progressed, it was found that the advantages of type-II NHPP in terms of predictive performance were diminished. Hence, we can conclude that the type-II NHPP-based SRMs are a good complement to the type-I NHPP-based SRMs for describing the fault-detection process of software systems, while at the same time, they have greater potential in the early software testing phase. On the other hand, we also confirmed that the type-II NHPP tended to make more conservative predictions than the type-I NHPP in software reliability assessment.
We do not believe that there are any significant limitations to the validity in this work. The datasets used for the numerical experiments were collected during the software/system testing phase under careful supervision and with specific objectives. They have been proven to be of high quality [18,[23][24][25][26]. Both the finite-failure software reliability modeling assumptions and the infinite-failure software reliability modeling assumptions have been shown to be well-founded. The experimental results exhibited by type-I NHPP-based SRMs and type-II NHPP-based SRMs, such as software reliability, also match the actual situation. There is no evidence so far that our proposed SRMs are inapplicable in any type of software.
In the future, we will introduce virtual testing time in the type-II NHPP-based SRMs, which will be beneficial as we continue to explore the potential of the type-II modeling hypothesis.