Software Release Assessment under Multiple Alternatives with Consideration of Debuggers’ Learning Rate and Imperfect Debugging Environment

Tian, Qing; Fang, Chih-Chiang; Yeh, Chun-Wu

doi:10.3390/math10101744

Open AccessArticle

Software Release Assessment under Multiple Alternatives with Consideration of Debuggers’ Learning Rate and Imperfect Debugging Environment

by

Qing Tian

¹,

Chih-Chiang Fang

^1,* and

Chun-Wu Yeh

^2,*

¹

School of Computer Science and Software, Zhaoqing University, Zhaoqing 526061, China

²

Computer and Game Development Program & Department of Information Management, Kun Shan University, Tainan 710303, Taiwan

^*

Authors to whom correspondence should be addressed.

Mathematics 2022, 10(10), 1744; https://doi.org/10.3390/math10101744

Submission received: 11 April 2022 / Revised: 12 May 2022 / Accepted: 16 May 2022 / Published: 19 May 2022

Download

Browse Figures

Versions Notes

Abstract

:

In the software development life cycle, the quality and reliability of software are critical to software developers. Poor quality and reliability not only cause the loss of customers and sales but also increase the operational risk due to unreliable codes. Therefore, software developers should try their best to reduce such potential software defects by undertaking a software testing project. However, to pursue perfect and faultless software is unrealistic since the budget, time, and testing resources are limited, and the software developers need to reach a compromise that balances software reliability and the testing cost. Using the model presented in this study, software developers can devise multiple alternatives for a software testing project, and each alternative has its distinct allocation of human resources. The best alternative can therefore be selected. Furthermore, the allocation incorporates debuggers’ learning and negligent factors, both of which influence the efficiency of software testing in practice. Accordingly, the study considers both human factors and the nature of errors during the debugging process to develop a software reliability growth model to estimate the related costs and the reliability indicator. Additionally, the issue of error classification is also extended by considering the impacts of errors on the system, and the expected time required to remove simple or complex errors can be estimated based on different truncated exponential distributions. Finally, numerical examples are presented and sensitivity analyses are performed to provide managerial insights and useful directions to inform software release strategies.

Keywords:

software reliability; reliability growth; learning effect; imperfect debugging; Nonhomogeneous Poisson Process

MSC:

68M15; 62N05

1. Introduction

Nowadays, software reliability is always an important indicator for evaluating software quality and can be used as a basis to make an appropriate plan for software testing work. In modern software industries, software developers not only focus on the functionality of their developed software or systems but also ensure that software quality and stability are over an acceptable level. Once the software quality and stability cannot satisfy their clients’ or customers’ requirements, this will still cause the loss of sales or customer dissatisfaction no matter how superior the functionality and performance of software/system might be. However, due to the constraints of the project budget, human resources, and testing time, it is unrealistic for software developers to pursue a perfect and faultless software/system. Accordingly, in practice, most software developers will make a compromise plan instead of spending a huge budget pursuing a faultless software/system. Over the past few decades, various software reliability growth models (SRGMs) have been proposed, and most of them adopt a Non-Homogeneous Poisson Process (NHPP) to describe software testing and debugging phenomena [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21].

Generally, testing staff’s learning and experience will influence the debugging efficiency, and the velocity of error detection will increase in the mid to late testing phase. Therefore, the curve of the mean value function of a software reliability growth model (SRGM) will present an S-shaped form. Yamada et al. [2] proposed an S-shaped reliability growth model for software error detection but they did not clearly point out which parameters are related to the learning factor in their proposed model. Li and Pham [22] took failure intensity functions into consideration for presenting a learning effect during a testing process. Huang et al. [23] developed an SRGM which considered testing effort. They believed that the efficiency of error detection may be lower in the early stage but that the growth rate of error detection will accelerate in the mid-stage. Therefore, the curve of the SRGM will present as S-shaped, indicating the learning effect that occurs in debugging works. Chiu et al. [7] state that the learning effect should be considered a crucial parameter in SRGM, and the value of this parameter can be used to estimate the inflection point during a testing period. In their experiments, the curve of SRGM can adapt to concave or S-shape forms, reflecting the learning effect that exists in testing works. Moreover, due to the fact that the parameter of the learning effect might be hard to estimate if the related historical data are insufficient, the decision-maker needs to utilize other methods for evaluating the parameters’ value. In order to solve this issue, Chiu et al. [24] extended their original model by utilizing a Bayesian method to reasonably estimate the parameters’ value regarding the learning effect. However, their research concerns perfect debugging SRGMs, and they did not take the testing staff’s negligence into consideration. Besides, they simply assumed that all software errors would be homogeneous, an unrealistic assumption because different error types will require different amounts of time to be rectified. Ahmad et al. [25] proposed a testing-effort dependent inflection S-Shaped SRGM that includes the learning effect in the software testing process. Raju [26] also proposed an inflection S-shaped software reliability model based on NHPP, and their model can be adapted to exponential and S-shaped data. Ramsamy et al. [27] state that most SRGMs do not take the learning factors of testing staff into consideration. In order to improve the adaptability of various software testing projects, they utilized learning and debugging indices to develop an SRGM. Kim and Kim [28] proposed SRGMs that were sensitive to learning effects by using Yamada and Ohba’s delayed S-shaped models. The study discussed the relationship between release time and testing efforts to minimize software development. Pachauri et al. [29] took the S-shaped curve, imperfect debugging, and error reduction factors into consideration to develop a multi-stage SRGM. Jin and Jin [30] extended the study of Huang et al. [23] to propose an SRGM that included testing effort, and their model can accommodate exponential and S-shaped data patterns simultaneously. Chiu et al. [31] extended their former model and assumed that the learning effect can be time-dependent and may grow linearly or exponentially with testing time. In addition, their extended model can describe the phenomenon of unstable debugging efficiency during the early testing stage. Based on the above discussion, it can be noted that although the learning effect has been taken into consideration in previous research, this work has not fully considered all possible factors regarding human resource issues, including the basic ability, learning efficacy, and negligent factor of testing staff. Accordingly, the present study will incorporate all of these elements to propose a software release decision model SRGM under multiple alternatives.

Moreover, most of the related studies only considered the scenario of perfect debugging. They simply assumed that the testing/debugging staff would not make mistakes again when they corrected the error codes. However, such an assumption might be unrealistic because humans are not perfect in nature, and it cannot be ruled out that testing/debugging staff’s carelessness may introduce new software defects. Therefore, recently, some studies have begun to investigate the issue of imperfect debugging SRGMs. Aktekin and Caglar [32] proposed a multiplicative failure rate that varied randomly during the test phase, and the model could explain the phenomenon of imperfect debugging in practice. Peng et al. [33] proposed an SRGM that considered testing-effort allocation for imperfect debugging processes. They designed three testing-effort functions (constant, Weibull, and logistic forms) to describe how the software developer allocates testing resources during a time horizon. Wang and Wu [34] proposed an imperfect debugging SRGM by utilizing a fault content function with a log-logistic distribution. They assumed that testing/debugging staff’s skills, case tools, and testing resources are complex and uncertain factors that significantly influence the testing process. Therefore, these factors might cause new software faults, which were produced in the testing process. Zhu and Pham [35] proposed a multi-release software reliability model that considered the remaining software faults from previous software releases and newly introduced faults different from those that existed earlier. Li and Pham [36] also highlighted that debugging and test coverage might not be perfect, since the operating environment might be uncertain and varied for testing projects. Therefore, they proposed a model that used randomly distributed variables to simulate the uncertainty of the operating environment. Zhao et al. [37] took the factors of perfect and imperfect debugging into consideration to propose a Bayesian SRGM to deal with the issue of the uncertainty of software reliability. Saraf and Iqbal [38] developed an SRGM that incorporated two types of imperfect debugging and the change-point factor to deal with the issue of multiple versions for a software release. Li et al. [39] proposed an NHPP-based SRGM that concerns testability growth effort, rectifying delays, and imperfect debugging. Within the above-mentioned studies, the common assumption is that new software errors are simply introduced with testing time. However, pure testing work is only used to discover errors in the system and should not produce any new errors because program codes are not changed during pure testing work. Therefore, in this study, the software error correction process and debugging staff’s negligence are considered, and our assessment of efficiency includes the results from the detected software error patterns. This is different from the idea of a traditional imperfect debugging SRGM.

It is easy to understand how the processing time for removing or correcting each error could be different, because different types of errors need different amounts of time to be corrected. Furthermore, the processing time for removing or correcting an error should be regarded as a random variable, which needs to be estimated in advance from the related historical data. However, most existing SRGMs simply assume that such processing time is constant and unitary. Obviously, this assumption might be unrealistic in practice. Accordingly, some related studies have begun to revise the former assumption and focus on the issue of multiple error types in order to improve the practicality of SRGMs. Kapur et al. [9,10] assumed that error detection rates are different for various error types. In their model, the errors can be classified as simple, difficult, and complex types. Jain et al. [40] constructed an SRGM that considered simple and complex errors during the debugging process. Garmabaki et al. [41] examined software version up-gradation and error conditions with different degrees of severity to construct software reliability models for different software versions, in which the number of error removals in each version included the detected errors in the current version and those left from the previous versions. Khatri and Chhillar [42] classified software faults into three types (simple, difficult, and complex) to develop an imperfect SRGM. This classification was conducted according to the degree of error correction difficulty. Kaushal and Khullar [43] proposed Neural Network Approaches to predict software errors for NASA’s public defect datasets. They classified software faults into different types according to the severity of the damage caused by the fault. Song and Rhew [44] developed a method for identifying the types of software errors automatically. They roughly classified software errors into logic, data, interface, document, and computational problems. Wang et al. [45] also took two types of faults into consideration to propose an SRGM and assumed that the two types of faults are mutually independent. Zhu and Pham [46] proposed a two-phase SRGM that considered software error dependency and imperfect fault removal, and they assumed the occurrence of two types of software error during the two-phase debugging process. Huang et al. [47] also considered two types of software error to develop an imperfect debugging SRGM. Based on the above discussion, previous studies have taken the issue of multiple error types into consideration because the correction for a complex error usually requires more time than a simple one does, and it is inappropriate to calculate the estimated time and cost for different error types by using a unified setting. Moreover, different errors require different processing time during a correction, and therefore the processing time for removing or correcting an error should be regarded as a random variable instead of a constant value. The decision-maker can reasonably estimate it by using the related operational data. However, most of the related studies have failed to recognize this factor. Accordingly, in order to increase practicability, our study assumes that the processing time for removing or correcting different error types follows specified probability distributions with different parameters.

In general, the testing environment might be changed due to increasing testing staff, upgrades to hardware, or changing case tools, and these factors will cause changes in debugging efficiency. The timing of changes in the testing environment is known as a change-point within the SRGM. Huang [5] proposed an SRGM that considered both testing effort and change-points. Huang and Hung [48] extended their former model to propose a model with multiple change-points based on the assumption that software developers will upgrade the developed software many times during a software’s life cycle in practice. Hu et al. [49] proposed a modified adaptive testing model, and considered how a testing environment may be changed due to the change in a testing policy. Inoue et al. [50] proposed an all-stage truncated multiple change-point model for assessing software reliability. They utilized a zero-truncated Poisson distribution to describe the counting process for software errors. Nagaraju et al. [51] proposed an SRGM that incorporated heterogeneous change-points and an error detection process that can be characterized by different time segments. Ke and Huang [52] highlighted that a development environment or method might be changed for some reason, and that such changes in the development process must be taken into consideration when adjusting the former analysis regarding software reliability. Pradhan et al. [53] also took the change-point issue into consideration when they proposed a testing-effort-based NHPP software reliability growth model. Khurshid et al. [54] proposed an imperfect debugging model, and they integrated testing effort, the fault-reduction factor, and change-point issues into their study. SRGMs that incorporate the change-point issue can bring more flexibility than the traditional models when the software developers face varied development or running environments. Accordingly, in this study, the change-point issue is taken into consideration, since, in practice, a software developer may increase manpower and resources within a testing system to accelerate testing tasks, and these changes might result in a change in software reliability. Thus, the current study incorporates these practical considerations within its model.

To sum up, this study considers the multiple issues of software testing in order to construct a software reliability growth model under multiple alternatives. Furthermore, the best timing for software release can be evaluated by considering reliability and cost. In addition, the parameters of the proposed model are more intuitive and can be more easily estimated or evaluated. Accordingly, we consider our study to be helpful to software industries aiming to release their software products or systems to the market. We identify three advantages of the present study, as follows: (1) The study integrates the issues of learning effect, human factors, imperfect debugging, multiple errors classification, change-point, etc., to construct a software reliability growth model under multiple alternatives. It is different from most of the related studies, which only take some of these factors into consideration. (2) The proposed model exhibits better performance with regards to goodness-of-fit compared with other existing models in statistical analyses. The estimation methods of the parameters’ values and confidence intervals are also developed in this study. (3) The model presented in this study is easier for practical applications. The optimal timing for software release can be obtained by the proposed model in terms of testing cost under the constraint of software reliability. The managers can evaluate all the feasible alternatives and decide on one of them before proceeding with software testing work. The rest of this paper is organized as follows: Section 2 presents the development of the model, parameter estimation, and the optimal software release model. Section 3 provides the model verification and comparison. The application and numerical analysis are presented in Section 4. Finally, Section 5 draws the concluding remarks and identifies topics for future studies.

2. Software Reliability Modeling

In recent years, NHPP has been widely and successfully used to examine reliability issues for a range of hardware and software applications. NHPP is a derivative of HPP; the major difference between these two stochastic processes is that the NHPP allows for the expected number of failures to vary with time. Since expected software faults decrease with testing time during the debugging process, it is suitable for application within software reliability growth models.

An implemented software system needs to be tested and debugged in advance to ensure quality when the system is released to the market. Several feasible alternatives can be designed and prepared for the software department or company, with the decision-maker selecting the best option for the software testing work. To balance system stability and related costs, the managers need to know the efficiency of any system reliability improvements under different resource allocations. Therefore, the managers will estimate the testing/debugging efficiency and the spending budget under several resource alternatives in advance. However, it is not easy to effectively estimate testing/debugging efficiency when the managers want to design and change input resources over time. Moreover, the environment of software testing might be changed or adjusted in practice due to changing testing staff, hardware, tools, or strategies midway through the testing process, and these changes may result in changes in testing/debugging efficiency. Accordingly, it is necessary to develop an effective SRGM that can deal with such issues of multiple testing alternatives or testing environment changes.

Generally, the process of software reliability growth can be described by mathematical functions as a counting process,

\{N (t), t \geq 0\}

, where

N (t)

follows an NHPP with the mean value function

M (t)

, and this probability can be formulated as follows:

\Pr (N (t) = k) = \frac{{[M (t)]}^{k} e^{- M (t)}}{k!}, k = 0, 1, 2, \dots

(1)

From another perspective, the mean value function can be expressed as the expected number of errors detected within the time period

[0, t]

:

M (t) = \int_{0}^{t} λ (x) d x .

(2)

Furthermore, software reliability

R (x | t)

can be defined as the probability of no errors being detected within the time period

[t, t + x]

, and it can be formulated as follows:

R (x | t) = e^{- [M (t + x) - M (t)]} .

(3)

The operating time

x

is for the requirements of stability in practice, and the decision-maker can give an adequate value for managerial requirements. Note that enlarging the operating time will decrease the value of reliability. In addition, the value of software reliability will finally approach one when testing time approaches infinity (

\lim_{t \to \infty} R (x | t) \to 1

). The notations and terminologies are presented in Table 1, and they will be used throughout the study:

2.1. Basic Model Development

In general, it is unrealistic to assume that testing work will be conducted under perfect debugging conditions because the debugging staff may always produce new errors or bugs when removing and correcting the detected ones. Therefore, some of the related studies have made the assumption that software errors may be increased by testing time, and this can be called the issue of imperfect debugging in SRGM. In our proposed model, the major influential factors include testing staffs’ autonomous errors-detected factor

α

, learning factor

β

, and negligent factor

γ

. It should be noted that the values of the parameters

α

,

β

, and

γ

must be greater than zero. In order to describe the concept of the proposed model, a causal loop diagram is used to illustrate the process of software reliability growth. Figure 1 illustrates the logical concept of the process of the software testing/debugging task, and the total efficiency of error detection is associated with the three factors. The arrows indicate the direction of influence among these factors and/or components. The circles represent the factors regarding debugging works and testing costs. The rectangles represent the relevant components. Additionally, the meaning of the mean value function

M (t)

can be regarded as the average of the cumulative number of errors detected. Therefore, the function

M (t)

is used to estimate the expected cumulative number in practice. However, if we take the stochastic issue into consideration, the actual cumulative number will be influenced by random noise. Such a definition was often seen in similar studies.

In this causal loop diagram, the autonomous errors-detected factor

α

can be described as the rate at which testing staff can find errors by their original ability. In other words, these are software errors spontaneously found by testing staff without any reference to the previous error patterns. Therefore, this ability originates from the testing staff’s intelligence and previous training, and it is unrelated to the learning factor, i.e., knowledge acquired from the current project. Furthermore, the learning factor

β

indicates the testing staff’s learning ability, which is based on the previous error patterns. Therefore, the learning effect would increase with the testing time. Both factors can improve the efficiency of software debugging. However, the negligent factor

γ

is the main cause of the phenomenon of imperfect debugging, and it can be regarded as the increased rate of new errors due to the debugging staff’s carelessness. In other words, the negligent factor negatively impacts on the efficiency of debugging, and it will increase the number of software errors. In this study, the number of the total software errors is set as a function of testing time, and we consider its increment to be related to the mean value function

M (t)

. Therefore, the function of the total software errors can be defined as follows:

A (t) = a + γ M (t),

(4)

where

a

is the initial number of potential errors before the testing work started, and the new errors are those introduced along with the revised codes. The negligent factor

γ

can be regarded as the rate of increase and is related to the accumulated number of software errors detected. In addition,

F (t)

represents the cumulative fraction of the errors detected within the time range (0,

t

), and

A (t) - M (t)

signifies the undetected errors at time

t

. Accordingly, the process of the detection can be described using the following differential equation:

D (t) = \frac{d M (t)}{A (t) - M (t)} = α + β F (t)

(5)

where the parameters

α

and

β

must be greater than or equal to zero to ensure the effect of debugging activity is positive in the testing process. Equations (4) and (5) imply that the learning effect and imperfect debugging exist in the testing process. Accordingly, the mean value function

M (t)

can be derived by using differential equation methods; the following steps are the process for deducing the function

M (t)

.

Since the cumulative fraction of the original detected pattern

F (t)

is defined as

M (t) / a

and the total errors increase with testing time, i.e.,

A (t) = a + γ M (t)

, Equation (5) can be rewritten as

D (t) = \frac{d M (t)}{a - (1 - γ) M (t)} = α + (\frac{β}{a}) M (t)

. In order to deduce the math form of

M (t)

, we arrange the equation in the following steps:

d M (t) = (a - (1 - γ) M (t)) (α + (\frac{β}{a}) M (t))

(6)

\Rightarrow \frac{d M (t)}{(a - (1 - γ) M (t)) (α + (\frac{β}{a}) M (t))} = 1

Taking the integral of both sides of the equation, we can obtain the result as follows:

\int \frac{d M (t)}{(a - (1 - γ) M (t)) (α + (\frac{β}{a}) M (t))} d t = \int 1 d t

\frac{\ln (a - (1 - γ) M (t)) - \ln (a α + β M (t))}{- β - α (1 - γ)} = t + c o n s t

(7)

Solving the above equation for

M (t)

, we can obtain the math form of

M (t)

with the unknown constant as the following equation:

M (t) = \frac{- a e^{(α + β) (t + c o n s t)} + a α e^{α γ (t + c o n s t)}}{- β e^{α γ (t + c o n s t)} + (γ - 1) e^{(α + β) (t + c o n s t)}}

(8)

Since the initial condition

M (0) = 0

is given (no error detected at testing time

t = 0

), we can use it to solve the value of the unknown constant as follows:

M (0) = \frac{a e^{(α + β) c o n s t} - a α e^{(α γ) c o n s t}}{β e^{(α γ) c o n s t} + (1 - γ) e^{(α + β) c o n s t}} = 0

(9)

Solving Equation (8) for the unknown constant, the constant can be obtained as follows:

c o n s t = \frac{\ln (α)}{(1 - γ) α + β}

(10)

Substituting the constant into Equation (8), we can obtain the complete form of

M (t)

as follows:

M (t) = \frac{a e^{(α + β) (t + \ln α) / ((1 - γ) α + β))} - a α e^{α γ (t + \ln (α) / ((1 - γ) α + β))}}{β e^{α γ (t + \ln (α) / ((1 - γ) α + β))} + (1 - γ) e^{(α + β) (t + \ln (α) / ((1 - γ) α + β))}}

(11)

Since

e^{\ln (α)} = α

, the form of

M (t)

can be simplified as follows:

M (t) = \frac{a α^{\frac{α + β}{(1 - γ) α + β}} (e^{α γ t} - e^{(α + β) t})}{β α^{\frac{α γ}{(1 - γ) α + β}} e^{α γ t} + (1 - γ) α^{\frac{α + β}{(1 - γ) α + β}} e^{(α + β) t}}

(12)

Furthermore, the intensity function

λ (t)

, i.e., the number of the errors detected at time

t

, can be obtained as follows:

λ (t) = \frac{d M (t)}{d t} = \frac{a α^{\frac{α + β}{(1 - γ) α + β}} ((1 - γ) α + β) ((1 - γ) α^{\frac{α + β}{(1 - γ) α + β}} + β α^{\frac{α γ}{(1 - γ) α + β}}) e^{((1 + α) γ + β) t}}{{((1 - γ) α^{\frac{α + β}{(1 - γ) α + β}} e^{(α + β) t} + β α^{\frac{α γ}{(1 - γ) α + β}} e^{α γ t})}^{2}} .

(13)

In order to handle the variation of the error detection rate per error at time

t

, the form of the error detection rate needs to be known and can be derived by Equations (4), (6), and (12), as follows:

D (t) = \frac{d M (t)}{A (t) - M (t)} = \frac{((1 - γ) α + β) α^{\frac{α + β}{(1 - γ) α + β}}}{(1 - γ) α^{\frac{α + β}{(1 - γ) α + β}} + β α^{\frac{α γ}{(1 - γ) α + β}} e^{(α γ - (α + β)) t}} .

(14)

The error detection rate is a strictly increasing function, which means that the testing staff’s efficiency is always improving in the process. Proposition A1 provides proof that the error detection rate is increasing at any testing time. Please see the proof in Appendix A.

In general, testing staff’s learning effect will present an acceleration phenomenon in the testing process. Once the acceleration phenomenon is significant, the mean value function will present as S-shaped. However, the means by which to verify the S-shaped mean value function depends on whether the inflection point is greater than zero or not. By referring to Proposition A2, we can see that the mean value function will show the acceleration phenomenon as S-shaped if the condition

γ > \frac{α - β}{α}

is supported. The existence of an inflection point implies that the learning effect is significant in the testing process, and the inflection time point can be given by the following:

t_{inflection} \approx \frac{α γ \ln [α] - ((1 - γ) α + β) (\ln [\frac{1 - γ}{β}] α^{\frac{α + β}{(1 - γ) α + β}})}{{((1 - γ) α + β)}^{2}} .

(15)

The related proof of Proposition A2 can be seen in Appendix A.

2.2. Parameter Estimation

The two estimation methods can be applied to estimate the model parameters in this study. These collected software failure data are used to demonstrate the fitting degrees among the proposed model and other existing ones.

(1) The least-square estimation method (LSE) is a standard approach in statistics for estimating model parameters by minimizing the sum of the squares of the residuals. A set of

n

pairs of observed data is taken into consideration, i.e.,

(t_{0}, M_{0}), (t_{1}, M_{1}), (t_{2}, M_{2}), \dots, (t_{n}, M_{n})

, to estimate all parameters of the proposed model. Here,

M_{i}

is the number of total detected errors within the period [

0, t_{n}]

, and the calculation can be given by the following:

M i n E r r o r (a, α, β, γ) = \sum_{i = 1}^{n} {(M_{i} - M (t_{i}))}^{2} .

(16)

Taking the first-order derivative of Equation (16) with respect to each parameter and letting them be equal to zero, the simultaneous equations can be given as follows:

\frac{\partial E r r o r (a, α, β, γ)}{\partial a} = \frac{\partial E r r o r (a, α, β, γ)}{\partial α} = \frac{\partial E r r o r (a, α, β, γ)}{\partial β} = \frac{\partial E r r o r (a, α, β, γ)}{\partial γ} = 0 .

(17)

Since the closed-form expression of the solution cannot be obtained, numerical methods are employed to solve the simultaneous equations to obtain the estimated parameter values. However, if software testing managers use an error seeding method to estimate parameter

a

according to the system scale, the estimation of the remaining parameters will be simplified by solving the simultaneous equations

\frac{\partial E r r o r (α, β, γ | \hat{a})}{\partial α} = \frac{\partial E r r o r (α, β, γ | \hat{a})}{\partial β} = \frac{\partial E r r o r (α, β, γ | \hat{a})}{\partial γ} = 0

.

(2) The maximum likelihood estimation method (MLE) is another approach to estimate the parameters of an assumed probability distribution. Due to the operation of a Non-Homogeneous Poisson Process, the likelihood function can be given as follows:

\begin{array}{l} L [a, α, β, γ] = & \Pr \{N (t_{1}) = M_{1}, N (t_{2}) = M_{2}, N (t_{3}) = M_{3}, \dots, N (t_{n}) = M_{n}\} \\ = \prod_{i = 1}^{n} \frac{{(M (t_{i}) - M (t_{i - 1}))}^{(M_{i} - M_{i - 1})} (e^{- (M (t_{i}) - M (t_{i - 1}))})}{(M_{i} - M_{i - 1})!} . \end{array}

(18)

Taking the natural logarithm of Equation (18), the likelihood function can be given as follows:

\begin{array}{l} \ln [L [a, α, β, γ]] = & \sum_{i = 1}^{n} (M_{i} - M_{i - 1}) \ln [M (t_{i}) - M (t_{i - 1})] - \sum_{i = 1}^{n} (M (t_{i}) - \\ M (t_{i - 1})) - \sum_{i = 1}^{n} \ln [(M_{i} - M_{i - 1})!] . \end{array}

(19)

Likewise, taking the first-order derivative of Equation (19) with respect to each parameter and letting them be equal to zero, the simultaneous equations can be given as follows:

\frac{\partial \ln [L [a, α, β, γ]]}{\partial a} = \frac{\partial \ln [L [a, α, β, γ]]}{\partial α} = \frac{\partial \ln [L [a, α, β, γ]]}{\partial β} = \frac{\partial \ln [L [a, α, β, γ]]}{\partial γ} = 0 .

(20)

The estimated values of these parameters can be obtained by solving the log-likelihood function with numerical methods. However, if an error seeding method can be employed to estimate the initial number of all potential errors in advance, the estimation of the remaining parameters will be simplified by solving the simultaneous equations as

\frac{\partial \ln [L [α, β, γ | \hat{a}]]}{\partial α} = \frac{\partial \ln [L [α, β, γ | \hat{a}]]}{\partial β} = \frac{\partial \ln [L [α, β, γ | \hat{a}]]}{\partial γ} = 0

.

In addition, the confidence interval for the parameters

α, β, and γ

can be derived from the variance–covariance matrix

\sum

for all the maximum likelihood estimators. The Fisher information matrix F can be used to obtain the variance–covariance matrix

\sum

as follows:

F = [\begin{matrix} E [\frac{- \partial^{2} \ln [L [α, β, γ | \hat{a}]]}{\partial α^{2}}] & E [\frac{- \partial^{2} \ln [L [α, β, γ | \hat{a}]]}{\partial α \partial β}] & E [\frac{- \partial^{2} \ln [L [α, β, γ | \hat{a}]]}{\partial α \partial γ}] \\ E [\frac{- \partial^{2} \ln [L [α, β, γ | \hat{a}]]}{\partial β \partial α}] & \frac{- \partial^{2} \ln [L [α, β, γ | \hat{a}]]}{\partial β^{2}} & E [\frac{- \partial^{2} \ln [L [α, β, γ | \hat{a}]]}{\partial β \partial γ}] \\ E [\frac{- \partial^{2} \ln [L [α, β, γ | \hat{a}]]}{\partial γ \partial α}] & E [\frac{- \partial^{2} \ln [L [α, β, γ | \hat{a}]]}{\partial γ \partial β}] & E [\frac{- \partial^{2} \ln [L [α, β, γ | \hat{a}]]}{\partial γ^{2}}] \end{matrix}] .

(21)

The variance–covariance matrix

\sum

can be calculated by the inverse matrix of F, and it can be given as follows:

\sum = F^{- 1} = [\begin{matrix} V a r [α] & C o v [α, β] & C o v [α, γ] \\ C o v [β, α] & V a r [β] & C o v [β, γ] \\ C o v [γ, α] & C o v [γ, β] & V a r [γ] \end{matrix}] .

(22)

The variance–covariance matrix can be used to measure the possible bias of the estimated parameters. The two-sided approximate 100 × CR% (Critical Region) for the estimated parameters can be obtained as follows:

\hat{α} \pm t_{C R / 2, n - 4} \sqrt{V a r [α]}, \hat{β} \pm t_{C R / 2, n - 4} \sqrt{V a r [β]} and \hat{γ} \pm t_{C R / 2, n - 4} \sqrt{V a r [γ]},

(23)

where

t_{C R / 2, n - 3}

is the critical value of a given area

C R / 2

of Student-t distribution with

n - 4

degrees of freedom.

2.3. Optimal Software Release with Consideration of Multiple Alternatives

In general, more than one feasible alternative is always prepared for the software department or company, and the testing manager will evaluate all the feasible alternatives and decide on one of them to proceed with the software testing work. However, the manager should consider not only savings in the related testing costs but also the need to meet the requirements of system reliability and release date. In order to balance these conflicting objectives, the manager needs to develop a decision model to evaluate which alternative is worthy to proceed with. According to historical data obtained from former testing projects and different arrangements of testing/debugging staff, the software testing engineers are able to devise multiple alternatives with the corresponding parameter values and costs. Accordingly, the proposed model can be presented as follows:

\underset{p = 1 . . P}{M i n} \{\underset{T_{p}}{M i n} E [T C (T_{p})]\} = S C_{p} + G C_{p} T_{p} + E C_{y}^{p} M_{v}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) E [t_{r} | {\hat{θ}}_{y}, ξ_{y}] + R C (1 - R_{v}^{p} (x | T_{p})) + O C (T_{p}) S u b j e c t t o : R_{v}^{p} (x | T_{p}) \leq R_{m r}, f o r p = 1 . . P T_{p} \leq T_{U}, f o r p = 1 . . P

(24)

The objective function (24) is to minimize the total testing cost at the alternative p’s testing time

T_{p}

from the designed alternatives under the minimal requirement of system reliability

R_{m r}

and the restricted timeline of software release

T_{U}

. The values

{\hat{α}}_{p}

,

{\hat{β}}_{p}

, and

{\hat{γ}}_{p}

denote the estimated values of the model for testing alternative

p

.

R_{v}^{p} (x | T_{p})

is the defined reliability of no errors occurring within the operation time

x

under testing alternative

p

and it can be calculated by

R_{v}^{p} (x | T_{p}) = e^{- [M_{v}^{p} (T_{p} + x | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) - M_{v}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p})]}

. The objective function includes the five costs, which can be illustrated in the following descriptions. The setup cost of testing alternative

p

includes the initial cost in testing planning, equipment, and preparation works, which is given by

S C_{p}

.

G C_{p}

denotes the daily administrative cost per unit time during the testing period, which may include utility fees, office rental, and insurance.

E C_{y}^{p}

denotes the cost of removing and correcting a simple, complex, or difficult error per unit time under testing alternative

p

.

M_{v}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p})

denotes the total number of errors detected for alternative

p

with testing environment

v

. However, the error types should be categorized into simple, complex, and difficult, and the number of simple, complex, and difficult errors can be formulated as follows:

\begin{array}{l} M_{v}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) & = \sum_{y = s, c, d} M_{v}^{p, y} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) \\ = M_{v}^{p, s} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) + M_{v}^{p, d} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) + M_{v}^{p, c} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) \end{array} \Rightarrow \{\begin{matrix} M_{v}^{p, s} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) = q_{s} M_{v}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) \\ M_{v}^{p, d} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) = q_{d} M_{v}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}), \\ M_{v}^{p, c} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) = (1 - q_{s} - q_{d}) M_{v}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) \end{matrix}

(25)

where

q_{s}

and

q_{d}

are the estimated ratios of the simple and difficult errors respectively.

In addition, it is necessary to estimate the time for removing and rectifying an error in advance, which will be assumed to follow a truncated exponential distribution (Huang et al. [43]), as follows:

G_{y} (θ_{y}, ξ_{y}) = {(1 - e^{- ξ_{y} / θ_{y}})}^{- 1} θ_{y}^{- 1} e^{- t_{r} / θ_{y}}, 0 < t_{r} < ξ_{y} and 0 < θ_{y} < ξ_{y} .

(26)

ξ_{y}

denotes the upper time limit for correcting an error, which is categorized into type

y

;

t_{r}

is a random variable corresponding to the processing time for correcting an error;

θ_{y}

denotes the parameter for the expected time for correcting an error of type

y

, and a maximum likelihood estimation method can be used to evaluate the estimator

{\hat{θ}}_{y}

. In addition, complex errors always require more time to correct than simple ones because of the difficulty of the correction process. As a result, the upper time limit of difficult errors will be greater than for complex or simple ones (

ξ_{d} > ξ_{c} > ξ_{s}

). Based on the above, taking the natural logarithm for the likelihood function based on Equation (26), the result can be obtained as follows:

\ln [\prod_{i = 1}^{n} {(1 - e^{- ξ_{y} / θ_{y}})}^{- 1} θ_{y}^{- 1} e^{- t_{r}^{(i)} / θ_{y}}] = - n \ln [1 - e^{- ξ_{y} / θ_{y}}]- n \ln[θ_{y}] - \sum_{i = 1}^{n} t_{r}^{(i)} / θ_{y} .

(27)

In Equation (27),

t_{r}^{(i)}

denotes the ith sample of the correction time from historical data (

D^{(n)} =

{

t_{r}^{(1)}

,

t_{r}^{(2)}

,…,

t_{r}^{(n)}

}). Taking the first-order derivative of the log-likelihood function with respect to

θ_{y}

and allowing a value of 0, the value of the estimator

{\hat{θ}}_{y}

can be obtained as follows:

\frac{\partial (n \ln [1 - e^{- ξ_{y} / θ_{y}}]- n \ln[θ_{y}] - \sum_{i = 1}^{n} t_{r}^{(i)} / θ_{y})}{\partial θ_{y}} = \frac{n ξ_{y} e^{- ξ_{y} / θ_{y}}}{(1 - e^{- ξ_{y} / θ_{y}}) θ_{y}^{2}} - \frac{n}{θ_{y}} + \frac{\sum_{i = 1}^{n} t_{r}^{(i)}}{θ_{y}^{2}} = 0 .

(28)

We can solve equation (28) for

θ_{y}

by numerical methods to calculate the estimator

{\hat{θ}}_{y}

. Therefore, the expected time for correcting an error of type

y

with the parameters

ξ_{y}

and

{\hat{θ}}_{y}

can be estimated from the following equation:

E [t_{r} | {\hat{θ}}_{y}, ξ_{y}] = \int_{0}^{ξ_{y}} t_{r} G_{y} (θ_{y}, ξ_{y}) d t_{r} = \int_{0}^{ξ_{y}} t_{r} {(1 - e^{- ξ_{y} / {\hat{θ}}_{y}})}^{- 1} {\hat{θ}}_{y}^{- 1} e^{- t_{r} / {\hat{θ}}_{y}} d t_{r} .

(29)

Due to the lack of a closed-form solution to Equation (29), a numerical integration method will be needed to obtain the value. Based on the above-mentioned considerations, the total cost of error correction during the testing process can be given as

E C_{y}^{p} M_{v}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) E [t_{r} | {\hat{θ}}_{y}, ξ_{y}]

.

The risk cost can be regarded as the loss if an error occurs after the software has been released to the market or implemented within an organization. In this study,

R C

can be a loss due to operational failure or damage to commercial reputation, and

1 - R_{v}^{p} (x | T_{p})

is the probability of the occurrence of failures after the software release. Therefore, the expected risk cost can be evaluated as

R C (1 - R_{v}^{p} (x | T_{p}))

. In addition, delaying the software release may result in both tangible and intangible losses. Accordingly, the time-dependent opportunity loss due to a delay in the software release must be considered, and this can be defined as

O C (T_{p}) = ω_{0} {(ω_{1} + T_{p})}^{ω_{2}}

.

ω_{0}

,

ω_{1}

, and

ω_{2}

denote the scale coefficient, the intercept value, and the increasing degree of opportunity loss over time, respectively. In this study, we take the power-law form as the opportunity cost. However, if decision-makers have other considerations, the math form of the opportunity cost can be redefined according to their needs.

Moreover, the managers might intend to accelerate the current testing work for some reason, and will therefore incorporate more manpower or resources to shorten the testing period. Once the manager decides to proceed with it, the change-point

τ

will be shown in the curve of the mean value function. In such circumstances, the related parameters need to be re-estimated, and the mean value function should be redefined as follows:

\begin{array}{l} M_{v}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}) \\ \Rightarrow \{\begin{matrix} M_{b}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}), τ \geq T_{p} > 0 \\ M_{f}^{p} (T_{p} | {\hat{α}}_{p}^{f}, {\hat{β}}_{p}^{f}, {\hat{γ}}_{p}^{f}) - M_{f}^{p} (τ | {\hat{α}}_{p}^{f}, {\hat{β}}_{p}^{f}, {\hat{γ}}_{p}^{f}) + M_{b}^{p} (τ | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p}), & T_{p} > τ \end{matrix} \end{array}

(30)

where

M_{b}^{p} (T_{p} | {\hat{α}}_{p}, {\hat{β}}_{p}, {\hat{γ}}_{p})

and

M_{f}^{p} (T_{p} | {\hat{α}}_{p}^{f}, {\hat{β}}_{p}^{f}, {\hat{γ}}_{p}^{f})

denote the mean value functions before and after the change-point

τ

.

{\hat{α}}_{p}^{f}

,

{\hat{β}}_{p}^{f}

, and

{\hat{γ}}_{p}^{f}

denote the estimated parameters for the changing environment.

3. Model Verification and Comparison

The section assesses the different SRGMs’ fitness ability to different datasets in practice. The datasets are available and were used by the related studies in the research field. Table 2 presents the information of the datasets, and further details are provided in Table 3. In order to verify the validity of the fitness of the proposed model, the study compared it with the three alternative classic SRGMs with imperfect debugging, and these can be seen in Table 4. Figure 1, Figure 2, Figure 3 and Figure 4 present the fitting results of each imperfect debugging SRGM under the different datasets with the traditional CI. In order to fairly investigate the effectiveness of these models, the three most common criteria were chosen. These evaluation criteria were as follows:

(1): Mean square error (MSE) evaluates the difference between the estimated and true values, i.e., $M S E = \frac{\sum_{h = 1}^{n} (m_{h} - m (t_{h}))}{n - k}$ , where $m_{h}$ is the cumulative number of detected errors from time 0 to $m_{h}$ ; $m (t_{h})$ is the cumulative number of detected errors from o to $t_{h}$ estimated by using the mean value function; $n$ is the number of observations; and $k$ is the number of parameters.
(2): R-squared explains the variability of data in a model, with a greater value indicating the model has a better fit, and can be given by $R s q = 1 - \frac{\sum_{h = 1}^{n} (m_{h} - m (t_{h}))}{\sum_{h = 1}^{n} (m_{h} - \frac{1}{n} \sum_{h = 1}^{n} m_{h})}$ .
(3): Akaike information criterion (AIC) [55] is defined as the log-likelihood term penalized by the number of model parameters, which is given by $A I C = n * l n [\frac{R S S}{n}] + 2 k$ .

After the analysis of Goodness-of-Fit, it can be seen that the proposed model mostly outperforms the others in the criteria of MSE, R-squared, and AIC under datasets 3 and 4. Moreover, referring to the details in Table 5 and Figure 2 and Figure 3, we can see that Wang’s model presents an extremely fitting ability for datasets 1 and 2, with R-squared values of 99.43% and 98.81%, respectively. In general, such indicators can measure a model’s performance in adapting to different data patterns, and decision-makers can utilize the applicability of different scenarios to accurately predict the future. However, the situation of overfitting should also be avoided. Examining the bottom-left corners of Figure 2 and Figure 4 for Wang’s model, we can observe the fluctuating curves to fit datasets 1 and 3, which might be due to the phenomenon of overfitting. Overfitting is as an error that occurs in data modeling, in which a particular function may align too closely to a specific dataset but not to other datasets. In other words, a model may overly learn the detail and noise in a specific training dataset. Obviously, the phenomenon of overfitting may negatively impact the performance of a model for new and future datasets. Although the performance of our model may not be superior to that of Wang’s, such overfitting did not show in our model because the proposed model did not show fluctuating curves to fit datasets. It is worth noting that the number of model parameters can influence the fitting ability. In practice, increasing the number of model parameters can effectively improve a model’s flexibility and adaptability. As can be seen in the mean value function of Table 4, Wang’s model has five parameters, while the other models only used four parameters, and therefore Wang’s model can present more flexibility and adaptability than the other models. On the other hand, Kapur’s model shows weak adaptation to S-shaped datasets, perhaps having been developed for a specific scenario. Based on the data presented in Figure 2 and Figure 5, it can be seen that Kapur’s model could only present concave or exponential curves to the S-shaped datasets. Nonetheless, the Kapur model adapts well to the other datasets. In summary, our model presents a better validity of fitness in average cases. In other words, our model’s performance is not inferior to the others. Moreover, since it is easier to comprehend and evaluate our proposed model’s parameters, software testing managers can use them to adjust or modify the values, thereby more accurately predicting the efficiency of a testing scenario that might be changed in the future.

4. Application and Numerical Analysis

Suppose that a business application software provider has developed a commercial software product, and the development of the work is close to the last phase. Therefore, the product manager has to decide the best timing to release the software product. In general, numerous potential software bugs exist in the software, and the number can be estimated by using error seeding methods with consideration of the software scale. In this case, the number of software errors was estimated to be approximately 1650. However, in order to ensure the quality and stability of the software product, testing and debugging works are necessary to satisfy customers’ requirements. Although follow-up service packs can amend the defects of the developed software, customers’ minimal requirement for software reliability is 90%. Consequently, the product manager needs to arrange the appropriate software testing projects and schedules in advance to achieve the objectives of lower cost and higher reliability. However, due to the consideration of the company’s internal human resources, four software testing alternatives can be performed, with the manager choosing the most suitable. After investigation and evaluation by the company’s experts and senior engineers, the information and estimated parameters for the four candidate alternatives can be seen in Table 6. Alternative P1 is devised for a project with low-intensity manpower. The testing team is composed of more junior staff. Even though the efficiency of such alternative testing work is inferior to the others, it can save the costs of removing and correcting different error types. On the contrary, alternative P4 can accelerate the testing work by using high-intensity manpower but it also costs more money due to the requirement of hiring senior staff with experience in testing. Furthermore, each testing staff member performs his/her job in rotation for 44 h per week. Every detected software bug will require different amounts of time to be corrected. The required time for removing bugs is assumed to follow a truncated exponential distribution, and the parameters of the distribution can be obtained by the proposed MLE method. After the calculation using MLE and historical data, the correct times for correcting simple, complex, and difficult errors are estimated to be approximately 1.5, 2.5, and 3.5 h. The manager can use this information to evaluate the related debugging cost. In addition, the risk cost and the opportunity loss are both taken into consideration. The expected risk cost per one percent loss in system reliability is about $3500. The opportunity loss is devised as a time-dependent function, and the parameters are estimated to be

ω_{0}

= 1800,

ω_{1}

= 2.5, and

ω_{2}

= 1.5. The other detailed information can be seen in Table 6.

After calculation employing the proposed model, as shown in Table 7 and Figure 6, the curves of the four alternatives’ expected testing cost show convexity with testing time. Comparing the four alternatives, it can be seen that alternative P3 is more appropriate for the company since its expected testing cost is the lowest of all the other alternatives. Therefore, the manager should adopt alternative P3 and schedule the software release to occur after 13.5 weeks. The software reliability can reach 90.41% and the total expected cost is evaluated to be $387,557. With regard to alternatives P1 and P2, the lowest costs are $418,204 and $427,751 at times 15.5 and 15 weeks according to Table 7. However, despite the lower costs, these time points are not feasible solutions since the two alternatives’ software reliability can only reach 85.55% and 87.68% respectively, meaning they cannot satisfy the minimal requirement of software reliability for customers (90%). This minimal requirement would only be reached at 17 and 16 weeks of testing for these two alternatives, respectively. The growth of different alternatives can be seen in Figure 7. Moreover, alternative P4 can rapidly raise the reliability but the related cost is higher than the other alternatives, by about 5–15%. Alternative P3 therefore offers a compromise between reliability and testing cost, and the manager should choose to proceed with the testing work as long as the testing time is within 13.5–18 weeks.

Furthermore, the manager may want to change his/her previous decision in order to release the software product as early as possible. Suppose that the testing work of alternative P3 has proceeded for 5 weeks, and then the manager decides to increase the manpower over the subsequent weeks in order to shorten the original testing schedule, for instance due to commercial competition. The estimated values of the new alternative’s parameters are presented in Table 8. The learning factor remains the same as before but both the autonomous errors-detected and negligent factors are higher than those in the previous setting. Since the detection rate can be used to investigate the efficiency of testing and debugging, Figure 8 is presented to compare the new alternative with the original one. Although the detection rate of the new alternative is always higher than the original one, it can be seen that the two patterns are similar to each other because the increase in the detection rate with time depends on the size of the learning factor. By employing Equation (30) of the change-point model, the comparative results for the original and new alternatives can be obtained and are presented in Table 9 and Figure 9 and Figure 10. The optimal timing of software release for the new alternative should be scheduled for the end of the 12th week. The estimated testing cost is approximately equal to $389,808 with a reliability of 91.32%. Although the cost is slightly higher than that of the previous alternative, the duration of the testing can be shorter with no sacrifice in reliability. This is because the increase in the related debugging costs can be offset by the decrease in the opportunity loss. However, if the manager chose to follow the original alternative and did not increase the manpower to raise the testing efficiency, the reliability would only be 81.56% at the end of 12 weeks.

Some additional critical parameters were also considered by performing a sensitivity analysis to investigate their impacts on the total cost and the decision of software release time. Figure 11 presents the impacts of related parameters

α

,

β

,

γ

,

S C_{p}

,

E C_{y}^{p}

, and

G C_{p}

on the total cost. As can be seen in Figure 11, with regard to the model parameters, the cost is more sensitive to

α

and

β

. This means that misestimations of

α

and

β

will disturb the testing alternative’s budget plan. If the manager underestimated

α

and

β

, it might lead to the overestimation of the testing cost. Furthermore, the misestimation of

α

and

β

could also influence the company’s software release decision. As can be seen in Figure 12, if the manager overestimated

α

and

β

, he/she will shorten the testing period and rush to release the software. In other words, such a situation may cause dissatisfaction of customers and damage the company’s reputation since the software was released to the market while still unreliable. Moreover, from a different perspective, if the manager wants to increase the efficiency of testing, he/she has to provide more job training to the testing staff in advance, which also increases the cost of educating the testing staff. Although the education investment can raise the values of

α

and

β

, the manager needs to consider the trade-off between any investment in the staff’s skills and the benefit of reducing the cost of reliability improvements. Additionally, the time-dependent administrative cost

G C_{p}

and the error-correction cost

E C_{y}^{p}

would also impact the testing cost. In this case, saving 10% of the error-correction cost can reduce the total testing cost by about 4.5%. The administrative cost is less impactful; saving 10% of the administrative cost may only reduce the total testing cost by about 1.5%. Accordingly, the manager might need to pay attention to how to improve the use of the expenditure for the error-correction works.

5. Conclusions

In this study, we proposed an imperfect debugging software reliability growth model with consideration of human factors and the nature of errors during the debugging process to estimate the related costs and the reliability indicator. Compared with previous models, the estimation of the learning and negligent factors in the proposed model is more intuitive and can be more easily evaluated by internal engineers or domain experts. Furthermore, this study also extended the issue of error classification by considering the impacts of different error types on the system. The expected time required to correct the different types of errors was assumed to obey different truncated exponential distributions. Moreover, in this study, the presented model enables software developers to produce multiple alternatives for a testing project, each with its distinct allocation of human resources. The best alternative can then be identified by utilizing the proposed software release model. These allocations consider debuggers’ learning and negligent factors, which influence the efficiency of software testing in practice. Furthermore, the change-point issue was taken into consideration, i.e., the fact that a software developer may increase manpower and resources so as to accelerate the testing work in practice.

With regard to analyzing the impact of the model parameters

α

,

β

, and

γ

on the testing cost, raising

α

and

β

can help improve debugging efficiency. However, the increase of

α

and

β

requires more on-the-job training or more senior manpower to achieve. In addition, decreasing the value of parameter

γ

depends on successfully reducing errors in correction work. On-the-job training and online case tools may also help reduce repetitions of the same mistakes. Although education investment is able to raise the values of

α

and

β

, the software developer has to consider the trade-off between any investment in staff’s skills and the given benefit in cost reduction due to reliability improvements. In addition, it should be noted that the optimal time to release a software product to the market is not only a matter of cost, because the software provider must also satisfy the minimal level of software reliability according to the contract or most customers’ requirements. In general, opportunity loss is related to the delay in the software release, and therefore the software developer should carefully estimate the parameters of the opportunity loss from marketing surveys.

Finally, two directions can be considered for future study: (1) The issue of multiple change-points could be taken into consideration since there are many environmental factors that may impact the debugging process at different times, resulting in the scenario of multiple change-points. Furthermore, error classification could also be extended to a function by considering the impacts of errors on the system. The testing staff can correct those errors that have more serious impacts on the system, and the probability distribution can be set to different forms in order to increase the flexibility of applications. (2) The time delay issue in developing a software reliability growth model could also be considered, i.e., the fact that software error correction will consume testing time and might therefore postpone the testing schedule. Most of the related studies have tended to ignore this factor and assume that the time required to correct software errors is zero, potentially leading to errors in the estimation of software reliability. Accordingly, future research might consider the time delay issue to make the model more realistic in practice.

Author Contributions

Conceptualization, Q.T., C.-W.Y. and C.-C.F.; Data Curation, Q.T., C.-W.Y. and C.-C.F.; Formal Analysis, Q.T., C.-W.Y. and C.-C.F.; Funding Acquisition, C.-C.F.; Investigation, Q.T.; Methodology, Q.T., C.-W.Y. and C.-C.F.; Project Administration, Q.T. and C.-C.F.; Resources, Q.T., C.-W.Y. and C.-C.F.; Supervision, Q.T.; Writing—Review and Editing, Q.T., C.-W.Y. and C.-C.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was sponsored by the Guangdong Basic and Applied Basic Research Foundation, China [grant number 2020A1515010892].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proposition A1.

Given that the error detection rate

D (t) = \frac{((1 - γ) α + β) α^{\frac{α + β}{(1 - γ) α + β}}}{(1 - γ) α^{\frac{α + β}{(1 - γ) α + β}} + β α^{\frac{α γ}{(1 - γ) α + β}} e^{(α γ - (α + β)) t}}

and the conditions

α

,

β

,

r

> 0 and

r < 1

are supported, the error detection rate always presents an increasing trend and the range is within

[\frac{((1 - γ) α + β) α^{\frac{α + β}{(1 - γ) α + β}}}{(1 - γ) α^{\frac{α + β}{(1 - γ) α + β}} + β α^{\frac{α γ}{(1 - γ) α + β}}}, \frac{((1 - γ) α + β) α^{\frac{α + β}{(1 - γ) α + β}}}{(1 - γ) α^{\frac{α + β}{(1 - γ) α + β}}})

.

Proof.

In order to verify whether the function

D (t)

is strictly increasing or not, we take the first-order derivative for it with respect to time

t

and can obtain the following result:

D ’ (t) = \frac{d D (t)}{d t} = \frac{β α^{\frac{(1 + γ) α + β}{(1 - γ) α + β}} e^{((1 - γ) α + β) t} {((1 - γ) α + β)}^{2}}{{(β α^{\frac{α γ}{(1 - γ) α + β}} + (1 - γ) α^{\frac{α + β}{(1 - γ) α + β}} e^{((1 - γ) α + β) t})}^{2}}

In this equation, the denominator and numerator must be positive; this can be assured since the square and the conditions

α

,

β

,

r

> 0 and

r < 1

. Accordingly, the derivative function

D' (t)

is greater than 0, which means that the error detection rate always presents an increasing trend with any testing time t.

Furthermore, the lower and upper boundaries can be obtained by investigating the limits of

\lim_{t \to 0} D (t)

and

\lim_{t \to \infty} D (t)

.

The lower bound:

\lim_{t \to 0} D (t) = \frac{((1 - γ) α + β) α^{\frac{α + β}{(1 - γ) α + β}}}{(1 - γ) α^{\frac{α + β}{(1 - γ) α + β}} + β α^{\frac{α γ}{(1 - γ) α + β}}}

The upper bound:

\lim_{t \to \infty} D (t) = \frac{((1 - γ) α + β) α^{\frac{α + β}{(1 - γ) α + β}}}{(1 - γ) α^{\frac{α + β}{(1 - γ) α + β}}}

□

Proposition A2.

Given that the mean value function

M (t)

is continuous and differentiable and the condition

γ > \frac{α - β}{α}

is held, the time inflection point will exist in and can be approximated to

\frac{γ α \ln [α] - ((1 - γ) α + β) (\ln [\frac{1 - γ}{β}] α^{\frac{α + β}{(1 - γ) α + β}})}{{((1 - γ) α + β)}^{2}}

.

Proof.

According to Equation (12), the mean value function

M (t)

is continuous and differentiable at time

t

under the parameters

α

,

β

,

r

> 0. In order to obtain the time inflection point

t_{inflection}

during the planned horizon, it is necessary to take the first-order derivative for

λ (t)

with respect to time t and let it be zero, thereby obtaining the maximum of

λ (t)

. Since the solution should be a real number, the time inflection point can be obtained by the following equation through numerical methods:

t_{inflection} \approx \frac{α γ \ln [α] - ((1 - γ) α + β) (\ln [\frac{1 - γ}{β}] α^{\frac{α + β}{(1 - γ) α + β}})}{{((1 - γ) α + β)}^{2}}

Since

t_{inflection} > 0

implies that the learning effect is significant in the testing process, the condition can be obtained by solving the inequality

α γ \ln [α] - ((1 - γ) α + β) (\ln [\frac{1 - γ}{β}] α^{\frac{α + β}{(1 - γ) α + β}}) > 0

. Once

γ > \frac{α - β}{α}

is supported, the time inflection point will exist in the planed horizon to present the significant learning effect. □

References

Goel, A.L.; Okumoto, K. Time-dependent fault detection rate model for software and other performance measures. IEEE Trans. Reliab. 1979, 28, 206–211. [Google Scholar] [CrossRef]
Yamada, S.; Ohba, M.; Osaki, S. S-shaped reliability growth modeling for software error detection. IEEE Trans. Reliab. 1984, 32, 475–484. [Google Scholar] [CrossRef]
Pham, H.; Nordmann, L.; Zhang, X. A general Imperfect software debugging model with S-shaped fault detection rate. IEEE Trans. Reliab. 1999, 48, 169–175. [Google Scholar] [CrossRef]
Pham, H.; Zhang, X. NHPP Software Reliability and Cost Models with Testing Coverage. Eur. J. Oper. Res. 2003, 145, 443–454. [Google Scholar] [CrossRef]
Huang, C.Y. Performance analysis of software reliability growth models with testing-effort and change-point. J. Syst. Softw. 2005, 76, 181–194. [Google Scholar] [CrossRef]
Ho, J.W.; Fang, C.C.; Huang, Y.S. The Determination of Optimal Software Release Times at Different Confidence Levels with Consideration of Learning Effects. Softw. Test. Verif. Reliab. 2008, 18, 221–249. [Google Scholar] [CrossRef]
Chiu, K.C.; Huang, Y.S.; Lee, T.Z. A Study of Software Reliability Growth from the Perspective of Learning Effects. Reliab. Eng. Syst. Saf. 2008, 93, 1410–1421. [Google Scholar] [CrossRef]
Inoue, S.; Fukuma, K.; Yamada, S. Two-Dimensional Change-Point Modeling for Software Reliability Assessment. Int. J. Reliab. Qual. Saf. Eng. 2010, 17, 531–542. [Google Scholar] [CrossRef]
Kapur, P.K.; Gupta, D.; Gupta, A.; Jha, P.C. Effect of introduction of faults and imperfect debugging on release time. Ratio Math. 2008, 18, 62–90. [Google Scholar]
Kapur, P.K.; Pham, H.; Anand, S.; Yadav, K. A unified approach for developing software reliability growth models in the presence of imperfect debugging and error generation. IEEE Trans. Reliab. 2011, 60, 331–340. [Google Scholar] [CrossRef]
Zachariah, B. Analysis of Software Testing Strategies Through Attained Failure Size. IEEE Trans. Reliab. 2012, 61, 569–579. [Google Scholar] [CrossRef]
Wang, J.; Wu, Z.; Shu, Y.; Zhang, Z. An imperfect software debugging model considering log-logistic distribution fault content function. J. Syst. Softw. 2015, 100, 167–181. [Google Scholar] [CrossRef]
Fang, C.C.; Yeh, C.W. Effective Confidence Interval Estimation of Fault-detection Process of Software Reliability Growth Models. Int. J. Syst. Sci. 2016, 47, 2878–2892. [Google Scholar] [CrossRef]
Lee, D.H.; Chang, H.; Pham, H. Software Reliability Model with Dependent Failures and SPRT. Mathematics 2020, 8, 1366. [Google Scholar] [CrossRef]
Al-Mutairi, N.N.; Al-Turk, L.I.; Al-Rajhi, S. A New Reliability Model Based on Lindley Distribution with Application to Failure Data. Math. Probl. Eng. 2020, 2020, 4915812. [Google Scholar] [CrossRef]
Marasi, H.R.; Sedighi, M.; Aydi, H.; Gaba, Y.U. A Reliable Treatment for Nonlinear Differential Equations. J. Math. 2021, 2021, 6659479. [Google Scholar] [CrossRef]
Li, Q.; Pham, H. Software Reliability Modeling Incorporating Fault Detection and Fault Correction Processes with Testing Coverage and Fault Amount Dependency. Mathematics 2022, 10, 60. [Google Scholar] [CrossRef]
Chang, T.C.; Lin, Y.; Shi, K.; Meen, T.H. Decision Making of Software Release Time at Different Confidence Intervals with Ohba’s Inflection S-Shape Model. Symmetry 2022, 14, 593. [Google Scholar] [CrossRef]
Liu, Z.; Yang, S.; Yang, M.; Kang, R. Software Belief Reliability Growth Model Based on Uncertain Differential Equation. IEEE Trans. Reliab. 2022, 1–13. [Google Scholar] [CrossRef]
Bokhari, M.U.; Siddiqui, M.A.; Ahmad, A. Integration of Testing Effort Function into Delayed S-Shaped Software Reliability Growth Model with Imperfect Debugging—A Proposed Bokhari Model. Oper. Res. Forum 2021, 2, 56. [Google Scholar] [CrossRef]
Li, Q.; Pham, H. Modeling Software Fault-Detection and Fault-Correction Processes by Considering the Dependencies between Fault Amounts. Appl. Sci. 2021, 11, 6998. [Google Scholar] [CrossRef]
Li, Q.; Pham, H. A Generalized Software Reliability Growth Model with Consideration of the Uncertainty of Operating Environments. IEEE Access 2019, 7, 84253–84267. [Google Scholar] [CrossRef]
Huang, C.Y.; Kuo, S.Y.; Michael, R.L. An assessment of testing-effort dependent software reliability growth models. IEEE Trans. Reliab. 2007, 56, 198–211. [Google Scholar] [CrossRef]
Chiu, K.C.; Ho, J.W.; Huang, Y.S. Bayesian updating of optimal release time for software systems. Softw. Qual. J. 2009, 17, 99–120. [Google Scholar] [CrossRef]
Ahmad, N.; Khan, M.G.M.; Rafi, L.S. A Study of Testing-Effort Dependent Inflection S-Shaped Software Reliability Growth Models with Imperfect Debugging. Int. J. Qual. Reliab. Manag. 2010, 27, 89–110. [Google Scholar] [CrossRef]
Raju, O.N. Software Reliability Growth Models for the Safety Critical Software with Imperfect Debugging. Int. J. Comput. Sci. Eng. 2011, 3, 3019–3026. [Google Scholar]
Ramsamy, S.; Govindasamy, G.; Kapur, P.K. A Software Reliability Growth Model for Estimating Debugging and the Learning Indices. Int. J. Perform. Abil. Eng. 2012, 8, 539–549. [Google Scholar]
Kim, K.S.; Kim, H.C. The performance analysis of software reliability growth NHPP Log-linear model depend on viewpoint of the learning effects. Indian J. Sci. Technol. 2016, 9, 1–16. [Google Scholar] [CrossRef] [Green Version]
Pachauri, B.; Dhar, J.; Kumar, A. Incorporating Inflection S-Shaped Fault Reduction Factor to Enhance Software Reliability Growth. Appl. Math. Model. 2015, 39, 1463–1469. [Google Scholar] [CrossRef]
Jin, C.; Jin, S.W. Parameter optimization of software reliability growth model with S-shaped testing-effort function using improved swarm intelligent optimization. Appl. Soft Comput. 2016, 40, 283–291. [Google Scholar] [CrossRef]
Chiu, K.C.; Huang, Y.S.; Huang, I.C. A Study of Software Reliability Growth with Imperfect Debugging for Time-Dependent Potential Errors. Int. J. Ind. Eng. 2019, 26, 376–393. [Google Scholar]
Aktekin, T.; Caglar, T. Imperfect debugging in software reliability: A Bayesian approach. Eur. J. Oper. Res. 2013, 227, 112–121. [Google Scholar] [CrossRef]
Peng, R.; Li, Y.F.; Zhang, W.J.; Hu, Q.P. Testing Effort Dependent Software Reliability Model for Imperfect Debugging Process Considering Both Detection and Correction. Reliab. Eng. Syst. Saf. 2014, 126, 37–43. [Google Scholar] [CrossRef] [Green Version]
Wang, J.; Wu, Z. Study of the nonlinear imperfect software debugging model. Reliab. Eng. Syst. Saf. 2016, 153, 180–192. [Google Scholar] [CrossRef]
Zhu, M.; Pham, H. Environmental factors analysis and comparison affecting software reliability in development of multi-release software. J. Syst. Softw. 2017, 132, 72–84. [Google Scholar] [CrossRef]
Li, Q.; Pham, H. NHPP software reliability model considering the uncertainty of operating environments with imperfect debugging and testing coverage. Appl. Math. Model. 2017, 51, 68–85. [Google Scholar] [CrossRef]
Zhao, X.; Littlewood, B.; Povyakalo, A.; Strigini, L.; Wright, D. Conservative claims for the probability of perfection of a software-based system using operational experience of previous similar systems. Reliab. Eng. Syst. Saf. 2018, 175, 265–282. [Google Scholar] [CrossRef] [Green Version]
Saraf, I.; Iqbal, J. Generalized Multi-release modelling of software reliability growth models from the perspective of two types of imperfect debugging and change point. Qual. Reliab. Eng. Int. 2019, 35, 2358–2370. [Google Scholar] [CrossRef]
Li, T.; Si, X.; Yang, Z.; Pei, H.; Pham, H. NHPP Testability Growth Model Considering Testability Growth Effort, Rectifying Delay, and Imperfect Correction. IEEE Access 2020, 8, 9072–9083. [Google Scholar] [CrossRef]
Jain, M.; Agrawal, S.C.; Agarwal, P. Markovian Software Reliability Model for Two Types of Failures with Imperfect Debugging Rate and Generation of Errors. Int. J. Eng. Trans. A Basics 2012, 25, 177–188. [Google Scholar] [CrossRef]
Garmabaki, A.H.S.; Aggarwal, A.G.; Kapur, P.K. Multi Up-gradation Software Reliability Growth Model with Faults of Different Severity. In Proceedings of the 2011 IEEE International Conference on Industrial Engineering and Engineering Management, Singapore, 6–9 December 2011; pp. 1539–1543. [Google Scholar]
Khatri, S.; Chhillar, R.S. Designing Debugging Models for Object Oriented Systems. Int. J. Comput. Sci. Issues 2012, 9, 350–357. [Google Scholar]
Kaushal, R.; Khullar, S. PSO Based Neural Network Approaches for Prediction of Level of Severity of Faults in NASA’s Public Domain Defect Dataset. Int. J. Inf. Technol. Knowl. Manag. 2012, 5, 453–457. [Google Scholar]
Song, J.S.; Rhew, S.Y. A Study of Compliance Based on Fault Analysis Improving Reliability of Maintenance. Lect. Notes Softw. Eng. 2014, 2, 110–115. [Google Scholar] [CrossRef] [Green Version]
Wang, L.; Hu, Q.; Liu, J. Software Reliability Growth Modeling and Analysis with Dual Fault Detection and Correction Processes. IIE Trans. 2016, 48, 359–370. [Google Scholar] [CrossRef]
Zhu, M.; Pham, H. A two-phase software reliability modeling involving with software fault dependency and imperfect fault removal. Comput. Lang. Syst. Struct. 2018, 53, 27–42. [Google Scholar] [CrossRef]
Huang, Y.S.; Chiu, K.C.; Chen, W.M. A software reliability growth model for imperfect debugging. J. Syst. Softw. 2022, 188, 111267. [Google Scholar] [CrossRef]
Huang, C.Y.; Hung, T.Y. Software reliability analysis and assessment using queueing models with multiple change-points. Comput. Math. Appl. 2010, 60, 2015–2030. [Google Scholar] [CrossRef]
Hu, H.; Jiang, C.H.; Cai, K.Y.; Wong, W.E.; Mathur, A.P. Enhancing software reliability estimates using modified adaptive testing. Inf. Softw. Technol. 2013, 55, 288–300. [Google Scholar] [CrossRef]
Inoue, S.; Taniguchi, S.; Yamada, S. An All-Stage Truncated Multiple Change Point Model for Software Reliability Assessment. Int. J. Reliab. Qual. Saf. Eng. 2015, 22, 1550017. [Google Scholar] [CrossRef]
Nagaraju, V.; Fiondella, L.; Wandji, T. A heterogeneous single change point software reliability growth model framework. Softw. Test. Verif. Reliab. 2019, 29, e1717. [Google Scholar] [CrossRef]
Ke, S.Z.; Huang, C.Y. Software reliability prediction and management: A multiple change-point model approach. Qual. Reliab. Eng. Int. 2020, 36, 1678–1707. [Google Scholar] [CrossRef]
Pradhan, V.; Dhar, J.; Kumar, A. Testing-Effort based NHPP Software Reliability Growth Model with Change-point Approach. J. Inf. Sci. Eng. 2022, 38, 343–355. [Google Scholar]
Khurshid, S.; Shrivastava, A.K.; Iqbal, J. Effort based software reliability model with fault reduction factor, change point and imperfect debugging. Int. J. Inf. Technol. 2021, 13, 331–340. [Google Scholar] [CrossRef]
Akaike, H. Information theory and an extension of the maximum likelihood principle. In Proceedings of the Second International Symposium on Information Theory, Budapest, Hungary; 1973; pp. 267–281. [Google Scholar]
Zhang, X.; Pham, H. Software field failure rate prediction before software deployment. J. Syst. Softw. 2006, 79, 291–300. [Google Scholar] [CrossRef]
Singpurwalla, N.D.; Wilson, S.P.; Simon, P. Statistical Analysis of Software Failure Data. In Statistical Methods in Software Engineering; Springer: New York, NY, USA, 1999; pp. 101–167. [Google Scholar]

Figure 1. Causal loop diagram of the proposed model.

Figure 2. Goodness-of-Fit, Estimated Values of Models’ Parameters, and 95% Confidence Intervals for Dataset (1).

Figure 3. Goodness-of-Fit and Estimated Values of Models’ Parameters, and 95%Confidence Intervals for Dataset (2).

Figure 4. Goodness-of-Fit and Estimated Values of Models’ Parameters, and 95% Confidence Intervals for Dataset (3).

Figure 5. Goodness-of-Fit and Estimated Values of Models’ Parameters, and 95% Confidence Intervals for Dataset (4).

Figure 6. Total Expected Costs for the Four Alternatives.

Figure 7. Expected Reliability for the Four Alternatives.

Figure 8. The Detection Rate for the Original and New Alternatives.

Figure 9. Total Expected Costs for the Former and the New Alternatives.

Figure 10. Expected Reliability for the Former and the New Alternatives.

Figure 11. The Impacts of Related Parameters

α

,

β

,

γ

,

S C_{p}

,

E C_{y}^{p}

and

G C_{p}

on the Cost.

Figure 11. The Impacts of Related Parameters

α

,

β

,

γ

,

S C_{p}

,

E C_{y}^{p}

and

G C_{p}

on the Cost.

Figure 12. The Impacts of Model Parameters

α

,

β

,

γ

on Decision of Optimal Release Time.

Figure 12. The Impacts of Model Parameters

α

,

β

,

γ

on Decision of Optimal Release Time.

Table 1. The Notations and Terminologies.

f (t)

: the intensity function of the fraction of the errors detected at time

t

F (t)

: the cumulative fraction of the errors detected in the time interval (0,

t

)

α

: the autonomous errors-detected factor

β

: the learning factor

γ

: the negligent factor

a

: the initial number of all potential errors in the software system

A (t)

: the function of the total errors at time

t

in the software system

M (t)

: the mean value function, and it represents the accumulated number of software errors detected during the time interval (0,

t

)

M_{v}^{p} (t)

: the mean value function for alternative

p

with testing environment

v

(

v = b

, before changing point

τ

;

v = f

, after changing point

τ

)

M_{v}^{p, y} (t)

: the mean value function for alternative

p

with testing environment

v

for error type

y

λ (t)

: the intensity function that denotes the number of the errors detected at time

t

D (t)

: the function of the error detection rate

R (x | t)

: the software reliability, which is defined as the probability of no errors being detected within the time interval

[t, t + x]

G_{y} (θ_{y}, ξ_{y})

: the probability density function of the time for correcting an error type y=s (simple errors), y = d (difficult errors) or y = c (complex errors));

t_{r}

: The time required for performing a correction of software error

ξ_{y}

: The pre-specified upper time limit for performing a correction of software error type

y

θ_{y}

: the parameter of the function

G_{y} (θ_{y}, ξ_{y})

;

{\hat{θ}}_{y}

denote the estimator of

θ_{y}

E [θ_{y}, ξ_{y}, t_{r}]

: the expected time for performing a correction of software error type

y

under parameters

ξ_{y}

and

θ_{y}

R_{m r}

: the minimal requirement of the system’s reliability for testing alternatives

Table 2. The Information for the Datasets.

Dataset	Reference	Source
(1)	Zhang and Pham (2006) [56]	Failure data of Telecommunication system
(2)	Wang et al.(2016) [34]	Medium scale software project
(3)	Peng et al.(2014) [33]	Testing data for the Room Air Development Center
(4)	Singpurwalla and Willson (1999) [57]	Failure data of NTDS system

Table 3. The Details of the Datasets {testing time, num. of defects}.

Data1={{0.5,5},{1,5},{1.4,5},{1.9,10},{2.4,15},{2.9,15},{3.3,15},{3.8,25},{4.3,30},{4.8,40},{5.2,45},{5.7,55},{6.2,65},{6.7,65},{7.1,80},{7.6,80},{8.1,85},{8.6,90},{9,90},{9.5,90},{10,100}}

Data2={{1,12},{2,23},{3,43},{4,64},{5,84},{6,97},{7,109},{8,111},{9,112},{10,114},{11,116},{12,123},{13,126},{14,128},{15,132},{16,141},{17,144}}

Data3={{4,2},{8.3,2},{10.3,2},{10.9,3},{13.2,4},{14.8,6},{16.6,7},{31.3,16},{56.4,29},{60.9,31},{70.4,42},{78.9,44},{108.4,55},{130.4,69},{169.9,87},{195.9,99},{220.9,111},{252.3,126},{282.3,132},{295.1,135},{300.1,136}}

Data4={{4,2},{8.3,2},{10.3,2},{10.9,3},{13.2,4},{14.8,6},{16.6,7},{31.3,16},{56.4,29},{60.9,31},{70.4,42},{78.9,44},{108.4,55},{130.4,69},{169.9,87},{195.9,99},{220.9,111},{252.3,126},{282.3,132},{295.1,135},{300.1,136}}

Table 4. Summary of Mean Value Function and Error Detection Rate Functions for the Imperfect Debugging SRGMs.

Imperfect Debugging SRGMs	$M (t)$ $and D (t)$
Pham et al. (1999) [3]	$M (t) = \frac{a (1 - e^{- b t}) (1 + α t + \frac{α t}{b})}{1 + e^{- b t} β}$ and $D (t) = \frac{b}{1 + β e^{- b t}}$ .
Kapur et al. (2008) [9]	$M (t) = \frac{a (1 - e^{- b p (1 - α) t})}{1 - α}$ and $D (t) = \frac{(1 - α) b p}{1 - α e^{- b p t (- 1 + α)}}$ .
Wang et al. (2015) [12]	$M (t) = \frac{a {(α t)}^{d}}{1 + {(α t)}^{d}} + C (1 - e^{- b t}) - a d {(α t)}^{d} e^{- b t} (\frac{1}{d} + \frac{b t}{1 + d} + \frac{b^{2} t^{2}}{2 (2 + d)})$ and $D (t) = b$ .
Proposed model	$M (t) = \frac{a α^{\frac{α + β}{(1 - γ) α + β}} (e^{α γ t} - e^{(α + β) t})}{β α^{\frac{α γ}{(1 - γ) α + β}} e^{α γ t} + (1 - γ) α^{\frac{α + β}{(1 - γ) α + β}} e^{(α + β) t}}$ and $D (t) = \frac{((1 - γ) α + β) α^{\frac{α + β}{(1 - γ) α + β}}}{(1 - γ) α^{\frac{α + β}{(1 - γ) α + β}} + β α^{\frac{α γ}{(1 - γ) α + β}} e^{(α γ - (α + β)) t}}$ .

Table 5. Comparisons of Different Fitting Criteria.

MSE	Data1	Data2	Data3	Data4
Pham et al. (1999) [3]	7.96	44.92	17.35	44.66
Kapur et al. (2008) [9]	42.16	51.86	9.05	463.39
Wang et al. (2015) [12]	6.67	19.22	20.48	41.78
Proposed model	7.96	44.92	4.43	40.91
R-Squared	Data1	Data2	Data3	Data4
Pham et al. (1999) [3]	99.31%	97.22%	99.33%	99.13%
Kapur et al. (2008) [9]	96.37%	96.79%	99.65%	90.92%
Wang et al. (2015) [12]	99.43%	98.81%	99.21%	99.18%
Proposed model	99.31%	97.22%	99.83%	99.20%
AIC	Data1	Data2	Data3	Data4
Pham et al. (1999) [3]	50.55	71.65	66.90	98.16
Kapur et al. (2008) [9]	85.55	74.09	53.24	154.30
Wang et al. (2015) [12]	48.82	59.22	72.39	98.55
Proposed model	50.55	71.65	38.25	96.05

Table 6. Related Settings for Candidate Alternatives P1–P4.

Alternative P1	Alternative P2	Alternative P3	Alternative P4
Low-Intensity Manpower	Mediate-Intensity Manpower	Moderate-Intensity Manpower	High-Intensity Manpower
${\hat{α}}_{1}$ = 0.25, ${\hat{β}}_{1}$ = 0.13, ${\hat{γ}}_{1}$ = 0.06	${\hat{α}}_{2}$ = 0.27, ${\hat{β}}_{2}$ = 0.13, ${\hat{γ}}_{2}$ = 0.04	${\hat{α}}_{3}$ = 0.3, ${\hat{β}}_{3}$ = 0.18, ${\hat{γ}}_{3}$ = 0.03	${\hat{α}}_{4}$ = 0.32, ${\hat{β}}_{4}$ = 0.2, ${\hat{γ}}_{4}$ = 0.03
The initial cost in testing planning, equipment, and preparation works $S C_{p}$ The daily administrative cost per month during the testing period $G C_{p}$
$S C_{1}$ = 5000, $G C_{1}$ = 4500	$S C_{2}$ = 5100, $G C_{2}$ = 4500	$S C_{3}$ = 5200, $G C_{3}$ = 4500	$S C_{4}$ = 8000, $G C_{4}$ = 4500
The estimated ratios of different error types in the system: $q_{s}$ = 50%, $q_{c}$ = 30%, $q_{d}$ = 20%
The costs of removing and correcting different error types under different alternatives
$E C_{s}^{1}$ = 30, $E C_{c}^{1}$ = 40, $E C_{d}^{1}$ = 60	$E C_{s}^{2}$ = 32, $E C_{c}^{2}$ = 45, $E C_{d}^{2}$ = 60	$E C_{s}^{3}$ = 32, $E C_{c}^{3}$ = 46, $E C_{d}^{3}$ = 62	$E C_{s}^{4}$ = 35, $E C_{c}^{4}$ = 52, $E C_{d}^{4}$ = 72
The parameter for the correction time for different error types: $ξ_{s}$ = 5.5, $D^{(s)}$ = {2.5, 3.1, 0.5, 1.4, 1.5, 0.9, 1.2, 1.1, 0.2, 2.9, 2.3, 0.4}→ ${\hat{θ}}_{s}$ = 1.75, $E [t_{r} \| {\hat{θ}}_{s}, ξ_{s}] ≅$ 1.5 h $ξ_{c}$ = 11, $D^{(c)}$ = {1.3, 4.9, 1.1, 4.2, 2.3, 2.7, 2.4, 1.3, 3.3, 3.1, 2.2, 1.4}→ ${\hat{θ}}_{c}$ = 2.70, $E [t_{r} \| {\hat{θ}}_{c}, ξ_{c}] ≅$ 2.5 h $ξ_{d}$ = 15, $D^{(d)}$ = {4.5, 2.7, 1.5, 3.2, 4.1, 5.3, 3.1, 1.9, 2.5, 4.3, 6.2, 2.9}→ ${\hat{θ}}_{d}$ = 3.80, $E [t_{r} \| {\hat{θ}}_{d}, ξ_{d}] ≅$ 3.5 h
The parameters for opportunity cost: $ω_{0}$ = 1800, $ω_{1}$ = 2.5 and $ω_{2}$ = 1.5
The minimal requirement of reliability $R_{m r}$ = 90%, the defined operation time x = 2 h The expected risk cost $R C$ for 1% of system reliability = 3500 The restricted timeline of software release $T_{U}$ = 20 weeks Testing time for one week is about 44 h The estimated number of all potential errors in the software system $\hat{a}$ = 1650

Table 7. The Comparative Results for the Four Alternatives at Different Testing Times.

Weeks	Total Testing Cost				System Reliability
T_p	P1	P2	P3	P4	P1	P2	P3	P4
12	$469,980	$447,576	$395,482	$406,673	57.45%	65.70%	81.56%	86.94%
12.5	$458,701	$437,842	$390,692	$404,546	62.95%	70.73%	85.11%	89.72%
13	$449,157	$430,179	$388,155	$404,438	67.95%	75.17%	88.03%	91.93%
13.5	$441,431	$424,525	$387,557	$405,998	72.44%	79.04%	90.41%	93.69%
14	$435,517	$420,754	$388,598	$408,920	76.42%	82.38%	92.34%	95.07%
14.5	$431,345	$418,705	$391,012	$412,949	79.90%	85.24%	93.90%	96.16%
15	$428,801	$418,204	$394,566	$417,872	82.94%	87.68%	95.14%	97.01%
15.5	$427,751	$419,071	$399,060	$423,518	85.55%	89.73%	96.14%	97.68%
16	$428,047	$421,135	$404,329	$429,748	87.80%	91.47%	96.94%	98.20%
16.5	$429,545	$424,239	$410,237	$436,453	89.72%	92.92%	97.57%	98.60%
17	$432,102	$428,239	$416,670	$443,544	91.36%	94.13%	98.08%	98.91%
17.5	$435,590	$433,011	$423,538	$450,952	92.74%	95.14%	98.48%	99.16%
18	$439,888	$438,444	$430,767	$458,624	93.91%	95.98%	98.79%	99.35%

Table 8. Related Parameters for the Original and New Alternatives.

Alternative P3	New Alternative (P3F) (after the Changed Testing Environment)
$The timing of the change of testing alternative τ$ = 6 weeks
${\hat{α}}_{3}$ $= 0.3, {\hat{β}}_{3}$ $= 0.18, {\hat{γ}}_{3}$ = 0.03	${\hat{α}}_{3}^{f}$ $= 0.38, {\hat{β}}_{3}^{f}$ $= 0.18, {\hat{γ}}_{3}^{f}$ = 0.04
$S C_{3}$ $= 5200, G C_{3}$ = 4500	$S C_{f}$ $= 16, 000, G C_{f}$ = 4500
$E C_{s}^{3}$ $= 32, E C_{c}^{3}$ $= 46, E C_{d}^{3}$ = 62	$E C_{s}^{f}$ $= 45, E C_{c}^{f}$ $= 60, E C_{d}^{f}$ = 70

Table 9. Comparative Results for the Former and the New Alternatives.

Weeks	Total Testing Cost		System Reliability
T_p	P3	P3F (after the Changed Environment)	P3	P3F (after the Changed Environment)
10	$442,364	$412,252	59.49%	76.40%
10.5	$426,205	$401,992	66.27%	81.45%
11	$413,024	$395,154	72.21%	85.52%
11.5	$402,837	$391,254	77.28%	88.77%
12	$395,482	$389,808	81.56%	91.32%
12.5	$390,692	$390,369	85.11%	93.32%
13	$388,155	$392,551	88.03%	94.87%
13.5	$387,557	$396,029	90.41%	96.07%
14	$388,598	$400,540	92.34%	96.99%
14.5	$391,012	$405,871	93.90%	97.70%
15	$394,566	$411,855	95.14%	98.24%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tian, Q.; Fang, C.-C.; Yeh, C.-W. Software Release Assessment under Multiple Alternatives with Consideration of Debuggers’ Learning Rate and Imperfect Debugging Environment. Mathematics 2022, 10, 1744. https://doi.org/10.3390/math10101744

AMA Style

Tian Q, Fang C-C, Yeh C-W. Software Release Assessment under Multiple Alternatives with Consideration of Debuggers’ Learning Rate and Imperfect Debugging Environment. Mathematics. 2022; 10(10):1744. https://doi.org/10.3390/math10101744

Chicago/Turabian Style

Tian, Qing, Chih-Chiang Fang, and Chun-Wu Yeh. 2022. "Software Release Assessment under Multiple Alternatives with Consideration of Debuggers’ Learning Rate and Imperfect Debugging Environment" Mathematics 10, no. 10: 1744. https://doi.org/10.3390/math10101744

APA Style

Tian, Q., Fang, C.-C., & Yeh, C.-W. (2022). Software Release Assessment under Multiple Alternatives with Consideration of Debuggers’ Learning Rate and Imperfect Debugging Environment. Mathematics, 10(10), 1744. https://doi.org/10.3390/math10101744

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Software Release Assessment under Multiple Alternatives with Consideration of Debuggers’ Learning Rate and Imperfect Debugging Environment

Abstract

1. Introduction

2. Software Reliability Modeling

2.1. Basic Model Development

2.2. Parameter Estimation

2.3. Optimal Software Release with Consideration of Multiple Alternatives

3. Model Verification and Comparison

4. Application and Numerical Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI