1. Introduction
Reliability is one of the key attributes of a quality model according to ISO/IEC 25010:2023 (
Systems and software engineering—Systems and software Quality Requirements and Evaluation (SQuaRE)—Product quality model) [
1]. Moreover for software it is a critical quality metric that reflects a system’s resistance to failure during operation. Traditional reliability assessment methods often require extensive testing or production deployment data. One basic approach of software reliability prediction is using
software reliability growth models (SRGMs). Their main aim is supporting the reliability estimation in the process of software development, especially in the phases of testing and validation.
The development of new software reliability growth models constitutes an important topic in the scientific literature. Each of them is based on distinct assumptions concerning the software fault detection process. For instance, some models assume perfect debugging, where faults are immediately and completely removed without introducing new errors. However, this is often unrealistic, as the difficulty of correcting faults tends to increase as testing progresses. In contrast, other models incorporate imperfect debugging, acknowledging that the fault correction process may be incomplete and may introduce additional ones. Some new models for imperfect debugging can be found in [
2,
3,
4,
5,
6,
7].
Open-source software (OSS) projects are developed collaboratively by geographically dispersed developers without centralized management. Unlike traditional software, the testing phase is not clearly demarcated; it is carried out in parallel with coding. It relies on users to identify defects rather than dedicated test teams, so all bugs cannot be fully detected. Given the dynamic and distributed nature of OSS development, assessing its reliability is essential, particularly as such software is increasingly used in new systems. Studying OSS reliability helps ensure quality despite the absence of formal testing phases. Our objective is to investigate the reliability of open-source projects by applying an SRGM. While this is not a novel technique, it remains a meaningful approach for understanding and predicting the fault-finding process. Other advanced methods and applications of the methodology are extensively explored in various studies, such as [
8,
9,
10,
11,
12].
Most software reliability growth models assume that software faults follow a nonhomogeneous Poisson process (NHPP). They are characterized by a model-specific mean value function
used to estimate the expected number of faults detected over time. Based on
, the reliability function for a NHPP model can be expressed as the conditional probability
, which represents the probability that no software failure will occur in the time interval
, where
and
. Formally, this can be written as
Table 1 presents summary of some basic software reliability growth models including the definitions of their mean value functions and their corresponding types (concave or S-shaped).
In 2018, Pham [
20] proposed estimating the mean value function
using a logistic growth model, also known as the Verhulst model, defined by
where
is fault content function and
is software fault detection rate per unit of time.
Haque and Ahmad [
21] proposed the software reliability model that takes into account the improvement of the testing team’s skills over time. Its mean value function is expressed by
They used (
1) with
,
and
. Function
is the same as in the Generalized Goel Model and represents an initial increase followed by a decrease in the failure rate.
Song, Kim and Chang [
22] consider a software reliability model that considers the number of finite faults and dependent faults with the mean value function defined by
It is obtained from (
1) with
,
and
. This model is assumed to be used for open-source software.
Some other SRGM models obtained by the logistic growth model can be found in [
23,
24,
25].
In this study, we have two main aims. First, we introduce an intrinsic characteristic known as Hausdorff saturation for software reliability models. It provides a quantitative measure of how close the software is to completing its failure discovery process. In
Section 2 we present a detailed analysis for two software reliability models—Haque–Ahmad
and Song–Kim–Chang
. We prove precise estimates of Hausdorff distance to the horizontal asymptote. Additionally, we utilize four datasets from an open-source platforms to perform a comparative analysis between three classical and two proposed software reliability growth models. We apply standard goodness-of-fit tests and Hausdorff saturation to determine the most appropriate approximation model. Our second goal is assessing the reliability of open-source software (OSS) projects by analyzing the patterns of issue creation and resolution over time. By focusing on the trends in opened and closed issues, we aim to determine the maturity and reliability of OSS projects.
Section 3 describes the development of a software reliability assessment tool designed for this purpose. The proposed tool implements five software reliability growth models within a dynamic Wolfram Language environment that directly interfaces with the GitHub API. The presented process enables real-time reliability assessment and provides comparative analysis of model fits to determine the most appropriate for a given software project.
3. Software Reliability Assessment Tool
Although software reliability growth models can be implemented manually using specialized mathematical software, this process is often time-consuming and demands a high level of expertise. Most software engineers are reluctant to implement these models due to limited awareness, a lack of understanding of the underlying mathematical concepts or time constraints. The development of software engineering tools has been driven by the objective of bridging the gap between theoretical academic research and its practical application. In [
39] Xiao, Okamura and Dohi present a review of over 40 software reliability assessment tools. A review of five open-source tools, SFRAT (
https://github.com/LanceFiondella/srt.core, accessed on 27 July 2025), SRATS (
https://github.com/SwReliab/SRATS2010, accessed on 27 July 2025), RSRAT (
https://github.com/SwReliab/Rsrat, accessed on 27 July 2025), PHSRM (
https://github.com/okamumu/phsrm, accessed on 27 July 2025) and SMERFS (
https://www.slingcode.com/smerfs/, accessed on 27 July 2025), can be found in [
40]. Some examples of other tools are SFRAT [
41], C-SFRAT [
42] and C SFRAT [
43]. The basic difference among the tools lie in how they collect data, which software reliability models are used and the programming languages used in their development.
Several tools support the full process of reliability analysis. For instance, STRAIT [
44] is a free, open-source tool that collects issue report data directly from GitHub and supports nine software reliability models. It was developed using Java and R. Another example is RAAF-OSS [
45], a reliability automatic assessment framework designed for open-source software and compatible with Jira. It is implemented in Python and supports six software reliability models.
We present a computational tool for assessing software reliability through the analysis of issue tracking patterns in GitHub repositories. The tool supports the full reliability analysis process and is designed to be accessible to users without expert knowledge.
It is a dynamic Wolfram Language environment that directly interfaces with GitHub’s API. To the best of the authors’ knowledge, this is the first tool of its kind.
The tool processes five main steps: data collection, data normalization, model fitting, visualization and evaluation.
Data Collection: In open-source software (OSS) projects, public repositories serve both as development platforms and as sources for issue data collection. Issues are retrieved from the GitHub repository, and their creation and closure timestamps are captured. The dataset used comprises closed issues from GitHub repositories, obtained via the GitHub REST API v3. Data points are organized as pairs, where t is the time in days since the first reported issue and is the cumulative number of issues up to time t. This transformed dataset emulates software reliability growth testing data, where the time of discovery of software faults (represented by issues) is modeled over calendar time. This modeling is performed in the absence of formal test-phase data, which is often unavailable in open-source projects. Our tool automates the collection and storage of issue reports directly from GitHub using the project URL. Users only need to provide the repository owner and name.
Data Normalization: Timestamps are normalized to analyze the progression of issues over time. The raw timestamps undergo the following processing steps: 1. conversion to absolute time values; 2. normalization relative to the creation date of the first issue; and 3. binning into daily intervals for consistent time series analysis. In the interactive interface, the user has opportunity to choose amount of data to include, choosing either all issue reports or only a portion using a dynamic slider. Another option allows the user to set the maximum number of iterations to be used during the model fitting process.
Accurate parameter estimation is essential for the effective application of software reliability growth models. Since software failures exhibit nonlinear behavior, merely fitting test data to models is insufficient. Instead, optimization techniques are required to identify the most suitable parameters that ensure reliable prediction accuracy. Our tool allows users to select the most appropriate optimization method for parameter estimation, including
“ConjugateGradient”—nonlinear conjugate gradient;
“Gradient” —gradient descent;
“LevenbergMarquardt”—Gauss–Newton method for least squares;
“Newton”—Newton method;
“QuasiNewton” —quasi-Newton BFGS;
“InteriorPoint”—interior point method;
“NMinimize”—optimization that always attempts to find a global minimum;
“Automatic”—automatic default method.
Model Fitting: Various reliability growth models, such as Goel–Okumoto, delayed S-shaped, inflection S-shaped, Haque–Ahmad and Song–Kim–Chang, are applied to the issue data. Each model can be selected by the user. We use
CAS Wolfram Mathematica to fit data with high precision. The model fitting function
NonlinearModelFit [
36] returns a
FittedModel object, which provides convenient access to both model predictions and diagnostic information without the need to refit the model. One of the key advantages is its ability to support direct evaluation at specific input values, as well as queries for parameter estimates, residuals and other diagnostic metrics. Once a model is chosen, the parameter estimations are displayed in a table view. The table presents the best-fit parameter information, including values for “Estimate,” “Standard Error,” “t-Statistic” and “
p-Value.”
Visualization: Plots are generated to visualize how the selected models fit the actual issue data. These plots allow for analysis of issue trends and provide a clear graphical representation of model performance, helping users choose the most appropriate model.
Evaluation: Criteria values for model comparison present some properties that measure the goodness-of-fit of the considered models in table view. The presented criteria are as follows: “AIC” (Akaike information criterion), “BIC” (Bayesian information criterion), “R-Squared” (coefficient of determination ) and “Adjusted R-Squared” ( adjusted for the number of model parameters). In each row of the table, the most suitable value is highlighted in blue for the selected software reliability models—specifically, the minimum values for AIC and BIC, and the maximum values for R-squared and adjusted R-squared. This facilitates the identification of the model that demonstrates the best fit to the observed data.
The main advantages provided by the interactive interface of our tool are as follows:
Dynamic model fitting with adjustable sample sizes;
Support for multiple optimization algorithms;
Visual comparison of model fits;
Parameter estimates with statistical significance;
Ease of use, even for users without expert knowledge.
We demonstrate capabilities of our tool by using the MPAndroidChart repository as a case study (see
Figure 13). MPAndroidChart is a powerful and user-friendly charting library for Android
https://github.com/PhilJay/MPAndroidChart (accessed on 27 July 2025). This open-source project belongs to Philipp Jahoda and is licensed under the Apache License, Version 2.0. Its repository issue tracker is dedicated exclusively to bug reports and feature requests.
Analysis of the MPAndroidChart repository revealed a decreasing trend in opened issues over time, suggesting a reduction in the introduction of new defects. Also we observe a high closure rate. A significant proportion of opened issues have been resolved and closed, indicating active maintenance and responsiveness from the development community.
In our software reliability tool, we provide a comparative analysis of all five models. Based on parameter estimations and standard goodness-of-fit tests, we conclude that the Song–Kim–Chang software reliability model offers the most accurate predictions for the number of future faults.
The decreasing number of opened issues, coupled with a high closure rate, suggests that the MPAndroidChart project is maturing and becoming more reliable over time. This aligns with findings from previous studies indicating that OSS projects exhibit reliability growth patterns similar to closed source projects. Moreover, the active engagement of the community in resolving issues contributes to the project’s reliability. As more users and developers participate in identifying and fixing bugs, the software becomes more robust.
By analyzing issue tracking data, we can gain valuable insights into the reliability and maturity of OSS projects. The trends observed in the MPAndroidChart repository demonstrate that a decreasing rate of new issues and a high resolution rate are indicative of a reliable and well-maintained project.
4. Discussion and Conclusions
Open-source projects have become integral to modern software development, offering transparency and collaborative improvement. However, assessing the their reliability remains a challenge due to the decentralized nature of their development. The paper presents an analysis of the so-called Hausdorff saturation characteristic for two software reliability models based on the logistic growth model. The main goal is to propose an additional criterion for model selection. Thus, this distance acts as a geometric indicator of software reliability maturity, helping assess how close we are to a stable, dependable system. This methodology can be successfully applied to other models as well. We considered four numerical examples using fault detection datasets from two different open-source projects. In the presented two examples, the proposed metric supports the results obtained from the previous metrics, including standard goodness-of-fit tests. The question of the usefulness of the proposed Hausdorff saturation criterion remains open and will be the subject of future research, particularly in the context of large datasets. We present a new dynamic tool for evaluating the reliability of open-source software through the analysis of issue tracking dynamics. The tool supports the complete reliability assessment process in real time and is designed for use by individuals without specialized expertise. To demonstrate its applicability, we include a case study based on an open-source project. Furthermore, the tool is easily extensible, allowing for the integration of additional software reliability growth models.
The applicability of SRGMs to real-world projects is heavily dependent on the characteristics of software repositories and the adopted development methodology. For OSS in particular, development typically follows agile practices rather than classical waterfall models. This approach, characterized by short iterations, continuous feedback and the parallel execution of coding and testing, aligns perfectly with the decentralized nature of OSS communities. Consequently, the reliability assessment of OSS must account for the heterogeneity of the reported data, which includes not only genuine defects but also duplicates, enhancement requests and rejected reports. This reality distinguishes OSS reliability analysis from that of traditional systems, where testing is a more formalized and managed process. Some recent studies [
46,
47,
48] emphasize the need to integrate repository mining and empirical software engineering into SRGM-based analysis, while foundational works [
49] continue to provide the guiding theoretical framework. Future research may explore several enhancements to improve software reliability assessment. We can consider incorporating additional metrics. While this study focuses on issue opening and closing trends, integrating other indicators such as pull request activity, code churn, commit frequency and test coverage could offer a more comprehensive view of software reliability and overall project health. One promising direction is analyzing the roles and behaviors of contributors to gain insight into how community dynamics impact the evolution of reliability in open-source projects. Another avenue involves developing a real-time monitoring dashboard that continuously tracks issue dynamics and estimates software reliability, thereby supporting maintainers and contributors in evaluating project status on an ongoing basis. Another promising direction involves correlating issue trends with external quality indicators, such as user ratings, download counts or real-world usage in production systems, to strengthen the link between internal repository data and perceived reliability. Lastly, predictive modeling based on historical issue data could enable the forecasting of future reliability trends, issue arrival rates and resolution times, ultimately aiding in resource planning and risk assessment allocation for OSS projects.