Improving Software Reliability in Nuclear Power Plants via Diversity in the Requirements Phase: An Experimental Study
Abstract
1. Introduction
2. Background
2.1. Dependent and Independent Failures in Requirements Inspection
2.2. Inspection Methods
2.3. Z-Model
- Detection activity: A detection activity refers to the actions undertaken by the inspection team in an attempt to identify a defect.
- Multiplicity (k): The multiplicity of a detection activity is the number of failures in the detection activity, which is denoted as .
- Dependency (d): The dependency of a detection activity is the number of dependent failures it contains, denoted by .
- Perception zone: The perception domain of an inspector encompasses the knowledge that the inspector can accurately access or apply to identify defects in an SRS. The universal perception domain U, representing all knowledge required to detect every defect in the SRS, is partitioned into different zones according to the perception domains of the inspectors in a team. An example is illustrated in Figure 1. For two inspectors, i.e., and , the universal perception domain U is divided into four perception zones: , , , and . Assuming that detecting a defect requires the knowledge in a given perception zone, the dependency of a zone is defined as the number of inspectors who lack the necessary knowledge for that zone. For instance, the dependency of is 2, since neither inspector possesses the knowledge needed to detect defects in that zone.
- Dependency of a perception zone: The dependency of a perception zone is the number of inspectors lacking the knowledge within that zone. Perception zones can be categorized according to their dependencies. Let denote the ith perception zone with dependency d. In this paper, a defect requiring the knowledge contained in for detection is referred to as a defect in that zone.
3. Experiment Design
- A background-diverse team is an inspection team in which the members are pursuing different majors.
- A background-uniform team is an inspection team in which all members share the same major.
- A technique-diverse team is an inspection team in which different inspection techniques are used.
- A technique-uniform team is an inspection team in which all members use the same inspection technique.
3.1. Research Questions
- To determine whether the performance of a background-diverse team is better than that of a background-uniform team.
- To determine whether the performance of a technique-diverse team is better than that of a technique -uniform team.
3.2. Hypotheses
- : there is no difference between the performance of the background-diverse teams and the background-uniform teams.
- : the performance of the background-diverse teams is better than that of the background-uniform teams.
- : there is no difference between the performance of the technique-diverse teams and the technique-uniform teams.
- : the performance of the technique-diverse teams is better than that of the technique-uniform teams.
3.3. Research Variables
- The inspection techniques used by the subjects in an inspection team (RIMSM or CBR).
- The backgrounds (i.e., major) of the subjects in an inspection team.
- The number of subjects in an inspection team.
- The SRS documents under inspection.
3.4. Experiment Instrumentations
3.4.1. SRS Documents
3.4.2. Checklist
3.4.3. RITSM Tool
3.4.4. Defect Recording Sheet
3.5. Research Subject Identification
3.6. Experimental Procedure
3.6.1. Collection of Data for Individual Inspectors
- Investigator 1 designed the experiment and prepared all instruments and training materials. Investigator 1 also conducted the analysis of the experimental results.
- Investigator 2 was unaware of the experiment’s hypotheses. To minimize research bias, Investigator 2 delivered the training session presentations and hosted all practice and testing sessions. At the end of each session, Investigator 2 collected the results recording sheets, removed identifiers and methods used by the subjects, and provided the anonymized data for analysis. The results were subsequently analyzed by Investigators 1 and 3.
- Investigator 3 did not participate in the 3-day experiment. Investigators 1 and 3 independently analyzed the results to evaluate inter-rater reliability.
- Investigator 4 supervised the entire experiment.
3.6.2. Creation of Data for Inspection Teams
3.6.3. Identification of Dependent and Independent Failures
- Criterion 1: the inspector must detect at least defects in the SRS.
- Criterion 2: the inspector must detect at least defects within the function containing the defect or within a closely related function.
- Criterion 3: the inspector must detect at least defects of the same type in previous testing or practice sessions.
3.6.4. Analysis of Results
4. Experiment Results
4.1. Stability of the Subjects’ Performance
- : there is no significant difference between the performance of the inspectors using the two inspection techniques in the third and fourth rounds.
- : there are significant differences between the performance of the inspectors using the two inspection techniques in the third and fourth rounds.
4.2. Inter-Rater Reliability
4.3. Comparison of Background-Diverse Teams and Background-Uniform Teams
4.3.1. Creation of Background-Diverse Teams and Background-Uniform Teams
4.3.2. Results of Comparing Background-Diverse Teams and Background-Uniform Teams
4.4. Comparison of Technique-Diverse Teams and Technique-Uniform Teams
4.4.1. Creation of Technique-Diverse Teams and Technique-Uniform Teams
4.4.2. Results of Comparing Technique-Diverse Teams and Technique-Uniform Teams
4.5. Threats to Validity
4.5.1. Internal Validity
- Selection Bias
- Rivalry
- History
- Maturation
- Repeated testing
- Hawthorne effect
- Experimenter bias
- Observer-expectancy effect
- Mortality
4.5.2. External Validity
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Sensitivity Analysis for the Criteria of the Dependent Failures and Independent Failures
Size | Team | c = 2 | c = 1 | Change% |
---|---|---|---|---|
m = 2 | BU_C1 | 0.13 | 0.16 | 18.7% |
BU_C2 | 0.13 | 0.13 | 0.0% | |
BD_C | 0.14 | 0.15 | 7.0% | |
BU_R1 | 0.04 | 0.04 | 12.7% | |
BU_R2 | 0.03 | 0.03 | 0.0% | |
BD_R | 0.06 | 0.06 | 4.5% | |
TU_C | 0.14 | 0.15 | 7.2% | |
TU_R | 0.05 | 0.05 | 4.9% | |
TD | 0.02 | 0.03 | 26.7% | |
m = 4 | TU_C | 0.04 | 0.05 | 9.9% |
TU_R | 0.0020 | 0.0022 | 9.1% | |
TD | 0.0003 | 0.0008 | 56.1% |
Size | Team | c = 2 | c = 1 | Change% |
---|---|---|---|---|
m = 2 | BU_C1 | 0.44 | 0.46 | 4.3% |
BU_C2 | 0.43 | 0.44 | 3.0% | |
BD_C | 0.47 | 0.48 | 2.7% | |
BU_R1 | 0.24 | 0.25 | 2.0% | |
BU_R2 | 0.26 | 0.26 | 1.7% | |
BD_R | 0.26 | 0.27 | 1.6% | |
TU_C | 0.49 | 0.50 | 2.7% | |
TU_R | 0.26 | 0.26 | 1.6% | |
TD | 0.21 | 0.22 | 5.3% | |
m = 4 | TU_C | 0.31 | 0.32 | 3.8% |
TU_R | 0.15 | 0.15 | 1.5% | |
TD | 0.04 | 0.04 | 12.0% |
Size | Team | c = 2 | c = 1 | Change% |
---|---|---|---|---|
m = 2 | BU_C1 | 0.55 | 0.55 | 0.7% |
BU_C2 | 0.56 | 0.56 | 0.3% | |
BD_C | 0.57 | 0.57 | 0.4% | |
BU_R1 | 0.21 | 0.22 | 2.4% | |
BU_R2 | 0.24 | 0.24 | 1.7% | |
BD_R | 0.25 | 0.25 | 1.7% | |
TU_C | 0.53 | 0.53 | 0.4% | |
TU_R | 0.29 | 0.29 | 1.6% | |
TD | 0.25 | 0.25 | 1.7% | |
m = 4 | TU_C | 0.37 | 0.37 | 0.4% |
TU_R | 0.14 | 0.14 | 2.8% | |
TD | 0.06 | 0.07 | 3.9% |
SRS Size | BU_C1 vs. BD_C | BU_C2 vs. BD_C | BU_R1 vs. BD_R | BU_C2 vs. BD_R |
---|---|---|---|---|
Small | Not rejected | Not rejected | Not rejected | Not rejected |
Medium | Not rejected | Not rejected | Not rejected | Not rejected |
Large | Not rejected | Not rejected | Not rejected | Not rejected |
SRS Size | TD vs. TU_C | TD vs. TU_R | ||
---|---|---|---|---|
m = 2 | m = 4 | m = 2 | m = 4 | |
Small | Reject | Reject | Reject | Reject |
Medium | Reject | Reject | Reject | Reject |
Large | Reject | Reject | Reject | Reject |
Appendix B. Summary of the Performance Statistics for the BU_R1, BU_R2, and BD_R Teams
BU_R1 | BU_R2 | BD_R | BU_R1 vs. BD_R | BU_C2 vs. BD_R | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | |
0.04 | 0.09 | 0.03 | 0.07 | 0.06 | 0.09 | 0.22 | 0.12 | 0.19 | 0.12 | 0.18 | 0.27 | |
0.18 | 0.07 | 0.18 | 0.09 | 0.18 | 0.09 | 0.49 | 0.05 | 0.00 | 0.44 | 0.05 | 0.04 | |
0.16 | 0.08 | 0.16 | 0.08 | 0.16 | 0.08 | 0.49 | 0.05 | 0.00 | 0.43 | 0.05 | 0.04 | |
0.02 | 0.03 | 0.02 | 0.03 | 0.02 | 0.04 | 0.50 | 0.05 | 0.00 | 0.50 | 0.05 | 0.00 | |
0.80 | 0.41 | 0.84 | 0.34 | 0.70 | 0.44 | 0.17 | 0.15 | 0.23 | 0.08 | 0.24 | 0.34 | |
0.16 | 0.37 | 0.11 | 0.28 | 0.22 | 0.40 | 0.24 | 0.10 | 0.17 | 0.08 | 0.22 | 0.32 | |
0.13 | 0.09 | 0.13 | 0.07 | 0.11 | 0.08 | 0.21 | 0.13 | 0.20 | 0.14 | 0.17 | 0.27 | |
0.04 | 0.09 | 0.03 | 0.07 | 0.05 | 0.09 | 0.23 | 0.11 | 0.18 | 0.10 | 0.20 | 0.29 |
BU_C1 | BU_C2 | BD_C | BU_C1 vs. BD_C | BU_C2 vs. BD_C | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | |
0.25 | 0.15 | 0.26 | 0.16 | 0.27 | 0.17 | 0.32 | 0.07 | 0.11 | 0.45 | 0.05 | 0.03 | |
0.45 | 0.15 | 0.47 | 0.15 | 0.46 | 0.16 | 0.35 | 0.07 | 0.09 | 0.45 | 0.05 | 0.03 | |
0.42 | 0.15 | 0.44 | 0.15 | 0.43 | 0.16 | 0.37 | 0.06 | 0.08 | 0.43 | 0.05 | 0.05 | |
0.03 | 0.05 | 0.03 | 0.03 | 0.03 | 0.04 | 0.45 | 0.05 | 0.03 | 0.40 | 0.06 | 0.06 | |
0.51 | 0.28 | 0.47 | 0.26 | 0.48 | 0.32 | 0.35 | 0.06 | 0.09 | 0.46 | 0.05 | 0.03 | |
0.49 | 0.28 | 0.53 | 0.26 | 0.52 | 0.32 | 0.35 | 0.06 | 0.09 | 0.46 | 0.05 | 0.03 | |
0.19 | 0.10 | 0.20 | 0.11 | 0.18 | 0.11 | 0.35 | 0.06 | 0.09 | 0.31 | 0.08 | 0.13 | |
0.22 | 0.16 | 0.24 | 0.15 | 0.24 | 0.17 | 0.28 | 0.08 | 0.14 | 0.43 | 0.05 | 0.04 |
BU_C1 | BU_C2 | BD_C | BU_C1 vs. BD_C | BU_C2 vs. BD_C | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | |
0.22 | 0.17 | 0.24 | 0.14 | 0.25 | 0.18 | 0.25 | 0.10 | 0.17 | 0.43 | 0.05 | 0.04 | |
0.45 | 0.20 | 0.42 | 0.17 | 0.44 | 0.20 | 0.38 | 0.06 | 0.08 | 0.37 | 0.06 | 0.08 | |
0.43 | 0.21 | 0.40 | 0.17 | 0.41 | 0.20 | 0.38 | 0.06 | 0.08 | 0.36 | 0.06 | 0.09 | |
0.02 | 0.02 | 0.03 | 0.03 | 0.03 | 0.02 | 0.40 | 0.06 | 0.06 | 0.42 | 0.05 | 0.06 | |
0.59 | 0.27 | 0.45 | 0.17 | 0.50 | 0.27 | 0.10 | 0.26 | 0.36 | 0.18 | 0.11 | 0.20 | |
0.41 | 0.27 | 0.55 | 0.17 | 0.50 | 0.27 | 0.10 | 0.26 | 0.36 | 0.18 | 0.11 | 0.20 | |
0.23 | 0.13 | 0.17 | 0.09 | 0.18 | 0.11 | 0.07 | 0.35 | 0.43 | 0.36 | 0.06 | 0.09 | |
0.20 | 0.17 | 0.22 | 0.13 | 0.23 | 0.18 | 0.25 | 0.10 | 0.18 | 0.42 | 0.05 | 0.05 |
Appendix C. Summary of the Performance Statistics for the TU_R, TU_C, and TD Teams
TU_C | TU_R | TD | TU_C vs. TD | TU_R vs. TD | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | |
0.50 | 0.23 | 0.26 | 0.16 | 0.22 | 0.20 | 0.00 | 1.00 | 1.31 | 0.02 | 0.50 | 0.22 | |
0.66 | 0.16 | 0.44 | 0.15 | 0.55 | 0.16 | 0.00 | 1.00 | 0.69 | 0.00 | 1.00 | 0.66 | |
0.63 | 0.17 | 0.41 | 0.16 | 0.52 | 0.17 | 0.00 | 1.00 | 0.70 | 0.00 | 1.00 | 0.67 | |
0.02 | 0.04 | 0.03 | 0.04 | 0.03 | 0.04 | 0.06 | 0.31 | 0.16 | 0.14 | 0.19 | 0.12 | |
0.32 | 0.32 | 0.47 | 0.31 | 0.72 | 0.28 | 0.00 | 1.00 | 1.38 | 0.00 | 1.00 | 0.89 | |
0.68 | 0.32 | 0.53 | 0.31 | 0.28 | 0.28 | 0.00 | 1.00 | 1.38 | 0.00 | 1.00 | 0.89 | |
0.17 | 0.13 | 0.17 | 0.10 | 0.34 | 0.11 | 0.00 | 1.00 | 1.47 | 0.00 | 1.00 | 1.54 | |
0.47 | 0.25 | 0.24 | 0.16 | 0.18 | 0.20 | 0.00 | 1.00 | 1.33 | 0.00 | 0.79 | 0.31 |
TU_C | TU_R | TD | TU_C vs. TD | TU_R vs. TD | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | |
0.32 | 0.23 | 0.15 | 0.11 | 0.04 | 0.08 | 0.00 | 1.00 | 2.59 | 0.00 | 1.00 | 1.20 | |
0.66 | 0.10 | 0.45 | 0.10 | 0.55 | 0.11 | 0.00 | 1.00 | 1.07 | 0.00 | 1.00 | 0.96 | |
0.64 | 0.11 | 0.41 | 0.11 | 0.52 | 0.11 | 0.00 | 1.00 | 1.07 | 0.00 | 1.00 | 0.94 | |
0.02 | 0.03 | 0.03 | 0.03 | 0.03 | 0.03 | 0.00 | 1.00 | 0.17 | 0.06 | 0.40 | 0.06 | |
0.07 | 0.06 | 0.20 | 0.13 | 0.12 | 0.16 | 0.00 | 1.00 | 0.35 | 0.00 | 1.00 | 0.51 | |
0.12 | 0.15 | 0.19 | 0.18 | 0.51 | 0.24 | 0.00 | 1.00 | 1.69 | 0.00 | 1.00 | 1.35 | |
0.40 | 0.29 | 0.30 | 0.23 | 0.33 | 0.25 | 0.00 | 1.00 | 0.27 | 0.00 | 0.72 | 0.09 | |
0.42 | 0.33 | 0.30 | 0.25 | 0.04 | 0.11 | 0.00 | 1.00 | 2.53 | 0.00 | 1.00 | 1.98 | |
0.04 | 0.03 | 0.08 | 0.05 | 0.05 | 0.05 | 0.00 | 0.97 | 0.14 | 0.00 | 1.00 | 0.59 | |
0.02 | 0.03 | 0.03 | 0.03 | 0.08 | 0.04 | 0.00 | 1.00 | 1.69 | 0.00 | 1.00 | 1.58 | |
0.08 | 0.06 | 0.04 | 0.03 | 0.06 | 0.05 | 0.00 | 1.00 | 0.29 | 0.00 | 1.00 | 0.47 | |
0.29 | 0.25 | 0.14 | 0.12 | 0.03 | 0.08 | 0.00 | 1.00 | 2.48 | 0.00 | 1.00 | 1.37 |
TU_C | TU_R | TD | TU_C vs. TD | TU_R vs. TD | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | |
0.53 | 0.24 | 0.29 | 0.19 | 0.25 | 0.21 | 0.00 | 1.00 | 1.29 | 0.03 | 0.43 | 0.21 | |
0.69 | 0.16 | 0.49 | 0.19 | 0.57 | 0.15 | 0.00 | 1.00 | 0.82 | 0.00 | 1.00 | 0.53 | |
0.68 | 0.16 | 0.46 | 0.19 | 0.55 | 0.16 | 0.00 | 1.00 | 0.83 | 0.00 | 1.00 | 0.53 | |
0.02 | 0.01 | 0.02 | 0.02 | 0.02 | 0.02 | 0.03 | 0.41 | 0.20 | 0.08 | 0.32 | 0.17 | |
0.28 | 0.23 | 0.43 | 0.19 | 0.66 | 0.27 | 0.00 | 1.00 | 1.45 | 0.00 | 1.00 | 0.92 | |
0.72 | 0.23 | 0.57 | 0.19 | 0.34 | 0.27 | 0.00 | 1.00 | 1.45 | 0.00 | 1.00 | 0.92 | |
0.16 | 0.10 | 0.18 | 0.10 | 0.33 | 0.11 | 0.00 | 1.00 | 1.57 | 0.00 | 1.00 | 1.37 | |
0.52 | 0.24 | 0.27 | 0.19 | 0.22 | 0.21 | 0.00 | 1.00 | 1.33 | 0.01 | 0.59 | 0.25 |
TU_C | TU_R | TD | TU_C vs. TD | TU_R vs. TD | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | |
0.37 | 0.23 | 0.14 | 0.10 | 0.07 | 0.11 | 0.00 | 1.00 | 2.38 | 0.00 | 1.00 | 0.70 | |
0.70 | 0.11 | 0.50 | 0.15 | 0.57 | 0.10 | 0.00 | 1.00 | 1.37 | 0.00 | 1.00 | 0.54 | |
0.70 | 0.11 | 0.49 | 0.15 | 0.55 | 0.11 | 0.00 | 1.00 | 1.38 | 0.00 | 1.00 | 0.56 | |
0.02 | 0.01 | 0.02 | 0.01 | 0.02 | 0.01 | 0.00 | 1.00 | 0.29 | 0.00 | 1.00 | 0.26 | |
0.05 | 0.04 | 0.13 | 0.11 | 0.09 | 0.11 | 0.00 | 1.00 | 0.41 | 0.00 | 1.00 | 0.40 | |
0.11 | 0.11 | 0.25 | 0.16 | 0.48 | 0.22 | 0.00 | 1.00 | 1.73 | 0.00 | 1.00 | 1.07 | |
0.36 | 0.22 | 0.35 | 0.17 | 0.35 | 0.20 | 0.08 | 0.33 | 0.06 | 0.41 | 0.05 | 0.01 | |
0.48 | 0.28 | 0.27 | 0.15 | 0.08 | 0.15 | 0.00 | 1.00 | 2.42 | 0.00 | 1.00 | 1.27 | |
0.03 | 0.02 | 0.05 | 0.03 | 0.04 | 0.04 | 0.00 | 1.00 | 0.26 | 0.00 | 1.00 | 0.29 | |
0.02 | 0.02 | 0.04 | 0.03 | 0.08 | 0.03 | 0.00 | 1.00 | 1.87 | 0.00 | 1.00 | 1.27 | |
0.08 | 0.05 | 0.06 | 0.04 | 0.07 | 0.05 | 0.00 | 1.00 | 0.25 | 0.00 | 0.99 | 0.17 | |
0.36 | 0.24 | 0.13 | 0.10 | 0.05 | 0.11 | 0.00 | 1.00 | 2.38 | 0.00 | 1.00 | 0.71 |
References
- Arndt, S.A.; Alvarado, R.; Dittman, B.; Mott, K.; Wood, R. NRC Technical Basis for Evaluation of Its Position on Protection Against Common Cause Failure in Digital Systems Used in Nuclear Power Plants. In Proceedings of the 2017 NPIC-HMIT, San Francisco, CA, USA, 11–15 June 2017. [Google Scholar]
- Alshazly, A.A.; Elfatatry, A.M.; Abougabal, M.S. Detecting defects in software requirements specification. Alex. Eng. J. 2014, 53, 513–527. [Google Scholar] [CrossRef]
- Porter, A.A.; Votta, L.G.; Basili, V.R. Comparing detection methods for software requirements inspections: A replicated experiment. IEEE Trans. Softw. Eng. 2002, 21, 563–575. [Google Scholar] [CrossRef]
- He, L.; Carver, J. PBR vs. Checklist: A Replication in the N-Fold Inspection Context. In Proceedings of the 2006 ACM/IEEE International Symposium on Empirical Software Engineering, Rio de Janeiro, Brazil, 21–22 September 2006; pp. 95–104. [Google Scholar] [CrossRef]
- Signoret, J.P.; Leroy, A. Dependent and Common Cause Failures; Springer Series in Reliability Engineering; Springer: Berlin/Heidelberg, Germany, 2021; pp. 103–120. [Google Scholar] [CrossRef]
- Ali, S.W.; Ahmed, Q.A.; Shafi, I. Process to enhance the quality of software requirement specification document. International Conference on Engineering and Emerging Technologies, Lahore, Pakistan, 22–23 February 2018. [Google Scholar]
- Li, B.; Diao, X.; Gao, W.; Smidts, C. A Requirements Inspection Method Based on Scenarios Generated by Model Mutation and the Experimental Validation. Empir. Softw. Eng. 2021, 26, 108. [Google Scholar] [CrossRef]
- Martin, J.; Tsai, W.T. N-Fold Inspection: A Requirements Analysis Technique. Commun. ACM 1990, 33, 225–232. [Google Scholar] [CrossRef]
- Kantorowitz, E.; Guttman, A.; Arzi, L. The performance of the N-fold requirement inspection method. Requir. Eng. 1997, 2, 152–164. [Google Scholar] [CrossRef]
- Vulpe, A.; Carausu, A. Dependent failure and CCF analysis of NPP systems with diversity defense factors. In Proceedings of the Transactions of the 14th International Conference on Structural Mechanics in Reactor Technology, Lyon, France, 17–22 August 1997. [Google Scholar]
- Huang, F.; Liu, B.; Song, Y.; Keyal, S. The links between human error diversity and software diversity: Implications for fault diversity seeking. Sci. Comput. Program. 2014, 89, 350–373. [Google Scholar] [CrossRef]
- Li, B.; Smidts, C. A Zone-Based Model for Analysis of Dependent Failures in Requirements Inspection. IEEE Trans. Softw. Eng. 2023, 49, 3581–3598. [Google Scholar] [CrossRef]
- Staron, M.; Kuzniarz, L.; Thurn, C. An empirical assessment of using stereotypes to improve reading techniques in software inspections. In Proceedings of the International Conference on Software Engineering, St. Louis, MO, USA, 15–21 May 2005; pp. 63–69. [Google Scholar] [CrossRef]
- Lanubile, F.; Visaggio, G. Evaluating Defect Detection Techniques for Software Requirements Inspections; International Software Engineering Research Network; Bari, Italy, 2000. [Google Scholar]
- Fleming, K.; Mosleh, A. Classification and Analysis of Reactor Operating Experience Involving Dependent Events; ISERN Report no. 00-08; Electric Power Research Institute: Palo Alto, CA, USA, 1985; pp. 1–24. [Google Scholar]
- Fleming, K. A reliability model for common cause failures in redundant safety systems. In Technical Report No. GA-A-13284; General Atomics: San Diego, CA, USA, 1974. [Google Scholar]
- Mosleh, A.; Siu, N. A reliability model for common mode failure in redundant safety systems. In Proceedings of the Ninth International Conference on Structural Mechanics in Reactor Technology, Lusanne, Switzerland, 17–21 August 1987. [Google Scholar]
- Atwood, C. Common Cause Fault Rates for Pumps; NUREG/CR-2098; US Nuclear Regulatory Commission: Washington, DC, USA, 1983.
- ISO/IEC/IEEE 29148:2018; Systems and Software Engineering-Life Cycle Processes: Requirements Engineering. IEEE: New York, NY, USA, 2018.
- Li, X.; Mutha, C.; Smidts, C.S. An automated software reliability prediction system for safety critical software. Empir. Softw. Eng. 2016, 21, 2413–2455. [Google Scholar] [CrossRef]
- Li, X.; Gupta, J. ARPS: An Automated Reliability Prediction System Tool for Safety Critical Software; PSA: Quezon City, Philippines, 2013; pp. 22–27. [Google Scholar]
- Li, B.; Smidts, C.S. Extension of Mutation Testing for the Requirements and Design Faults. In Proceedings of the 2017 NPIC-HMIT, Pittsburgh, PA, USA, 24–28 September 2017. [Google Scholar]
- Lanubile, F.; Visaggio, G. Assessing defect detection methods for software requirements inspections through external replication. In International Software Engineering Research Network, Technical Report ISERN9601; International Software Engineering Research Network: Bari, Italy, 1996; p. 17. [Google Scholar]
- Votta, L.G. Does every inspection need a meeting? In Proceedings of the Symposium on the Foundations of Software Engineering, Los Angeles, CA, USA, 7–10 December 1993; pp. 107–114. [Google Scholar] [CrossRef]
- Goswami, A.; Walia, G. An empirical study of the effect of learning styles on the faults found during the software requirements inspection. In Proceedings of the 24th International Symposium on Software Reliability Engineering, Pasadena, CA, USA, 4–7 November 2013; pp. 330–339. [Google Scholar] [CrossRef]
- McHugh, M.L. Interrater reliability: The kappa statistic. Biochem. Med. 2012, 22, 276–282. [Google Scholar] [CrossRef]
- Sullivan, G.M.; Feinn, R. Using Effect Size—Or Why the P Value Is Not Enough. J. Grad. Med. Educ. 2012, 4, 279–282. [Google Scholar] [CrossRef] [PubMed]
- Sawilowsky, S.S. New Effect Size Rules of Thumb. J. Mod. Appl. Stat. Methods 2009, 8, 597–599. [Google Scholar] [CrossRef]
- Zubair, M.; Ishag, A. Sensitivity analysis of APR-1400’s reactor protection system by using RiskSpectrum PSA. Nucl. Eng. Des. 2018, 339, 225–234. [Google Scholar] [CrossRef]
Method | CBR | RIMSM |
---|---|---|
Instrumentation | Checklist | Tool (RITSM) |
Defect coverage | Defects defined in the checklist | Defects considered in model mutation |
Inspection item | SRS document | Results of execution of the SRS model |
Defect detection activity | Answer questions in the checklist | Examine the system behaviors and outputs in different scenarios |
Data | Definition |
---|---|
m | Number of Inspectors in the Team |
Total Number of Detection Activities | |
Detection Activities | |
Detection Activities |
SRS Label | SRS Topic | Function | Page | Defects |
---|---|---|---|---|
S1 | A water level control system | 4 | 3 | 5 |
S2 | A reaction chamber control system in a chemical plant | 5 | 3 | 4 |
S3 | An automated car assembly system | 5 | 3 | 4 |
S4 | A fly safety system | 5 | 3 | 3 |
S5 | A valve control system | 5 | 3 | 5 |
S6 | A vehicle speed monitor system | 5 | 3 | 4 |
S7 | A post-collision event control system | 5 | 3 | 6 |
S8 | An automobile cruise control and monitoring system | 5 | 3 | 5 |
S9 | A digital-based small reactor protection system | 10 | 6 | 9 |
S10 | An elevator control system | 11 | 6 | 11 |
S11 | An integrated vehicle-based safety system | 21 | 11 | 55 |
S12 | An embedded control software for smart sensor | 22 | 12 | 22 |
Major | Number of Subjects |
---|---|
Computer Science & Engineering (CSE) | 11 |
Electrical Engineering (EE) | 10 |
Mechanical Engineering (ME) | 2 |
Day | Sessions | Content |
---|---|---|
Day 1 | training 1 | Introduced what requirements engineering is |
training 2 | Introduced the CBR method | |
practice 1 | Practiced the CBR method using a small-size SRS (S1) | |
training 3 | Introduced the RIMSM method | |
practice 2 | Practiced the RIMSM method using a small-size SRS (S2) | |
Day 2 | training 4 | Reviewed the CBR method |
practice 3 | Practiced the CBR method using a small-size SRS (S3) | |
training 5 | Reviewed the RIMSM method | |
practice 4 | Practiced the RIMSM method using a small-size SRS (S4) | |
practice 5 | Practiced the CBR method using a small-size SRS (S5) | |
practice 6 | Practiced the RIMSM method using a small-size SRS (S6) | |
Day 3 | testing 1 | Inspected a small-size SRS (S7) |
testing 2 | Inspected a small-size SRS (S8) | |
testing 3 | Inspected a medium-size SRS (S9) | |
testing 4 | Inspected a medium-size SRS (S10) | |
testing 5 | Inspected a large-size SRS (S11) | |
testing 6 | Inspected a large-size SRS (S12) |
Session | SRS | Inspection Method | |||||
---|---|---|---|---|---|---|---|
Group 1: CSE (6) | Group 2: EE (5) | Group 3: ME (1) | Group 4: CSE (5) | Group 5: EE (5) | Group 6: ME (1) | ||
testing session 1 | S7 (small size) | RIMSM | RIMSM | RIMSM | CBR | CBR | CBR |
testing session 2 | S8 (small size) | CBR | CBR | CBR | RIMSM | RIMSM | RIMSM |
testing session 3 | S9 (medium size) | RIMSM | RIMSM | RIMSM | CBR | CBR | CBR |
testing session 4 | S10 (medium size) | CBR | CBR | CBR | RIMSM | RIMSM | RIMSM |
testing session 5 | S11 (large size) | RIMSM | RIMSM | RIMSM | CBR | CBR | CBR |
testing session 6 | S12 (large size) | CBR | CBR | CBR | RIMSM | RIMSM | RIMSM |
RIMSM | CBR | |
---|---|---|
Number of data points | 23 | 23 |
Significance level | 0.05 | 0.05 |
p-value (two-tailed) | 0.98 | 0.78 |
Statistical power | 5.9% | 5.0% |
Label | Team Features | Team Creation | Teams of Size m = 2 | |
---|---|---|---|---|
Method | Major | |||
BU_R1 | RIMSM | CSE | Select m inspectors from Group 1 and combine their results in testing session 1 Select m inspectors from Group 4 and combine their results in testing session 2 | 25 |
BU_R2 | RIMSM | EE | Select m inspectors from Group 2 and combine their results in testing session 1 Select m inspectors from Group 5 and combine their results in testing session 2 | 20 |
BU_C1 | CBR | CSE | Select m inspectors from Group 1 and combine their results in testing session 2 Select m inspectors from Group 4 and combine their results in testing session 1 | 25 |
BU_C2 | CBR | EE | Select m inspectors from Group 2 and combine their results in testing session 2 Select m inspectors from Group 5 and combine their results in testing session 1 | 20 |
Label | Team Features | Team Creation | Teams of Size m = 2 | |
---|---|---|---|---|
Method | Major | |||
BD_R | RIMSM | Both (CSE, EE) | Select m/2 inspectors from Group 1 and m/2 inspectors from Group 2, and combine their results in testing session 1 Select m/2 inspectors from Group 4 and m/2 inspectors from Group 5, and combine their results in testing session 2 | 55 |
BD_C | CBR | Both (CSE, EE) | Select m/2 inspectors from Group 1 and m/2 inspectors from Group 2, and combine their results in testing session 2 Select m/2 inspectors from Group 4 and m/2 inspectors from Group 5, and combine their results in testing session 1 | 55 |
BU_C1 | BU_C2 | BD_C | BU_C1 vs. BD_C | BU_C2 vs. BD_C | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | ||
0.16 | 0.17 | 0.13 | 0.15 | 0.15 | 0.17 | 0.44 | 0.05 | 0.04 | 0.26 | 0.09 | 0.16 | 1 | |
0.39 | 0.22 | 0.32 | 0.16 | 0.35 | 0.21 | 0.23 | 0.12 | 0.19 | 0.23 | 0.10 | 0.18 | 2 | |
0.34 | 0.21 | 0.28 | 0.18 | 0.31 | 0.21 | 0.28 | 0.09 | 0.14 | 0.29 | 0.08 | 0.14 | 3 | |
0.05 | 0.06 | 0.04 | 0.05 | 0.04 | 0.06 | 0.25 | 0.10 | 0.16 | 0.28 | 0.08 | 0.14 | 4 | |
0.66 | 0.36 | 0.62 | 0.43 | 0.58 | 0.40 | 0.17 | 0.15 | 0.23 | 0.33 | 0.07 | 0.12 | 5 | |
0.22 | 0.27 | 0.23 | 0.35 | 0.26 | 0.33 | 0.26 | 0.09 | 0.14 | 0.35 | 0.07 | 0.10 | 6 | |
0.23 | 0.13 | 0.18 | 0.12 | 0.18 | 0.13 | 0.07 | 0.31 | 0.36 | 0.42 | 0.05 | 0.05 | 7 | |
0.11 | 0.15 | 0.11 | 0.16 | 0.13 | 0.17 | 0.33 | 0.07 | 0.10 | 0.32 | 0.07 | 0.12 | 8 | |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
BU_C1 | BU_C2 | BD_C | BU_C1 vs. BD_C | BU_C2 vs. BD_C | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | |
0.46 | 0.19 | 0.44 | 0.28 | 0.48 | 0.24 | 0.37 | 0.06 | 0.08 | 0.29 | 0.09 | 0.16 | |
0.64 | 0.12 | 0.63 | 0.17 | 0.63 | 0.17 | 0.43 | 0.05 | 0.04 | 0.46 | 0.05 | 0.03 | |
0.61 | 0.11 | 0.61 | 0.19 | 0.61 | 0.18 | 0.49 | 0.05 | 0.00 | 0.49 | 0.05 | 0.01 | |
0.03 | 0.05 | 0.02 | 0.03 | 0.02 | 0.04 | 0.29 | 0.09 | 0.15 | 0.35 | 0.06 | 0.09 | |
0.35 | 0.28 | 0.43 | 0.40 | 0.34 | 0.33 | 0.46 | 0.05 | 0.03 | 0.19 | 0.16 | 0.25 | |
0.65 | 0.28 | 0.57 | 0.40 | 0.66 | 0.33 | 0.46 | 0.05 | 0.03 | 0.19 | 0.16 | 0.25 | |
0.19 | 0.15 | 0.20 | 0.14 | 0.16 | 0.12 | 0.16 | 0.19 | 0.27 | 0.17 | 0.18 | 0.27 | |
0.42 | 0.21 | 0.41 | 0.31 | 0.45 | 0.25 | 0.26 | 0.09 | 0.15 | 0.32 | 0.08 | 0.14 |
BU_C1 | BU_C2 | BD_C | BU_C1 vs. BD_C | BU_C2 vs. BD_C | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | |
0.55 | 0.22 | 0.56 | 0.26 | 0.57 | 0.24 | 0.35 | 0.07 | 0.10 | 0.44 | 0.05 | 0.04 | |
0.70 | 0.15 | 0.71 | 0.18 | 0.71 | 0.18 | 0.44 | 0.05 | 0.04 | 0.44 | 0.05 | 0.04 | |
0.68 | 0.16 | 0.70 | 0.18 | 0.69 | 0.18 | 0.41 | 0.06 | 0.06 | 0.42 | 0.06 | 0.06 | |
0.02 | 0.01 | 0.01 | 0.01 | 0.01 | 0.01 | 0.20 | 0.13 | 0.23 | 0.15 | 0.15 | 0.25 | |
0.25 | 0.20 | 0.27 | 0.27 | 0.24 | 0.23 | 0.37 | 0.06 | 0.09 | 0.30 | 0.09 | 0.16 | |
0.75 | 0.20 | 0.73 | 0.27 | 0.76 | 0.23 | 0.37 | 0.06 | 0.09 | 0.30 | 0.09 | 0.16 | |
0.15 | 0.08 | 0.15 | 0.10 | 0.13 | 0.09 | 0.22 | 0.12 | 0.20 | 0.22 | 0.12 | 0.22 | |
0.53 | 0.22 | 0.55 | 0.26 | 0.56 | 0.24 | 0.32 | 0.07 | 0.12 | 0.44 | 0.05 | 0.04 |
SRS Size | BU_C1 vs. BD_C | BU_C2 vs. BD_C | BU_R1 vs. BD_R | BU_C2 vs. BD_R |
---|---|---|---|---|
Small | Not rejected | Not rejected | Not rejected | Not rejected |
Medium | Not rejected | Not rejected | Not rejected | Not rejected |
Large | Not rejected | Not rejected | Not rejected | Not rejected |
Label | Method | Team Creation | Number of Teams | |
---|---|---|---|---|
m = 2 | m = 4 | |||
TU_R | RIMSM | Select m inspectors from Group 1–3 and combine their results in testing session 1 Select m inspectors from Group 4–5 and combine their results in testing session 2 | 121 | 825 |
TU_C | CBR | Select m inspectors from Group 1–3 and combine their results in testing session 2 Select m inspectors from Group 4–5 and combine their results in testing session 1 | 121 | 825 |
Label | Method | Team Creation | Number of Teams | |
---|---|---|---|---|
m = 2 | m = 4 | |||
TD | RIMSM, CBR | Select m/2 inspectors from Group 1–3 and m/2 inspectors from Group 4–6, and combine their results in testing Section 1 and testing Section 2 | 264 | 7260 |
TU_C | TU_R | TD | TU_C vs. TD | TU_R vs. TD | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | ||
0.15 | 0.18 | 0.047 | 0.09 | 0.027 | 0.05 | 0.00 | 1.00 | 1.10 | 0.01 | 0.83 | 0.32 | 1 | |
0.33 | 0.20 | 0.17 | 0.08 | 0.25 | 0.14 | 0.00 | 1.00 | 0.54 | 0.00 | 1.00 | 0.60 | 2 | |
0.28 | 0.20 | 0.16 | 0.08 | 0.22 | 0.15 | 0.00 | 0.97 | 0.42 | 0.00 | 0.98 | 0.45 | 3 | |
0.05 | 0.05 | 0.02 | 0.03 | 0.03 | 0.05 | 0.00 | 0.89 | 0.35 | 0.00 | 0.91 | 0.37 | 4 | |
0.57 | 0.42 | 0.74 | 0.43 | 0.84 | 0.34 | 0.00 | 1.00 | 0.72 | 0.01 | 0.70 | 0.27 | 5 | |
0.24 | 0.34 | 0.19 | 0.38 | 0.03 | 0.11 | 0.00 | 1.00 | 1.00 | 0.00 | 1.00 | 0.68 | 6 | |
0.16 | 0.12 | 0.11 | 0.08 | 0.20 | 0.13 | 0.00 | 0.80 | 0.31 | 0.00 | 1.00 | 0.74 | 7 | |
0.12 | 0.18 | 0.04 | 0.09 | 0.01 | 0.05 | 0.00 | 1.00 | 1.00 | 0.00 | 0.99 | 0.49 | 8 | |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
TU_C | TU_R | TD | TU_C vs. TD | TU_R vs. TD | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mean | std | mean | std | mean | std | p-Value | Power | eff Size | p-Value | Power | eff Size | ||
0.05 | 0.10 | 0.002 | 0.018 | 0.001 | 0.002 | 0.00 | 1.00 | 1.44 | 0.01 | 1.00 | 0.24 | 1 | |
0.34 | 0.15 | 0.17 | 0.06 | 0.25 | 0.10 | 0.00 | 1.00 | 0.94 | 0.00 | 1.00 | 0.75 | 2 | |
0.30 | 0.15 | 0.16 | 0.05 | 0.22 | 0.10 | 0.00 | 1.00 | 0.74 | 0.00 | 1.00 | 0.57 | 3 | |
0.05 | 0.03 | 0.02 | 0.02 | 0.03 | 0.03 | 0.00 | 1.00 | 0.55 | 0.00 | 1.00 | 0.46 | 4 | |
0.35 | 0.36 | 0.43 | 0.36 | 0.64 | 0.34 | 0.00 | 1.00 | 0.83 | 0.00 | 1.00 | 0.60 | 5 | |
0.30 | 0.31 | 0.44 | 0.36 | 0.33 | 0.33 | 0.01 | 0.59 | 0.08 | 0.00 | 1.00 | 0.34 | 6 | |
0.24 | 0.30 | 0.12 | 0.29 | 0.02 | 0.08 | 0.00 | 1.00 | 1.81 | 0.00 | 1.00 | 0.87 | 7 | |
0.07 | 0.18 | 0.01 | 0.08 | 0.00 | 0.00 | 0.00 | 1.00 | 1.16 | 0.01 | 1.00 | 0.28 | 8 | |
0.07 | 0.05 | 0.06 | 0.04 | 0.12 | 0.06 | 0.00 | 1.00 | 0.76 | 0.00 | 1.00 | 0.97 | 9 | |
0.03 | 0.03 | 0.02 | 0.02 | 0.03 | 0.04 | 0.38 | 0.06 | 0.01 | 0.00 | 1.00 | 0.19 | 10 | |
0.03 | 0.04 | 0.01 | 0.02 | 0.00 | 0.01 | 0.00 | 1.00 | 1.95 | 0.00 | 1.00 | 0.60 | 11 | |
0.04 | 0.10 | 0.00 | 0.02 | 0.00 | 0.00 | 0.00 | 1.00 | 1.14 | 0.00 | 1.00 | 0.29 | 12 | |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
SRS Size | TD vs. TU_C | TD vs. TU_R | ||
---|---|---|---|---|
m = 2 | m = 4 | m = 2 | m = 4 | |
Small | Reject | Reject | Reject | Reject |
Medium | Reject | Reject | Reject | Reject |
Large | Reject | Reject | Reject | Reject |
SRS Size | TD vs. TU_C | TD vs. TU_R | ||
---|---|---|---|---|
m = 2 | m = 4 | m = 2 | m = 4 | |
Small | 560% | 5899% | 178% | 284% |
Medium | 228% | 733% | 119% | 336% |
Large | 213% | 552% | 117% | 212% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, B.; Li, J.; Huang, X. Improving Software Reliability in Nuclear Power Plants via Diversity in the Requirements Phase: An Experimental Study. Energies 2025, 18, 4794. https://doi.org/10.3390/en18184794
Li B, Li J, Huang X. Improving Software Reliability in Nuclear Power Plants via Diversity in the Requirements Phase: An Experimental Study. Energies. 2025; 18(18):4794. https://doi.org/10.3390/en18184794
Chicago/Turabian StyleLi, Boyuan, Jianghai Li, and Xiaojin Huang. 2025. "Improving Software Reliability in Nuclear Power Plants via Diversity in the Requirements Phase: An Experimental Study" Energies 18, no. 18: 4794. https://doi.org/10.3390/en18184794
APA StyleLi, B., Li, J., & Huang, X. (2025). Improving Software Reliability in Nuclear Power Plants via Diversity in the Requirements Phase: An Experimental Study. Energies, 18(18), 4794. https://doi.org/10.3390/en18184794