Effectiveness of a Standardized Nursing Process Using NANDA International, Nursing Interventions Classification and Nursing Outcome Classification Terminologies: A Systematic Review

The decision-making in clinical nursing, regarding diagnoses, interventions and outcomes, can be assessed using standardized language systems such as NANDA International, the Nursing Interventions Classification and the Nursing Outcome Classification; these taxonomies are the most commonly used by nurses in informatized clinical records. The purpose of this review is to synthesize the evidence on the effectiveness of the nursing process with standardized terminology using the NANDA International, the Nursing Interventions Classification and the Nursing Outcome Classification in care practice to assess the association between the presence of the related/risk factors and the clinical decision-making about nursing diagnosis, assessing the effectiveness of nursing interventions and health outcomes, and increasing people’s satisfaction. A systematic review was carried out in Medline and PreMedline (OvidSP), Embase (Embase-Elsevier), The Cochrane Library (Wiley), CINAHL (EbscoHOST), SCI-EXPANDED, SSCI and Scielo (WOS), LILACS (Health Virtual Library) and SCOPUS (SCOPUS-Elsevier) and included randomized clinical trials as well as quasi-experimental, cohort and case-control studies. Selection and critical appraisal were conducted by two independent reviewers. The certainty of the evidence was assessed with the Grading of Recommendations Assessment, Development and Evaluation Methodology. A total of 17 studies were included with variability in the level and certainty of evidence. According to the outcomes, 6 studies assessed diagnostic decision-making and 11 assessed improvements in individual health outcomes. No studies assessed improvements in intervention effectiveness or population satisfaction. There is a need to increase studies with rigorous methodologies that address clinical decision-making about nursing diagnoses using NANDA International and individuals’ health outcomes using the Nursing Interventions Classification and the Nursing Outcome Classification as well as implementing studies that assess the use of these terminologies for improvements in the effectiveness of nurses’ interventions and population satisfaction with the nursing process.


Introduction
The nursing process (NP) is the most common way used by nurses to provide and document the actions of nurses through a scientific method to identify, diagnose, intervene in and resolve health issues in the population within the scope of their disciplinary field.The complexity of the NP involves problem solving, reflective judgement and decisionmaking to achieve desired outcomes through five sequential steps: assessment, diagnosis, planning, implementation, and evaluation [1].Its implementation demands cognitive, psychomotor and affective skills and capacities that underlie the clinical reasoning and care provided by nurses [2].Each stage of the NP involves carrying out strategies to address the observed phenomenon, from the aspects concerned to the establishment of clinical judgment, including the gathering of information and recognition of health patterns, along with decision-making to determine the main and secondary interventions required for its resolution [3].The nursing clinical decision-making regarding diagnoses, interventions and health outcomes of individuals can be assessed through the records made by nurses in information systems using standardized language systems (SLSs).Therefore, the phenomena and activities of nurses can be defined and described using SLSs through the retrieval of data from electronic records [4].
The use of such nursing terminologies in the scientific literature has been variable, with up to 72% of published studies using NANDA International (NANDA-I) [5] or its combination with Nursing Interventions Classification (NIC) [6] and Nursing Outcome Classification (NOC) [7] thus establishing itself as the most widely used system by nurses in the international context [8].Through the review of the scientific literature, it is possible to assess the nurses' use of NANDA-NIC-NOC (NNN) in clinical practice, as such records made in the patients' clinical history provide evidence of the efficacy of the NP.
Two systematic reviews have recently been published that address the use of standardized nursing terminologies [9,10], but they have not focused on the exact topic of NNN terminologies.After a preliminary search of the scientific literature no other review has been found on the effectiveness of the NP using NNN in clinical practice.The only study that approaches this topic was conducted in 2017 by Sanson et al. [1] addressing a systematic review (SR) to understand the impact of nursing diagnoses on patient and organizational outcomes.These authors showed the existence of studies with methodological inconsistencies and an insufficient level of evidence (LE) about the impact of nursing diagnoses on patient and organizational outcomes [1].
For this assessment, the two following review questions were posed: does any association exist between the presence of related and risk factors and the clinical decision-making about nursing diagnoses?And, does the effectiveness of interventions, people's health outcomes and people's satisfaction increase when nurses use standardized NNN terminology?The research aims of this review are to synthesize the evidence on the effectiveness of the NP with standardized terminology using NNN in care practice to assess the association between the presence of the related and risk factors and the clinical decision-making about nursing diagnosis, and to assess the effectiveness of nursing interventions and health outcomes and increase people's satisfaction.

Materials and Methods
An SR was carried out according to Joanna Briggs Institute (JBI) criteria; the reporting of results followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), 2020 statement [11].The research protocol was registered in the International Prospective Register of Systematic Reviews (PROSPERO); registration number CRD42020170350.

Sources of Information
The first step consisted of identifying previous publications on the subject of interest through various searches in PROSPERO and Google Scholar ® that could answer the research question.After this initial check, search strategies were employed in the following databases: Medline and PreMedline (through OvidSP), Embase (through Embase-Elsevier), The Cochrane Library (through Wiley), CINAHL (through EbscoHOST), SCI-EXPANDED, SSCI and Scielo (through WOS), LILACS (through the Health Virtual Library) and SCOPUS (SCOPUS-Elsevier).To complement these, manual searches were carried out in the Trip Database metasearch engine.

Search Methods
Searches were conducted on the 12 and 13 of January 2021 (File S1), establishing methodological limits to publications after 1992.Search strategies included the following terms: "nursing interventions classification" OR "nursing outcomes classification" OR "nanda international" OR "nnn terminology" in the title and abstract fields.Similarly, search strategies were adapted to each database.The search strategy was first checked by a documentalist in the Embase database (File S2) and independently reviewed by two of the authors.Once the definitive strategy was designed, it was adapted to the remaining databases selected.

Inclusion Criteria
Studies with the following design methodologies were included: Randomized clinical trials (RCT), quasi-experimental (non-randomized clinical trials and pre-post studies) and observational (cohort, case-control and case series), which consider the NP in English, Spanish and Portuguese language.Studies were included after 1992, coinciding with the year in which NNN terminology was officially recognized.

Exclusion Criteria
Other reviews (narrative reviews, scoping reviews, SR or umbrella reviews) and grey literature were excluded.Similarly, studies which did not consider the NP assessing the use of NNN were also excluded.

Quality Appraisal
The records were exported to an Excel ® spreadsheet for the selection process.Following the elimination of duplicates, studies were screened by title and abstract and classified into three groups: "potentially eligible", "doubtful eligibility" and "excluded"."Potentially eligible" and "doubtful eligibility" records were retrieved for full-text screening.The process was carried out by two independent reviewers and a third reviewer was consulted in the case of discrepancies.To determine study suitability, Critical Appraisal Skills Programme Español (CASPe) templates appropriate to each type of design were used so that for cohort studies, case-control studies and RCTs (11 items) scores ≤ 5 were considered low quality, scores 6-8 were considered moderate quality and scores ≥ 9 were considered high quality.To verify the suitability of the process, a pilot test was carried out on an initial record sample.
The certainty of the evidence (random sequence and allocation concealment), blinding bias of participants and researchers (concealment of allocation to study arm, intention to blind, method of blinding and blinding effectiveness), blinding bias to outcome assessors (reported, requiring researcher judgment or not requiring researcher judgment), attrition bias (incomplete data or omitted from analysis) and reporting bias (selective outcome reporting) were assessed, identifying each as: low risk, high risk, uncertain risk or not applicable.A pilot test of bias risk assessment was conducted on a sample of studies.Bias risk was considered in determining the degree of certainty of the evidence using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) methodology.

Data Extraction
The research outcomes analysed, correspond to information on improvements in diagnostic association between the presence of the related and risk factors and the clinical decision-making about nursing diagnosis, effectiveness of interventions, health outcomes and people's satisfaction.Separately, general study data were extracted.Data extraction was performed independently by two researchers and resolved through consensus with a third researcher in the case of discrepancies.The Mendeley ® bibliographic reference manager was used for data extraction and recorded in detail in the data extraction document.A pilot test of the extraction process was carried out on a sample of studies.

Data Synthesis
To organize the presentation of results, firstly, criteria established by JBI was followed to determine the LE for the effectiveness of each of the studies.The results were then organized according to the research outcomes below.

Results
The number of records identified was n = 4511; following elimination of n = 1601 duplicates, the number was n = 2910.During the title and abstract screening process n = 2820 were excluded, limiting the number of retrievable full-text records to n = 90.Of these, n = 4 could not be retrieved (1 was not retrieved due to conflict of references by the same title in 2 Digital Object Identifier (DOI) and different authorship names (Jones vs. Adams) in different journals; 3 were not retrieved due to impossibility to access the full text and no response after sending emails to the authors of correspondence) so that the number of studies assessed for eligibility was n = 86, of which n = 69 did not satisfy the inclusion criteria.Thus, the final number of included studies was n = 17, as can be seen in the flow chart below in Figure 1.
Following the screening process, those studies meeting the eligibility criteria were distributed among the authors for critical reading in pairs (CARS-CEMA; PRBB-MNHDL; and DAFG-HGDLT) and the subsequent measurement of interobserver agreement, through the determination of Cohen's weighted kappa coefficient, are shown in Table S1: Interobserver agreement on included studies.When the coefficient did not reach statistical significance, a third reviewer was consulted (CARS and MNHDL) to resolve agreement discrepancies.
All the studies showed high or moderate quality following critical reading with CASPe.The studies that showed high quality were the RCT (score 9/11) by Corcoles et al. [12], Guerra et al. [13], Gencbas et al. [14] and Sampaio et al. [15].The remaining studies showed moderate quality in Table S2: Critical reading scores for the included studies.
With regard to the design methodology, the studies included nine experimental designs (five RCT, one pseudo RCT and three quasi-experimental) and eight observational (one case control and seven cohort), which are shown together with sociodemographic characteristics in Table 1.
Following the GRADE methodology criteria, the overall quality of the certainty of scientific evidence was determined for each of the outcomes assessed.GRADE stipulates that studies with experimental designs show greater initial certainty, while observational studies do so with lesser initial certainty, although following application of compensation criteria for lowering or raising the quality of this initial certainty corresponding to each of the GRADE domains, their estimation is corrected.Final certainty was shown to be high in the study outcomes by Corcoles et al. [12], Silva et al. [16], Pascoal et al. [17], Silva et al. [18], Pascoal et al. [19], Reis and Jesus [20] and Pascoal et al. [21].JBI criteria were simultaneously applied to assign the LE to each one, as shown in Table S3: JBI level of evidence and degree of certainty using GRADE methodology.
Regarding research outcomes, the included studies assessed improvements in diagnostic accuracy (n = 6) and in people's health outcomes (n = 11).No studies were identified that assessed outcomes in the efficacy of interventions or improvements in population satisfaction.
simultaneously applied to assign the LE to each one, as shown in Table S3: JBI level of evidence and degree of certainty using GRADE methodology.
Regarding research outcomes, the included studies assessed improvements in diagnostic accuracy (n = 6) and in people's health outcomes (n = 11).No studies were identified that assessed outcomes in the efficacy of interventions or improvements in population satisfaction.

Diagnostic Etiological Association and Accuracy of Defining Characteristics
Studies assessing diagnostic indicators of NANDA-I determined the association with related/risk factors (RFs) (n = 3) and accuracy of defining characteristics (DCs) (n = 3).
The NANDA-I nursing diagnoses that addressed the etiological association of RFs were: risk of delayed surgical recovery (00246), dysfunctional ventilatory response to weaning (00034) and risk of falls (00155).The effect measures of these RFs were found to be statistically significant in most of the etiological indicators assessed, as shown in Table 2.

Diagnostic Etiological Association and Accuracy of Defining Characteristics
Studies assessing diagnostic indicators of NANDA-I determined the association with related/risk factors (RFs) (n = 3) and accuracy of defining characteristics (DCs) (n = 3).
The NANDA-I nursing diagnoses that addressed the etiological association of RFs were: risk of delayed surgical recovery (00246), dysfunctional ventilatory response to weaning (00034) and risk of falls (00155).The effect measures of these RFs were found to be statistically significant in most of the etiological indicators assessed, as shown in Table 2.
The articles that assessed the accuracy of the DCs (n = 3) concerned the NANDA-I nursing diagnoses: impaired gas exchange (00030), ineffective airway clearance (00031) and ineffective respiratory pattern (00032), as shown in Table 3.

People's Health Outcomes
Articles that addressed effectiveness in people's health outcomes did so from two perspectives.
First, regarding the general aspects of effectiveness (n = 2).On the one hand, with respect to the assessment of care planning using NNN and, on the other hand, concerning clinical reasoning.The study carried out by Cárdenas-Valladolid et al. [22] evaluated the implementation of care planning in primary care centres using standardized NNN terminology in the intervention group (IG) compared to the usual recording of non-standardized care as a control group (CG) through the prospective follow-up of a cohort (n = 23,488) over 2 years, demonstrating that both groups experienced a moderate reduction in cardiovascular risk factors observed at 12, 18 and 24 months for systolic blood pressure (SBP), diastolic blood pressure (DBP), glycosylated hemoglobin (HbA1c), LDL cholesterol and body mass index (BMI).The effect measure improved in the IG for all outcomes except LDL cholesterol and DBP.Following adjustment of the reference parameters for age, sex, type of treatment and physical activity, a reducing effect was observed in all outcomes except HbA1c, which was statistically significant for DBP (mean = −0.33 (CI = −0.63-0.04);p = 0.02).In general, the changes in the values for SBP, DBP, HbA1c, LDL cholesterol and BMI were greater in the IG than the CG, despite only reaching statistical significance in favour of the IG in HbA1c (p < 0.01), while the CG reached statistical significance in SBP (p < 0.01).
With regard to clinical reasoning, Müller-Staub et al. [23] developed a training program for nurses using guided clinical reasoning as an IG, compared with nurses who received training through classic discussion of clinical cases as a CG, showing greater acquisition of critical thinking skills for the application of NNN in clinical practice in the IG due to better internal consistency between diagnoses, interventions and outcomes, as shown in Table 4.
Secondly, studies that assessed the effectiveness of health outcomes in specific situations (n = 9) corresponded to the NANDA-I nursing diagnoses: functional urinary incontinence (00020), risk of falls (00155), ineffective health management (00078), risk of perioperative postural injury (00087), ineffective airway clearance (00031), nutritional imbalance: less than the body needs (00002), anxiety (00146) and sleep pattern disorder (00198).These studies assessed the interrelationship of NANDA-I diagnosis with respect to NIC and NOC terminologies.On the other hand, Guerra et al. [13] did not use NOC terminology to measure the effect of fall prevention on the reduction in risk of falls, while Bjorklund-Lima et al. [24] assessed the risk of perioperative postural injury using various NOCs but without reporting the NICs performed in the NP.
The statistically significant effect measures for each of the indicators of effectiveness on improving people's health outcomes are shown in Table 5.

Discussion
Brazil is the country with the greatest number of publications, showing a marked tendency to explore aspects related to the clinical applicability of NNN, while Spain ranked second with a distinct emphasis on the growing interest in the study of nursing terminologies in our environment.The increase in the use and effectiveness of nursing SLSs in clinical practice is accompanied by improvements in the diagnostic reasoning capacities of the nurses [25].
Regarding the quality of evidence in these studies, the use of traditional systems such as the proposal by JBI to establish the LE has been refined with the application of GRADE methodology such that it is possible to adjust the focus and quality of the initial evidence rating granted according to the design of these studies' methodologies, readjusting the factors or domains that confer the final certainty of the evidence to reduce it (assessing the risk of bias, inconsistency, indirectness, inaccuracy and publication bias) or increase it (assessing the magnitude of the effect, response gradient and absence of residual confounding) with greater certainty [29,30].According to GRADE methodology, an RCT starts from high LE (1c according to JBI), thus the Corcoles et al. [12] study maintains high certainty of this LE; however, this certainty of LE decreases in the RCT carried out by Guerra et al. [13], Vázquez-Sánchez et al. [27], Sampaio et al. [15] and Müller-Staub et al. [23] to low certainty due methodological limitations (risk of bias, indirectness and imprecision).These aspects make it necessary to improve the rigour of the design of these studies.In contrast, cohort studies, which start from a lower LE according to JBI (3c: cohort with control group; 3d: case control; and 3e: observational without control group) and a low certainty of evidence according to GRADE, increased to a high certainty of LE in the studies of Pascoal et al. [17], Pascoal et al. [19,21] and Reis and Jesus [20].The presence of these methodological weaknesses in the designs of the included studies, combined with the fact that each of these studies addressed different NNN concepts, has contributed to the heterogeneity of the findings, making it not possible to carry out comparative analyses of the measures of effect.
As background to this research, a study conducted by Müller-Staub et al. [30] assessed, among other aspects, the accuracy of the Standardized Nursing Terminology, in addition to the coherence between diagnoses, interventions and people's health results.The authors identified deficits in the diagnostic process as well as in the notification of signs, symptoms and aetiologies, arguing for the need to implement training measures that ensure accuracy in nurses' diagnostic reasoning [31,32].To complement these criteria, the present study adds the importance of linking nurses' critical thinking to the use of clinical indicators based on the best scientific evidence available from the results of rigorous research.
With respect to diagnostic etiological association, all the assumptions studied indicated that exposure to the aetiologies (related factors and risk factors) are diagnostic indicators for the presence of the health problems identified.The nursing diagnosis of risk of delayed surgical recovery (00246) includes people aged over 80 years in the NANDA-I classification, although the study only reported results that indicated an absence of statistical significance in this population with extreme ages.In contrast, the remaining aetiologies presented showed semantic variations.
Concerning the diagnosis of dysfunctional ventilatory response to weaning (00034), most of the statistically significant RFs reported by Silva et al. [16] were not included.
Regarding the analysis of diagnostic accuracy through the study of DCs, all studies were conducted about respiratory diseases by the same authors, and the presence of the DCs identified in these health problems were key indicators in each of the nursing diagnoses.The diagnosis of impaired gas exchange (00030) showed that abnormal skin colour and hypoxemia indicate the presence of this health issue with greater statistical accuracy.The 2021-2023 NANDA-I edition [5] includes these major DCs, which have high predictive value.A considerable number of minor or secondary DCs, with less predictive value for clinical judgement, have also been included.As such, it would be beneficial to add diagnostic accuracy criteria that distinguish between major and minor DCs to NANDA-I.
The diagnosis of ineffective airway clearance (00031) showed that the only DCs not included in the 2021-2023 NANDA-I edition [5] correlate significant with open eyes, albeit with an excessively wide CI.On the other hand, the diagnosis of ineffective respiratory pattern (00032) showed effectiveness for diagnostic accuracy in all DCs, including others that were not observed in the study, suggesting that it would be valuable in future research to assess the rest of the DCs included in NANDA-I.
The assessment of changes in people's health using NOC terminology has shown that planned interventions in clinical settings with specific diseases and certain risk situations using SLSs provide tools for the correct planning of nursing care.However, the literature supporting the use of these NOC indicators with validated tools providing objective data is limited; only the study conducted by Laguna-Parras et al. [28] has evaluated the NOC sleep (0004) for the diagnosis of sleep pattern disorder (00198) using the Oviedo Sleep Questionnaire.
In the effectiveness analysis for the resolution of specific health issues, certain modifications or the elimination of some diagnoses in the latest published edition of NANDA-I were notable [5].Thus, functional urinary incontinence (00020) was replaced by another diagnosis called disability associated urinary incontinence (00297).Similarly, in the 2021-2023 NANDA-I edition, the diagnosis risk of falls (00155) was removed from the classification and replaced by new diagnoses which distinguish between the population of adults and children, with the diagnoses risk of falls in adults (00303) and risk of falls in children (00306).Likewise, for the diagnosis dysfunctional ventilatory response to weaning (00034), the 2021-2023 NANDA-I edition included diagnoses called dysfunctional ventilatory response to adult weaning, which differs from the previous definition by specifying that it refers to individuals over 18 who required mechanical ventilation for at least 24 h.
Only two studies, Cárdenas-Valladolid et al. [22] and Müller-Staub et al. [23], addressed general aspects of the use of NNN in the NP showing results supporting the use of the NP with NNN to improve clinical indicators in diabetes control with planned follow-up and increasing reasoning ability after a training program, respectively.In recent years, there has been growing interest among nurses in studying the clinical application of NNN.On the other hand, recent studies have shown more rigorous methodological designs, including cohort studies with adequate follow-up and randomized interventions with control groups that estimate the risk of bias.However, it is still essential to diversify international contexts and sample sizes in the populations studied with the aim of increasing effect measures in the population.Separately, it is vital that the results of these studies are transferred more quickly to the subsequent published NNN editions in order to improve nurses' clinical impact.
In relation to diagnostic aetiology, this review has only assessed the association of RFs with clinical decision-making to identify nursing diagnoses; further studies should analyse the effects on the diagnostic accuracy of these aetiologies.In this sense, in the nursing field, a gold standard for diagnostic accuracy still needs to be developed.Moreover, this research has not addressed the existence of possible differences in relation to nurses' gender and the use of SLSs.It would be interesting to develop future lines of research to explore differences between men and women in the application of the NP using NNN.
The limitations of the current research are due to the heterogeneity of the studies included in the SR, addressing distinct clinical situations corresponding to various health issues and NNN labels independently, which prevents comparison of results and the accumulated meta-analysis of their effect measures.Taking this into account, future research should examine larger sample sizes and the effect of longer follow-up periods in the populations studied.

Conclusions
It must be concluded that the scientific literature using NNN is very extensive but that there is still a deficit regarding the amount and quality of evidence and the degree of certainty concerning the effectiveness of the NP using these terminologies.At present, the use of NNN shows the clinical impact of nurses in health systems using SLSs; however, it is not yet possible to conclude that the use of NNN improves the effectiveness of the NP, besides in some rather specific clinical settings in which it has been assessed.The association between aetiologies and health problems identified by nurses is statistically significant in the few nursing diagnoses reviewed, but clinical decision-making must be studied in further nursing diagnoses.NANDA-I should update the diagnostic indicators in some diagnostic labels according to the evidence retrieved from the scientific literature.In addition, it is essential to approach diagnostic accuracy and the health results in people using NNN terminologies from the clinical perspective.
Most of studies reviewed have been based on the use of NNN in disease situations, so there is a need to develop more studies the use of these terminologies in health promotion, community health and public health contexts.Similarly, it is important to implement the findings of new studies that assess the use of these terminologies with respect to improvements in the efficacy of nursing interventions and the satisfaction of the population with the NP.Finally, further methodologically rigorous studies are needed in a large number of clinical settings.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/healthcare11172449/s1,File S1: Search notebook; File S2: Test notebook; Table S1: Interobserver agreement on included studies; Table S2: Critical reading scores for the included studies; Table S3: JBI level of evidence and degree of certainty using GRADE methodology.

Table 1 .
Sociodemographic characteristics of the included studies.

Table 2 .
Statistically significant effect measures for the diagnostic etiological association with related/risk factors.

Table 3 .
Statistically significant effect measures for diagnostic accuracy of defining characteristics.

Table 4 .
Statistically significant effect measures for overall effectiveness in health outcomes.

Table 5 .
Statistically significant effect measures for people's health outcomes.