Quality of Systematic Reviews with Network Meta-Analyses on JAK Inhibitors in the Treatment of Rheumatoid Arthritis: Application of the AMSTAR 2 Scale

Ramalho, Bruna; Penedones, Ana; Mendes, Diogo; Alves, Carlos

doi:10.3390/jcm15020725

Open AccessSystematic Review

Quality of Systematic Reviews with Network Meta-Analyses on JAK Inhibitors in the Treatment of Rheumatoid Arthritis: Application of the AMSTAR 2 Scale

¹

Laboratory of Social Pharmacy and Public Health, Faculty of Pharmacy, University of Coimbra, 3000-548 Coimbra, Portugal

²

Clevidence, 2740-122 Oeiras, Portugal

^*

Author to whom correspondence should be addressed.

J. Clin. Med. 2026, 15(2), 725; https://doi.org/10.3390/jcm15020725

Submission received: 7 December 2025 / Revised: 6 January 2026 / Accepted: 8 January 2026 / Published: 15 January 2026

(This article belongs to the Section Immunology & Rheumatology)

Download

Browse Figures

Versions Notes

Abstract

Background/Objective: Systematic reviews (SRs) with network meta-analysis (NMA) support evidence-based decision-making by enabling both direct and indirect comparisons across multiple interventions. Given the expanding use of Janus kinase (JAK) inhibitors in rheumatoid arthritis (RA), the methodological rigor of SRs with NMA is essential for trustworthy conclusions. This study is aimed at evaluating the methodological quality of SRs with NMA assessing the efficacy and/or safety of JAK inhibitors in RA. Methods: PubMed and Embase were searched for full-text SRs with NMAs evaluating JAK inhibitors as a therapeutic class in RA. Eligible publications were English-language articles reporting efficacy and/or safety outcomes. Narrative reviews, letters, duplicates, reviews focused on a single JAK inhibitor, and reviews without quantitative synthesis were excluded. Three independent reviewers assessed methodological quality using AMSTAR 2. Descriptive statistics were used to summarize findings. Results: Of the 222 records identified, 18 SRs with NMA met the inclusion criteria: 5 focused on efficacy, 5 on safety, and 8 assessed both. The most consistently fulfilled AMSTAR 2 items were a clearly defined PICO question (100%), duplicate study selection (100%), and reporting of conflicts of interest (100%). Common shortcomings included lack of protocol registration (44%), incomplete reporting of the search strategy (39%), and absence of publication bias assessment (50%). Risk-of-bias assessment varied by review focus: all safety reviews complied (100%), compared with 20% of efficacy reviews and 37% of mixed reviews. Conclusions: Most SRs with NMA of JAK inhibitors in RA present relevant methodological limitations, particularly in protocol registration, search reporting, and risk-of-bias assessment. Methodological standards were generally higher in safety-focused reviews, underscoring the need for more consistent and rigorous conduct and reporting, especially in efficacy and mixed reviews, to strengthen confidence in NMA-derived conclusions.

Keywords:

rheumatoid arthritis; JAK inhibitors; network meta-analysis; systematic reviews; AMSTAR 2; methodological quality; evidence-based medicine

Graphical Abstract

1. Introduction

Rheumatoid arthritis (RA) is a chronic autoimmune condition marked by systemic inflammation that primarily affects synovial joints [1,2]. The disease typically leads to persistent joint inflammation, progressive destruction of cartilage and bone, and may also involve extra articular organs, such as the lungs, heart, and eyes [3]. Its clinical consequences include chronic pain, reduced mobility, and functional disability, all of which contribute to a substantial socioeconomic burden through increased healthcare utilization and decreased work productivity [3,4]. RA affects an estimated 0.5% to 1% of the global population, with a higher prevalence among women and a peak onset between 40 and 60 years of age [2]. The progressive nature of the disease, coupled with potential delays in diagnosis and intervention, continues to challenge effective long-term disease management. Therapeutic strategies for RA have advanced significantly, aiming to achieve disease remission or maintain low disease activity. Disease-modifying antirheumatic drugs (DMARDs) form the backbone of RA treatment and are categorized into conventional synthetic (csDMARDs), biological (bDMARDs), and targeted synthetic (tsDMARDs) agents [1]. Methotrexate remains the cornerstone of initial treatment; however, patients who fail to respond adequately may require the addition of either a bDMARD or a tsDMARD [2]. Among the tsDMARDs, Janus kinase (JAK) inhibitors have emerged as a relevant class due to their targeted mechanism of action on the JAK/STAT signaling pathway, which regulates key cytokines involved in RA pathophysiology [5]. These agents have demonstrated comparable or superior efficacy to bDMARDs in clinical trials and offer practical benefits, such as oral administration and rapid symptom control. Nevertheless, safety concerns, particularly infections, thromboembolic events, and potential cardiovascular and oncologic risks, have led regulatory bodies to issue updated recommendations for their use [6,7]. SRs play a pivotal role in evidence-based medicine, enabling the synthesis of findings from multiple studies to inform clinical and policy decisions. When supported by meta-analyses, SRs enhance the statistical power and precision of estimated treatment effects. Traditional meta-analyses, however, are limited to head-to-head comparisons. In contrast, network meta-analysis (NMA) allows for the comparison of multiple interventions simultaneously by combining both direct and indirect evidence across a network of trials [8,9]. In rheumatology, NMAs have become increasingly common in evaluating therapeutic options, including JAK inhibitors, and are often used to guide clinical guidelines and regulatory decisions [10,11]. Given their complexity, the validity of NMAs heavily depends on the methodological rigour of the underlying systematic reviews (SRs). Inconsistent reporting, absence of predefined protocols, and flawed risk of bias assessments can significantly affect the credibility of findings. These methodological inconsistencies raise particular concern when NMAs are used to underpin treatment recommendations. The AMSTAR 2 tool (A Measurement Tool to Assess Systematic Reviews) was developed to critically appraise SRs, whether they include meta-analyses and regardless of whether the primary studies are randomized or non-randomized [12]. This 16-item tool enables the identification of major methodological flaws and promotes the transparent and responsible use of published evidence. Considering the expanding role of JAK inhibitors in RA and the growing number of SRs with NMAs assessing their efficacy and safety, a structured evaluation of the methodological quality of these reviews is warranted. The present study aimed to assess the methodological quality of SRs with NMAs focusing on JAK inhibitors in RA [13].

2. Materials and Methods

2.1. Literature Search Strategy

This systematic review was registered with the International Prospective Register of Systematic Reviews (PROSPERO; number CRD420261279961) and was conducted in accordance with the guidelines of the Cochrane Collaboration and reported according to the PRISMA 2020 (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement [14,15]. A systematic literature search was performed in the PubMed database “http://www.ncbi.nlm.nih.gov/pubmed/ (accessed 23 June 2025)”, and Embase database “http://www.embase.com/ (accessed 23 June 2025)”, with the last update in June 2025. Both MeSH terms and free-text keywords were used, combining the concepts “JAK inhibitors”, “rheumatoid arthritis” and “network meta-analysis”. No filters were applied. The full search strategy is described in Appendix A, Table A1 and Table A2, Supplementary Material. Search results were imported and managed into Mendeley Reference Manager (version 2.139.0) and duplicates were identified and removed.

2.2. Inclusion and Exclusion Criteria

The eligibility criteria included SRs including a NMA, assessing the efficacy and/or safety of JAK inhibitors in the treatment of RA. Only studies published in English language were considered, given being the predominant language for dissemination of biomedical research. Narrative reviews, editorials, letters to the editor, and commentaries, SRs without NMA, duplicate publications or SRs focused exclusively on a single JAK inhibitor were not included.

2.3. Study Selection Process

The study selection was conducted in two stages: (1) screening of titles and abstracts and (2) full-text review of potentially eligible articles. The selection process was performed by three independent reviewers. Disagreements were resolved through discussion and consensus. Screening was conducted manually, with Mendeley used for reference management and duplicate removal.

2.4. Methodological Quality Assessment

The methodological quality of the included SRs was assessed using AMSTAR 2 tool [12]. AMSTAR 2 is composed of 16 items that evaluate the risk of bias of SRs, including both randomized controlled trials (RCT) and non-randomized trials. Each item is rated using a trichotomous approach: Yes—when all requirements of the criterion were met; Partial Yes—when only some requirements were met; No—when the criterion was not met. AMSTAR 2 provides an overall rating of confidence in the results of each included SR, based on the presence of flaws in critical and non-critical domains. The critical domains are the following: protocol registered before commencement of the review (item 2), adequacy of the literature search (item 4), justification for excluding individual studies (item 7), risk of bias from individual studies being included in the review (item 9), appropriateness of meta-analytical methods (item 11), consideration of risk of bias when interpreting the results of the review (item 13) and assessment of presence and likely impact of publication bias (item 15) [12].

The tool was applied independently by three reviewers, and discrepancies were resolved by consensus, in accordance with the AMSTAR 2 guidance document [12]. The detailed instrument and its main guidance are described in Appendix B, Supplementary Material.

2.5. Data Extraction and Analysis

Data extracted included the main methodologic characteristics (e.g., reference/year, orientation followed, bibliographic databases searched, type of outcomes assessed, protocol registration, quality assessment scale, type of statistical model, heterogeneity/inconsistency assessment) and compliance level for each AMSTAR 2 item. A descriptive analysis was performed, with results presented in tables. The results of the AMSTAR 2 assessment were visually presented using robvis, version 0.3.0. [16].

3. Results

3.1. Study Selection

The literature search conducted in the PubMed and Embase databases yielded 222 potentially eligible publications. Following a rigorous application of the predefined eligibility criteria, a total of 18 SRs with NMA were included in the final sample [10,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33]. Among these, 5 focused exclusively on the efficacy of JAK inhibitors, 5 on safety outcomes, and 8 assessed both aspects (Figure 1).

Characteristics of the Studies

In terms of methodological orientation, most reviews (n = 17; 94%) reported compliance with the PRISMA guidelines. Two of these also followed the Centre for Reviews and Dissemination (CRD) recommendations, and one explicitly adhered to the Cochrane Collaboration’ guidance. Only one review did not specify adherence to any formal methodological framework.

The bibliographic databases most frequently searched among the included SRs were the Cochrane Central Register of Controlled Trials (CENTRAL) (n = 18; 100%) and EMBASE (n = 17; 94%). Searches in PubMed (n = 10; 55%) and MEDLINE (n = 9; 50%) were performed in at least half of the SRs included. Overall, most studies conducted searches across at least three major databases, ensuring a comprehensive and systematic retrieval of relevant evidence. Regarding the type of outcomes assessed, the sample was evenly distributed: five reviews (28%) focused exclusively on efficacy, five (28%) on safety, and eight (44%) evaluated both efficacy and safety.

Concerning protocol registration, the majority of reviews did not report prior registration (n = 10; 56%). Among those who registered their reviews, PROSPERO was reported in six reviews (33%), the European Network of Centres for Pharmacoepidemiology and Pharmacovigilance in three reviews (18%), and the INPLASY in two reviews (11%). Some reviews reported registration in more than one registry or in different platforms. With respect to quality assessment of included primary studies, the most frequently used instruments were the Cochrane Risk of Bias 2 (RoB 2) tool (n = 10; 56%) and the Jadad score (n = 7; 39%), both used to assess clinical trials. Only one review (6%) did not report any quality assessment tool. No review applied more than one assessment method. The statistical modelling approaches were heterogeneous. The most common approach was a Bayesian fixed-effects model (n = 8; 44%), followed by a Bayesian random-effects model (n = 7; 39%) and frequentist random-effects models (n = 4; 28%). Only two reviews addressed more than one modeling approach (i.e., a Bayesian primary analysis with a frequentist sensitivity analysis). Finally, methods for assessing network heterogeneity and inconsistency varied across reviews. The most frequently reported approaches were the use of inconsistency plots or comparisons between fixed and random effects models (n = 6; 33%) and the node splitting (n = 6; 33%). In addition, there were reviews using other statistical methods such as the I² statistic (n = 4, 22%), the Wald test (n = 3; 18%). Two reviews (11%) did not report any method for assessing heterogeneity or inconsistency. Several methods were often applied in the same review to assess robustness. Overall, these results indicate variability in methodological choices and reporting practices across the included NMAs, notably the limited use of prospective protocol registration and the heterogeneous application of quality-assessment instruments and inconsistency/heterogeneity assessment. The methodological characteristics of the eighteen SRs with NMAs included in this study are summarized in Table 1.

3.2. Methodological Quality Assessment (AMSTAR 2): Overall Compliance with AMSTAR 2

Compliance with AMSTAR 2 domains across the 18 included NMAs was heterogeneous, as illustrated in Figure 2. Several domains demonstrated consistently high adherence, while others revealed substantial methodological weaknesses.

The domains most frequently fulfilled were those related to structural components of the reviews, including clear definition of the research question (Domain 1), justification for study design (Domain 3), study selection performed in duplicate (Domain 5), duplicate data extraction (Domain 6), and reporting of conflicts of interest (Domain 16). All of these were fulfilled by 100% of the reviews. High compliance was also observed for the use of appropriate statistical methods (Domain 11) and for the discussion of heterogeneity (Domain 14), each reported by 94% of the reviews.

In contrast, the domains most closely associated with transparency and reproducibility exhibited the poorest performance. None of the reviews justified the exclusion of studies during the full-text screening stage (Domain 7; 0%), and none reported the funding sources of the included primary studies (Domain 10; 0%). Only 44% of the reviews presented a preregistered protocol (Domain 2), and just 39% described their search strategy in sufficient detail (Domain 4). Furthermore, only half of the reviews assessed publication bias (Domain 15; 50%). These patterns reveal notable methodological weaknesses in several domains considered critical to the reliability of SRs’ findings.

Beyond individual domain performance, all included reviews exhibited at least one critical methodological flaw, which affects the level of confidence that can be placed in their conclusions. The absence of protocol registration, incomplete reporting of search strategies, inconsistent assessment of publication bias, and lack of justification for excluded studies collectively represent gaps that undermine transparency and traceability, two pillars of SR integrity. Despite the strong compliance observed in structural aspects such as duplicate processes and conflict-of-interest reporting, the deficits in these critical domains weaken the overall credibility of the evidence generated. This is particularly concerning given the increasing reliance on NMAs to inform therapeutic guidelines and regulatory frameworks. As a result, unresolved methodological shortcomings may compromise both internal validity and the applicability of NMA findings in clinical and policy settings.

3.3. Comparative Analysis of Review Categories: Efficacy, Safety, and Combined Outcomes

When stratifying the results according to the primary focus of the reviews, safety-focused reviews consistently demonstrated higher methodological quality compared to efficacy-focused ones, as shown in Figure 3.

This difference was particularly marked in several critical AMSTAR 2 domains. In Domain 2 (protocol registration), all safety reviews reported a pre-registered protocol (100%), while none of the efficacy reviews did so, reflecting greater transparency and methodological planning in the former group.

Similarly, in Domain 4 (comprehensive search strategy), 80% of safety reviews provided a well-detailed and appropriate search, compared to only 20% of efficacy reviews. In Domain 9 (risk of bias assessment), compliance was universal in safety reviews (100%) but observed in less than half of the efficacy group (40%).

The same pattern was found in Domain 15 (publication bias), with all safety reviews assessing this domain against only 20% in efficacy.

Transparency gaps were evident across both groups, particularly in Domain 10 (reporting of funding sources), which none of the reviews addressed, and in Domain 7 (justification for study exclusions), which was consistently absent. These findings highlight the systematic methodological advantages of safety-focused reviews, while also exposing shortcomings that remain common to both domains of investigation.

Reviews that assessed both efficacy and safety demonstrated an intermediate and often inconsistent methodological profile.

Although they performed well in structural domains, such as clearly defining the PICO question and conducting duplicate data extraction, they showed notable shortcomings in several critical areas. Only 37% reported a preregistered protocol (Domain 2), 25% described their search strategy in sufficient detail (Domain 4), and 37% assessed the risk of publication bias (Domain 15).

This variability likely reflects the additional complexity of synthesizing two distinct categories of outcomes, which may increase methodological demands and the likelihood of omissions. Consequently, although mixed reviews have the potential to provide a broader and more integrated perspective, their conclusions must be interpreted cautiously due to the heterogeneity observed in methodological quality.

4. Discussion

This study examined the methodological quality of 18 SRs with NMAs that evaluated JAK inhibitors for the treatment of RA, applying the AMSTAR 2 tool. Clear differences emerged across the reviews according to their primary focus: efficacy, safety, or both outcomes simultaneously. Safety-focused reviews demonstrated the strongest methodological performance. All reported a preregistered protocol, adequately evaluated the risk of bias in the included primary studies, and assessed the possibility of publication bias. This pattern likely reflects the heightened scientific and regulatory scrutiny typically associated with safety research, particularly in fields concerned with pharmacovigilance and drug-related harms. Similar findings have been described in other clinical domains, where safety-oriented reviews tend to perform better in critical methodological domains [13]. By contrast, reviews addressing efficacy presented considerable weaknesses. None reported a pre-registered protocol, and several described their search strategies only partially or with insufficient detail. Furthermore, the assessment of publication bias was also infrequent. These limitations have also been documented in earlier studies, which identified the absence of protocol registration and incomplete bias assessment as recurring problems in SRs of pharmacological interventions [34].

Mixed reviews, those addressing both efficacy and safety, showed intermediate performance. Although they complied well with several structural domains, they exhibited deficiencies like those of efficacy-focused reviews, particularly regarding protocol registration, search strategy detail, and publication-bias assessment. This may be attributable to the added complexity of synthesizing two distinct categories of outcomes, which can increase methodological demands and the risk of inconsistencies [35].

Across all review types, two shortcomings stood out: none of the reviews justified the exclusion of studies after full-text screening, and none reported the funding sources of the included primary studies. These omissions have also been highlighted in reviews from other clinical areas, raise concerns regarding transparency and traceability, two core dimensions of methodological quality [36].

It is important to note, however, that the present findings do not fully align with all previous literature. In a cross-sectional analysis of 127 safety-focused SRs in the surgical field, Zhou and colleagues observed that none achieved a high methodological quality rating, with the majority being classified as low or critically low quality [37]. This contrast suggests that the relative strength of safety-focused reviews observed in our study may not be universal, but rather dependent on the context and the scientific standards prevailing in a given field.

The implications of these results extend directly to clinical practice and regulatory decision making. NMAs are frequently relied upon as a basis for therapeutic guidelines and policy recommendations. Methodological flaws in critical domains, such as the absence of a protocol, insufficient assessment of bias, or failure to consider publication bias, may undermine the validity of the evidence and increase the risk of misleading conclusions [36,38]. In the case of JAK inhibitors, the impact of such flaws is particularly concerning, given the therapeutic importance of these drugs and the safety issues that accompany their use.

This study also has limitations. Although the search was expanded to include both PubMed and Embase, other databases may include relevant reviews not captured here. The assessment of methodological quality relied exclusively on AMSTAR 2, which, although widely validated, does not capture every possible dimension of review quality [34]. Finally, the analysis reflects the evidence available up to the date of the search, and more recent reviews or updated versions may not have been included.

Future Implications

Future evidence syntheses on JAK inhibitors in rheumatoid arthritis would benefit from greater standardization and methodological rigor, particularly as NMAs are increasingly used to inform comparative decision-making [38]. Several improvements should therefore be prioritized. Protocols should be pre-registered in platforms such as PROSPERO (or an equivalent registry), enhancing transparency and reducing the likelihood of unplanned methodological changes. Literature searches should be comprehensive and fully documented (including reproducible strategies and clear justification of exclusions), and reporting should align more consistently with PRISMA guidance to facilitate reproducibility and interpretation of network structure and study contributions.

In addition, NMA-specific assumptions, like transitivity and consistency, should be explicitly assessed and transparently reported, alongside heterogeneity and small-study effects/publication bias where feasible [34,38]. Risk-of-bias and publication-bias assessments should be conducted systematically and integrated into interpretation, with consideration of structured approaches to rate certainty of evidence in NMA (e.g., GRADE for NMA). Finally, when reviews aim to evaluate both efficacy and safety, tailored methodological strategies are needed to address the additional complexity without compromising rigor; given evolving safety signals and regulatory recommendations for JAK inhibitors, living or regularly updated reviews, stronger long-term follow-up, and more consistent reporting of patient-centred outcomes would further strengthen the clinical and policy relevance of future NMAs [39].

5. Conclusions

In conclusion, these results reveal significant methodological discrepancies across efficacy-, safety-, and mixed-focused reviews of JAK inhibitors. These findings underscore the need for more consistent adoption of registered protocols, duplicate processes, and bias assessment. Addressing these shortcomings is essential if such reviews are to provide a reliable foundation for evidence synthesis, clinical practice and regulatory decision-making.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/jcm15020725/s1, PRISMA 2020 Checklist.

Author Contributions

Conceptualization, B.R. and C.A.; methodology, B.R. and C.A., writing—original draft preparation, B.R.; writing—review and editing, B.R., A.P., D.M. and C.A.; data extraction: B.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

Authors Ana Penedones, Diogo Mendes and Carlos Alves are partners from the company Clevidence. The company has no connection with the article’s content and there are no conflicts of interest. All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A. Search Strategy

Table A1. PubMed Search Strategy.

Search	Query
#11	Search: #4 AND #7 AND #10 Sort by: Most Recent
#10	Search: #8 OR #9 Sort by: Most Recent
#9	Search: “janus kinase inhibitor” OR “JAK inhibitor” OR “tofacitinib” OR “upadacitinib” OR “baricitinib” OR “filgocitinib” Sort by: Most Recent
#8	Search: “Janus Kinase Inhibitors” [Mesh] Sort by: Most Recent
#7	Search: #5 OR #6 Sort by: Most Recent
#6	Search: “network meta-analysis” OR “mixed treatment comparison” OR “multiple treatment comparison” Sort by: Most Recent
#5	Search: “Network Meta-Analysis as Topic” [Mesh] OR “Network Meta-Analysis” [Publication Type] Sort by: Most Recent
#4	Search: #1 OR #2 OR #3 Sort by: Most Recent
#3	Search: “arthritis, rheumatoid” Sort by: Most Recent
#2	Search: “rheumatoid arthritis” Sort by: Most Recent
#1	Search: rheumatoid arthritis [MeSH Terms] Sort by: Most Recent

Table A2. Embase Search Strategy.

Search	Query
#13	#4 AND #8 AND #12
#12	#9 OR #10 OR #11
#11	tofacitinib OR upadacitinib OR baricitinib OR filgocitinib
#10	jak AND inhibitor
#9	‘janus kinase inhibitor’
#8	#5 OR #6 OR #7
#7	‘mixed treatment comparison’
#6	‘network meta-analysis’
#5	‘network meta-analysis’ (topic)’
#4	#1 OR #2 OR #3
#3	‘rheumatoid arthritis’/exp
#2	‘rheumatoid arthritis’
#1	‘rheumatoid arthritis’/exp OR ‘rheumatoid arthritis’

Appendix B. AMSTAR 2 Instrument Methodology

Domains:

(1): Did the research questions and inclusion criteria for the review include the components of PICO (Population, Intervention, Comparator, Outcome)?
(2): Did the report of the review contain an explicit statement that the review methods were established prior to the conduct of the review and did the report justify any significant deviations from the protocol? (critical)
(3): Did the review authors explain their selection of the study designs for inclusion in the review?
(4): Did the review authors use a comprehensive literature search strategy? (critical)
(5): Did the review authors perform study selection in duplicate?
(6): Did the review authors perform data extraction in duplicate?
(7): Did the review authors provide a list of excluded studies and justify the exclusions? (critical)
(8): Did the review authors describe the included studies in adequate detail?
(9): Did the review authors use a satisfactory technique for assessing the risk of bias (RoB) in individual studies that were included in the review? (critical)
(10): Did the review authors report on the sources of funding for the studies included in the review?
(11): If meta-analysis was performed, did the review authors use appropriate methods for statistical combination of results? (critical)
(12): If meta-analysis was performed, did the review authors assess the potential impact of risk of bias (RoB) in individual studies on the results of the meta-analysis or other evidence synthesis?
(13): Did the review authors account for RoB in primary studies when interpreting/discussing the results of the review? (critical)
(14): Did the review authors provide a satisfactory explanation for, and discussion of, any heterogeneity observed in the results of the review?
(15): If they performed quantitative synthesis did the review authors carry out an adequate investigation of publication bias (small study bias) and discuss its likely impact on the results of the review? (critical)
(16): Did the review authors report any potential sources of conflict of interest, including any funding they received for conducting the review?

The overall confidence can be interpreted as the following:

High (zero or one non-critical weakness: the systematic review provides an accurate and comprehensive summary of the results of the available studies that address the question of interest);
Moderate (more than one non-critical weakness *: the systematic review has more than one weakness, but no critical flaws. It may provide an accurate summary of the results of the available studies that were included in the review);
Low (one critical flaw with or without non-critical weaknesses: the review has a critical flaw and may not provide an accurate and comprehensive summary of the available studies that address the question of interest);
Critically low (more than one critical flaw with or without non-critical weaknesses: the review has more than one critical flaw and should not be relied on to provide an accurate and comprehensive summary of the available studies).

* Multiple non-critical weaknesses may diminish confidence in the review and it may be appropriate to move the overall appraisal down from moderate to low confidence.

References

Fraenkel, L.; Bathon, J.M.; England, B.R.; St.CLair, E.W.; Arayssi, T.; Carandang, K.; Deane, K.D.; Genovese, M.; Huston, K.K.; Kerr, G.; et al. 2021 American College of Rheumatology Guideline for the Treatment of Rheumatoid Arthritis. Arthritis Care Res. 2021, 73, 924–939. [Google Scholar] [CrossRef]
Smolen, J.S.; Landewé, R.B.M.; Bergstra, S.A.; Kerschbaumer, A.; Sepriano, A.; Aletaha, D.; Caporali, R.; Edwards, C.J.; Hyrich, K.L.; Pope, J.E.; et al. EULAR recommendations for the management of rheumatoid arthritis with synthetic and biological disease-modifying antirheumatic drugs: 2022 update. Ann. Rheum. Dis. 2023, 82, 3–18. [Google Scholar] [CrossRef]
Di Matteo, A.; Bathon, J.M.; Emery, P. Rheumatoid arthritis. Lancet 2023, 402, 2019–2033. [Google Scholar] [CrossRef] [PubMed]
GBD 2021 Rheumatoid Arthritis Collaborators. Global, regional, and national burden of rheumatoid arthritis, 1990–2020, and projections to 2050: A systematic analysis of the Global Burden of Disease Study 2021. Lancet Rheumatol. 2023, 5, e594–e610. [Google Scholar] [CrossRef] [PubMed]
Bonelli, M.; Kerschbaumer, A.; Kastrati, K.; Ghoreschi, K.; Gadina, M.; Heinz, L.X.; Smolen, J.S.; Aletaha, D.; O’Shea, J.; Laurence, A. Selectivity, efficacy and safety of JAKinibs: New evidence for a still evolving story. Ann. Rheum. Dis. 2024, 83, 139–160. [Google Scholar] [CrossRef]
Ytterberg, S.R.; Bhatt, D.L.; Mikuls, T.R.; Koch, G.G.; Fleischmann, R.; Rivas, J.L.; Germino, R.; Menon, S.; Sun, Y.; Wang, C.; et al. Cardiovascular and Cancer Risk with Tofacitinib in Rheumatoid Arthritis. N. Engl. J. Med. 2022, 386, 316–326. [Google Scholar] [CrossRef] [PubMed]
European Medicines Agency. Direct Healthcare Professional Communication (DHPC): Updated Recommendations to Minimise the Risks of Malignancy, Major Adverse Cardiovascular Events, Serious Infections, Venous Thromboembolism and Mortality with Use of Janus Kinase Inhibitors (JAKi). 2023. Available online: https://www.ema.europa.eu/en/medicines/dhpc/cibinqo-jyseleca-olumiant-rinvoq-xeljanz (accessed on 6 January 2026).
Caldwell, D.M.; Ades, A.E.; Higgins, J.P.T. Simultaneous comparison of multiple treatments: Combining direct and indirect evidence. BMJ 2005, 331, 897–900. [Google Scholar] [CrossRef]
Salanti, G. Indirect and mixed-treatment comparison, network, or multiple-treatments meta-analysis: Many names, many benefits, many concerns for the next generation evidence synthesis tool. Res. Synth. Methods 2012, 3, 80–97. [Google Scholar] [CrossRef]
Lee, Y.H.; Song, G.G. Comparative efficacy and safety of tofacitinib, baricitinib, upadacitinib, and filgotinib in active rheumatoid arthritis refractory to biologic disease-modifying antirheumatic drugs. Z. Rheumatol. 2021, 80, 379–392. [Google Scholar] [CrossRef]
Taylor, P.C.; Takeuchi, T.; Burmester, G.R.; Durez, P.; Smolen, J.S.; Deberdt, W.; Issa, M.; Terres, J.R.; Bello, N.; Winthrop, K.L. Safety of baricitinib for the treatment of rheumatoid arthritis over a median of 4.6 and up to 9.3 years of treatment: Final results from long-term extension study and integrated database. Ann Rheum Dis. 2022, 81, 335–343. [Google Scholar] [CrossRef]
Shea, B.J.; Reeves, B.C.; Wells, G.; Thuku, M.; Hamel, C.; Moran, J.; Moher, D.; Tugwell, P.; Welch, V.; Kristjansson, E.; et al. AMSTAR 2: A critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ 2017, 358, 4008. Available online: https://www.bmj.com/content/358/bmj.j4008 (accessed on 25 July 2025).
Zorzela, L.; Golder, S.; Liu, Y.; Pilkington, K.; Hartling, L.; Joffe, A.; Loke, Y.; Vohra, S. Quality of reporting in systematic reviews of adverse events: Systematic review. BMJ 2014, 348, f7668. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, 71. [Google Scholar] [CrossRef]
Higgins, J.P.T.; Altman, D.G.; Gøtzsche, P.C.; Jüni, P.; Moher, D.; Oxman, A.D.; Savović, J.; Schulz, K.F.; Weeks, L.; Sterne, J.A.C.; et al. The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. BMJ 2011, 343, d5928. [Google Scholar] [CrossRef] [PubMed]
McGuinness, L.A.; Higgins, J.P.T. Risk-of-bias VISualization (robvis): An R package and Shiny web app for visualizing risk-of-bias assessments. In Research Synthesis Methods; John Wiley and Sons Ltd.: Hoboken, NJ, USA, 2021; pp. 55–61. [Google Scholar]
Lee, Y.H.; Song, G.G. Relative remission rates of Janus kinase inhibitors in comparison with adalimumab in patients with active rheumatoid arthritis: A network meta-analysis. Z. Rheumatol. 2024, 83, 88–96. [Google Scholar] [CrossRef]
Cai, W.; Tong, R.; Sun, Y.; Yao, Y.; Zhang, J. Comparative efficacy of five approved Janus kinase inhibitors as monotherapy and combination therapy in patients with moderate-to-severe active rheumatoid arthritis: A systematic review and network meta-analysis of randomized controlled trials. Front. Pharmacol. 2024, 15, 1387585. [Google Scholar] [CrossRef] [PubMed]
Lee, Y.H.; Song, G.G. Relative efficacy and safety of tofacitinib, baricitinib, upadacitinib, and filgotinib in comparison to adalimumab in patients with active rheumatoid arthritis. Z. Rheumatol. 2020, 79, 785–796. [Google Scholar] [CrossRef]
Pope, J.; Sawant, R.; Tundia, N.; Du, E.X.; Qi, C.Z.; Song, Y.; Tang, P.; Betts, K.A. Comparative Efficacy of JAK Inhibitors for Moderate-to-Severe Rheumatoid Arthritis: A Network Meta-Analysis. Adv. Ther. 2020, 37, 2356–2372. [Google Scholar] [CrossRef]
Sung, Y.K.; Lee, Y.H. Comparative study of the efficacy and safety of tofacitinib, baricitinib, upadacitinib, and filgotinib versus methotrexate for disease-modifying antirheumatic drug-naïve patients with rheumatoid arthritis. Z. Rheumatol. 2021, 80, 889–898. [Google Scholar] [CrossRef]
Li, N.; Gou, Z.-P.; Du, S.-Q.; Zhu, X.-H.; Lin, H.; Liang, X.-F.; Wang, Y.-S.; Feng, P. Effect of JAK inhibitors on high- and low-density lipoprotein in patients with rheumatoid arthritis: A systematic review and network meta-analysis. Clin. Rheumatol. 2022, 41, 677–688. [Google Scholar] [CrossRef]
Lee, Y.H.; Song, G.G. Relative Remission and Low Disease Activity Rates of Tofacitinib, Baricitinib, Upadacitinib, and Filgotinib versus Methotrexate in Patients with Disease-Modifying Antirheumatic Drug-Naive Rheumatoid Arthritis. Pharmacology 2023, 108, 589–598. [Google Scholar] [CrossRef] [PubMed]
Lee, Y.H.; Song, G.G. Comparative efficacy and safety of tofacitinib, baricitinib, upadacitinib, filgotinib and peficitinib as monotherapy for active rheumatoid arthritis. J. Clin. Pharm. Ther. 2020, 45, 674–681. [Google Scholar] [CrossRef]
Alves, C.; Penedones, A.; Mendes, D.; Marques, F.B. The Risk of Infections Associated with JAK Inhibitors in Rheumatoid Arthritis: A Systematic Review and Network Meta-analysis. J. Clin. Rheumatol. 2022, 28, e407–e414. [Google Scholar] [CrossRef]
Pugliesi, A.; Oliveira, D.G.C.; de Souza Filho, V.A.; de Oliveira Machado, J.; Pereira, A.G.; de Castro Silveira Bichuette, J.; Sachetto, Z.; de Carvalho, L.S.F.; Bertolo, M.B. Cardiovascular safety of the class of JAK inhibitors or tocilizumab compared with TNF inhibitors in patients with rheumatoid arthritis: Systematic review and a traditional and Bayesian network meta-analysis of randomized clinical trials. Semin. Arthritis Rheum. 2024, 69, 152563. [Google Scholar] [CrossRef]
Wei, Q.; Wang, H.; Zhao, J.; Luo, Z.; Wang, C.; Zhu, C.; Su, N.; Zhang, S. Cardiovascular safety of Janus kinase inhibitors in patients with rheumatoid arthritis: Systematic review and network meta-analysis. Front. Pharmacol. 2023, 14, 1237234. [Google Scholar] [CrossRef]
Alves, C.; Penedones, A.; Mendes Di Marques, F.B. Risk of Cardiovascular and Venous Thromboembolic Events Associated with Janus Kinase Inhibitors in Rheumatoid Arthritis: A Systematic Review and Network Meta-analysis. J. Clin. Rheumatol. 2022, 28, 69–76. [Google Scholar] [CrossRef]
Alves, C.; Penedones, A.; Mendes, D.; Batel-Marques, F. Risk of infections and cardiovascular and venous thromboembolic events associated with JAK inhibitors in rheumatoid arthritis: Protocols of two systematic reviews and network meta-analyses. BMJ Open 2020, 10, e041420. [Google Scholar] [CrossRef]
Lee, Y.H.; Song, G.G. Relative effectiveness and safety of interleukin-6 and Janus kinase inhibitors versus adalimumab in patients with rheumatoid arthritis: A network meta-analysis. Z. Rheumatol. 2023, 82, 696–705. [Google Scholar] [CrossRef]
Weng, C.; Xue, L.; Wang, Q.; Lu, W.; Xu, J.; Liu, Z. Comparative efficacy and safety of Janus kinase inhibitors and biological disease-modifying antirheumatic drugs in rheumatoid arthritis: A systematic review and network meta-analysis. Ther. Adv. Musculoskelet. Dis. 2021, 13, 1759720X21999564. [Google Scholar] [CrossRef] [PubMed]
Almoallim, H.M.; Omair, M.A.; Ahmed, S.A.; Vidyasagar, K.; Sawaf, B.; Yassin, M.A. Comparative Efficacy and Safety of JAK Inhibitors in the Management of Rheumatoid Arthritis: A Network Meta-Analysis. Pharmaceuticals 2025, 18, 178. [Google Scholar] [CrossRef] [PubMed]
Qu, B.; Zhao, F.; Song, Y.; Zhao, J.; Yao, Y.; Chen, Y.; Liao, R.; Fu, L. The efficacy and safety of different Janus kinase inhibitors as monotherapy in rheumatoid arthritis: A Bayesian network meta-analysis. PLoS ONE 2024, 19, e0305621. [Google Scholar] [CrossRef] [PubMed]
Page, M.J.; Shamseer, L.; Altman, D.G.; Tetzlaff, J.; Sampson, M.; Tricco, A.C.; Catalá-López, F.; Li, L.; Reid, E.K.; Sarkis-Onofre, R.; et al. Epidemiology and Reporting Characteristics of Systematic Reviews of Biomedical Research: A Cross-Sectional Study. PLoS Med. 2016, 13, e1002028. [Google Scholar] [CrossRef] [PubMed]
Lunny, C.; Pieper, D.; Thabet, P.; Kanji, S. Managing overlap of primary study results across systematic reviews: Practical considerations for authors of overviews of reviews. BMC Med. Res. Methodol. 2021, 21, 140. [Google Scholar] [CrossRef] [PubMed]
Ioannidis, J.P.A. The Mass Production of Redundant, Misleading, and Conflicted Systematic Reviews and Meta-analyses. Milbank Q. 2016, 94, 485–514. [Google Scholar] [CrossRef] [PubMed]
Zhou, X.; Li, L.; Lin, L.; Ju, K.; Kwong, J.S.W.; Xu, C. Methodological quality for systematic reviews of adverse events with surgical interventions: A cross-sectional survey. BMC Med. Res. Methodol. 2021, 21, 223. [Google Scholar] [CrossRef]
Sterne, J.A.C.; Savović, J.; Page, M.J.; Elbers, R.G.; Blencowe, N.S.; Boutron, I.; Cates, C.J.; Cheng, H.Y.; Corbett, M.S.; Eldridge, S.M.; et al. RoB 2: A revised tool for assessing risk of bias in randomised trials. BMJ 2019, 366, l4898. [Google Scholar] [CrossRef]
Pieper, D.; Antoine, S.L.; Mathes, T.; Neugebauer, E.A.M.; Eikermann, M. Systematic review finds overlapping reviews were not mentioned in every other overview. J. Clin. Epidemiol. 2014, 67, 368–375. [Google Scholar] [CrossRef]

Figure 1. PRISMA flow diagram of the study selection process.

Figure 2. Methodological quality assessment for each SR with NMA, according to the AMSTAR 2 scale. Green (low concern; + signal) indicates “Yes” (all criterion requirements met), yellow (some concerns; - signal) indicates “Partial Yes” (some requirements met), and red (high concern; × signal) indicates “No” (criterion not met). Overall confidence rating is determined by the presence and number of critical and non-critical flaws, and can be interpreted as the following: Green (high; + signal; the SR provides an accurate and comprehensive summary of the results); Red (critically low; ! signal; the SR should not be relied on to provide an accurate and comprehensive summary of the available studies).

Figure 3. Comparative compliance with AMSTAR 2 domains by review focus. Bars represent the percentage of reviews in each category (Efficacy, Safety, Efficacy + Safety) that fulfilled each AMSTAR 2 domain (P1–P16 correspond to Domains D1–D16). Higher bars indicate greater compliance with the respective domain.

Table 1. Methodologic characteristics of the SRs with NMAs included.

Reference	Guideline	Bibliographic Databases	Type of Outcomes Assessed	Protocol Registration	Quality Assessment Scale	Type of Statistical Model	Heterogeneity/Inconsistency Assessment
Lee & Song, 2021 [10]	PRISMA	MEDLINE, EMBASE, CENTRAL, ACR, EULAR	Efficacy & Safety	NR	Jadad scores	Bayesian fixed-effects model	Inconsistency plots, Fixed vs. random effects model comparison
Lee & Song, 2024 [17]	PRISMA	MEDLINE, EMBASE, CENTRAL, ACR, EULAR	Efficacy	NR	Jadad scores	Bayesian fixed-effects model	NR
Cai et al., 2024 [18]	NR	PubMed, EMBASE, Web of Science and CENTRAL	Efficacy	NR	RoB2	Random-effects model (Stata 14, RevMan 5.4)	Cochran’s Q test
Lee & Song, 2020 [19]	PRISMA	MEDLINE, EMBASE, CENTRAL, ACR, EULAR	Efficacy & Safety	NR	Jadad scores	Bayesian fixed-effects model	Inconsistency plots, Fixed vs. random effects model comparison
Pope et al., 2020 [20]	PRISMA	MEDLINE, EMBASE, CENTRAL	Efficacy	NR	NR	Bayesian random-effects model	NR
Sung & Lee, 2021 [21]	PRISMA	MEDLINE, EMBASE, CENTRAL, ACR, EULAR	Efficacy & Safety	NR	Jadad scores	Bayesian fixed-effects model	Inconsistency plots, Fixed vs. random effects model comparison
Li et al., 2022 [22]	PRISMA	PubMed, MEDLINE, EMBASE, CENTRAL	Efficacy	NR	RoB2	Bayesian random-effects model	I² statistic
Lee & Song, 2023 [23]	PRISMA	MEDLINE, EMBASE, CENTRAL, ACR, EULAR	Efficacy	NR	Jadad scores	Bayesian fixed-effects model	Inconsistency plots, Fixed vs. random effects model comparison
Lee & Song, 2020 [24]	PRISMA	MEDLINE, EMBASE, CENTRAL, ACR, EULAR	Efficacy & Safety	NR	Jadad scores	Bayesian fixed-effects model	Inconsistency plots, Fixed vs. random effects model comparison
Alves et al., 2022 [25]	PRISMA; CRD	PubMed, EMBASE, CENTRAL, and ClinicalTrials.gov	Safety	PROSPERO, ENCePP	RoB2	Frequentist random-effects model	Wald test; Node-splitting; Loop-specific approach;
Pugliesi et al., 2024 [26]	PRISMA	MEDLINE (PubMed), CENTRAL, EMBASE, Web of Science, Scopus, LILACS	Safety	PROSPERO	RoB2	Bayesian random-effects model (MCMC)	I² statistic, node-splitting
Wei et al., 2023 [27]	PRISMA, Cochrane	PubMed, EMBASE, CENTRAL	Safety	PROSPERO	RoB2	Frequentist random-effects model; Bayesian random-effects model	Cochran’s Q test, node-splitting analysis
Alves et al., 2022 [28]	PRISMA	PubMed, EMBASE, CENTRAL, and ClinicalTrials.gov	Safety	PROSPERO, ENCePP	RoB2	Frequentist random-effects model	Wald test; Node-splitting; Loop-specific
Alves et al., 2020 [29]	PRISMA-NMA, CRD	PubMed, EMBASE, CENTRAL, and ClinicalTrials.gov	Safety	ENCePP	RoB2	Frequentist random-effects model	Wald test; Node-splitting;
Lee & Song, 2023 [30]	PRISMA	MEDLINE, EMBASE, CENTRAL, ACR, EULAR	Efficacy & Safety	NR	Jadad scores	Bayesian fixed-effects model	Inconsistency plots, Fixed vs. random effects model comparison
Weng et al., 2021 [31]	PRISMA	PubMed, EMBASE and CENTRAL	Efficacy & Safety	PROSPERO, INPLASY	RoB2	Bayesian random-effects model	Node-splitting, design-by-treatment test, I² statistic, τ² and funnel plots
Almoallim et al., 2025 [32]	PRISMA	PubMed, CENTRAL, ClinicalTrials.gov, ICTRP Network	Efficacy & Safety	INPLASY	RoB2	Frequentist random-effects model	I² statistic and τ²
Qu et al., 2024 [33]	PRISMA	CNKI, VIP, Wanfang, CBM, Pubmed, EMBASE, CENTRAL and Web of Science	Efficacy & Safety	PROSPERO	RoB2	Bayesian fixed-effects model and the Bayesian random-effects model	Funnel plots, Monte Carlo method and random-effects model

ACR, American College of Rheumatology; CENTRAL, Cochrane Central Register of Controlled Trials; CRD, Centre for Reviews and Dissemination; EULAR, European League against Rheumatism; ENCePP, The European Network of Centres for Pharmacoepidemiology & Pharmacovigilance; NR, Not reported; PRISMA, preferred reporting items for systematic reviews and meta-analyses; PRISMA-NMA, PRISMA extension statement for reporting systematic reviews incorporating network meta- analyses of healthcare interventions; RCT, randomized controlled trials.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ramalho, B.; Penedones, A.; Mendes, D.; Alves, C. Quality of Systematic Reviews with Network Meta-Analyses on JAK Inhibitors in the Treatment of Rheumatoid Arthritis: Application of the AMSTAR 2 Scale. J. Clin. Med. 2026, 15, 725. https://doi.org/10.3390/jcm15020725

AMA Style

Ramalho B, Penedones A, Mendes D, Alves C. Quality of Systematic Reviews with Network Meta-Analyses on JAK Inhibitors in the Treatment of Rheumatoid Arthritis: Application of the AMSTAR 2 Scale. Journal of Clinical Medicine. 2026; 15(2):725. https://doi.org/10.3390/jcm15020725

Chicago/Turabian Style

Ramalho, Bruna, Ana Penedones, Diogo Mendes, and Carlos Alves. 2026. "Quality of Systematic Reviews with Network Meta-Analyses on JAK Inhibitors in the Treatment of Rheumatoid Arthritis: Application of the AMSTAR 2 Scale" Journal of Clinical Medicine 15, no. 2: 725. https://doi.org/10.3390/jcm15020725

APA Style

Ramalho, B., Penedones, A., Mendes, D., & Alves, C. (2026). Quality of Systematic Reviews with Network Meta-Analyses on JAK Inhibitors in the Treatment of Rheumatoid Arthritis: Application of the AMSTAR 2 Scale. Journal of Clinical Medicine, 15(2), 725. https://doi.org/10.3390/jcm15020725

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Quality of Systematic Reviews with Network Meta-Analyses on JAK Inhibitors in the Treatment of Rheumatoid Arthritis: Application of the AMSTAR 2 Scale

Abstract

1. Introduction

2. Materials and Methods

2.1. Literature Search Strategy

2.2. Inclusion and Exclusion Criteria

2.3. Study Selection Process

2.4. Methodological Quality Assessment

2.5. Data Extraction and Analysis

3. Results

3.1. Study Selection

Characteristics of the Studies

3.2. Methodological Quality Assessment (AMSTAR 2): Overall Compliance with AMSTAR 2

3.3. Comparative Analysis of Review Categories: Efficacy, Safety, and Combined Outcomes

4. Discussion

Future Implications

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Search Strategy

Appendix B. AMSTAR 2 Instrument Methodology

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI