Flawed Metrics, Damaging Outcomes: A Rebuttal to the RI2 Integrity Index Targeting Top Indonesian Universities
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis is a clearly written comment on the RI2 index. I think it is very helpful in shaping the right use of bibliometric index.
- Please focus more on the flaws of the index. Describe it in a more academic way.
- Based on the academic analysis of the flaws of the index, propose how to, for example, improve the index, or how to interpret the index.
- Give practical advices on how to use the retraction data and delisting data.
Author Response
Reviewer 1
General comment: This is a clearly written comment on the RI2 index. I think it is very helpful in shaping the right use of bibliometric index.
Response to general comment: Thank you for the encouraging feedback. We are grateful that the clarity and relevance of our analysis were appreciated. In response to your further queries, we have strengthened the manuscript by deepening the analytical critique, improving scholarly framing, and providing concrete, constructive recommendations.
Query #1: Please focus more on the flaws of the index. Describe it in a more academic way.
Response #1: We have thoroughly revised Sections 4 and 5 of the manuscript to provide a more rigorous and academically grounded critique of the RI² index. Specifically: (a) We now examine the flawed assumption that delisting is a reliable proxy for misconduct (Section 4); (b) We discuss the methodological opacity of Scopus delisting, referencing the peer-reviewed case of Nurture (Ahmad, 2025); (c) We present the imbalance caused by the equal weighting of retractions and delisting without contextual calibration. The index’s failure to incorporate any form of validation or stakeholder engagement is emphasized in the revised subsection titled “Global Metric, Local Blindness: RI²’s Design Without Validation”.
Query #2: Based on the academic analysis of the flaws of the index, propose how to, for example, improve the index, or how to interpret the index.
Response #2: We now include a dedicated discussion (Sections 5 and 6, and Concluding Remarks) that outlines actionable improvements: (a) A weighted model is suggested, assigning higher value to retractions (e.g., 70%) and lower weight to delisting (e.g., 30%) to reflect their differing severity and directness. (b) We propose contextual flexibility in the weighting model. For example, a 50:50 model may be valid in well-resourced settings but unsuitable in LMIC contexts. (c) We argue for a mechanism for institutional rebuttals and case-specific audits, which would allow universities to clarify circumstances surrounding delisted publications.(d) Most importantly, we emphasize that RI² should be treated as an exploratory and experimental tool, not as a definitive or policy-ready metric. Until it undergoes peer-reviewed validation and stakeholder engagement, its use should be strictly limited to hypothesis generation or academic discussion—not formal assessment or reputational ranking.
Query #3: Give practical advices on how to use the retraction data and delisting data.
Response #3: Our revised manuscript now includes the following practical guidance:(a) Retraction data should be interpreted in light of the retraction type, such as fraud, error, authorship dispute, or editorial oversight. Not all retractions signal misconduct. (c) Delisted journal data should not be used as a standalone indicator of author integrity, especially in regions where structural publication constraints prevail. (d) Both indicators should be accompanied by qualitative context, such as correction notices or investigations, to avoid misleading inferences. (e) We also caution that institutions should assess RI² outputs as flags for further inquiry, not definitive proof of wrongdoing.
Reviewer 2 Report
Comments and Suggestions for AuthorsI believe that this work represents a critical and timely contribution to the debate on academic integrity metrics, especially in contexts of the Global South, and can foster a fairer and more contextualized reflection on institutional evaluation systems. The manuscript meets a high standard and is a valuable contribution to the debate on research integrity.
Overall Assessment:
The manuscript presents a comprehensive and well-argued critique of the Research Integrity Risk Index (RI2), highlighting its methodological limitations, structural biases, and negative consequences, especially in the context of Indonesian universities.
The argument is coherent and supported by empirical evidence, relevant references, and a solid conceptual framework.
The manuscript's strengths are numerous, including the expository clarity of RI2, the strength of its methodological critique, evidence of disproportionate impact in Indonesia, historical and conceptual contextualization, analysis of the media ecosystem, perspective of the Global South, and a clear call to action.
Suggestions for improvement are for further refinement of the manuscript, addressing the need for more concrete examples of policy impact, proposing alternative metrics or approaches, and additional emphasis on the critical importance of peer review.
Author Response
Reviewer 2,
General Comment: “This work represents a critical and timely contribution... fosters a fairer and more contextualized reflection on institutional evaluation systems.”
Response: We are sincerely grateful for your generous and thoughtful evaluation. Your recognition of the manuscript's contribution to equity and methodological rigor has been deeply encouraging. Below we address your suggestions point by point.
Query #1: Include more concrete examples of the impact of the RI2 on public policies.
Response #1: We now reference the significant media amplification of RI² scores in Indonesia (Section 6), noting that institutional reputations were publicly questioned despite a lack of peer-reviewed validation. Furthermore, the potential impact on research funding allocations and compliance monitoring is highlighted in Section 9, second paragraph.
Query #2: Propose fairer and more contextualized metrics or alternative approaches.
Response #2: We have expanded the recommendation section to propose: (a) A flexible weighting scheme tailored to national contexts. (b) The use of multi-dimensional indicators (e.g., correction rates, ethics training, review transparency). (c) Adoption of qualitative triangulation with institutional ethics boards. (d) The example of the SES-C scale by Sacre et al. is included to illustrate how context-specific validation can enhance global research metrics.
Query #3:Reinforce the importance of peer review as an essential principle.
Response #3: This point has been reinforced in Section 6 (Conclusion). We explicitly call for the suspension of RI2 as a policy tool until its methods undergo rigorous, collaborative peer review. We also criticize the disproportionate media attention received by an unreviewed, single-authored preprint that lacks interdisciplinary co-authorship. Most importantly, we emphasize that RI2 should be treated as an exploratory and experimental tool, not as a definitive or policy-ready metric.
Query #4:Ethical and formatting considerations
Response #4: The manuscript has been revised to improve neutrality, avoid overstatement, and enhance scholarly tone throughout. We have also corrected all formatting inconsistencies.