Review Reports
- Mei Yang1,2,
- Tao Wang1,* and
- Yuchun Li1
- et al.
Reviewer 1: Celeste Varum Reviewer 2: Anonymous Reviewer 3: Lorena Del Carmen Espina Romero
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThank you for the opportunity to read the paper.
I my point of view the article is too long, it mixes too many methodologies and aside issues, while, at the end, the conclusions are not different from what a traditional analysis would reveal. In my opinion, the results do not add much, neither to academia, neither to the professional community. It is not clear to me that there was a gap in the literature regarding the theme under analysis. Eventually, the contribution might be more related to the methodology used. I would suggest to make the paper more focused and shorten it, leaving aside complementary analysis. Otherwise, if the aim is to prove the methodological contribution , compared to other alternatives, the focus should go on that direction.
Author Response
Comment 1: I my point of view the article is too long, it mixes too many methodologies and aside issues, while, at the end, the conclusions are not different from what a traditional analysis would reveal. In my opinion, the results do not add much, neither to academia, neither to the professional community. It is not clear to me that there was a gap in the literature regarding the theme under analysis. Eventually, the contribution might be more related to the methodology used. I would suggest to make the paper more focused and shorten it, leaving aside complementary analysis. Otherwise, if the aim is to prove the methodological contribution , compared to other alternatives, the focus should go on that direction.
Response: We sincerely thank the reviewer for this critical and constructive comment. Following this suggestion, we have substantially revised the manuscript to sharpen the focus on the methodological contribution rather than peripheral analyses. Specifically, we (i) removed several complementary and descriptive analyses, (ii) condensed the Introduction, Literature Review, and Discussion sections, and (iii) restructured the contribution statement to explicitly emphasize the CFA–CatBoost–SHAP integrated methodological framework as the core novelty of the study. In addition, we have rewritten the conclusion to clearly distinguish our nonlinear, interpretable mechanism findings from what traditional linear models can reveal. These revisions significantly reduce the manuscript length and strengthen its academic and practical contribution. The relevant changes are mainly reflected in Sections 1, 2, 5, and 6.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript presents an ambitious and innovative integration of confirmatory factor analysis, CatBoost modeling and SHAP interpretability within the regional innovation system framework. The study offers strong methodological potential and brings forward valuable insights into the nonlinear dynamics among firms, knowledge institutions, government and the economic environment. The paper is promising, but several areas would benefit from improvement. The introduction clearly identifies gaps in the literature, yet the manuscript would be strengthened by explicitly stating the research questions or hypotheses to guide the reader through the analytical logic. The structure and readability could be improved by simplifying some dense sections and by shortening overly long sentences, which would make the arguments more accessible. The connection between the theoretical foundations of regional innovation systems and the empirical machine learning framework deserves clearer articulation, especially with regard to the justification of the four latent dimensions used in the study. The SHAP interpretations are interesting but in a few places appear overstated; they would benefit from a more concise and cautious explanation that separates empirical findings from theoretical speculation. The manuscript should also include the complete reference list and ensure that all cited works are properly represented. A more explicit discussion of limitations, including issues of data aggregation, noncausal interpretation of machine learning models and potential endogeneity among latent variables, would increase the transparency and credibility of the study. Finally, the quality of English is generally good, but the text would benefit from stylistic polishing in order to improve fluency and clarity. Despite these points, the study makes a meaningful contribution and has strong publication potential once these refinements are addressed.
Author Response
Comment 1: The introduction clearly identifies gaps in the literature, yet the manuscript would be strengthened by explicitly stating the research questions or hypotheses to guide the reader through the analytical logic. The structure and readability could be improved by simplifying some dense sections and by shortening overly long sentences, which would make the arguments more accessible. The connection between the theoretical foundations of regional innovation systems and the empirical machine learning framework deserves clearer articulation, especially with regard to the justification of the four latent dimensions used in the study.
Response: We appreciate the reviewer’s valuable suggestion. We have explicitly added three research questions (RQs) at the end of the Introduction to guide the analytical logic of the study. In addition, we have systematically simplified long sentences and revised dense paragraphs to improve readability. Moreover, we have strengthened the theoretical–empirical linkage between RIS theory and the four latent dimensions (Firm, Knowledge, Gov, Econ) by adding a clearer theoretical justification in Section 2.1.
Comment 2: The SHAP interpretations are interesting but in a few places appear overstated; they would benefit from a more concise and cautious explanation that separates empirical findings from theoretical speculation. The manuscript should also include the complete reference list and ensure that all cited works are properly represented.
Response: We thank the reviewer for this important reminder. We have revised the SHAP interpretation throughout Sections 4.4 and 5 to clearly distinguish between (i) data-driven empirical findings and (ii) theoretical inference, and we have adopted a more cautious, evidence-based tone. In addition, we have carefully checked and completed the entire reference list, ensuring that all in-text citations are fully and correctly reported.
Comment 3: A more explicit discussion of limitations, including issues of data aggregation, noncausal interpretation of machine learning models and potential endogeneity among latent variables, would increase the transparency and credibility of the study.
Response: We fully agree with this comment. A new “Limitations and Future Directions” subsection has been added to Section 5, explicitly discussing (i) data aggregation at the provincial level, (ii) the non-causal nature of machine learning inference, and (iii) potential endogeneity among RIS latent variables, as well as possible solutions for future research.
Comment 4: Finally, the quality of English is generally good, but the text would benefit from stylistic polishing in order to improve fluency and clarity.
Response: We thank the reviewer for this helpful suggestion. The entire manuscript has been further proofread and stylistically polished to improve fluency, clarity, and academic tone.
Reviewer 3 Report
Comments and Suggestions for AuthorsPEER REVIEW REPORT sustainability-4013581
This manuscript presents an ambitious and technically robust integration of CFA, CatBoost, and SHAP to examine the mechanisms of Regional Innovation Systems. The study is well structured and provides valuable empirical insights. However, several sections would benefit from clearer focus, reduced redundancy, and more compact explanations to enhance the overall coherence and readability of the work. Below, I outline the main weaknesses identified in each section and offer concrete examples for improvement.
The introduction offers a strong theoretical base, but it is longer than necessary and repeats arguments on the limits of linear models. This delays a clear presentation of the research gap and weakens the impact of the main contribution. A concise revision would help. For example, two long paragraphs describing traditional methods could be replaced with a single sentence such as: “The study addresses the lack of models capable of validating the latent RIS structure while capturing nonlinear mechanisms through interpretable prediction”.
The literature review covers RIS theory and methodological advances, yet some explanations overlap, particularly when justifying why linear models are insufficient. Consolidating these arguments would improve flow. For instance, instead of repeating this justification in separate sections, it could be expressed once as: “Linear methods fall short of capturing RIS nonlinearities, which motivates the use of gradient-based and XAI approaches”.
The methodology is rigorous and well detailed, but the mathematical description is more extensive than needed. Parts of the derivations could be simplified without affecting the study’s rigor. An example of improvement would be reducing the formula-heavy segments and summarizing key points such as: “ULS estimation was selected for its robustness to non-normal data, ensuring stable CFA results across provinces”.
The CFA results are accurately presented, though the section lists each fit index individually, making it unnecessarily long. A more compact approach would keep the essential message without extra detail. For instance: “CFI, TLI, and AGFI all exceed accepted thresholds, confirming the adequacy of the latent RIS structure”.
The comparison of predictive models is clear, but the narrative repeatedly emphasizes why CatBoost outperforms linear models. This point needs to appear only once. A refined version might read: “CatBoost achieves the highest predictive accuracy because it captures nonlinear interactions across RIS dimensions, which linear regressions inherently miss”.
The SHAP analysis section includes valuable interpretations, yet the explanations of each curve are more detailed than needed. Focusing on the dominant patterns would make the section clearer. A concise example could be: “Firm is the primary driver; Knowledge and Gov act as modulators; and Econ displays threshold effects that complement system interpretation”.
The discussion section effectively interprets regional heterogeneity, but it would be strengthened by incorporating recent evidence showing that technology adoption and innovation depend on institutional, economic, and organizational contexts. At this point, it is appropriate to reference external studies that support this contextual interpretation. A relevant example is the work of Noroño Sánchez (2025), who demonstrates that technology perception and adoption vary across sectors, organizational cultures, and economic conditions. These findings align with the SHAP patterns observed in this study and help explain why provinces with similar inputs may display different innovation outcomes. Including a sentence such as the following would reinforce the argument: “Evidence shows that technology adoption does not depend solely on internal technical capacity; Noroño Sánchez (2025) reports that cultural, economic, and sectoral factors condition the assimilation of new technologies, which supports the differentiated RIS dynamics observed in this study”. DOI: https://doi.org/10.64923/ceniiac.e0002
In the same discussion, concepts such as the “knowledge trap” and the substitution between government and firm contributions are repeated. Grouping these ideas into a single subsection would avoid redundancy. For example, the two paragraphs explaining Knowledge's modulation role could be merged to present the idea once in a clearer, stronger way.
The conclusion summarises the study effectively but repeats technical explanations already covered in the discussion. It would benefit from focusing on the distinctive value of the integrated CFA–CatBoost–SHAP framework. An example revision could be: “The integrated CFA–CatBoost–SHAP framework offers a replicable pathway to analyze latent structures and nonlinear mechanisms in regional innovation systems”.
Overall, the manuscript offers a valuable contribution to the study of innovation systems through an integrated and interpretable modeling framework. Refining the structure, reducing redundancies, and incorporating contextual evidence—such as the findings of Noroño Sánchez (2025)—would substantially strengthen the clarity, coherence, and theoretical grounding of the work. With these adjustments, the manuscript would be well positioned for publication.
Comments for author File:
Comments.pdf
Author Response
Comment 1: This manuscript presents an ambitious and technically robust integration of CFA, CatBoost, and SHAP to examine the mechanisms of Regional Innovation Systems. The study is well structured and provides valuable empirical insights. However, several sections would benefit from clearer focus, reduced redundancy, and more compact explanations to enhance the overall coherence and readability of the work. Below, I outline the main weaknesses identified in each section and offer concrete examples for improvement.
Response: We sincerely thank the reviewer for the positive evaluation and constructive guidance. In response, we have conducted a systematic condensation and restructuring across all sections, particularly in the Introduction, Literature Review, Methodology, and Discussion, to eliminate redundancies and improve overall coherence and compactness.
Comment 2: The introduction offers a strong theoretical base, but it is longer than necessary and repeats arguments on the limits of linear models. This delays a clear presentation of the research gap and weakens the impact of the main contribution. A concise revision would help. For example, two long paragraphs describing traditional methods could be replaced with a single sentence such as: “The study addresses the lack of models capable of validating the latent RIS structure while capturing nonlinear mechanisms through interpretable prediction”.
Response: We agree with the reviewer’s assessment. The Introduction has been substantially shortened, and repeated discussions on the limitations of linear models have been merged into a concise statement, following the suggested example. This change improves clarity and highlights the research gap more directly.
Comment 3: The literature review covers RIS theory and methodological advances, yet some explanations overlap, particularly when justifying why linear models are insufficient. Consolidating these arguments would improve flow. For instance, instead of repeating this justification in separate sections, it could be expressed once as: “Linear methods fall short of capturing RIS nonlinearities, which motivates the use of gradient-based and XAI approaches”.
Response: We thank the reviewer for this valuable suggestion. Following this advice, we have consolidated the repeated justifications regarding the limitations of linear models into a single, coherent statement in Section 2. This revision improves the logical flow of the literature review and avoids redundancy, while clearly motivating the adoption of gradient boosting and XAI-based approaches.
Comment 4: The methodology is rigorous and well detailed, but the mathematical description is more extensive than needed. Parts of the derivations could be simplified without affecting the study’s rigor. An example of improvement would be reducing the formula-heavy segments and summarizing key points such as: “ULS estimation was selected for its robustness to non-normal data, ensuring stable CFA results across provinces”.
Response: We appreciate the reviewer’s constructive comment. We have streamlined the mathematical exposition in Section 3 by removing overly detailed derivations and directly summarizing key methodological choices. In particular, we now explicitly state the rationale for selecting ULS estimation based on its robustness to non-normal data and its ability to ensure stability across provinces. This revision preserves rigor while improving readability.
Comment 5: The CFA results are accurately presented, though the section lists each fit index individually, making it unnecessarily long. A more compact approach would keep the essential message without extra detail. For instance: “CFI, TLI, and AGFI all exceed accepted thresholds, confirming the adequacy of the latent RIS structure”.
Response: We thank the reviewer for this helpful suggestion. We have revised the CFA results description in Section 4.1 by adopting a more concise presentation. The fit indices are now summarized collectively to emphasize their joint confirmation of the adequacy of the latent RIS structure, thereby improving clarity and conciseness.
Comment 6: The comparison of predictive models is clear, but the narrative repeatedly emphasizes why CatBoost outperforms linear models. This point needs to appear only once. A refined version might read: “CatBoost achieves the highest predictive accuracy because it captures nonlinear interactions across RIS dimensions, which linear regressions inherently miss”.
Response: We appreciate the reviewer’s insightful observation. We have removed redundant explanations and retained a single, unified statement in Section 4.3 emphasizing that CatBoost outperforms linear models due to its ability to capture nonlinear interactions across RIS dimensions. This revision improves narrative efficiency while preserving the core message.
Comment 7: The SHAP analysis section includes valuable interpretations, yet the explanations of each curve are more detailed than needed. Focusing on the dominant patterns would make the section clearer. A concise example could be: “Firm is the primary driver; Knowledge and Gov act as modulators; and Econ displays threshold effects that complement system interpretation”.
Response: We thank the reviewer for this valuable suggestion. We have revised the SHAP analysis in Section 4.4 by reducing overly detailed curve-by-curve explanations and emphasizing the dominant patterns. The interpretation now clearly highlights Firm as the primary driver, Knowledge and Gov as modulators, and Econ as exhibiting threshold effects, thereby improving the clarity of the mechanism presentation.
Comment 8: The discussion section effectively interprets regional heterogeneity, but it would be strengthened by incorporating recent evidence showing that technology adoption and innovation depend on institutional, economic, and organizational contexts. At this point, it is appropriate to reference external studies that support this contextual interpretation. A relevant example is the work of Noroño Sánchez (2025), who demonstrates that technology perception and adoption vary across sectors, organizational cultures, and economic conditions. These findings align with the SHAP patterns observed in this study and help explain why provinces with similar inputs may display different innovation outcomes. Including a sentence such as the following would reinforce the argument: “Evidence shows that technology adoption does not depend solely on internal technical capacity; Noroño Sánchez (2025) reports that cultural, economic, and sectoral factors condition the assimilation of new technologies, which supports the differentiated RIS dynamics observed in this study”. DOI: https://doi.org/10.64923/ceniiac.e0002
Response: We sincerely thank the reviewer for this important recommendation. We have incorporated the study by Noroño Sánchez (2025) into the discussion section (Section 5.2) and explicitly linked its findings on cultural, economic, and sectoral conditioning of technology adoption to the heterogeneous SHAP patterns observed in our results. This addition strengthens the contextual interpretation of differentiated RIS dynamics and enhances the theoretical grounding of our discussion.
Comment 9: In the same discussion, concepts such as the “knowledge trap” and the substitution between government and firm contributions are repeated. Grouping these ideas into a single subsection would avoid redundancy. For example, the two paragraphs explaining Knowledge's modulation role could be merged to present the idea once in a clearer, stronger way.
Response: We appreciate the reviewer’s careful reading and helpful suggestion. We have reorganized the relevant content in Section 5 by merging the previously repetitive discussions of the “knowledge trap” and the substitution effects between government and firm contributions into a single unified subsection. This restructuring eliminates redundancy and presents these mechanisms in a clearer and more coherent manner.
Comment 10: The conclusion summarises the study effectively but repeats technical explanations already covered in the discussion. It would benefit from focusing on the distinctive value of the integrated CFA–CatBoost–SHAP framework. An example revision could be: “The integrated CFA–CatBoost–SHAP framework offers a replicable pathway to analyze latent structures and nonlinear mechanisms in regional innovation systems”.
Response: We thank the reviewer for this valuable comment. We have revised the conclusion (Section 6) to reduce repeated technical explanations and instead emphasize the distinctive methodological contribution of the integrated CFA–CatBoost–SHAP framework. The conclusion now highlights its replicability and value for analyzing latent structures and nonlinear mechanisms in regional innovation systems.
Comment 11: Overall, the manuscript offers a valuable contribution to the study of innovation systems through an integrated and interpretable modeling framework. Refining the structure, reducing redundancies, and incorporating contextual evidence—such as the findings of Noroño Sánchez (2025)—would substantially strengthen the clarity, coherence, and theoretical grounding of the work. With these adjustments, the manuscript would be well positioned for publication.
Response: We sincerely appreciate the reviewer’s positive evaluation and constructive guidance. Following all the above suggestions, we have refined the manuscript structure, reduced redundancies, strengthened contextual evidence, and further clarified the theoretical contributions. We believe these revisions have substantially improved the clarity, coherence, and academic robustness of the manuscript.
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsDear Authors,
I would like to congratulate you on the quality of your manuscript and on the careful and rigorous way in which you have addressed all the reviewer comments. The revised version shows clear improvements in structure, coherence, writing quality, and theoretical articulation. The study now presents a solid methodological approach and a well-supported discussion with relevant implications for both research and practice.
Overall, the manuscript makes a valuable contribution to the literature in its field, and the current version reflects a high level of academic rigor and clarity. I appreciate the authors’ responsiveness and commitment to strengthening the manuscript. The paper is now well positioned for publication.