Next Article in Journal
Data-Driven Decarbonization: Machine Learning Insights into GHG Trends and Informed Policy Actions for a Sustainable Bangladesh
Previous Article in Journal
Advancing SDG5: Machine Learning and Statistical Graphics for Women’s Empowerment and Gender Equity
 
 
Article
Peer-Review Record

How Do Rural Households Achieve Poverty Alleviation? Identification and Characterization of Development Pathways Using Explainable Machine Learning

Sustainability 2025, 17(21), 9704; https://doi.org/10.3390/su17219704
by Shoujie Jia 1,2,3, Qiong Li 3, Wenji Zhao 1,2,* and Yanhui Wang 1,2
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Reviewer 4:
Sustainability 2025, 17(21), 9704; https://doi.org/10.3390/su17219704
Submission received: 1 October 2025 / Revised: 22 October 2025 / Accepted: 27 October 2025 / Published: 31 October 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The introduction clearly situates the study in the broader literature on poverty dynamics, sustainable livelihoods, and machine learning applications.

The arguments are presented in a coherent and well-structured manner, showing a logical flow from the empirical results to their theoretical implications. The discussion effectively integrates the findings with both the conceptual framework proposed in the introduction and relevant international literature on poverty dynamics and sustainable livelihoods. The authors critically interpret their results, highlighting policy relevance. Comparative references to similar approaches in other developing contexts  demonstrate balance and awareness of global perspectives.

The empirical results are presented with clarity and precision. The structure of the Results section follows a logical progression—from model validation and feature importance to cluster analysis and spatial visualization—which helps readers understand the methodological flow and its outcomes. The use of figures and visual aids (heatmaps, Sankey diagrams) is particularly effective in communicating complex multidimensional relationships in an accessible way.

Each visualization is accompanied by adequate explanation and interpretation, linking quantitative findings to substantive insights about household poverty alleviation pathways.

Extensive and current references.

Conclusions logically derive from the empirical and interpretive analysis.

 

Comments on the Quality of English Language

The English could be improved slightly to enhance flow and reduce redundancy (some long, complex sentences).

 

Author Response

Comment 07:
The English could be improved slightly to enhance flow and reduce redundancy (some long, complex sentences).

Response 07:
Thank you for this helpful comment. We agree that improving the overall language flow and reducing redundancy would enhance readability. In the revised manuscript, we carefully edited the English throughout to simplify long or overly complex sentences and ensure smoother transitions. Specifically, we:
(1) Refined the Introduction to improve logical flow and clarity by structuring it around the research gap, theoretical framework, and study contributions;
(2) Revised the Abstract, Discussion, and Conclusions to strengthen the connection between findings, mechanisms, and policy implications while removing unnecessary methodological repetition; and
(3) Reviewed the entire manuscript to shorten and restructure sentences where needed for conciseness and clarity.
All these linguistic and stylistic revisions have been implemented across the manuscript and are marked in red in the revised version.

 

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors,

I congratulate you for the choice of the research topic. It is important both for the dynamics of scientific research and for the practical use in the elaboration of policies for poverty reduction in rural areas. However, I consider that some completions and clarifications are necessary.

The abstract is well written and reflects the content of the paper, but it could benefit from a more explicit clarification of the research question and the original contribution.

The context in which the research was done is well presented (lines 34–83) through a comprehensive overview of the studies on poverty and transition to sustainable development, and the integration of capability theory (Sen) and concepts of vulnerability and sustainable livelihood is welcome.

The aim of the paper is clearly outlined in lines 164–173, where it is stated that the paper tries to identify the poverty reduction pathways at rural household level. Maybe it would be good to specify that the three contributions also represent the research objectives.

Research methodology
The paper uses a complex and solid methodology, integrating XGBoost and SHAP algorithms to characterize the transition states and processes (lines 255–388), but the description of the model parameters could be condensed for clarity (lines 275–278).

Research results are detailed and well documented (lines 398–723), offering a rigorous analysis of the exit from poverty pathways.

Novelty or strong points of the research
I think the research is innovative by integrating a dynamic approach to poverty reduction and by using AI model interpretation techniques for public policies. The practical relevance is emphasized and well argued (lines 26–29, 707–723).

Limits of the research and recommendations for future research
I think it would be useful to include a few paragraphs about the research limitations in the results and implications section, moving also some parts from the conclusions (e.g., potential data biases or limits of the XGBoost model), and to propose formally future research directions such as applying the method in other regions or integrating other data sources like qualitative ones.

The conclusions are consistent with the objectives (the stated contributions) and the data, while unjustified generalizations are avoided. Maybe it would be more useful to focus more on the results and less on the methodology.

The bibliography is up to date and has a high proportion of sources published in the last 5 years. I consider the relevance and balance of references are appropriate.

I congratulate you for the effort and I consider that by addressing these recommendations, the manuscript can become more relevant and informative for the readers, increasing its chances for publication.

Author Response

Comment 1:
The abstract is well written and reflects the content of the paper, but it could benefit from a more explicit clarification of the research question and the original contribution.

Response 1:
Thank you for this insightful comment. We agree that the original abstract did not make the research question and the study’s original contribution sufficiently explicit. Following your suggestion, and in alignment with your later comments on the Discussion (Comment 7) and Conclusions (Comment 8), we have revised the abstract to clearly articulate the core research questions and to highlight the study’s conceptual, methodological, and empirical contributions.

The revised abstract appears on Page 1, Lines 11–34 of the revised manuscript, with all modifications marked in red.

 

 

Comment 3:
The aim of the paper is clearly outlined in lines 164–173, where it is stated that the paper tries to identify the poverty reduction pathways at rural household level. Maybe it would be good to specify that the three contributions also represent the research objectives.
Response 3:
Thank you for this valuable suggestion. We agree that the original Introduction listed the study’s contributions but did not explicitly present the corresponding research objectives, which may have reduced the clarity of the paper’s aim. Based on your comment, we revised the relevant text to make the research objectives explicit and to clarify that these objectives directly correspond to the three stated contributions.

In addition, to make the structure of the Introduction clearer, we adjusted the overall organization of this section by first summarizing the existing research gaps and then explaining how explainable machine learning helps address them, leading naturally to the statement of objectives and contributions.

The corresponding revisions have been made in the Introduction section, specifically on Page 3 (Lines 131–140) and Page 4 (Lines 180–183) of the revised manuscript. All modifications are highlighted in red for easy reference.

 

 

Comment 4:
Research methodology — The paper uses a complex and solid methodology, integrating XGBoost and SHAP algorithms to characterize the transition states and processes (lines 255–388), but the description of the model parameters could be condensed for clarity (lines 275–278).

Response 4:
Thank you for this helpful comment. We agree that the description of model parameters in the original version was unnecessarily detailed and could be streamlined for better readability. In the revised manuscript, we condensed the parameter section in the main text and clarified the hyperparameter tuning process, while providing the full parameter configurations in the appendix for completeness. Specifically, the text now explains that hyperparameters were tuned via stratified cross-validation with early stopping, using Macro-F1 on validation folds for model selection, and that the complete configurations and search ranges for XGBoost, Random Forest, and Decision Tree models are listed in Appendix Table B. These revisions appear in the Methods section on Page 9(Lines 340–343), and are marked in red in the revised manuscript.

 

Comment 7:
Limits of the research and recommendations for future research — I think it would be useful to include a few paragraphs about the research limitations in the results and implications section, moving also some parts from the conclusions (e.g., potential data biases or limits of the XGBoost model), and to propose formally future research directions such as applying the method in other regions or integrating other data sources like qualitative ones.

 

Response 7:
Thank you for this constructive suggestion. We agree that the manuscript would benefit from a clearer and more consolidated discussion of limitations and future directions. Accordingly, we have added a new subsection in the Discussion (Section 4.3: “Research Limitations and Future Directions”) and relocated relevant content from the Conclusions (e.g., remarks on regional specificity and model-related limits) into this section to improve coherence. The new subsection explicitly addresses external validity constraints, methodological limits, and extensions for future work (including broader geographic applications and richer data integration). All changes (Page 29,Lines 1035–1061)are marked in red in the revised manuscript.

 

 

Comment 8:
The conclusions are consistent with the objectives (the stated contributions) and the data, while unjustified generalizations are avoided. Maybe it would be more useful to focus more on the results and less on the methodology.

 

Response 8:
Thank you for this constructive suggestion. We agree that the Conclusions should prioritize the substantive results and their policy implications rather than revisiting methodological details. Accordingly, we streamlined methodological recap, reorganized the summary around the main empirical findings, and made the policy implications explicit for different household types and regional contexts. The revised text appears in the Conclusions on Page 30, Lines 1090–1115, with all changes marked in red.

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

Comment-01: The literature review section's review of "The Application of Machine Learning in Poverty Research" is brief. It is suggested to expand this section to more systematically sort out the advantages and disadvantages of different ML methods and add the application of interpretable machine learning in different fields, such as DOI: 10.3934/DSFE.2025002. And more prominently emphasize the superiority of the XML method adopted in this paper compared with these methods.

 

Comment-02: The paper mentioned that the data was cleaned and processed (such as handling missing values and standardization), but did not elaborate on the specific methods, such as whether the missing values were filled in with the median or deleted? How are categorical variables encoded? It is suggested to supplement these details to ensure the reproducibility of the research.

 

Comment-03: Although some hyperparameters of XGBoost, such as learning rate and max depth, were mentioned, it is suggested to supplement other key Settings and explain the process of hyperparameter tuning.

 

Comment-04: Apart from the XGBoost model, has the author considered using any other machine learning models? For example, random forests, decision trees, etc.

 

Comment-05: Using SHAP values for cluster analysis is a major innovation in this paper, but it is necessary to more fully demonstrate the rationality and advantages of this approach.

 

Comment-06: Apart from SHAP interpretable analysis, can other interpretable analysis methods, such as LIME, also be used for the analysis in this article? It is suggested that the author provide additional explanations for the reasons why no other explainable analytical methods were used.

Author Response

Comment 01:
The literature review section's review of "The Application of Machine Learning in Poverty Research" is brief. It is suggested to expand this section to more systematically sort out the advantages and disadvantages of different ML methods and add the application of interpretable machine learning in different fields, such as DOI: 10.3934/DSFE.2025002. And more prominently emphasize the superiority of the XML method adopted in this paper compared with these methods.

Response 01:
Thank you for this valuable suggestion. We agree that the literature review was too brief. Accordingly, we expanded the section on “Machine Learning in Poverty Research” to systematically compare the strengths and weaknesses of major ML methods and to add cross-domain evidence of interpretable ML applications (including the study suggested by the reviewer, DOI: 10.3934/DSFE.2025002). We also clarified the advantages of our X-ML framework, emphasizing how it extends beyond local explanations by clustering SHAP attribution vectors to reveal mechanism-based poverty pathways. These revisions are included in the Literature Review section and marked in red(Page 4, Lines 141–179) in the revised manuscript.

 

Comment 02:
The paper mentioned that the data was cleaned and processed (such as handling missing values and standardization), but did not elaborate on the specific methods, such as whether the missing values were filled in with the median or deleted? How are categorical variables encoded? It is suggested to supplement these details to ensure the reproducibility of the research.

Response 02:
Thank you for this important comment. We agree that additional details on data preprocessing are necessary for reproducibility. Accordingly, we added a description of the encoding and standardization procedures used for different variable types. Specifically, we clarified that categorical variables were encoded as discrete numerical values (0/1 for binary variables and integer mappings for multi-category variables) and that all indicators were standardized or discretized to ensure comparability across dimensions. These details have been added to the Methods section on Page 8, Lines 303–308, with changes marked in red in the revised manuscript.

 

Comment 03:
Although some hyperparameters of XGBoost, such as learning rate and max depth, were mentioned, it is suggested to supplement other key settings and explain the process of hyperparameter tuning.

Response 03:
Thank you for this helpful suggestion. We agree that the original version did not clearly explain the hyperparameter tuning process. In the revised manuscript, we added a concise description of the tuning procedure in the Methods section (Page 9, Lines 340–343), specifying that hyperparameters were tuned via stratified cross-validation with early stopping, using Macro-F1 on validation folds for model selection. We also moved the full configurations and search ranges for XGBoost, Random Forest, and Decision Tree models to Appendix Table B, ensuring transparency and reproducibility. A comparative analysis of model performance is summarized in Section 3.1. All revisions are marked in red in the revised manuscript.



Comment 04:
Apart from the XGBoost model, has the author considered using any other machine learning models? For example, random forests, decision trees, etc.

Response 04:
Thank you for this insightful suggestion. We agree that including additional models would strengthen the robustness of the analysis. Accordingly, we implemented Random Forest and Decision Tree models as comparative baselines to benchmark the performance of XGBoost. The training and tuning procedures for all models are described in the Methods section (Page 9, Lines 340–343), and the corresponding hyperparameter configurations are provided in Appendix Table B. All related additions are marked in red in the revised manuscript.


Comment 05:
Using SHAP values for cluster analysis is a major innovation in this paper, but it is necessary to more fully demonstrate the rationality and advantages of this approach.

Response 05:
Thank you for this constructive comment. We agree that the rationale for using SHAP values as the basis for clustering should be made clearer. In the revised manuscript, we added a concise explanation in the Introduction (Page 4, Lines 171–179) to clarify that clustering based on SHAP values (feature attribution vectors) groups households according to similar mechanisms of feature influence, rather than raw feature levels. This approach enhances comparability across heterogeneous indicators and provides more interpretable, mechanism-oriented insights into transitions of poverty states and development pathways. The added content is marked in red in the revised manuscript.

 

Comment 06:
Apart from SHAP interpretable analysis, can other interpretable analysis methods, such as LIME, also be used for the analysis in this article? It is suggested that the author provide additional explanations for the reasons why no other explainable analytical methods were used.

Response 06:
Thank you for this insightful comment. We agree that it is important to clarify why SHAP was selected over other interpretable methods such as LIME. In the revised manuscript, we added a comparative explanation in the Introduction to discuss the differences between LIME and SHAP. Specifically, we note that LIME approximates complex decision boundaries with local linear surrogates, but its explanations are highly sensitive to sampling strategies and neighborhood definitions, which can limit robustness in multi-dimensional socioeconomic data. In contrast, SHAP provides theoretically grounded, globally consistent feature attributions based on cooperative game theory and offers exact solutions for tree-based models (TreeSHAP), which aligns well with our XGBoost framework. These clarifications are included on Page 4, Lines 164–171 and are marked in red in the revised manuscript.

Author Response File: Author Response.docx

Reviewer 4 Report

Comments and Suggestions for Authors

The review comments for SUS-3935351

Generally speaking, the paper has done a research regarding the poverty alleviation in rural China, however, the paper hasn’t reveal the impact factors according to the strict model analysis and also, the core mechanisms of targeted poverty alleviation in China, it is not a a new research question, and without new insights for the literature in this area. I have some major concerns for the manuscript.

  • The abstract is such simple and cannot summarized the findings and the creatives of the paper, we haven’t found that the new findings and the possible enlightenment, the core mechanism of how to alleviate the poverty haven’t been revealed in the abstract.
  • The authors have done a good literature summarize and review, which from different level. This should be admired.   
  • The Introduction is such longer that should be simplified. Actually, the author should put forward the research gap and then introduce your creatives and the possible contribution to the literature and the practice meanings.
  • The authors haven’t constructed the theoretical framework regarding on the impact factors of household poverty alleviation, and also the possible evolution pathways of poverty alleviation, but directly enter into the data and methodology. At least the conceptual framework which can direct the contents of the paper must constructed.
  • The authors have constructed a indicator system for household poverty alleviation and development pathways, from 7 dimension, I think this should be admired and have some practical meaning.
  • Theauthors have used the new methodology to present the household poverty alleviation spatial distribution and the evolution pathways of the research area, this should also be admired and the figures made in the paper is also clear and good. 
  • The authors have discussed the different development pathways for different types of household alleviation areas.However, this paper haven’t revealed the research limitation of the research, and also haven’t put forward very detailed policy implication for different types of areas and also the different types of households.

 

All in all, the paper has deal with a problem of rural poverty alleviation, and have revealed some impact factors and also the evolution pathways of poverty alleviation. While I suggest the authors try to construct a research diagram or theoretical framework, this can make the paper much more complete and readable. The analysis and the methodology is deepen and the discussion is sufficient. While the conclusion and the policy implication is not sufficient. Hence the policy implication and the theoretical contribution is insufficient, I therefore give the author the opportunity to revise the paper in the current stage.      

Author Response

Comment 01:
The abstract is such simple and cannot summarize the findings and the creatives of the paper, we haven’t found that the new findings and the possible enlightenment, the core mechanism of how to alleviate the poverty haven’t been revealed in the abstract.

Response 01:
Thank you for this constructive comment. We agree that the original abstract was overly general and did not sufficiently summarize the main findings, innovations, and mechanisms of poverty alleviation. Accordingly, we completely rewrote the abstract to explicitly present the research questions, methodological innovation, key empirical findings, and policy implications. The revised abstract now highlights the core mechanisms of poverty alleviation, the role of initial household typologies and transition processes, and the spatial heterogeneity of development pathways. These revisions appear on Page 1, Lines 11–34 of the revised manuscript and are marked in red.

 

 

Comment 03:
The Introduction is such longer that should be simplified. Actually, the author should put forward the research gap and then introduce your creatives and the possible contribution to the literature and the practice meanings.

Response 03:
Thank you for this constructive suggestion. We agree that the original Introduction was overly lengthy and could be more focused. Following your advice, we restructured and condensed the Introduction to present a clearer logical flow: first outlining the research background, then summarizing the key research gaps, and finally introducing the methodological innovation, research objectives, and practical contributions of this study. The revised version highlights the main bottlenecks in existing poverty research, explains the practical advantages of explainable machine learning for addressing these gaps, and explicitly connects the study’s innovations to both academic and policy relevance. These revisions appear in the Introduction on Pages 3–5 and are marked in red in the revised manuscript.

 

 

Comment 04:
The authors haven’t constructed the theoretical framework regarding the impact factors of household poverty alleviation and the possible evolution pathways, but directly enter into data and methodology. At least a conceptual framework that directs the paper must be constructed.

Response 04:
We agree. In the revised manuscript we explicitly added a conceptual/theoretical framework and linked it to our empirical design. Specifically, Section 2.1 now defines states and processes, introduces the pathway concept, and proposes two testable hypotheses—H1 (synergy between burden-reduction and livelihood-development mechanisms) and H2 (spatial moderation of pathways). Section 2.3 then operationalizes these constructs by mapping each hypothesis to the indicator system (initial multidimensional poverty profile and subsequent livelihood dynamics) to guide variable selection and analysis. Finally, Section 4.1 (Discussion) synthesizes the empirical results to articulate the mechanisms and evolution pathways, providing a unifying framework (Figure 18) that ties initial conditions, process transitions, and spatial contexts to stable poverty alleviation. All additions are marked in red in Sections 2.1, 2.3, and 4.1 of the revised manuscript.

 

 

Comment 07:
The authors have discussed the different development pathways for different types of household alleviation areas. However, this paper hasn’t revealed the research limitation of the research, and also hasn’t put forward very detailed policy implication for different types of areas and also the different types of households.

Response 07:
Thank you for this valuable suggestion. We agree that the initial version lacked a clear statement of research limitations and more detailed policy implications. In the revised manuscript, we added a new subsection, “4.3 Research Limitations and Future Directions,” to explicitly discuss the study’s scope, data constraints, and methodological limitations, and to outline directions for future research. In addition, we expanded the policy discussion in Sections 4.1 and 4.2, integrating area-specific and household-type policy implications—for example, differentiated strategies for mountainous, agricultural, and peri-urban regions, and tailored measures for households facing distinct burdens or development potentials. These revisions together provide a more comprehensive understanding of how findings translate into targeted, practical policy actions. All changes are marked in red in the revised manuscript.

 

Author Response File: Author Response.docx

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

no comment now.

Back to TopTop