Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Big-Data-Assisted Urban Governance: A Machine-Learning-Based Data Record Standard Scoring Method

Systems 2025, 13(5), 320; https://doi.org/10.3390/systems13050320

by Zicheng Zhang^1,* and Tianshu Zhang²

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Sascha Eichstädt

Reviewer 4: Anonymous

Systems 2025, 13(5), 320; https://doi.org/10.3390/systems13050320

Submission received: 26 February 2025 / Revised: 10 April 2025 / Accepted: 24 April 2025 / Published: 26 April 2025

(This article belongs to the Topic Data Science and Intelligent Management)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This study utilizes government hotline data to develop a comprehensive scoring method for evaluating multidimensional data recording standards. The research holds significant academic value and practical implications. However, minor revisions are necessary before publication to enhance the paper’s completeness and clarity.

Insufficient introduction to research background and significance: In the first chapter, the author provides a relatively brief discussion of the research background and significance. It is recommended to enrich this section to improve readers' understanding of the research problem.
Emphasizing research innovation: The author should explicitly highlight the innovative contributions of this study, as this is crucial for publication. Comparing it with existing research can help underscore the methodological advantages, practical applications, or novel theoretical perspectives of this study.
Providing data source links: It is suggested that the author include links to data sources where appropriate to enhance the study’s transparency and reproducibility.
Reorganizing the research hypothesis section: Currently, the research hypotheses are placed within the research design chapter. It is recommended to integrate them with the literature review into a new chapter, making the paper’s structure more coherent.
Contextualizing findings within China's reality: The author is encouraged to further interpret the study's findings in the context of China's specific circumstances, enhancing the research’s practical relevance and applicability.
Addressing research limitations and future directions: The paper should include a discussion on research limitations and, in the conclusion, provide insights into potential future research directions to strengthen its academic rigor.

Comments on the Quality of English Language

Language refinement: The paper’s language should be further polished to ensure precise, fluent, and academically appropriate expression.

Author Response

We sincerely appreciate the reviewers’ efforts. Our point-by-point responses to the reviewers’ comments are provided in the attached file for your reference.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper establishes a multidimensional data quality evaluation system based on machine learning to verify the impact of data recording standards on work order completion rate. It aims to provide empirical support and practical guidance for improving the efficiency of government data management and public services. This research is innovative and necessary to some extent, but there is still room for further improvement in some aspects of detail expression, which can improve the quality of the paper.

2 Literature Review

In the literature review part, the content of literature review is insufficient. This part focuses on the application of government hotline data quality, data recording standards and machine learning in urban governance, focuses on the core role of government hotline data in public service and decision-making accuracy, introduces the research on existing data classification, assignment accuracy and task assignment models, and discusses the application of deep learning and text analysis technology in data quality assessment. It is pointed out that the current research mainly focuses on data classification and lacks systematic quantitative evaluation of data recording standards. However, the lack of systematic sorting of data quality assessment methods, the advantages and disadvantages of different models and methods are not detailed enough, and the uniqueness and improvement of the method in this paper cannot be highlighted. The proposed six evaluation indicators lack in-depth theoretical basis and rationality explanation, and fail to systematically explain the rationality and authority of these indicators in data quality evaluation.
It is suggested to supplement systematic literature review in the literature review part, comprehensively sort out data quality evaluation methods, and establish a clearer theoretical framework. The comparison of different data quality evaluation methods is increased to highlight the innovation of this method in multi-dimensional quantitative evaluation. The theoretical basis behind the six evaluation indicators is further discussed, and authoritative literature is quoted to support the rationality and importance of the selection of indicators. The research progress in related fields is introduced in detail, the gaps and challenges are pointed out, and the differences between the method of this paper and the existing research are clarified.

3 Research Design

The calculation method of some evaluation indicators in this chapter is not detailed enough, resulting in insufficient reproducibility of the method. It is suggested to clarify the calculation method of text similarity, supplement the formula and standard of address accuracy calculation method, and explain how to quantify the accuracy and completeness of address information. The data sources, calculation logic and possible error sources of the six indexes are explained in detail.
This chapter does not detail the pre-processing steps such as data cleaning, deduplication, and outlier processing. You are advised to add the specific steps and standards for data cleaning, deduplication, and outlier processing to describe the text processing flow.
This chapter does not describe the design and implementation of the weighted summation method of indicators, nor does it explain the basis for determining the weights in the weighted summation of indicators, which is not reasonable enough. It is suggested to clarify the basis for weight setting (such as expert scoring, model training results, data distribution analysis), clarify the basis for weight setting, explore other data fusion methods, and improve the rationality of index integration.
The description of model training parameters (such as learning rate, batch size, number of training rounds, etc.) in this chapter is insufficient, and the selection basis under the parameters is not explained. It is suggested to clarify the model training parameters and optimization strategies. The description of hyperparameter tuning method in model training is added.

4 Experimental Results

This chapter constructs a multiple linear regression model and discusses the relationship between the total score of data recording standards and the work order completion rate. However, the robustness of the model has not been tested. It is suggested to supplement the regression model validation, increase the model fit and robustness testing, and improve the reliability and explanatory power of the model results.
This chapter does not analyze the reasons for the weak influence of "assignment accuracy" and "address accuracy", nor does it reveal how the six indicators play a specific role in the improvement of data quality and the completion of work orders. It is suggested to add the reasons for the weak role of "dispatch accuracy" and "address accuracy", and analyze the specific impact of six indicators on data quality and work order completion rate combined with practical application scenarios.

5 Conclusions and Recommendations

The research conclusions in this section are not deep enough. The theoretical analysis is not deep enough to explain the difference of the impact of data quality index on the completion rate of work order based on the literature. There is insufficient discussion on the rationality of the results, so it is suggested to summarize the research results, introduce the relevant theories of data quality, information management and urban governance, and conduct a theoretical analysis on the difference of the role of the six indicators.
The policy recommendations in this section are not practical enough, and the recommendations are too macro and lack details. It is suggested to refine policy recommendations, improve training programs, standardization guidelines and technical implementation paths, and improve the operability and implementation effect of recommendations.

Author Response

We sincerely appreciate the reviewers’ efforts. Our point-by-point responses to the reviewers’ comments are provided in the attached file for your reference.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The authors of this manuscript present an comprehensive analysis of over 300.000 data records of governmental hotline data. The aim of the analysis is to assess the impact of data quality on work order completion. The authors' conclusion is that data standards could help increasing the rate of work order completion significantly.

The general finding that increased data quality helps with the handling of the work orders is not surprising at all. The authors' analyses add some scientific rigour to this. The manuscript is very well written, other studies are mentioned and the methodology is explained very well. However, there are several remarks and points to be improved that I want the authors to address before I can recommend publication of this work.

In Section 1, the authors state that they want to establish a scoring system. However, there are inconsistencies in the scoring model and the statistical findings (see below). Moreover, the scoring system is not analysed sufficiently well in this paper for it to become "established".
The authors cite several other studies, mostly from other areas than governmental hotline data. The authorss conclude that there is a lack of scoring for government hotlines. However, the authors don't explain why hotlines in commercial areas are considered so different? I'd imagine there to be standardised tools and scoring methods. The authors should comment why these methods cannot be applied here.
Figure 5 lacks axes labels
Tables 1 and 2 lack information about the version of ChatGPT used for the analysis
In their analysis in Section 4, the authors present the outcomes of their algorithms and ML method pipelines as if these were absolutely correct. The authors should comment on the quality of the various methods used for the calculation of the indicators. What was the accuracy of the automated address recognition, the adjacent sentence similarity calculation, etc.? For the quantitative analysis, this part adds a certain level of uncertainty to the conclusions. That is, the question remains how much is actual ground truth indicator value and how much is due to limitations of the methods used. Especially, the questions is whether the authors checked for the cases with indicator equal to 0 whether this was actually the case or an error of the algorithms / methods used.
The model in Section 4.3 shows no weights for the indicators' summation, despite the descriptive text stating a "weighted summation".
In the equation in Section 4.4 there is no explanation of the coefficients b and the meaning of u.
In Section 4.5 and Table 6 there is a negative correlation for Dispatch Accuracy and Record Accuracy with work order completion. This seems to be counter-intuative. The authors should comment on this and their interpretation.
Figure 12 shows significant differences for the various indicators. Doesn't this contradict the model for the total score that assumed equal weight for all indicators?
At the end of Section 4.5, it says "improving semantic clarity and structure within data records can 444 enhance the efficiency of work order completion". That is a strong and too brief conclusion drawn from the statistical(!) correlations.
In Table 10 the authors conclude from the misinterpretation of the text by their machine learning methods that this affected the work order completion. Does this mean that the handling of calls is done by machine learning methods? Wouldn't a human be able to identify the actual meaning despite the wrong classification by the machine learning? This again seconds my previous comment that the impact of the algorithms used for analysis of the data records on the conclusions remains unknown.
Below Table 10, it states that address accuracy was important. But isn't this interpretation in contradiction to the outcomes shown above? The address accuracy was identified to have minimal impact. The dispatch accuracy as negative correlation.
In Section 5.1 it says that the analyses revealed a significant positive correlation. This is not the case for all indicators!
In their recommendation (1) in Section 5.2 the authors state training to be of importance. This conclusion is not backed by the findings here. The authors only assessed the performance of their machine learning methods for identifying words and meanings correctly. To conclude from this finding that training is needed for the humans doing their job cannot be drawn directly.

These points need major revision of several parts of the manuscript.

Author Response

We sincerely appreciate the reviewers’ efforts. Our point-by-point responses to the reviewers’ comments are provided in the attached file for your reference.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

Dear Authors,

I would like to reveal my opinion that you have done a thorough job, which you have presented more or less adequately in your article. I would just like to ask for a few minor corrections.

1. The captions to Figures 2, 6 and 11 should be expanded with a few informative thoughts to make the message easier to understand.

2. In chapter 4.5, the terms correlation and significance are not clearly used. Specifically, where in the text the numbers 0.9 and 0.22 are used. Rephrase the sentences so that it is clear that the number 0.9 is correlation and 0.22 is significance. However, the mention of a strong linear correlation of 0.9 requires the corresponding significance level to be also stated in the text. I note that Table 6 does not include the above mentioned figure of 0.9 as a correlation between the 'Add. Sen. Sim.' and 'Full. Text Sim' parameters. If possible, please also add the correlation matrix to the article, if number data are already mentioned from this. Here, please also indicate significant data in the matrix, say with an stars or boldface style.

Author Response

We sincerely appreciate the reviewers’ efforts. Our point-by-point responses to the reviewers’ comments are provided in the attached file for your reference.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

At present, this study has been revised on the basis of the previous version, but there are still some problems in some parts of the paper, and the problems raised in the previous stage have not been fully solved.

Abstract

Abstract part introduces the main problems and main results of the research, but this part lacks specific research background and application requirements, so it is suggested to add this part.

Introduction

(1) In the Introduction part, some academic background of the research is provided, the importance of government hotline data is introduced, and research gaps are pointed out. However, this part does not fully prove the necessity and urgency of the research problem, and lacks the support of policies, cases and data. The comparison with the existing research is not clear, and the description of the application background is broad, which fails to reflect the value of the research to the real governance system. It is recommended to add relevant data or cases to illustrate the serious problems faced by the quality of government hotline data, reflecting the urgency of the problem. Cite government policies, industry reports, or other information about the actual need for data quality scoring.

(2) The distinction between research objectives and contribution points is not clear in this part. The last paragraph refers to "exploring and ensuring the standardization of government hotline data", is it a goal or a method? Please specify clearly, and suggest further clarifying the research objectives to make them more specific, clearly distinguishing between "research objectives" and "research methods", and detailing how this paper addresses existing problems.

(3) This part lacks cross-field comparison and cannot show the uniqueness of the research. This study is only compared with government hotline data research and does not cover other approaches to public data governance. It is recommended to compare practices in other fields, to explain the uniqueness of this study, and to explain the particularity of the government hotline data quality score.

2 Literature Review

In the Literature Review section, although relevant studies are supplemented and a number of research cases are mentioned, there is no in-depth induction and classification of different data quality assessment methods, which is not systematic enough and lacks a clear and complete theoretical framework. It is suggested to sort out the literature review systematically and form a clearer theoretical framework.
Some data quality assessment methods are listed, but the advantages and disadvantages, applicable scenarios and comparative analysis of these methods are lacking, and the advantages of BERT and other methods for text processing are not clearly demonstrated. As a result, the uniqueness and innovation of this method cannot be highlighted. It is suggested to increase the comparison of different data quality assessment methods to highlight the innovation of this method.
The core evaluation dimension proposed in this paper only cited some literatures in part 2.2, and did not cite authoritative literatures to fully demonstrate the rationality and importance of these indicators, so its theoretical support is still weak. The six evaluation indexes lack in-depth theoretical basis and rational explanation, and fail to systematically explain the rationality and authority of these indexes in data quality evaluation. It is suggested to further explore the theoretical basis of the six evaluation indicators and increase their scientificity and authority.
The paper mentions that existing studies "mainly focus on data classification and lack systematic quantitative evaluation of data recording standards", but the description of research innovations is still vague. The paper points out that its innovation is mainly reflected in the use of BERT for text processing, but it has not been compared with other methods and has not highlighted its advantages. It is suggested to clearly explain the reasons for choosing BERT instead of other methods, and fully prove the applicability and superiority of this method in government hotline data analysis.

3 Research Design

This part briefly mentions de-duplication and text segmentation, but does not describe the specific steps of data preprocessing in detail, such as how to deal with missing values and outliers, and whether any denoising technology is adopted. It is recommended to add a detailed data clearing and processing process.
Although this part mentions the use of multidimensional evaluation indicators, it does not specify the basis and design details of weighting, and does not clearly explain how to determine the weight of each indicator. It is recommended to use the basis for determining the weight.

It is recommended that the manuscript be accepted with modification.

Author Response

We sincerely thank the reviewer for their diligent efforts and valuable feedback. We have carefully reviewed your comments and provided detailed, point-by-point responses to each concern raised. We kindly invite you to review our revisions and further evaluate our manuscript.Please see the attachment for detailed responses.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

I would like to thank the authors for following my suggestions and addressing my previous concerns and remarks. The revised version of the manuscript is of much better quality.

Author Response

Dear Reviewer,

Thank you very much for your valuable comments and suggestions. We sincerely appreciate the time and effort you have taken to review our manuscript. We are glad to know that the revised version meets your expectations.

Article Menu

Big-Data-Assisted Urban Governance: A Machine-Learning-Based Data Record Standard Scoring Method

Further Information

Guidelines

MDPI Initiatives

Follow MDPI