Machine Learning Applied to Improve Prevention of, Response to, and Understanding of Violence Against Women
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsDetailed Comments:
- Several paragraphs are just 1-2 sentences and should be combined into larger paragraph (e.g., lines 188-89, 208-214,
- Section 4.3 does not describe variable selection in the sense that I understand the term. I suggest that you either rename this section or merge it with section 4.2.
- Section 5 is largely redundant. This is elementary textbook material on machine learning models and does not need to be described in a research paper. I suggest adding a new subsection to Section 4 (this is methodology, anyway), that simply states which models were used. Simply state that you used random forest and boosted trees (with citations), specify the specific implementation used and the hyperparameter setting, along with how those hyperparameters were set (default values or some tuning). Talk of gini index and entropy is beyond redundant in this context.
- The beginning of the results section (lines 323-325) brings up additional methodology (chi-squared tests, ANOVA and LASSO). This should be in the methodology section. It should also be explained more, and by that I mean, please explain how you implemented these methods not how they work. These are all methods can be used for feature selection, but it is not clear in what combination they were used for this work. Note that since these are all heuristics for the purpose of feature selection, claiming that you have “optimal set of features” is too strong (line 326).
- I find the Random Forest Architecture section to be quite odd (section 6.1). I have never seen RF described in this manner. It seems to draw on traditional representations for neural networks to describe RF, which I simply do not think is useful (or correct). The same comment holds for the description of GB later. In any case, there is simply no need for this. This is a results section. Stick to results (not methodology).
- The results for RF and GB should be combined into a single section (and Table 5 is the results, move it to the results section), with consistency of how the performance of the models is evaluated. For example, you report F1 score for GB but not for RF. Same set of measures should be used for both models. In any case, the RF makes 1 error on the test data, the GB makes 4 errors. Reporting the confusion matrix as a heatmap is unnecessary – just report the simple number. Drawing the ROC curve seems like overkill for models that are already nearly perfect, which can be observed by a simple error count. In this case, the other numbers do not add any insights. The key observation is that the error rate is very low for both models, and extremely low for RF. Any other attempted observations are suspect because your test data is very small (and hence the estimate of the error/AUC/F1 etc., is highly variable).
- I suspect there is an error in Table 5. The reported results are consistent with other things you report for the RF, but not for the GB. I suspect the results for RF accidentally got put in both sets of columns.
Overall Comments:
- First, you describe a very worthwhile application, and all the material that is directly related to violence prevention is fairly well written and logical. On the other hand, it appears that machine learning is not an area where you have a great deal of background. In my opinion you have tried to include too much trivial material on ML, whereas the application should be the focus. If you streamline this material, then I think you have a nice paper.
- Second, I think you may have wanted to consider different ML tools in the first place. The RF is fine but the GB is overkill, overfits to your very limited data and doesn’t do as well as the RF anyway. I would advise you to leave in the RF but add a couple of simpler models: decision trees and logistics regression. I suspect they will do quite well and they have the advantage of being interpretable, which is important in your application.
The language is fine. I understood everything clearly, but it could use some careful editing.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsCOMMENTS TO THE AUTHORS
Line 56: Stephanie et al.,, be consistent with the style of citation, check the entire paper.
Line 103: Prieto et al., be consistent with the style of citation.
Line 131, 136..... be consistent with the style of citation. Use (surname, year).
Table 2 content are misrepresented. They should be arranged properly.
Line 326: The features ranked by importance (see Table 4). Incomplete sentence
Abstract keywords should be arranged in ascending order.
The components of abstract are brief/few lines of the following in a continuous paragraph form: introduction, problem statement, objective, methodology, findings/results, conclusion/benefits.
The abstract presented in this lack some parts of the above components.
There are some typographical/grammatical errors that need to be corrected by the authors.
The authors should check the entire paper for such typo corrections.
Related works containing gaps should be well highlighted in session 3, shortly before session 4 Methodology
Comments on the Quality of English LanguageThere are some typographical/grammatical errors that need to be corrected by the authors.
The authors should check the entire paper for such typo corrections.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThis revision is vastly improved. Great job on a successful revision in a short amount of time. I do have a few more comments, focused on just two issues: the feature selection method and the parameter tuning.
- Section 4.3. I think the description here is still too vague. For example, what does “considered for removal” really mean? Where any variables removed? Was there a threshold significance level such that if a variable didn’t meet that threshold then it would have been removed? In short, I wouldn’t be able to replicate what you did based on this description and the description should be sufficiently precise so that the procedure is replicable. I’m also not able to infer it because results are only reported for the RF variable importance, not the other methods (see comment below).
- Line 268. This is picky but I don’t think it is strictly correct to say “LASSO regularization”. LASSO is regression with shrinkage regularization using an L1 penalty. It makes sense to me to talk about L1 regularization or shrinkage regularization, but not LASSO regularization. Again, it is not clear to me exactly how the result of the regression model are used on top of the results of the RF.
- Table 4 and any corresponding comments should be in the results section. I would also suggest expanding the table to include the results of the Chi-squared, ANOVAA and LASSO regression (after they are clearly explained in the methods section). Adding those results may help clear up some of my confusions listed above. Finally, do you want to use the English translations of the terms to be consistent with Table 2?
- Line 307. Please specify what parameters were changed and how for each model. It is also very surprising that you used the same number of combinations for boosted trees, which have many hyperparameters and are quite sensitive to how those are set, and the random forest, which has at most two hyperparameters and is generally insensitive to their settings. Some comment on that would be good, especially if you found that the RF really needed to be tuned. Finally, please specify the final parameter values selected for each model.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 3
Reviewer 1 Report
Comments and Suggestions for AuthorsMy apologies to the authors for the lengthy time it took me to get back to this review. I think the authors have made substantial improvements. While I do not entirely agree with all of the choices made by the authors in their response to my second review - they are their choices to make. I do not have any further comments but recommend that the authors review the paper one more time for language and presentation before it is published.
Author Response
Please see the attachment
Author Response File: Author Response.pdf