Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Comparison of the Predictive Performance of Medical Coding Diagnosis Classification Systems

Technologies 2022, 10(6), 122; https://doi.org/10.3390/technologies10060122

by Dimitrios Zikos^* and Nailya DeLellis

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Kaiwen Man

Technologies 2022, 10(6), 122; https://doi.org/10.3390/technologies10060122

Submission received: 8 October 2022 / Revised: 15 November 2022 / Accepted: 23 November 2022 / Published: 28 November 2022

Round 1

Reviewer 1 Report (New Reviewer)

Manuscript title: Comparison of the Predictive Performance of Medical Coding Diagnosis Classification Systems

In this study author found Random forest performance significantly superior than Naïve Bayes for MS-DRG, and ICD-10-CM. NB performance is so poor and though Random forest show better performance it’s still not great to use. Author should also apply few more algorithms, such SVM, linear regression, decision tree etc for comparison and finding best model.

1.Author used 10fold cross validation method. Six NB and four Random forest models developed. Please explain more details and reason behind making multiple models. Also provide information on how random sampling of 50000 cases data done and split in each subset of training validation and test set.

2.How the control variable were fed into model. Are they hot encoded into 0 or 1 for non numeric variable like gender etc.

3.Author should provide clear details on pipeline used for analysis which could help interested readers to replicate.

4.Author should comment on novelty of his work and how impactful are his results to use since he has used large dataset for study. However produced conclusion is not enough supported by analysis and author should try adding more algorithms for comparison.

5.How do author compare the performance of algorithms with ground truth? Is this ground truth report made by expertise from author institute? Can author compare performance between two readers?

Author Response

Thank you for your time to review our work and for your invaluable comments! Please find attached the communication document with a point by point response to your remarks.

Author Response File: Author Response.docx

Reviewer 2 Report (New Reviewer)

Paper deals with important task. The authors compared the predictive power of these alternatives against ICD-10-CM, for two outcomes of hospital care: inpatient mortality and length of stay.

Paper has great practical value.

Suggestions:

1. In the abstract section please delete (1) Background … (4)Conclusions and other general words

2. The introduction section should be extended using RF algoritm’s modifications. Fore example DOI: 10.15587/1729-4061.2018.134319

3. It would be good to add the remainder (structure) of this paper at the end of the Introduction section

4. The qualiy of the Fig. 2 should be improved

5. The authors should add all optimal parameters for the RF algorithm

6. The conclusion section should be extended using: 1) numerical results obtained in the paper; 2) limitations of the proposed approach; 3) prospects for future research.

Author Response

Thank you for your time to review our work and for your invaluable comments! Please find attached the communication document with a point by point response to your remarks.

Author Response File: Author Response.docx

Reviewer 3 Report (New Reviewer)

The focus of the manuscript was to compare the predictive power of these alternative against 1cd-10-cm, for two outcomes of hospital care using several methods including Naive Bayes (NB) and Random Forests(RF). Overall, this manuscript's writing and readability was great. However, the method part needs to be further polished.

1) Please report all the features you used for training the NB, and RF models.

2) How many samples did you used for training and testing separately?

3) Please generate correlation scatter plots for understanding your data.

4) Please report the the tuning parameters for NB and RF models. For example, for RF method, how many trees did you decide to use after training the model?

5) For the RF method, please generate the regression tree plot

6) Regarding regression results, did you check all the model assumptions? normality? also, did you check the model-data fit?

Author Response

Thank you for your time to review our work and for your invaluable comments! Please find attached the communication document with a point by point response to your remarks.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report (New Reviewer)

All answers were satisfactory.

Reviewer 3 Report (New Reviewer)

The author addressed all my comments. No more comments

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.

Round 1

Reviewer 1 Report

Please see the attached file.

Comments for author File: Comments.doc

Reviewer 2 Report

Health analytics frequently involve tasks to predict outcomes of care. A foundational predictor to study clinical outcomes is the medical diagnosis (Dx). The most used expression of medical Dx is the International Classification of Diseases (ICD-10-CM). Since ICD-10-CM includes >70,000 codes, its use renders the training of models challenging, because of its high computational cost. Alternative lower dimensionality medical Dx systems include the Clinical Classification Software (CCS) and the Diagnosis Related Groups (MS-DRGs).

The authors seek to compare the predictive power of these alternatives against ICD-10-CM, for two outcomes of hospital care: inpatient mortality and length of stay (LOS).

They proposed a methodology based on Naïve Bayes (NB) and Random Forests models were created for each Dx system to examine their predictive performance for inpatient mortality, and Multiple Linear Regression models for the continuous LOS variable.

Their results showed that: (I) the MS-DRGs had the best performance for both outcomes, outperforming the original ICD-10-CM codes. (II) The CCS system, although having a much lower dimensionality than ICD-10-CM, has similar performance. (II) Random Forests outperformed NB for MS-DRG, and ICD-10-CM, by a large margin.

The authors concluded that their results can provide insights to understand the performance and compromise from using those lower-dimensionally representations in outcome studies.

This is a good paper.

Some minor suggestions:

1. Minimize the use of acronyms (starting from the abstract). Insert them in a table/list

2. Please improve the purpose “The study does not aim to develop clinically applicable models, but to compare the performance of the aforementioned Dx systems…” The readers are not interested in not aims but in the aims

3. Introduce the themes of the results

4. Check the resolutions of the figures

5. Insert the limitations in the discussion.

6. The recommendation for the future research is interesting. What about inserting a small paragraph entitled “future research” or something similar

7. Insert the conclusions

Article Menu

Comparison of the Predictive Performance of Medical Coding Diagnosis Classification Systems

Further Information

Guidelines

MDPI Initiatives

Follow MDPI