Using Natural Language Processing to Identify Low Back Pain in Imaging Reports
Round 1
Reviewer 1 Report
Interesting work, I thank the authors for this contribution. The article presents an interesting application of NLP in the medical domain. It is also good to see the authors aware of possible limitations. However, please consider the points below in the next version.
(1)
My main critique is that that study used a much smaller subset of the original dataset, about 5% only. I believe that the study would have delivered a stronger impact with a larger dataset. I am not sure why this decision was taken, and it is important that the authors should explain their view on that.
(2)
Having a (relatively) small dataset, cross-validation should be considered for evaluating the model. The basic train/test split could have overestimated the model performance.
(3)
The motivation behind the study needs further clarification. For example, is it based on a practical need for the university hospital? and/or the literature lacks such NLP studies in this context in particular?
Overall, the study should mention what it would add to the literature from a theoretical or practical aspect.
(4)
More related work is genuinely needed in the introduction. The related work should include more studies that applied the state-of-the-art methods (e.g. BERT) to extract knowledge from free-text clinical documents. For example:
https://doi.org/10.5220/0011012800003123
(5)
The authors have touched on possible directions for the future work. However, the use of BERT-based models was not mentioned in this regard. The NLP research is currently dominated by the use of transformer models (e.g BERT). That said, I recommend considering that as part of the future work as well.
Author Response
(1) My main critique is that that study used a much smaller subset of the original dataset, about 5% only. I believe that the study would have delivered a stronger impact with a larger dataset. I am not sure why this decision was taken, and it is important that the authors should explain their view on that.
Response
Thank you for your comment. We agree with your point, and we think your suggestion will make our manuscript more understandable. Thanks to the reviewer’s suggestion, we added similar research to ours.
Manuscript
[Materials and Methods]
With reference to the method of the NLP system for extracting words associated with LBP on previous research, 5% from the original dataset was extracted and used for this study [40]. The dataset was selected randomly.
(2) Having a (relatively) small dataset, cross-validation should be considered for evaluating the model. The basic train/test split could have overestimated the model performance.
Response
Thank you for your kind comment. We agree with your point, and we added about cross-validation method that we used in this research.
Manuscript
[Materials and Methods]
The k-fold cross validation was used to the NLP model validation. Training and test datasets were configured differently in the datasets several times.
(3) The motivation behind the study needs further clarification. For example, is it based on a practical need for the university hospital? and/or the literature lacks such NLP studies in this context in particular? Overall, the study should mention what it would add to the literature from a theoretical or practical aspect.
Response
Thank you for your kind comment. We agree with your point that the motivation of the study needs more clarification. Following your suggestion, we added our research motivation based on a need in the medical institution and lack of the previous researches.
Manuscript
[Introduction]
There is no clear standard notation in the use of many technical terms in radiologic report. It is necessary to develop an NLP system tailored for extracting terms from the unique free-text format of the medical institution (i.e. Bundang CHA Medical Center).
In previous studies, the NLP model for extracting words associated with LBP in CT reports was not developed [13, 25, 26, 33, 35, 40]. This study expands the utilization of the NLP system to CT reports as well as X-ray and MRI reports for extracting words associated with LBP.
(4) More related work is genuinely needed in the introduction. The related work should include more studies that applied the state-of-the-art methods (e.g. BERT) to extract knowledge from free-text clinical documents. For example: https://doi.org/10.5220/0011012800003123
Response
Thank you for your suggestion. We agree with your point that related work with the state-of-the-art methods is needed in the introduction. Thanks to your suggestion, we could add recent and state-of-the-art NLP models as a reference in our manuscript.
Manuscript
[Introduction]
Emilien et al., the NLP system was developed by the BERT model. The free-text form records in the University Hospital of Amiens-Picardy in France was used to develop the BERT model. This BERT model learned contextual embeddings. Through this research, the utilization of the BERT method in medical field care was discussed [43].
[References]
- Emilien Arnaud, Mahmoud Elbattah, Maxime Gignon, Gilles Dequen. Learning Embeddings from Free-text Triage Notes using Pretrained Transformer Models. HEALTHINF: PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL 5: HEALTHINF, Feb 2022, Lisbonne, Portugal. pp.835-841.
(5) The authors have touched on possible directions for the future work. However, the use of BERT-based models was not mentioned in this regard. The NLP research is currently dominated by the use of transformer models (e.g BERT). That said, I recommend considering that as part of the future work as well.
Response
Thank you for your kind suggestion. Thanks to your suggestion, we could add the future work’s possible directions that is using BERT-based models.
Manuscript
[Discussion]
Additionally, the transformer models (i.e. BERT model) can be used for further researches. The patient datasets with LBP and without LPB will be learned in the BERT model. Through this, it is expected to discover the sentence or words with a high correlation with LBP. The keywords other than 23 radiologic findings used in the current study will be found.
Author Response File: Author Response.docx
Reviewer 2 Report
There are flaws in the structure and content of this paper. The description of the whole content is weak, and many details are unclear, which hard to understand the substantial contributions of the developed NLP system.
The specific comments as below.
1. The authors did not state the relevance of this study to NLP, moreover, the manuscript should be pattern recognition by using computed tomography (CT), magnetic resonance imaging (MRI) and Python (3.8.10).
2. What is NLP, the authors did not state or define clearly.
3. What is the relevance of this study to NLP?
4. The authors did not suggest what kinds of algorithm to be used in NLP. What are the differences between other existing algorithms? What are the contributions or new discoveries?
Author Response
Reviewer 2
There are flaws in the structure and content of this paper. The description of the whole content is weak, and many details are unclear, which hard to understand the substantial contributions of the developed NLP system.
The specific comments as below.
- The authors did not state the relevance of this study to NLP, moreover, the manuscript should be pattern recognition by using computed tomography (CT), magnetic resonance imaging (MRI) and Python (3.8.10).
Response
Thank you for your comment. We agree with your point, and we think your suggestion will make our manuscript more understandable. Thanks to the reviewer’s suggestion, we added and mentioned about the NLP systems and pipeline of this study.
Manuscript
[Materials and Methods]
The NLP system for extracting words associated with LBP from CT, MRI, and X-ray reports. The NLP system is classified into three approaches; rule-based approach, Machine learning-based approach, and hybrid types approach [42]. In this study, a rule-based approach was used.
For implementing rule-based approach, “re” was used. “re” was used for tokenization sentences and make patterned regular expressions.
The words associated with LBP were extracted from X-ray, CT, and MRI reports, by the patterned regular expressions. The extracted words are expressed in a matrix.
[References]
- Bacco L, Russo F, Ambrosio L, et al. Natural language processing in low back pain and spine diseases: A systematic review. Front Surg. 2022;9:957085. Published 2022 Jul 14.
- What is NLP, the authors did not state or define clearly.
Response
Thank you for your kind comments. Following yours comments, we added the definition of an NLP clearly. Thanks to the reviewer’s suggestion, we added and mentioned about the NLP.
Manuscript
[Introduction]
Natural language processing (NLP) is understanding, analyzing, and extracting meaningful information from text (natural language) by computer science [42]. An NLP is used for translation, chatbot, and text classification, etc. Additionally, the research on the application of the NLP system is being actively conducted in medical studies.
[References]
- Bacco L, Russo F, Ambrosio L, et al. Natural language processing in low back pain and spine diseases: A systematic review. Front Surg. 2022;9:957085. Published 2022 Jul 14.
- What is the relevance of this study to NLP?
Response
Thank you for your kind comments. We agree with your point that we need to describe the relevance of this study clearly. Thanks to the reviewer’s suggestion, we could add the relevance of this study to the NLP in conclusion.
Manuscript
[Conclusion]
Through research, an NLP system was developed to identify and extract the necessary terms from the free-text form clinical data in medical institutions. Through this study, an NLP system for extracting words associated with LBP from not only X-ray and MRI reports but also CT reports was developed.
- The authors did not suggest what kinds of algorithm to be used in NLP. What are the differences between other existing algorithms? What are the contributions or new discoveries?
Response
Thank you for your kind comment. We agree with your point that the used algorithm of this study was not clearly described. Following your suggestion, we added the method (rule-base methods) and python libraries (“re”) that were used in this NLP system. Thanks to the reviewer’s suggestion, we could add the new discoveries of this study and difference from other algorithms.
Manuscript
[Disccusion]
In this study, the NLP system is developed by rule-based methods. Clinician determined labels were patterned and recognized through regular expressions, "re". The various patterns of the rules of free-text form clinical reports can be defined by Rule-based method. Because rules are set by the user, the rule-based method is easier to debug and has higher precision can be expected than machine learning-based method.
[Introduction]
In previous studies, the NLP model for extracting words associated with LBP in CT reports was not developed [13, 25, 26, 33, 35, 40]. This study expands the utilization of the NLP system to CT reports as well as X-ray and MRI reports for extracting words associated with LBP.
Author Response File: Author Response.docx
Reviewer 3 Report
This study proposed an NLP rule-based system for extracting the clinical findings by identifying radiologic X-ray, CT, and MRI reports to assist treatment and decisions. The current work is a good case study that serves the aim of this Journal. However, the paper's motivation and novelty are not sufficient and limited for a research paper. An in-depth analysis of the results and how the new method is evaluated must be conducted. The discussion should include more analyses. The authors need to explain model development, Training, and testing more. It needs in-depth discussions and explanations of how to read the X-ray, CT, and MRI reports and extract the primary information. It needs to be clarified how the model computes the proposed system efficiency compared to other studies. Besides, the technical writing has many grammatical errors, making it hard to read. The authors should justify the novelty of the proposed work as similar studies have been carried out in the existing literature.
In addition to the following:
- Enhance the introduction to present a detailed background of the current problem and solutions.
-Enhance the conclusion to present the findings and contributions of the work.
- Add more recent literature (2022-2021) that describes the works developed in the last few years and then conclude the advantages and disadvantages of each method.
-What are the current research gaps in the studies mentioned in the literature survey, and how will this work fill them?
- Avoid using many references together, such as [[1-7], [14-18],], etc. You should classify the studies and write a proper paragraph bout each study or category.
- Add the equations with a reference of how to compute the F1 score, recall, precision, and accuracy.
- Abbreviations must be written in the complete form where they are first used, such as MRI. Check the main text and edit for the same.
-The research paper should be written in the third person's perspective; words such as "we", "our," etc., must be avoided.
- Need more details about the proposed system and the NLP pipeline.
-Too-long sentences make the meaning unclear. Consider breaking it into multiple sentences—for example, L51-L54, L58-L60, L177-L180, L185-L188, etc.
-Many grammatical or spelling errors make the meaning unclear, and sentence construction errors need proofreading. Improve the English language, redaction, and punctuation in general. The manuscript should undergo editing before being submitted to the Journal again.
The following are some examples:
L56: Compared previous study …. Should be … Compared to previous studies
L57: NLP system in LPB …. Should be … an NLP system for LPB
L59: developed, and it showed …. Should be … developed. It showed
L60: However, for using NLP system …. Should be … However, using the NLP system
L61: or systems, should …. Should be …. or systems should
L62: methods to NLP system developing. …. Should be … for NLP system development.
Author Response
Reviewer 3
This study proposed an NLP rule-based system for extracting the clinical findings by identifying radiologic X-ray, CT, and MRI reports to assist treatment and decisions. The current work is a good case study that serves the aim of this Journal. However, the paper's motivation and novelty are not sufficient and limited for a research paper. An in-depth analysis of the results and how the new method is evaluated must be conducted. The discussion should include more analyses. The authors need to explain model development, Training, and testing more. It needs in-depth discussions and explanations of how to read the X-ray, CT, and MRI reports and extract the primary information. It needs to be clarified how the model computes the proposed system efficiency compared to other studies. Besides, the technical writing has many grammatical errors, making it hard to read. The authors should justify the novelty of the proposed work as similar studies have been carried out in the existing literature.
In addition to the following:
- Enhance the introduction to present a detailed background of the current problem and solutions.
Response
Thank you for your kind comment. We agree with your point that we should justify the novelty and present a detailed back ground of the current problem. Following your suggestion, we added our research motivation based on a need in the medical institution.
Manuscript
[Introduction]
There is no clear standard notation in the use of many technical terms in radiologic report. It is necessary to develop an NLP system tailored for extracting terms from the unique free-text format of the medical institution (i.e. Bundang CHA Medical Center).
-Enhance the conclusion to present the findings and contributions of the work.
Response
Thank you for your comments. We agree with your point that we need to describe the findings and contributions of this study clearly. Thanks to the reviewer’s suggestion, we could add the contribution of this NLP system associated with lack of the previous researches.
Manuscript
[Conclusion]
Through research, an NLP system was developed to identify and extract the necessary terms from the free-text form clinical data in medical institutions. Through this study, an NLP system for extracting words associated with LBP from not only X-ray and MRI reports but also CT reports was developed.
The utilization of this NLP system will be expanded to extract data needed for medical research or data that becomes a marker for diagnosis or treatment from free-text form clinical data of the medical institution.
- Add more recent literature (2022-2021) that describes the works developed in the last few years and then conclude the advantages and disadvantages of each method.
Response
Thank you for your suggestion. We agree with your point, and we added more recent works (2022-2021). Also we described the advantages and disadvantages of each study’s method. Thanks to your suggestion, we could add recent NLP models. Also, we added a column about the year of the recent works publication at table 4.
Manuscript
[Introduction]
Emilien et al., the NLP system was developed by the BERT model. The free-text form records in the University Hospital of Amiens-Picardy in France was used to develop the BERT model. This BERT model learned contextual embeddings. Through this research, the utilization of the BERT method in medical field care was discussed [43].
[Table 4]
Study |
Topic (Findings) |
Source |
Year |
Datasets from radiological images reports |
|||
Tan et al. [13] |
Radiologic findings associated with LBP |
Lumbar MRI reports and X-ray reports |
2018 |
Jujjavarapu et al. [40] |
Radiologic findings associated with LBP |
Lumbar MRI reports and X-ray reports |
2022 |
Caton et al. [25] |
Relationship between radiologic findings of degenerative spinal stenosis on lumbar MRI (LMRI) and patient characteristics |
Lumbar MRI reports |
2021 |
Caton et al. [26] |
Performance metric for measuring global severity of lumbar degenerative disease (LSDD) |
Lumbar MRI reports |
2021 |
Huhdanpaa et al. [33] |
Patients with Type 1 modic endplate changes |
Lumbar MRI reports |
2018 |
Lewandrowski et al. [34] |
Producing automatic MRI reports |
MRI DICOM datasets |
2020 |
Galbusera et al. [35] |
Generating annotations for radiographic images |
Lumbar X-ray reports |
2021 |
Datasets from free-text type reports |
|||
Miotto et al. [27] |
Classification of acute LBP episode |
Free-text clinical notes |
2020 |
Walsh et al. [28] |
Identifying the axial SpA concepts |
Electro medical records |
2017 |
Walsh et al. [29] |
Identifying the axial SpA |
Clinical chart database |
2020 |
Zhao et al. [30] |
Classification of the axial SpA patients |
Electronic health records |
2020 |
Ehresman et al. [31] |
Identifying incidental durotomy |
Intra-operative electronic health records |
2020 |
Karhade et al. [32] |
Identifying incidental durotomy |
Free-text operation notes |
2020 |
Karhade et al. [36] |
Prediction of intra-operative vascular injury (VI) and identifying VI |
Free-text operation notes |
2021 |
Karhade et al. [38] |
Identifying the post-operative wound infection which needs reoperation after lumbar discectomy |
Free-text operation notes |
2020 |
Karhade et al. [39] |
Predicting 90-day unplanned readmission of the lumbar spine fusion patients |
Free-text note during hospitalization period |
2022 |
Dantes et al. [37] |
Venous thromboembolism |
Electronic medical records |
2018 |
[Discussion]
In many previous NLP system studies on LBP and spine disease, machine learning approach was used [42]. The machine learning algorithm has “learnability”. Thus, when regular expressions are not applicable in the NLP system, machine learning algorithms can automatically learn new rules and remove unnecessary rules. The disadvantages of machine learning algorithms are that the training dataset must need to be set up and the debugging is more complicated than rule-based algorithm.
In this study, the NLP system is developed by rule-based methods. The various patterns of the rules of free-text form clinical reports can be defined by Rule-based method. Because rules are set by the user, the rule-based method is easier to debug and has higher precision can be expected than machine learning-based method.
[References]
- Bacco L, Russo F, Ambrosio L, et al. Natural language processing in low back pain and spine diseases: A systematic review. Front Surg. 2022;9:957085. Published 2022 Jul 14.
- Emilien Arnaud, Mahmoud Elbattah, Maxime Gignon, Gilles Dequen. Learning Embeddings from Free-text Triage Notes using Pretrained Transformer Models. HEALTHINF: PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL 5: HEALTHINF, Feb 2022, Lisbonne, Portugal. pp.835-841.
-What are the current research gaps in the studies mentioned in the literature survey, and how will this work fill them?
Response
Thank you for your kind comment. Following your suggestion, we added our research’s strengths based on the research gaps associated with previous studies.
Manuscript
[Introduction]
In previous studies, the NLP model for extracting words associated with LBP in CT reports was not developed [13, 25, 26, 33, 35, 40]. This study expands the utilization of the NLP system to CT reports as well as X-ray and MRI reports for extracting words associated with LBP.
- Avoid using many references together, such as [[1-7], [14-18],], etc. You should classify the studies and write a proper paragraph bout each study or category.
Response
Thank you for pointing out the mistake. Thanks for your comment, we classified and modified some references.
Manuscript
[Introduction]
LBP can be classified as acute or chronic depending on its onset [1, 3, 5]. Acute LBP usually occurs suddenly, and if the pain persists for more than 3 months, it can be called chronic LBP [14]. LBP is one of the most common causes of hospital visits and the second leading cause of sick leave [4]. Because of its high direct and indirect costs, the health, social, and economic impacts on individuals, families, and society are significant [2, 6, 7]. Especially, cases that is between 5% and 10% of chronic low back pain (CLBP), need high costs and long-term care [23]. It is important to develop tools to help patients with recently developed back pain accurately predict whether persistent pain will occur [8, 15].
A radiology report is a formal interpretation of a radiological examination and contains radiological findings related to the clinical diagnosis and treatment decisions [10, 16]
Twenty-three radiologic findings known to be related to LBP from previous studies were established, and each report was annotated by two physicians specializing in LBP diagnosis [17, 18] (Table 1).
- Add the equations with a reference of how to compute the F1 score, recall, precision, and accuracy.
Response
Thank you for your kind comments. We agree with your point that add the equations with a reference. We removed some sentence about how to compute the F1 score, recall, precision, and accuracy. And we add the equations and reference.
Manuscript
[Materials and Methods]
The recall is defined as .
The precision is defined as .
The accuracy is defined as
.
The F1 score is defined as and is used to estimate performance. This score is the harmonic mean of precision and recall, and it is considered more useful than accuracy because of the prevalence of class imbalance in text classification [41].
[References]
- David C. Blair, 1979. Information Retrieval, 2nd ed. C.J. Van Rijsbergen. London: Butterworths; 1979: 208 pp. Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 30(6), pages 374-375, November.
- Abbreviations must be written in the complete form where they are first used, such as MRI. Check the main text and edit for the same.
Response
Thank you for pointing out the mistake. Thanks to your comment, we double-checked the format and could revise the manuscript.
-The research paper should be written in the third person's perspective; words such as "we", "our," etc., must be avoided.
Response
Thank you for pointing out the mistake. Thanks to your comment, we edited the words (we, our) to the third person’s prospective.
- Need more details about the proposed system and the NLP pipeline.
Response
Thank you for your kind comment. We think your suggestion will make our manuscript more understandable. Thanks to the reviewer’s suggestion, we added and mentioned about the NLP systems and pipeline of this study.
Manuscript
[Materials and Methods]
The NLP system for extracting words associated with LBP from CT, MRI, and X-ray reports. The NLP system is classified into three approaches; rule-based approach, Machine learning-based approach, and hybrid types approach [42]. In this study, a rule-based approach was used. For implementing rule-based approach, “re” was used. “re” was used for tokenization sentences and make patterned regular expressions.
The words associated with LBP were extracted from X-ray, CT, and MRI reports, by the patterned regular expressions. The extracted words are expressed in a matrix.
The k-fold cross validation was used to the NLP model validation. Training and test datasets were configured differently in the datasets several times.
[References]
- Bacco L, Russo F, Ambrosio L, et al. Natural language processing in low back pain and spine diseases: A systematic review. Front Surg. 2022;9:957085. Published 2022 Jul 14.
-Too-long sentences make the meaning unclear. Consider breaking it into multiple sentences—for example, L51-L54, L58-L60, L177-L180, L185-L188, etc.
Response
Thank you for kind comments. Thanks to your comment, we revised too-long sentence into multiple sentence and made the sentence’s meaning clear.
-Many grammatical or spelling errors make the meaning unclear, and sentence construction errors need proofreading. Improve the English language, redaction, and punctuation in general. The manuscript should undergo editing before being submitted to the Journal again.
The following are some examples:
L56: Compared previous study …. Should be … Compared to previous studies
L57: NLP system in LPB …. Should be … an NLP system for LPB
L59: developed, and it showed …. Should be … developed. It showed
L60: However, for using NLP system …. Should be … However, using the NLP system
L61: or systems, should …. Should be …. or systems should
L62: methods to NLP system developing. …. Should be … for NLP system development.
Response
Thank you for pointing out the mistake we could miss. Following the reviewer’s comment, we received a proofreading service. Thanks to your comment
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
Thanks, I have no further comments.
Author Response
Thank you so much
Reviewer 2 Report
Accept in present form
Author Response
Thank you so much
Reviewer 3 Report
Thank you for your response. Still, the new findings and contributions are limited. Unfortunately, the author failed to answer my comments and did not explain the motivation and novelty of the work, which is not sufficient and limited for a research paper in its current form. Also, it needs a more in-depth analysis of the results and how the new method is better than other research papers in the literature survey. The discussion is weak and needs to be improved. The authors should not have explained how they built the model and did the training and testing process. Also, how they read the X-ray, CT, and MRI reports and extract the primary information. Besides, the technical writing is weak and has many grammatical errors, making it hard to read.
In addition to the following:
- Enhance the introduction to present a detailed background of the current problem and solutions.
The author did not answer my comment.
The author's answer is unsatisfactory in enhancing the introduction that should present the current problem and studies.
- Add more recent literature (2022-2021) that describes the works developed in the last few years and then conclude the advantages and disadvantages of each method.
The author did not answer my comment.
The author only adds a new column to the table with the year of the published papers, which is not the answer to my comment.
- Avoid using many references together, such as [[1-7], [14-18],], etc. You should classify the studies and write a proper paragraph bout each study or category.
The author did not answer my comment. See L63:[1, 3, 5], L66:[2, 6, 7], etc.
-Still, many grammatical or spelling errors make the meaning unclear, and sentence construction errors need proofreading. Improve the English language, redaction, and punctuation in general. The manuscript should undergo editing before being submitted to the Journal again.
The following are some examples:
L66: Especially, cases that is between ....should be.... Especially cases between
L74: reports use a free form language ....should be.... reports use free-form language
L77: extracting the clinical information from the radiology reports. ....should be.... extracting clinical information from radiology reports.
L83: encoder representations from transformer ....should be.... encoder representation from the transformer
Author Response
Thank you for your response. Still, the new findings and contributions are limited.
Unfortunately, the author failed to answer my comments and did not explain the motivation and novelty of the work, which is not sufficient and limited for a research paper in its current form.
Response
Thank you for your comment. We agree with your point that the motivation and the novelty of the study needs more clarification. Following your suggestion, we added explanation of significance of this study.
Manuscript
[Introduction]
This study developed the improved rule-based NLP system than previous research.
In this study, the utilization of the NLP system was expanded to the CT radiology reports for extracting words associated with LBP.
By developing an NLP system, it is possible to analyze the free-text form radiology reports unique to the medical institution. Unstructured clinical information of the radiology reports was processed into a structured form data. The structured form clinical data will be used for data-drive research of LBP. This study laid the foundation for extracting the clinical data from free-text from radiology reports and using it in various clinical studies.
Also, it needs a more in-depth analysis of the results and how the new method is better than other research papers in the literature survey. The discussion is weak and needs to be improved.
Response
Thank you for your kind comment. We agree with your point, and we think your suggestion will make our manuscript more understandable. Thanks to the reviewer’s suggestion, we added analysis of the results and the strength of our research.
Manuscript
[Discussion]
This NLP system presented accuracy more than 0.9611. To correct the effects of dataset bias, the NLP system was evaluated using F1 score. F1 score is the most important thing among the parameters for NLP system evaluation. The F1 score is an indicator that reflects recall and precision. In this NLP system, all radiologic findings had the F1 score of 0.9802 or higher. Recall means the sensitivity of statistics. The developed NLP system had a sensitivity at least 0.984. Compared to the previous rule-based NLP system [16], the sensitivity of the rule-based NLP pipeline was improved.
Also, developed NLP system presented higher sensitivity of potentially clinically important radiological findings than previous rule-based NLP system [16].
In this study, the NLP system is developed by the rule-based approach. Clinician determined set the regular expressions by "re". The various patterns of the free-text form clinical reports could be defined to regular expressions. Because rules are set by the user, the rule-based method is easier to debug and has higher precision can be expected than machine learning-based method. Several limitations of the previous studies were overcome. First, this NLP system was trained with relatively small datasets (n=720) than previous studies. Second, the NLP system had improved sensitivity (from 0.984 to 1), which has been pointed out as low in previous rule-based NLP system [16]. Third, utilization of the NLP system was extended to the CT radiology reports for extracting words associated with LBP.
The authors should not have explained how they built the model and did the training and testing process. Also, how they read the X-ray, CT, and MRI reports and extract the primary information.
Response
Thank you for your comment. We agree with your point, and we think your suggestion will make our manuscript more understandable. Thanks to the reviewer’s suggestion, we modified and summarized explanation about the NLP systems and pipeline of this study.
Manuscript
[Materials and Methods]
This NLP pipeline was developed by using Python (3.8.10). Section and sentence segmentation from the radiology reports were done using a rule-based approach; regular expression with negation detection. A regular expression is a set of characters specialized in identifying terms in text and analyzing patterns. "re" is one of the Python libraries used to implement regular expressions in Python. The development of all models was performed by analyzing the train dataset of X-ray, CT, and MRI readings (n = 720).
Besides, the technical writing is weak and has many grammatical errors, making it hard to read.
Response
Thank you for pointing out the mistake we could miss. We received a proofreading service and modified the manuscript. Thanks to your comment
In addition to the following:
- Enhance the introduction to present a detailed background of the current problem and solutions.
The author did not answer my comment. The author's answer is unsatisfactory in enhancing the introduction that should present the current problem and studies.
Response
Thank you for your comment. We agree with your point that the current problem and solutions needs more clarification. Following your suggestion, we added detailed background based the lack of the previous researches.
Manuscript
[Introduction]
In previous studies, there are some points need to be improved. Frist, while the machine learning approach has been actively studied, the rule-based approach has not been studied much in the NLP system for extracting words associated with LBP. The set-up cost of the rule-based approach is smaller than the machine learning approach. The training datasets size of the rule-based approach is smaller than the machine learning approach. In previous studies, the rule-based approach presented high specificity but moderate sensitivity [16]. This study developed the improved rule-based NLP system than previous research. Second, it needs to develop the NLP system for extracting words associated with LBP in CT radiology reports. In previous studies, the NLP system was not trained and tested in the CT radiology reports [16-21]. In this study, the utilization of the NLP system was expanded to the CT radiology reports for extracting words associated with LBP.
- Add more recent literature (2022-2021) that describes the works developed in the last few years and then conclude the advantages and disadvantages of each method.
The author did not answer my comment. The author only adds a new column to the table with the year of the published papers, which is not the answer to my comment.
Response
Thank you for your suggestion. We agree with your point, and we added more recent works (2022-2021) on discussion. Also we described the advantages and disadvantages of study’s method. Thanks to your suggestion, we could add recent NLP models.
Manuscript
[Discussion]
Recently, the NLP system research on clinical reports written by official language is conducted in each country. Emilien et al. developed the NLP system for learning contextual embeddings from free-text form clinical records in France [15]. Kim Y et al. developed an NLP system for identifying a Korean medical corpus, by BERT models [37]. Dahl et al. developed an NLP system for classifying Norwegain radiology reports of pediatric CT radiology reports. A bidirectional recurrent neural network model, a convolutional neural network model and a support vector machine model were used for training and testing the NLP system [38]. Fink et al. developed an NLP system for identifying the oncologic outcomes from structured oncology reports created in German language [39].
In previous research, machine learning or deep learning based NLP system could predict multiple findings by using same datasets. Also, machine learning based NLP system was more scalable than rule-based model. However, machine learning or deep learning based NLP system required larger datasets and needed higher set-up cost than rule-based NLP system [16].
[References]
- Kim, Y.; Kim, J.H.; Lee, J.M.; Jang, M.J.; Yum, Y.J.; Kim, S.; Shin, U.; Kim, Y.M.; Joo, H.J.; Song, S. A pre-trained BERT for Korean medical natural language processing. Sci Rep 2022, 12, 13847, doi:10.1038/s41598-022-17806-8.
- Dahl, F.A.; Rama, T.; Hurlen, P.; Brekke, P.H.; Husby, H.; Gundersen, T.; Nytrø, Ø.; Øvrelid, L. Neural classification of Norwegian radiology reports: using NLP to detect findings in CT-scans of children. BMC medical informatics and decision making 2021, 21, 84, doi:10.1186/s12911-021-01451-8.
- Fink, M.A.; Kades, K.; Bischoff, A.; Moll, M.; Schnell, M.; Küchler, M.; Köhler, G.; Sellner, J.; Heussel, C.P.; Kauczor, H.U.; et al. Deep Learning-based Assessment of Oncologic Outcomes from Natural Language Processing of Structured Radiology Reports. Radiology. Artificial intelligence 2022, 4, e220055, doi:10.1148/ryai.220055.
- Avoid using many references together, such as [[1-7], [14-18],], etc. You should classify the studies and write a proper paragraph bout each study or category.
The author did not answer my comment. See L63:[1, 3, 5], L66:[2, 6, 7], etc.
Response
Thank you for pointing out the mistake. Thanks for your comment, we classified and removed or added some references. Also, reference numbers have been rearranged.
Manuscript
[Introduction]
Low back pain (LBP) is defined as pain and discomfort localized under the ribs and above the inferior gluteal folds, with or without leg pain [1]. 70-80% of adults can experience LBP in some form during their lives [2]. More than 85% of people under 45 years of age experience at least one LBP symptoms which requires medicine or interventional treatment [3]. LBP can be classified as acute or chronic depending on its onset. Acute LBP usually occurs suddenly, and if the pain persists for more than 3 months, it can be called chronic LBP [4]. LBP is one of the most common causes of hospital visits and the second leading cause of sick leave [5]. Because of its high direct and indirect costs, the health, social, and economic impacts on individuals, families, and society are significant [6]. Especially, cases between 5% and 10% of chronic low back pain (CLBP), need high costs and long-term care [7]. It is important to develop tools to help patients with recently developed back pain accurately predict whether persistent pain will occur [8].
-Still, many grammatical or spelling errors make the meaning unclear, and sentence construction errors need proofreading. Improve the English language, redaction, and punctuation in general. The manuscript should undergo editing before being submitted to the Journal again.
The following are some examples:
L66: Especially, cases that is between ....should be.... Especially cases between
L74: reports use a free form language ....should be.... reports use free-form language
L77: extracting the clinical information from the radiology reports. ....should be.... extracting clinical information from radiology reports.
L83: encoder representations from transformer ....should be.... encoder representation from the transformer
Response
Thank you for pointing out the mistake we could miss. Following the reviewer’s comment, we modified some sentence of the manuscript. Also we received a proofreading service. Thanks to your comment
Author Response File: Author Response.docx