Next Article in Journal
Computational Intelligence in Photovoltaic Systems
Next Article in Special Issue
Image Super-Resolution Algorithm Based on Dual-Channel Convolutional Neural Networks
Previous Article in Journal
Numerical Analysis of the Mechanical Behaviors of Various Jointed Rocks under Uniaxial Tension Loading
Previous Article in Special Issue
A Joint Training Model for Face Sketch Synthesis
 
 
Article
Peer-Review Record

Data Balancing Based on Pre-Training Strategy for Liver Segmentation from CT Scans

Appl. Sci. 2019, 9(9), 1825; https://doi.org/10.3390/app9091825
by Yong Zhang 1,2, Yi Wang 1,*, Yizhu Wang 2, Bin Fang 1, Wei Yu 1, Hongyu Long 1 and Hancheng Lei 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2019, 9(9), 1825; https://doi.org/10.3390/app9091825
Submission received: 2 March 2019 / Revised: 14 April 2019 / Accepted: 29 April 2019 / Published: 2 May 2019
(This article belongs to the Special Issue Intelligent Imaging and Analysis)

Round 1

Reviewer 1 Report

This study improved the mean Dice score in the liver segmentation by enhancing hard samples (more-difficult-to-detect samples). The hard samples are identified by a pre-training model. The partitioned data and enhanced hard samples are fed into the network for liver segmentation. There is no improvement in the result of max Dice score, but there is improvement in the result of min Dice score. I have some comments for the authors as below:

In the last sentence of Introduction section, the declaration that “We emphasize that the improvement of segmentation effects is not based on the model but on the enhancement of hard examples” is not true. In this study, the data enhancement is incremental improvement of the segmentation. The study doesn’t verify the segmentation effects based on different models.

Please provide the details for the pre-training model. Did it use the same FCN model as the segmentation?

Are the selection ratios in Table 1, A1 and A2 the ratios of selected training scans (part A) to total number of scans (A + B)? Also in Table 1, A1, and A2, the mean, min, max Dice scores are obtained from a number of trials, aren’t they? How many trials are used? This should be described in the manuscript.

In section 2.2, there is a statement that “(4) selecting slices from part A based on their classification results and then continue model training process until reaching 10x104 iterations” (line 91-93). This means only selected slices from A are used for the segmentation. However, in section 3, it is stated that “Hard samples in A dataset were enhanced by flipping and added to the B dataset” (line 128-129). So, I’m wondering what part of the dataset is used for the segmentation. Please state it clearly and consistently throughout the manuscript.

Do the authors divide the dataset (A and B) into two datasets with ratio 1:1 (selected ratio 0.5)? How do you partition the dataset, is it random or not?

The number of training iterations is not consistent throughout the manuscript. In (4) of section 2.2, the training process is executed with 10 x 104 iterations, in section 2.4, it is 60 x 104 iterations, and in section 3. It is 10 x 104 iterations. Is it the number of training iterations for the segmentation task? Additionally, the pre-training is executed with 5 x 104 iterations, isn’t it?

Some minor comments:

The last sentence of the second paragraph in page 2 (line 67-68) is not readable. Please revise it to be more readable.

The sentence “We find that in liver and kidney segmentation tasks.” in line 155-156, page 6, Discussion section is meaningless. Please double-check it.

There are some English typos as in Figure 1 caption (“The read regions” -> “The red regions”), “the dentified hard samples” (“the identified samples”) in line 67, page 2, “[]” (lack of the reference) in line 99, page 3, “Figure1” in line 137, page 4 (I believe it should be Figure 2). Please check the English writing throughout the manuscript.

 

Author Response

Q1: In the last sentence of Introduction section, the declaration that “We emphasize that the improvement of segmentation effects is not based on the model but on the enhancement of hard examples” is not true. In this study, the data enhancement is incremental improvement of the segmentation. The study doesn’t verify the segmentation effects based on different models.

 

A1: Thanks for your suggestion, sentence “We emphasize that the improvement of segmentation effects is not based on the model but on the enhancement of hard examples” is deleted in manuscript. And, we supplement the segmentation results on 2D U-Net structure in appendix section, Table A3 (liver segmentation), Table A4 (kidney segmentation) and Table A5 (spleen segmentation).

 

 

Q2: Please provide the details for the pre-training model. Did it use the same FCN model as the segmentation.

A2: In this work, we used the same model structure and training strategy (section 2.3) in pre-training process and final training stage.

 

Q3: Are the selection ratios in Table 1, A1 and A2 the ratios of selected training scans (part A) to total number of scans (A + B)? 2.Also in Table 1, A1, and A2, the mean, min, max Dice scores are obtained from a number of trials, aren’t they? How many trials are used? This should be described in the manuscript.

A3: Selection ratio used in this work is the ratio of training scans in part B to total number of scans in training dataset (part A + part B), and this information is added in manuscript (line 157). The mean, min, max Dice scores in Table1, A2, A2, A3, A4, A5 are obtained from the test dataset which consists 40 scans.

Q4: In section 2.2, there is a statement that “(4) selecting slices from part A based on their classification results and then continue model training process until reaching 10x104 iterations” (line 91-93). This means only selected slices from A are used for the segmentation. However, in section 3, it is stated that “Hard samples in A dataset were enhanced by flipping and added to the B dataset” (line 128-129). So, I’m wondering what part of the dataset is used for the segmentation. Please state it clearly and consistently throughout the manuscript.

A4: Thank you for your advice. In this work, slices from part B are enhanced by flipping and mirroring, and then these enhanced slices are used for the pre-training process. Hard samples in part A are enhanced by flipping, and then these slices are added to the training dataset (part B) used in pre-training process. And relevant expression errors have been corrected in section 2.2 (line 104-121) and section 3 (line 175-176).

Q5: The number of training iterations is not consistent throughout the manuscript. In (4) of section 2.2, the training process is executed with 10 x 104 iterations, in section 2.4, it is 60 x 104 iterations, and in section 3. It is 10 x 104 iterations. Is it the number of training iterations for the segmentation task? Additionally, the pre-training is executed with 5 x 104 iterations, isn’t it?

A5: Thank you very much for pointing out the iterations mistake in this work. In this study, the total iterations of segmentation task are set to 10x 105. Just 5×105 iterations are needed in the final training process if 5×105 iterations were done in the pre-training process and the pre-training model structure is consistent with the final model, while 10×105 iterations are needed in the final training stage if pre-training model structure is inconsistent with the final model. In this study, we use the same model structure in pre-training process and final training stage. Related description statements are reorganized in section 2.2 (line 104-121), and Iteration-related presentation errors are revised in section 3 (line 177) in revised manuscript. And the pre-training is executed with 5 x 105 iterations in this work.

Q6: The last sentence of the second paragraph in page 2 (line 67-68) is not readable. Please revise it to be more readable.

A6: Thank you for your advice, sentence “Second, the hard slices identified by pre-training model are selected and enhanced by flipping, and then these slices are added to the dataset used in pre-training process to enhance the ratio of hard slices in training dataset, and improve the contribution of hard slices in training process.’’ in line 86-89.

Q7: The sentence “We find that in liver and kidney segmentation tasks.” in line 155-156, page 6, Discussion section is meaningless. Please double-check it.

A7The sentence “We find that in liver and kidney segmentation tasks.” is deleted.

Q8: There are some English typos as in Figure 1 caption (“The read regions” -> “The red regions”), “the dentified hard samples” (“the identified samples”) in line 67, page 2, “[]” (lack of the reference) in line 99, page 3, “Figure1” in line 137, page 4 (I believe it should be Figure 2). Please check the English writing throughout the manuscript.

A8: Thank you for pointing out the writing and grammatical errors in manuscript. We have carefully revised the manuscript according to the reviewers' comments, and also have re-scrutinized to improve the English in manuscript.


Author Response File: Author Response.docx

Reviewer 2 Report

In this work, the authors address the problem of data imbalance in the segmentation tasks of CT scans. In particular, they use a pre-training strategy to  distinguish hard samples and easy samples and increase the proportion of hard slices in training  datasets. The results of the work show that the prediction ability of the model could be improved by increasing the ratio of hard samples in the training dataset. 


I think that this problem is common to many classification tasks for segmentation and the authors' proposal may be scientifically relevant to the biomedical research community. However, there are some important points that must be revised.

1. For example, it is appropriate for the readers of the journal "Applied Sciences" to specify the difference between hard and easy samples, forcing illustrative examples, even for the case under investigation. 


2. Frankly, I think that section "2.1 Dataset and processing" should be entirely reorganized, by providing a flowchart figure of the method and by explaining in details all the steps of the proposed  method. Also please explain the choice of the number of iterations (for example line 89 and 93).


3. Please, explain better what the pre-training strategy represents.


4. Please, make sure that all references are reported in the main text (see line 99).


5. I recommend to insert a graph of the network architecture (2D FCN) for more seak of clarity.


In conclusion, I think the manuscript is interesting, but it is too concise in some salient parts. A reorganization of the methodology sections is recommended.

Author Response

Q1: For example, it is appropriate for the readers of the journal "Applied Sciences" to specify the difference between hard and easy samples, forcing illustrative examples, even for the case under investigation. 

 

A1: Thank you for your constructive suggestion. Descriptive Sentences on hard samples and easy samples “(the easily segmented slices are called easy samples or easy slices, while the difficult samples are defined as hard samples or hard slices) in training datasets. As show in Figure 2(A and B), the features of some slices are obvious and easy to segment. However, in some others, as shown in Figure 2 (C and D), the features of liver are not obvious, which may be due to poor quality of CT image or the liver self-defect (e. g, liver morphological variation, liver lesions , etc.), and it is difficult to accurately segment liver from these slices. Moreover, it is easy to qualitatively divide hard samples and easy samples according to the segmentation results, but it is difficult or almost impossible to describe the characteristics of hard samples and easy samples, and accurately distinguish them in training dataset before training process” are inserted into introduction section, line 47-55. And an illustrative example of hard samples and easy samples is inserted into Introduction section, Figure 2.

 

Q2: Frankly, I think that section "2.1 Dataset and processing" should be entirely reorganized, by providing a flowchart figure of the method and by explaining in details all the steps of the proposed method. Also please explain the choice of the number of iterations (for example line 89 and 93).

A3: Thank you for your suggestion. Description of proposed method and related details are entirely reorganized in manuscript (“Inspired by the pre-training strategy, a pre-training model was used as a sample classifier to classify hard samples and easy samples in training dataset. Frist, the whole training dataset was divided into two parts (A and B) based on their simple statistics information (e. g, the number of slices in volume, the proportion of positive and negative samples in volume). In this way, the ratio of positive and negative slices in two subsets (A and B) can be guaranteed the same as that of the whole training dataset. Part A is used for the later samples classification and screening, while part B is used for model pre-training. Second, slices in part B are enhanced by flipping and mirroring, and then these enhanced slices are used in model pre-training process. And we get a pre-training model when model is trained to a set iteration (such as 5×105 iterations in this work). Third, the pre-training model is used to predict slices in part A, and all slices in part A are simply divided into two categories, hard samples, and easy samples, by their Dice score. And then the hard slices in part A are enhanced by flipping, and then added to the training dataset (part B) used in pre-training process. Finally, we continue the training process until reaching to the set 10×105 iterations, and then get the final segmentation model. Just 5×105 iterations are needed in the final training process if 5×105 iterations were done in the pre-training process and the pre-training model structure is consistent with the final model, while 10×105 iterations are needed in the final training stage if pre-training model structure is inconsistent with the final model. In this study, we use the same model structure in pre-training process and final training stage”), line 104-121, section 2.2.

I'm very sorry for the inconvenience caused by our unclear statements in manuscript. The choice of iteration times is based on the experimental results. In addition to 10×105 iterations, we also experimented with 12×105 iterations, 15×105 iterations and 20×105 iterations. When the number of iterations is more than 10×105, segmentation effect has hardly improved with increase iteration, moreover, there is a phenomenon of over-fitting at 20×105 iterations. So, we select 10×105 as the total number of iterations in this work. And we determine the number of iterations in pre-training stage (5×105 iterations) using the same strategy.

Q3: Please, explain better what the pre-training strategy represents.

A3: The basic purpose of pre-training strategy is to get a sample classifier, which could distinguish hard/easy slices according to actual segmentation task need. And sentence “Therefore, the basic purpose of pre-training strategy is to get a sample classifier, which could distinguish hard/easy slices according to actual task need” is inserted into introduction section in manuscript, line 89-90.

Q4: Please, make sure that all references are reported in the main text (see line 99).

A4: Thank you for pointing out the references related mistake in manuscript, and the missing reference tag has been added in manuscript, line 141.

Q5: I recommend to insert a graph of the network architecture (2D FCN) for more seak of clarity.

A5: Thank you for your suggestion. Network architecture of 2D FCN is inserted into appendix section, Figure A1.


Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The authors have addressed all the reviewer's previous concerns in the revision. Please be sure to mind and correct all the typos throughout the manuscript again.


read regions -> red regions (Line 57)

frist -> first" (Lines 79 and 100)

Reviewer 2 Report

The authors are deemed to have provide acceptable responses to the Reviewer's comments.

Please improve text editing overall the manuscript.

Back to TopTop