Next Article in Journal
A Critical Analysis of Agile and Lean Methodology to Fulfill the Project Management Gaps in Nonprofit Organizations (NPOs)
Next Article in Special Issue
The Impact of Segmentation Method and Target Lesion Selection on Radiomic Analysis of 18F-FDG PET Images in Diffuse Large B-Cell Lymphoma
Previous Article in Journal
An Effective AUSM-Type Scheme for Both Cases of Low Mach Number and High Mach Number
Previous Article in Special Issue
A Clustering Approach to Improve IntraVoxel Incoherent Motion Maps from DW-MRI Using Conditional Auto-Regressive Bayesian Model
 
 
Article
Peer-Review Record

Discrimination of Tumor Texture Based on MRI Radiomic Features: Is There a Volume Threshold? A Phantom Study

Appl. Sci. 2022, 12(11), 5465; https://doi.org/10.3390/app12115465
by João Santinha 1,2,†, Linda Bianchini 3,†, Mário Figueiredo 2,4, Celso Matos 1,5, Alessandro Lascialfari 3, Nikolaos Papanikolaou 1,6, Marta Cremonesi 7, Barbara A. Jereczek-Fossa 8,9, Francesca Botta 10,* and Daniela Origgi 10
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2022, 12(11), 5465; https://doi.org/10.3390/app12115465
Submission received: 15 March 2022 / Revised: 25 May 2022 / Accepted: 26 May 2022 / Published: 27 May 2022
(This article belongs to the Special Issue Medical Image Processing and Analysis Methods for Cancer Applications)

Round 1

Reviewer 1 Report

This paper evaluates the effect of the size of the volume of interest for the purposes of utilizing its radiomics for classification. The authors generated test data with a phantom emulating a female pelvis, with inserts of various sizes and textures. Three different MRI scanners were then used to scan the phantom. The Scans were then used to generate the radiomics, which were then used for distinguishing between various sized inserts.

The paper attempts to answer a good question, but the methodology requires much more polishing, due to which I am recommending a major revision.

1. When answering a question such as 'what is the threshold', the methodology has to be more rigorous. Ideally, the authors may consider plots such as receiver operating characteristics to evaluate a plot of false positives against false negatives as a function of volume size. In the current state, the threshold of 7.4 seems very ad hoc.
2. There is only a single visualization of the phantom. I recommend adding a larger figure with images of several phantoms, including comparisons from all three scanners for a fixed subject.
3. I do understand the point of the comparison plots -- the authors are essentially isolating each radiomic and comparing classification accuracy for each volume size. However, a more robust classifier would combine all radiomics to output a single classification result. In such a case, perhaps the threshold volume can be decreased? 
4. The authors have alluded to a point that the same subject has varying accuracies across different scanners -- this is a concerning finding and may require further investigation. Specifically, the authors may visualize the subjects causing this inaccuracy, and pinpoint the exact textural feature leading to this finding.
5. The formatting of the paper is not polished. Several figures are cut off -- I believe this is possible because of the figure size. I also wonder if it is possible to combine some of the figures and reduce the overall number of figures. In the current format, it is difficult to evaluate across figures.

Overall a promising paper but requires some attention to detail before the paper can be published.

Author Response

See attached file.

Author Response File: Author Response.pdf

Reviewer 2 Report

The work under review is interesting and important. Not enough sophisticated radiomics phantom studies are being published. However, the authors need to improve the analysis and presentation of their results. What is shown here is a possible first step but to allow drawing conclusions, more energy needs to go into data analysis/visualisation.

Major comments:

Results & Discussion

Generally, the authors need to up their game in terms of analysis and visualisation of results. The figures 2-10 are not very informative. There is hardly any statistical analysis included here. The authors simply look at those figures and make observations from them which is a good start but should not be the end. There is very little novelty shown in terms of phantom development or data acquisition so the authors really need to show their skills in analysing the data.

The first thing I think needs optimising are the dots on those large almost unreadable figures. The dot size is at the absolute lowest limit. The symbols used for the different VOI sizes could represent different sizes! My suggestion is to use circles (no filling) of different sizes so that multiple dots on top of each other can be perceived as such and the largest circle represents the largest VOI etc.

Next, the authors need to come up with a more sophisticated analysis summarising the different comparisons, features etc. One very simple suggestion could be to see if the VOI size correlates with percentage (as shown in fig 2-10) for a given feature? Could you group features and report on GLCM features etc. to see if some matrices are particularly good/poor? I would very much welcome further suggestions made by the authors for their analysis. The authors describe general tendencies and visual observations in the discussion but they are encouraged to develop corresponding ways of putting numbers to their observations and designing analysis and visualisation methods.

Examples:

The authors describe “the general tendency for larger volumes to be more discriminative is confirmed across radiomic features”. Can any correlations be calculated/visualised? Etc etc

“Focusing on feature classes, it is possible to observe an overall trend for First-Order,GLCM, and NGTDM features to have a higher percentage of discriminative radiomic features across VOI sizes, while GLDM, GLRLM, and GLSZM radiomic features show lower percentage  discrimination  for  smaller  VOI  sizes.  Although  several  exceptions  may  be found to these tendencies, such exceptions might likely be a result of noise and artifacts that may affect these experimental results.“ Again, summary statistics per feature group are needed to support this! Visualise it, and then rephrase.

The authors state: “Scanner C, the 3T MRI scanner, demonstrated an overall lower discriminability of radiomic features across sizes, when compared with the 1.5T scanners (scanners A and B). Again, this result is likely due to the higher magnetic field that leads to higher water-fat shift artifacts, as observed by visual inspection,…” This is an extremely interesting observation! Please show example images for your observations. This would be extremely interesting as it shows a possible advantage of lower over higher field scanners (many centres would love to read this). Has something similar been shown previously? Would this encourage you to do further studies investigating this?

Please also explain in more detail and show examples of artifacts around the insert edge that is described in the discussion. Do these artifacts reach the VOIs?

“…it can be inferred that  448  for 1.5 T scanners and for the specific setting considered in the present study (in terms of  449  voxel size, MR sequence and imaged region), most radiomic features extracted from VOI  450  volumes below ~7cm 3  likely fail to discriminate two textures…” Would it not be more appropriate to write “3.3cm3 or smaller as we do not have data for volumes between 7 and 3.3 cm3?

“…If the textures under investigation are very different for the presence of areas  452  with very high and very low signal intensities (like could occur, for example, in necrotic  453  areas or in the presence of contrast medium accumulation) it could be advisable to set a  454  higher VOI volume threshold,…” Is this not exactly what you are showing with mixtures of agar (very high SI) and Styrofoam (very low SI)? I would advise caution in this context and would even suggest any such suggestions are avoided.

M&M

Phantom:

The authors of this manuscript and of the references paper [8] state: the phantom container is “filled with an aqueous solution of MnCl 2  to reproduce the transverse relaxation time T 2 of the abdominal muscle tissue.”

I have not come across this phantom previously but was surprised by the hyperintensity of the MnCl2 in Figure1 which made me look into the referenced paper [8]. Even when placed directly next to a T2-weighted image of a pelvis, the MnCl2 appears very hyperintense. Is this due to completely different window/level setting of the phantom image when compared to the human image?

I am finding it difficult to understand the appearance of the inserts versus the background especially on T2w images wich I think Figure 1 is (agar brighter than MnCl2; please indicate the weighting/contrast in the legend!). In [8] and the authors report longer T2 relaxation time for all inserts including Insert3 (117ms) than the MnCl2 (57ms). Why is Insert 3 then darker on Figure 1 of this paper and also in Figure 6 of paper [8]? Should it not be brighter due to the longer T2 relaxation time? Please forgive me if I am missing an important point here but you investigated this phantom in a lot of detail, please explain where my thinking goes wrong. I do appreciate that phantom development was not part of this work.

 

Table1

It might be worth including the scanner models and sequence types here as well.

 

VOIs

Where were the VOIs placed along the z-axis of the phantom? At 50% of the height of the phantom? Please indicate.

When reading the first two paragraphs of page 5, it is not entirely clear from the wording what exactly was done for this analysis and whether anything was just taken from the Bianchini et al paper. Please change the wording so that it becomes crystal clear.

 

Minor points:

Introduction:

“As a matter of fact, when a tumor originates, the non-linear interactions happening at the genomic level result in intra-tumoral heterogeneity affecting its physiology and anatomy.” Please explain in one or more easier to understand sentence(s) the point you are making here. This will be difficult to understand for readers with an imaging-only background (explain ‘non-linear interactions at the genomic level’ and what they have to do with intratumoral heterogeneity).

Figures:

Starting with Figure 2, the legends are suddenly placed above the figures and often the legend on a page does not match the figure shown on the same page. This needs some reformatting for the next iteration of this work.

 

 

Author Response

See attached file.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors have addressed several of my concerns from the previous version. I would still recommend carefully polishing the look of each figure. They seem quite big at places. 

Author Response

We thank the reviewer for appreciating our efforts in the revision. We now upload a modified version of the manuscript with improved Figures.

We also provide separate full-resolution files for each Figure for possible editing (e.g., further scaling) by the Editorial team, and we are available in case additional modifications are required.

Back to TopTop