Next Article in Journal
Synergistic ZnO–CuO/Halloysite Nanocomposite for Photocatalytic Degradation of Ciprofloxacin with High Stability and Reusability
Next Article in Special Issue
Thin-Section Petrography in the Use of Ancient Ceramic Studies
Previous Article in Journal
Geochronology and Geochemistry of the Galale Cu–Au Deposit in the Western Segment of the Bangong–Nujiang Suture Zone: Implications for Molybdenum Potential
Previous Article in Special Issue
Technosol Micromorphology Reveals the Early Pedogenesis of Abandoned Rare Earth Element Mining Sites Undergoing Reclamation in South China
 
 
Article
Peer-Review Record

Classification of Thin-Section Rock Images Using a Combined CNN and SVM Approach

Minerals 2025, 15(9), 976; https://doi.org/10.3390/min15090976
by İlhan Aydın 1,*, Taha Kubilay Şener 1, Ayşe Didem Kılıç 2 and Hüseyin Derviş 1
Reviewer 1: Anonymous
Reviewer 2:
Minerals 2025, 15(9), 976; https://doi.org/10.3390/min15090976
Submission received: 22 August 2025 / Accepted: 12 September 2025 / Published: 15 September 2025
(This article belongs to the Special Issue Thin Sections: The Past Serving The Future)

Round 1

Reviewer 1 Report (Previous Reviewer 1)

Comments and Suggestions for Authors

The author has carefully revised the paper, and the quality of the paper has been significantly improved. Therefore, the paper can be accepted for publication.

Reviewer 2 Report (Previous Reviewer 2)

Comments and Suggestions for Authors

Dear Authors, I acknowledge your effort to improve the paper. The present form is much improved.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper presents a hybrid Convolutional Neural Network (CNN) and Support Vector Machine (SVM) approach for classifying thin-section rock images, with a focus on sedimentary, metamorphic, and igneous rocks. The method combines the strengths of VGG16 and EfficientNetV2 architectures, demonstrating improved accuracy over individual models. The overall structure and content of the paper are strong, but there are some areas that can be improved for clarity and robustness.

  1. The introduction provides a solid foundation on rock classification techniques but could benefit from incorporating mathematical methods like Fractal Theory.  Referencing relevant studies, such as DOI: 10.1021/acs.energyfuels. 4c03095, would enhance the manuscript by offering a more robust theoretical background.
  2. Figure 2 (RGB brightness distribution) lacks sufficient annotations and key information. I suggest improving the quality of this figure, adding labels, and highlighting key features to make it more informative and representative of the dataset.
  3. The paper mentions the use of certain evaluation metrics (accuracy, precision, recall, F1-score), but it would be beneficial to cite the foundational works or references from which these metrics were derived. If these metrics were created by the authors, providing more context and references to their development would add credibility to their application.
  4. The combined CNN-SVM approach is innovative and promising. However, the paper would benefit from a deeper discussion on the pros and cons of the different methods, such as VGG16 and EfficientNetV2, particularly in comparison to other classification methods. A critical evaluation of their respective advantages and shortcomings will make the discussion more robust, and addressing potential limitations of the hybrid approach will provide a clearer understanding of its performance.
  5. The conclusion should more explicitly highlight the advantages of the proposed method, particularly regarding its accuracy and efficiency in distinguishing between rock types. A more detailed comparison with existing methods and a clearer explanation of how this approach improves upon them will significantly strengthen the final section.
  6. The methodology section is well-organized, but the explanation of feature extraction and the use of the ReliefF algorithm could be expanded. More details about the implementation process of these techniques would make it easier for readers to replicate the study and understand the significance of the feature selection process.
  7. The quality of the figures in the entire text does not meet the publication requirements. Please revise them one by one.

Reviewer 2 Report

Comments and Suggestions for Authors

Dear Authors, the argument of your research is crucial for the geological community and would provide significant support for microscopic-scale analysis.
My review only covers the geological aspects, as my computational skills are limited. Based on my experience with microscopic studies, the entire paper lacks a clear definition of the problem and the methodology referred to. Specifically, it is unclear what you want to identify (e.g., the rock type, such as magmatic, sedimentary, or metamorphic) when classifying an unknown rock image. Secondly, what are the microscopic features on which your classification is based? For example, knowing minerals is not enough to discriminate between rock types; we need to identify objects as clasts or fossils to refer to sedimentary rocks. Even harder is distinguishing between magmatic and metamorphic rocks, as it is the mineral phase type that determines whether a rock is metamorphic or not, rather than its structure, such as foliation.
In your research, I expect a clear definition of the parameters to be checked and which parameters are better recognised. In addition, you mention subgroups, but they are not named? Are they real rock subgroups? Please include this point in both the method and the discussion paragraph.
Throughout the paper, you refer to the ability to recognise mineral phases. Is that true? If so, is that true for what mineral phases? In that case, you have to list them and show the overall ability of the recognition.
Figure 14 reports subgroups of a specific type of sedimentary rock. Is it an example of the same capability for the other rock groups?
Based on this review, I suggest resubmitting your paper after addressing the fundamental problems identified.

Comments for author File: Comments.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This manuscript presents a hybrid deep learning approach (FSHNet) for classifying rock thin-section images, reporting very high accuracy figures on two 'external' datasets. However, the manuscript in its present form suffers from fundamental flaws in its structure, methodological reporting, scientific assertions, and overall presentation. The work lacks the scientific rigor and clarity required for publication in the present form. Key information regarding the experimental setup and datasets is missing, which makes the study non-reproducible and non-verifiable (FAIR). The introduction is poorly structured, the methods section contains both excessive and insufficient information, and several geological and technical assertions are questionable or incorrect. Furthermore, the manuscript is full of grammatical errors and awkward phrasing that impede readability.

Below, you can find specific comments:

1. The introduction chapter is overly long and lacks a clear narrative structure. It reads like a disjointed list of summaries from other papers rather than a focused review that builds a case for the present study. Also, the citation practice is poor! Many paragraphs contain information where it is unclear if the authors are describing their own work, summarizing a previous study, or making a general statement. Several bold affirmations are made without any supporting citation (see my specific comments within the attached .pdf manuscript).

I strongly recommend a comparative table to properly contextualize this study's contribution. This table should summarize previous key studies, detailing the methodologies, the specific datasets used (including the number of images and classes), and the reported accuracies. Comparing model performance is only meaningful if trained and tested on the same, or at least comparable, datasets (magnification of photomicrographs, number and ratio between PPL:XPL images used, type of classes/rocks, etc.).

2. Materials and Methods chapter has severe deficiencies that prevent the study from being reproducible. The MS fails to adhere to FAIR principles. The authors do not state whether the model, code, or feature sets are publicly available. Without access to these materials, the results cannot be independently verified. Furthermore, there is no information on the hardware or software environment used for training (e.g., local machine vs. HPC cluster, GPU type, core libraries and versions). This information is standard and required for assessing computational cost and reproducibility.

3. The origin and composition of the primary dataset are ambiguous. The authors cite reference [37] and state the dataset covers "more than 90% of the common rock types" a claim that is almost certainly an overstatement and lacks evidence. It is never made clear if the authors are using the dataset from [37] directly or if they have constructed their own. Figure 1, for example, is not explicitly attributed in its caption. The authors must provide a clear, unambiguous description of the datasets used (at least in the next chapters, even in the abstract).

4. Another major problem here is that the critical parameters are presented without justification. The selection of the top 500 features using ReliefF appears arbitrary. I think that a sensitivity analysis is needed. Furthermore, the number of nearest neighbors (k) for the ReliefF algorithm is not mentioned, and the range of values tested for the SVM's C parameter during grid search is not specified.

5. One of my major concerns is that the geological descriptions are poor. Classifying igneous rocks as "neutral rocks" is vague and less common than "intermediate rocks". The context provided is insufficient for a specialized journal like Minerals. You have several other 'issues' regarding the geological misunderstandings (see my specific comments in the attached .pdf MS).

6. You state that the dataset contains images from "single polarized and cross-polarized light," with "8 or 9 images for each sample," implying a heavily skewed ratio. The effect of this significant imbalance on model training is never discussed. For example, I surfed on the datasets [37] and on each sub-type we have the ratio 7:1:1 (XPL:PPL:CPL).

7. Also, this chapter 2 is bloated with unnecessary information. Figures 3, 4, and 5, which depict the architectures of well-known models like EfficientNetV2 and VGG16, are superfluous and lack proper citation to their original sources. A simple citation would suffice.

8. The relevance of Figure 2 ("RGB brightness distributions") is highly questionable. Thin-section petrography relies on optical properties like birefringence and extinction, which are not captured by simple RGB histograms. Its inclusion must be justified, or it should be removed.

9. The paper claims that "The extracted features capture distinctive mineral compositions, crystal structures, and textural differences of various rock types...". Determining crystal structures is not possible from 2D optical photomicrographs alone and requires other methods (e.g., XRD). This statement is scientifically inaccurate and must be corrected!!! In my extended opinion, I do not agree with the 'textural' term because image magnification from [37] does not cover enough textural features of main/sub-rock types.

10. A key contribution is listed as achieving high generalization "without the need for any data augmentation". This is an unusual claim that requires significant discussion. The authors should justify this choice and ideally compare it against a model trained with standard augmentation techniques (e.g., rotation, flipping) to prove their assertion.

11. Another major point is that the MS introduces "FSHNet" but its definition is ambiguous. It is unclear if FSHNet refers to the entire pipeline (CNN feature extraction + ReliefF + SVM) or just the final classification stage. This is a source of major confusion in the results section.

12. In Table 3, the "Total training time" for FSHNet is left blank. This is a critical omission. The time required for feature selection and SVM training must be reported for a fair comparison.

13. In the discussion of Figure 9, the authors claim that clusters for EfficientNetV2B0 (Fig. 9b) are "not as clearly separated as those in VGG16". This interpretation is debatable from a visual inspection and needs to be more robustly revised. I clearly see separated clusters!

Other minor concerns:
- the MS has numerous grammatical errors, typos, and awkward sentence constructions that require a thorough language review. Several words are comma-separated!

- there are instances of duplicated or near-duplicated text (e.g., the description of the petrographic microfacies dataset), indicating careless preparation.

- there are two tables labeled "Table 1" (on pages 6 and 12). The caption for the table on page 12 is also incorrect. All tables and figures need to be checked for correct numbering and accurate captions.

- the term "volcanic" is used interchangeably with "igneous," which can cause confusion. Consistent use of the primary category "igneous" is recommended.

Due to the multitude of severe and fundamental issues, including a lack of reproducibility, questionable methodological choices, scientifically inaccurate statements, and poor overall presentation, I cannot recommend this manuscript for publication in Minerals.

Comments for author File: Comments.pdf

Back to TopTop