Identification of AI-Generated Rock Thin-Section Images by Feature Analysis Under Data Scarcity

Habrat, Magdalena; Dwornik, Maciej

doi:10.3390/app15158314

Open AccessArticle

Identification of AI-Generated Rock Thin-Section Images by Feature Analysis Under Data Scarcity

by

Magdalena Habrat

^*

and

Maciej Dwornik

Faculty of Geology, Geophysics and Environmental Protection, AGH University of Krakow, al. Mickiewicza 30, 30-059 Kraków, Poland

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(15), 8314; https://doi.org/10.3390/app15158314

Submission received: 4 July 2025 / Revised: 22 July 2025 / Accepted: 24 July 2025 / Published: 25 July 2025

Download

Browse Figures

Versions Notes

Abstract

An important aspect of geoscience and energy research is the analysis of microscopic images, where the assessment of rock properties combines imaging methods with numerical analysis. Given the significant advancements in generative artificial intelligence technologies in recent years, which have enabled the creation of realistic images, a need arises to assess the authenticity of synthetic visual data compared to authentic geological data images. This article evaluates the potential for identifying artificially generated microscopic rock images. Synthetic images were generated using a widely accessible diffusion model, based on real training data. Expert evaluation noted high realism, though some structural and rock-type differences remained detectable. In the study, image descriptors were analyzed to assess their usefulness in distinguishing synthetic data from real data. Discriminative feature selection was conducted, and the effectiveness of various classification models based on the selected parameter sets was compared. The study also proposes a heuristic coefficient demonstrating discriminative potential for the analyzed images. The results confirm the feasibility of building classifiers for synthetic images that could aid in detecting generated visual data in geological and petrographic research. They also serve as a foundation for further exploration of the importance of individual features in such applications.

Keywords:

generative artificial intelligence; microscopic rock images; diffusion models; image descriptors; data science; feature engineering

1. Introduction

In recent years, there has been rapid advancement in generative artificial intelligence (GAI) technologies, which enable the creation of realistic images, texts, and other types of synthetic data. GAI models, such as Generative Adversarial Networks (GANs) and diffusion models, are increasingly being applied across various fields of science and research technology [1]. In the domain of geosciences, this development creates new opportunities for image-based analysis, particularly in cases where access to empirical data is limited or expensive. One of the key aspects where image data is utilized in geology is the microscopic analysis of rocks. Microscopic techniques enable the identification and characterization of mineral components and textural structures, which are fundamental in lithology, petrography, and geochemistry studies [2]. This analysis can be supported by computational methods [3], including the classification of microscopic rock components [4], coal maceral identification [5], or even unconventional image-based search approaches [6,7]. Computer image analysis of microscopic thin-section images enables the precise determination of the area content of individual mineral phases. Applying machine learning methods in geological image analysis offers powerful tools to support scientific research. In particular, they facilitate advancements such as rock texture classification [8,9] and lithological identification [10,11]. However, in most cases, due to the high variability of rocks, manual adjustment of the code to the given rock and mineral type, or at least additional verification, is required. Also, effectively implementing these solutions requires large, diverse, and well-annotated datasets. Unfortunately, access to high-quality field data is often limited, particularly in remote or deeply situated locations, and rare geological structures appear infrequently. Moreover, obtaining empirical data requires a significant time investment and is not always possible (very rarely minerals or rocks). In response to these limitations, synthetic data is being increasingly used to train and test interpretive algorithms. Generative models enable the creation of images that replicate complex geological configurations, such as faults, uncommon stratigraphic sequences, or mineral deposits [12,13]. This promotes greater diversity in datasets and improves the modeling of rare or hard-to-access scenarios [14,15]. Synthetic data is also widely used in data augmentation as a method for increasing the representativeness of small datasets and enhancing the performance of training algorithms [16,17,18,19,20]. The development of generative artificial intelligence enables the transformation of one data modality into another, depending on the specific task. This includes models such as text-to-image (creating images based on textual descriptions), image-to-image (transforming images using patterns from other images), and text-to-code or text-to-text [21]. The growing accessibility of such tools, including pre-trained models available through open repositories like Hugging Face, means their use is no longer limited to specialists but is increasingly accessible to non-domain users. As the realism of synthetic images increases, a new risk emerges: the possibility of their unintentional use as real data. GAI models can generate images that replicate real-world patterns with such a high degree of realism that they may be mistaken for authenticity. The research into synthetic data detection is gaining increasing importance [22], particularly in fields such as medicine [23], fake information detection [23], and the identification of generated human faces [24,25]. While initial efforts are also emerging in geoinformation data, such as satellite image analysis [26], little attention has been paid to microscopic images in environmental sciences. This can influence the reliability of scientific or engineering analyses. While employing synthetic geological imagery has clear practical benefits (e.g., as an augmentation method [16,20]), several concerns should be considered. The authors acknowledge that as the use of synthetic data expands, there is a risk of unintentional mixing of artificial and authentic geological datasets, especially considering that geological data are frequently stored locally (e.g., on servers, drives, or USB devices) among researchers. Consequently, synthetic data, often convincingly realistic, could inadvertently enter standard stereological analysis workflows, introducing significant inaccuracies. It is essential to recognize that synthetic images represent statistical configurations of pixels designed to resemble authentic geological data rather than a faithful record of geological processes occurring over the years. Incorporating such data into analyses, such as grain size measurements, grain orientation assessments, or anisotropy evaluations, could lead to misleading geological descriptors that propagate errors through subsequent multi-dimensional analyses and conclusions. An additional critical concern raised is the potential loss of natural variability within datasets. Specifically, if machine learning models predominantly learn from synthetic data (in the long-term perspective), their ability to accurately interpret genuine geological images might deteriorate. Moreover, despite privacy issues being less prevalent in geological contexts compared to social data, proprietary restrictions remain common, such as those imposed by oil corporations or due to strategic confidentiality, thus complicating the unrestricted sharing and validation of data. In the Earth sciences and energy sector, where data-driven analysis is crucial for reliable scientific conclusions, it is vital to develop robust methods for detecting synthetic data.

Therefore, this study aims to highlight these challenges by presenting basic, initial, and exploratory research, and encouraging the scientific community to start a discussion about robust methodologies capable of identifying critical geological features within mixed datasets of real and synthetic imagery. This study aims to analyze the feasibility of detecting synthetic microscopic images of rocks using global features and various classification models. This research serves as a starting point for further studies on detecting falsified image data and identifying key features that distinguish such data from real images.

2. Materials and Methods

The assessment of identification capability relied on analyzing feature sets derived from real and generated images. Empirical evaluations were conducted on parameters, and discriminative models were employed. Following this, classifiers were developed using various feature sets to evaluate their discrimination effectiveness.

2.1. Data Acquisition

Data acquisition involved using microscopic rock samples as training data for a generative text/image to image model [27]. The dataset utilized real-world samples of thin sections of rocks from various geological units. A rock thin section is a physical rock sample prepared for microscopic observation. The preparation process includes selecting a fragment of rock, cutting it, grinding the surface, and mounting it onto a microscope slide with a specialized adhesive. The sample is then further abraded and polished (to 0.02 mm thickness) to enable precise imaging under the microscope. The images were obtained using a polarized light microscope at a constant magnification (10×). The resulting images depicted various geological structures, such as fractures, conglomerates, and grains, that can be used in geological and petrographic analyses. An example of a real and a synthetic microscopic image of Dolomite is presented in Figure 1 and Figure 2.

The synthetic images were generated using a widely accessible tool that implements a diffusion-based algorithm, characterized by a relatively simple user interface [28]. The paid version of the tool included the im2im model fine-tuning feature to simulate usage by individuals without programming expertise. According to documentation [28], the application utilizes Stable Diffusion algorithms to fine-tune the model based on user-provided data. No additional fine-tuning or manual hyperparameter configuration is required by the user. However, some options are available for configuration [27]. The user simply uploads their own data to the tool, which is then used to train the model accordingly. The authors emphasize the importance of data privacy, particularly in the context of utilizing cloud-based tools rather than locally hosted solutions. Given that the platform employed in this study operates in the cloud, it is not recommended for use with protected or confidential datasets. Consequently, the authors deliberately limited the dataset used in this work to a small subset of non-sensitive samples from their broader collection. However, the authors opted to use this tool due to the high quality of the generated imagery and the ease with which image-to-image (img2img) models could be finetuned. Based on a generative model, this tool enables the creation of high-quality, highly realistic images. The option to train a custom text/image-to-image model was selected to generate synthetic images. The model was trained using authentic images, based on 20 samples of a given rock, along with the textual prompt “microscopic images” [27], which was the same for all rock types. Separate models were fine-tuned independently for each geological unit. Key generation parameters included the following: model weight—indicating the influence of the trained model during image generation (ranging from 1.08 to 1.5 for the tested rocks); steps—the number of denoising iterations (fixed at 28 for all models); scale—guidance scale, controlling prompt adherence; and seed—a numerical value initializing the random number generator, unique for each image. The seed parameter was deliberately varied for every generated image. Due to the deterministic nature of diffusion-based algorithms, using the same seed along with an identical prompt and model configuration can yield the exact same image. This behavior ensures reproducibility and control over synthetic output. The fine-tuned models for each lithological unit used the following parameter ranges (Figure 1 and Figure 2): Dolomite I: Model weight: 1.5, Scale: 2 to 5; Dolomite II: Model weight: 1.5, Scale: 10; Limestone: Model weight: 1.17 to 1.26, Scale: 7 to 15; Schist: Model weight: 1.08, Scale: 3.5. All models shared consistent image dimensions and step count, while only the seed and scale varied across images to ensure diversity and detail retention and to balance between creativity and guided generation.

Upon completing the training process, it was possible to generate new data for the given rock type, using a minimal prompt consisting of a single white space character (to minimize the influence of the model’s predefined behavior). A total of 20 synthetic images were generated for each rock type, resulting in a dataset of 160 images (80 authentic and 80 generated) used in the study. An example of an image from a given group and its corresponding synthetic equivalent is presented in Figure 2.

A visual similarity is apparent. The generated images exhibited a relatively high level of realism for individuals without specialized knowledge in petrography; however, visual differences were noticeable to experts. These differences primarily concerned structural features, such as the arrangement of grains within the rock. It was empirically observed, among other things, that the diffusion algorithm tended to generate smaller grains with more vivid colors. Individuals with geological expertise compared real and synthetic images. The expert evaluation indicated that the best model fit was achieved for images of Dolomite (Figure 3a), followed by Dolomite II, Schist, and the weakest fit for Limestone. In the case of Dolomite images, specialists did not observe significant differences. However, Limestone’s synthetic images were visible to geologists, who could readily identify them as not representing actual rock structures (Figure 3b).

Standard image processing transformations were applied to prepare the data for feature extraction, a step necessary for further analysis. The images were converted to grayscale, and direct features were extracted. Where necessary, the Fourier Transform (FFT) was applied to analyze frequency-domain characteristics in both the central and peripheral regions of the image. Additionally, segmentation was conducted to measure coherent color clusters. After blurring the image, grayscale-level clustering (across 255 levels) was performed, followed by contour detection (using the Sobel operator), labeling, and calculating the geometric features of objects (grains) [6]. This procedure was based on the hypothesis that generated images may exhibit more stable color clusters, which could support their identification.

2.2. Feature Extraction and Selection

A range of global and local parameters was calculated from the preprocessed images. In total, 70 parameters were computed, consisting of 33 object-based descriptors and 37 global texture descriptors. The texture parameters were calculated based on the main thematic areas indicated in Table 1, including colour, texture, and spectral characteristics.

For the segmented images, geometric features were extracted for objects representing color clusters (interpreted as grains). The features of each detected object were computed: area, eccentricity, solidity, extent, and perimeter. As the number of objects varied between images, these features were consolidated into a standardized global feature vector. This was achieved by computing statistical measures for each feature: mean, standard deviation, minimum, maximum, skewness, and kurtosis, resulting in a total of 30 descriptors. Additionally, the proportion of objects exceeding the median value (fraction large) was calculated for the area feature, total number of objects, number of characteristic clusters (determined using the elbow method), and constituted the 30th–33rd object-based parameter. During the feature extraction quality assessment stage, remediation techniques were applied; for example, a small constant value (eps) was added to the denominators to avoid division by zero. In selected cases, min-to-max normalization was used. Features were verified both manually (using an exploratory approach, visualizations, and class separability analysis) and through statistical and model-based methods:

Kruskal-Wallis test—a non-parametric alternative to ANOVA, which theoretically does not assume normality of distribution.
Random Forest—a classifier with feature importance evaluation—assessed the impact of individual features using the permutation method (OOBPermutedPredictorDeltaError, MATLAB R2024b), accounting for nonlinear relationships. However, this method is susceptible to missing data or inadequate remediation, particularly in features based on segmentation of initial color clusters, where division by zero may occur.

2.3. Image Type Identification

A comparison was made to evaluate the ability to identify image types (real vs. generated) between selected basic classifiers and different sets of input features used to train them. Six classification models were employed to assess the effectiveness of distinguishing real images from generated ones, based on feature vectors describing image structure. The data were split into a training set (80%) and a test set (20%), ensuring class balance was maintained. The classifiers used are presented in Table 2. A total of 36 classification scenarios were applied (Section 3), differing in the scope and type of features used. The top features are according to the Kruskal-Wallis test, and feature selection is based on the lowest p-values in the non-parametric Kruskal - Wallis test. Top features according to Random Forest-selected features are based on their importance determined via the permutation method. All-features scenario is a classification using the complete feature set without any selection. Global features only (named texture-based features) are limited to features based on texture and frequency-domain parameters, and object-based features only use geometric features derived from object analysis after segmentation. The dimensionality reduction scenario was based on all features transformed using Principal Component Analysis (PCA).

3. Results and Discussion

The study examined 160 images across four geological groups, with 20 real and 20 synthetic images for each group. A total of 70 parameters were tested, encompassing texture feature analysis, frequency-domain characteristics, and object-based features. Tests assessing the informativeness of the parameters yielded varying results. In the authors’ opinion, both the Random Forest test and the Kruskal-Wallis test proved to be relatively practical tools in the feature identification process in this study, as they enabled automatic ranking that was relatively consistent with the empirical structure of the data. Figure 4 presents feature rankings obtained using the Kruskal-Wallis (KW) test and the Random Forest (RF) method.

It is worth noting that in both the Kruskal-Wallis test and the Random Forest method, the features with high statistical significance aligned with the intuition of geological experts, highlighting differences in grain size and texture correlation. It is essential to note that, when analyzing object-based features, the uniform segmentation algorithm frequently returned single-pixel objects or failed to accurately separate structures. This is particularly relevant, as geological images represent a challenging case for segmentation, usually requiring dedicated, specialized segmentation methods tailored to different rock types. Introducing a universal and robust segmentation algorithm into the pipeline would likely improve the accuracy of the discriminative models. In the present study, this step was omitted in favor of a simplified approach: the aim was to analyze the size of coherent color clusters as a potentially distinguishing indicator between synthetic and real data. Features based on the statistics of these clusters often achieved high informational value; however, it should be emphasized that they were also sensitive to missing data, particularly in cases where segmentation failed to identify any coherent objects. This necessitated decisions regarding the elimination of certain features or samples. It was also observed that some features varied in their ability to distinguish between real and generated images. A consistent trend across all analyzed groups was the noticeable dispersion of feature values in real images. In contrast, generated images exhibited greater uniformity, with their values tightly clustered around the mean.

Interesting results were observed during the empirical analysis of the parameters. A particularly noteworthy finding was the relatively high separability of the DFT Center To Edge Ratio, C/E (Equations (1) and (2)) parameter, calculated in the frequency domain. This feature was computed as the central to peripheral energy ratio, expressed as a function of radius r (Equations (1) and (2)). It distinguished real from synthetic images across multiple geological groups, likely due to the higher regularity and symmetry characteristic of generated images. As the radius increases, more low- and mid-frequency components are included in the central region, reducing the share of peripheral energy.

C / E (r) = \frac{E_{c e n t e r} (r)}{E_{e d g e} (r) + ε}

(1)

E_{e d g e} (r) = E_{t o t a l} - E_{c e n t e r} (r)

(2)

This parameter was calculated within a fixed analysis window, defined as a circle with a radius of 80 pixels (Figure 5). This setup enabled a direct comparison of energy between the center and the periphery of the Fourier spectrum within the same image.

Although features such as skewness and total spectral energy remained relatively similar between the two classes, the contrast between central and peripheral spectral energy (i.e., low vs. high frequencies) proved to be significant. Figure 6 also illustrates example images in the frequency domain, with white circular masks marking the central energy region for radii of 100 and 399 pixels.

A higher concentration of high-frequency components is observed within the same masked area for generated images compared to real ones. With an increasing radius, the edge energy difference diminishes (Figure 7a), while the ratio of central-to-edge energy (Figure 7b) increases. The overall trend is similar for both real and generated images; however, the energy values are consistently higher for the generated images, making the differences distinguishable.

Both parameters can support feature engineering and may be incorporated into a heuristic indicator. For example, a clear relationship was observed in which the product of the C/E coefficient and the skewness of eccentricity effectively separated the analyzed image set. On average, this combined metric showed negative values for real images and positive values for generated ones (Figure 8). This suggests the possibility of defining a relatively dimensionless heuristic coefficient to separate not only by image type but also by rocks type. A heuristic coefficient of this kind could prove particularly useful in practice, as it offers a dimensionless, easily computable indicator for rapid preliminary detection, potentially reducing the need for complex, data-specific models such as deep learning architectures. While the metric should be further developed, evaluated, and validated, for instance, by exploring the universality of segmentation methods used to extract eccentricity parameters, its practical value as a lightweight screening tool (e.g., in automated image analysis pipelines, particularly in scenarios where computational resources or annotated data are limited) seems to be already apparent.

There are additional features that can help distinguish between real and generated images. In the authors’ opinion, although these features may not demonstrate as dimensionless a separation as the dependence of edge energy on the radius of the central mask in the frequency domain (DCT), they still represent a valuable source of information. They can be used to develop custom classifiers. Box plots of selected features normalized values are shown in Figure 9 (texture) and Figure 10 (objects).

Another interesting case involves using a disproportionality coefficient based on Benford’s Law (Figure 9b, Equations (3) and (4)), which describes the probability distribution of the first digits in numerical data. For the analyzed dataset, Benford Deviation values were slightly but consistently higher for generated data, suggesting subtle differences in numerical structure between natural and synthetic images. This may be an intriguing starting point for further data authenticity or provenance analyses. The following steps were performed to calculate this feature: the grayscale image was converted into a vector, the absolute values of pixel intensities were extracted, and each value was rounded down to the nearest integer. Zeros were removed, since zero cannot appear as the first digit and is not covered by Benford’s Law. The frequency of occurrence of digits 1 to 9 as the leading digits was then computed. Finally, the sum of differences between the observed distribution and the theoretical Benford distribution [29] was calculated (Equations (3) and (4)).

P_{e x p} (i) = {l o g}_{10} (1 + \frac{1}{i}), i = 1, \dots, 9

(3)

B e n f o r d D e v i a t i o n = \sum_{i = 1}^{9} | P_{o b s} (i) - P_{e x p} (i) |

(4)

In analyzing object-based features (Figure 10) derived from early-stage color clusters, several parameters enabled the distinction between real and generated images. One particularly relevant feature was Eccentricity, especially its skewness and mean. Another interesting finding concerned differences in the mean area metric (average object area), where synthetic data exhibited a lower median and reduced variability compared to real data. Although the absolute values varied across rock types, the key observation was the consistency of the overall trend rather than the specific results for individual samples. A similar pattern was observed for the number of objects metric, interpreted as the number of detected homogeneous clusters. In all analyzed samples, generated images consistently exhibited lower values for this metric than real images. This may indicate lower structural diversity in synthetic data, simplified morphology, and a tendency of the generative model to produce more homogeneous regions without subtle textural transitions. The most pronounced discrepancies were observed in the Schist and Dolomite II samples, suggesting that generative algorithms may struggle to replicate complex geological structures, consistent with expert observations. The absence of a higher number of object values, typical of real images, may reflect the model’s limitations in capturing local morphological nuances and its tendency to smooth out content, leading to cluster merging and a reduced cluster count.

To evaluate the effectiveness of selected features in classifying images as “real” or “generated,” a comparative experiment was conducted using six different classifiers. The dataset was split into training and test sets in an 80/20 ratio. Models were trained and evaluated. Figure 11 presents the confusion matrices obtained from classification using all available features, without remediation after the prior removal of non-informative features containing unmeasurable values (e.g., NaN). The results obtained were satisfactory. Only a few individual images were misclassified, except for the Random Forest classifier, which correctly classified all images without confusion. The presence of NaN values is directly linked to segmentation challenges, particularly when using a universal (i.e., identical) segmentation method across all image types and rock categories. In geological practice, segmentation methods are often adapted to specific rock types due to the need to detect different geological characteristics, such as anisotropy, grain size distribution, or tortuosity. Therefore, applying a single segmentation strategy may lead to suboptimal feature extraction in some cases, resulting in missing values (NaNs). The following (Figure 12) shows the classification results after applying a remediation strategy, which involved imputing missing NaN values with the mean value calculated within the same group (defined by rock type and image type). Further improvements would likely require either tailored segmentation strategies that better reflect the geological diversity of the samples or the development of a highly universal method, both of which remain significant challenges and represent intriguing, albeit niche, scientific problems.

As shown in Figure 12, the application of remediation, i.e., retaining all features and samples in the dataset and replacing missing values with typical values for the corresponding group, significantly improved classification results. The findings demonstrate that even a simple imputation method, when appropriately tailored to the data context, can substantially enhance model performance and reduce information loss caused by the prior exclusion of features or samples containing missing values.

The highest accuracy (Accuracy > 95%) on the validation set was achieved in configurations such as: texture-only features, all features, and features selected using the Random Forest method. High performance was also achieved using Random Forest, decision tree, Naive Bayes, and neural network models. In these cases, precision and recall exceeded 90%, indicating strong effectiveness in distinguishing between generated and real images. On the other hand, SVM and KNN models, in many configurations, particularly when using PCA, the complete feature set, or features selected via the Kruskal-Wallis method, yielded significantly lower results, often falling below 60%. This suggests that these models were more sensitive to data representation or do not perform well with certain feature sets. In the case of feature selection using the Kruskal-Wallis method, outstanding results were obtained with Random Forest and decision tree models. Although KNN performed well overall, it showed sensitivity to feature selection methods, such as PCA or Kruskal-Wallis, where performance occasionally dropped below 60%. In contrast, feature selection using the Random Forest method enabled almost all models to achieve strong results, confirming the effectiveness of this approach for selecting features for classification. Random Forest, both as a model and feature selection method, proved to be the most stable solution. High precision, recall, and F1-score values in most configurations indicate that the evaluated classifiers effectively distinguished between generated and real images, assuming a consistent generative algorithm and supervised learning framework. The limited size of the dataset used in this study introduces an inherent risk of overfitting, particularly in deep learning applications. While this work represents a preliminary exploration, the main objective was to demonstrate a potential processing pipeline and to explore direct representations of features, as well as the heuristic parameters, which in this case showed promising discriminative potential. Model outputs were primarily assessed through manual inspection of results. However, in real-world analytical scenarios, the risk of overfitting must be addressed more rigorously. Overfitting may manifest through excellent performance on training data but poor generalization to new or unseen samples, potentially leading models to capture dataset-specific noise rather than meaningful geological structures. Future studies should implement appropriate validation strategies, such as cross-validation, regularization, and testing on independent datasets, to ensure reliable performance and avoid misleading interpretations, especially when synthetic data are incorporated into geological analysis pipelines.

4. Conclusions

This study addressed the problem of authenticity detection in microscopic images of rocks, focusing on distinguishing synthetic data generated using a diffusion model from real data. An analysis of various features was conducted, encompassing textural, geometric, and frequency-domain parameters to identify effective indicators for classification. The results indicate that frequency-domain features, particularly the central-to-peripheral spectral energy ratio, are especially interesting. Additional valuable insights were provided by the geometric features of initial color clusters (e.g., eccentricity, cluster size distribution, and count), as well as textural metrics such as gradient-based gray-level co-occurrence matrix (GDLCM) correlation and deviation from Benford’s Law. Relatively high classification performance was achieved, with accuracy exceeding 90% in specific configurations. These results naturally depended on the selected feature set and classifier configuration. The best performing setup achieved an accuracy above 95%, with precision, F1, and recall, particularly when Random Forest discrimination and classification were applied. These findings highlight the discriminative potential of carefully chosen descriptors across multiple feature domains.

This study is among the first systematic efforts to identify synthetic elements in microscopic geological images by utilizing handcrafted features and traditional machine learning models. The authors believe this research topic to be forward-looking, despite the issue not yet being widely recognized within geology. It is evident that other fields, particularly those closely tied to everyday life, such as the risks associated with deepfakes or the direct impact on individual privacy, are more immediately susceptible. The risks associated with using artificially generated images in geology, for instance, and on social media are fundamentally different. Unlike social media, artificially generated geological images do not pose immediate direct threats, such as bullying or personal harm. The authors see great potential in applying GAI techniques to geology and expect their use to grow in the future. However, if synthetic images are inadvertently mixed with real datasets, especially without proper labeling or validation, there is a risk of introducing biases or misleading signals into downstream models. This could compromise the reliability of geological interpretations, particularly in tasks such as grain-size analysis, phase classification, or structural anisotropy measurements. Thus, the authors have initiated research on this subject, presenting preliminary findings based on a simplified and relatively small dataset to stimulate scientific discussion. In the author’s opinion, the use of artificial images is expected to increase. Such functionality could be particularly valuable, for example, in automating stereological measurements, by generating synthetic models that augment datasets, thus improving machine learning model capabilities when limited real data are available. Practically, the suggested approaches could enhance automated quality control within digital petrography processes, aiding researchers and analysts in confirming the validity of microscopy datasets employed in scientific modeling or resource evaluation. The proposed workflow may be especially valuable for researchers and practitioners working with two-dimensional images of rock thin sections and datasets in geoscience, such as in early-stage geological studies or in cases where sample collection is logistically challenging or resource-intensive. The proposed solution may serve not primarily as a security mechanism, but rather as a quality control tool that may support validation workflows, especially in automated image-based geological analyses. This includes fields like sedimentary petrology, mineral characterization, and automated stereology analysis. The authors emphasize that while generative artificial intelligence presents numerous valuable prospects, it also introduces risks associated with the generation and use of counterfeit data. The authors believe that future studies could explore the behavior of key parameters, such as the spectral energy ratio, in greater depth and expand the analysis using a significantly larger dataset of both real and synthetic images. It could confirm the current findings across larger and more diverse datasets, as well as test the framework against outputs from a variety of generative models—incorporating a broader range of generative models, such as GAN or VAE-based, would allow for a more detailed investigation of their impact on the performance and generalizability. While the current findings rely on a controlled dataset, the presented pipeline and methods are applicable in a broader context. Ultimately, the techniques introduced in this study may serve as a robust basis for future research and the development of reliable methodologies for distinguishing synthetic from authentic images in geoscientific analysis, a capability that is likely to become increasingly critical in the coming years, given the rapid advancement and broad accessibility of artificial intelligence technologies.

Author Contributions

Conceptualization, M.H.; methodology, M.H. and M.D.; software, M.H.; validation, M.D.; formal analysis, M.H. and M.D.; investigation, M.H. and M.D.; resources, M.D.; data curation, M.H.; writing—original draft preparation, M.H.; writing—review and editing, M.H. and M.D.; visualization, M.H.; supervision, M.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The datasets generated during and analyzed during the current study are not publicly available due to their acquisition as part of individual research efforts and their ownership by AGH University of Science and Technology.

Acknowledgments

This work was financed within the framework of the statutory research of the AGH University of Krakow, Faculty of Geology, Geophysics, and Environmental Protection.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, Z.; Zhang, J.; Zhang, X.; Mai, W. A comprehensive overview of Generative AI (GAI): Technologies, applications, and challenges. Neurocomputing 2025, 632, 129645. [Google Scholar] [CrossRef]
Mishra, D.A.; Ram, B.K. A review on evaluation of microstructural parameters to estimate the strength of virtually isotropic rock materials. Geotech. Geol. Eng. 2024, 42, 4627–4649. [Google Scholar] [CrossRef]
Long, T.; Zhou, Z.; Hancke, G.; Bai, Y.; Gao, Q. A review of artificial intelligence technologies in mineral identification: Classification and visualization. J. Sens. Actuator Netw. 2022, 11, 50. [Google Scholar] [CrossRef]
Santoro, L.; Lezzerini, M.; Aquino, A.; Domenighini, G.; Pagnotta, S. A Novel Method for Evaluation of Ore Minerals Based on Optical Microscopy and Image Analysis: Preliminary Results. Minerals 2022, 12, 1348. [Google Scholar] [CrossRef]
Skiba, M.; Młynarczuk, M. Identification of macerals of the inertinite group using neural classifiers, based on selected textural features. Arch. Min. Sci. 2018, 63, 827–837. [Google Scholar]
Habrat, M.; Młynarczuk, M. Granulation-Based Reverse Image Retrieval for Microscopic Rock Images. In International Conference on Computational Science; Springer International Publishing: Cham, Switzerland, 2020; pp. 74–86. [Google Scholar]
Ładniak, M.; Młynarczuk, M. Search of visually similar microscopic rock images. Comput. Geosci. 2015, 19, 127–136. [Google Scholar] [CrossRef]
Wang, C.; Li, P.; Long, Q.; Chen, H.; Wang, P.; Meng, Z.; Zhou, Y. Deep learning for refined lithology identification of sandstone microscopic images. Minerals 2024, 14, 275. [Google Scholar] [CrossRef]
Zhang, Y.; Li, M.; Han, S.; Ren, Q.; Shi, J. Intelligent identification for rock-mineral microscopic images using ensemble machine learning algorithms. Sensors 2019, 19, 3914. [Google Scholar] [CrossRef]
Ma, W.; Han, T.; Xu, Z.; Lin, P. Feature fusion of single and orthogonal polarized rock images for intelligent lithology identification. AI Civ. Eng. 2025, 4, 5. [Google Scholar] [CrossRef]
Xu, Z.; Shi, H.; Lin, P.; Liu, T. Integrated lithology identification based on images and elemental data from rocks. J. Pet. Sci. Eng. 2021, 205, 108853. [Google Scholar] [CrossRef]
Abdellatif, A.; Elsheikh, A.H.; Busby, D.; Berthet, P. Generation of non-stationary stochastic fields using Generative Adversarial Networks. arXiv 2022, arXiv:2205.05469. [Google Scholar] [CrossRef]
Hadid, A.; Chakraborty, T.; Busby, D. When geoscience meets generative AI and large language models: Foundations, trends, and future challenges. Expert Syst. 2024, 41, e13654. [Google Scholar] [CrossRef]
Pierdicca, R.; Paolanti, M. GeoAI: A review of artificial intelligence approaches for the interpretation of complex geomatics data. Geosci. Instrum. Methods Data Syst. Discuss. 2022, 11, 195–218. [Google Scholar] [CrossRef]
Zhang, W.; Gu, X.; Tang, L.; Yin, Y.; Liu, D.; Zhang, Y. Application of machine learning, deep learning and optimization algorithms in geoengineering and geoscience: Comprehensive review and future challenge. Gondwana Res. 2022, 109, 1–17. [Google Scholar] [CrossRef]
Ferreira, I.; Ochoa, L.; Koeshidayatullah, A. On the generation of realistic synthetic petrographic datasets using a style-based GAN. Sci. Rep. 2022, 12, 12845. [Google Scholar] [CrossRef]
Nathanail, A. Geo Fossils-I: A synthetic dataset of 2D fossil images for computer vision applications on geology. Data Brief 2023, 48, 109188. [Google Scholar] [CrossRef]
Arlovic, M.; Damjanovic, D.; Hrzic, F.; Balen, J. Synthetic Dataset Generation Methods for Computer Vision Application. In Proceedings of the 2024 International Conference on Smart Systems and Technologies (SST), Osijek, Croatia, 16–18 October 2024; pp. 69–74. [Google Scholar]
Saif, A.; Alnagi, E.; Ahmad, A. Texture-Based Classification of Geo-Fossils. In International Conference on Information Integration and Web Intelligence; Springer: Cham, Switzerland, 2025; pp. 226–236. [Google Scholar]
Qaderi, S.; Maghsoudi, A.; Pour, A.B.; Rajabi, A.; Yousefi, M. DCGAN-Based Feature Augmentation: A Novel Approach for Efficient Mineralization Prediction Through Data Generation. Minerals 2025, 15, 71. [Google Scholar] [CrossRef]
Bengesi, S.; El-Sayed, H.; Sarker, M.K.; Houkpati, Y.; Irungu, J.; Oladunni, T. Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers. IEEE Access 2024, 12, 69812–69837. [Google Scholar] [CrossRef]
Corvi, R.; Cozzolino, D.; Zingarini, G.; Poggi, G.; Nagano, K.; Verdoliva, L. On the detection of synthetic images generated by diffusion models. In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
Vayadande, K.; Sawant, A.; Pawar, A.; Rajurkar, S.; Shendre, R.; Waghmode, J. Enhanced Detection of Deep Fakes: Exploring Advanced Approaches. In Proceedings of the 2024 International Conference on Sustainable Communication Networks and Application (ICSCNA), Theni, India, 11–13 December 2024; pp. 1613–1620. [Google Scholar]
Cao, Y.; Chen, J.; Huang, L.; Huang, T.; Ye, F. Three-classification face manipulation detection using attention-based feature decomposition. Comput. Secur. 2023, 125, 103024. [Google Scholar] [CrossRef]
Chauhan, R.; Popli, R.; Kansal, I. A comprehensive review on fake images/videos detection techniques. In Proceedings of the 2022 10th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 13–14 October 2022; pp. 1–6. [Google Scholar]
Abady, L.; Cannas, E.D.; Bestagini, P.; Tondi, B.; Tubaro, S.; Barni, M. An overview on the generation and detection of synthetic and manipulated satellite images. APSIPA Trans. Signal Inf. Process. 2022, 11, 1–56. [Google Scholar] [CrossRef]
Młynarczuk, M.; Habrat, M. Generating microscopic images of rocks using generative artificial intelligence (GenAI). Earth Sci. Inform. 2025, 18, 1–16. [Google Scholar] [CrossRef]
Diab, M.; Herrera, J.; Chernow, B.; Mao, C. Stable Diffusion Prompt Book, 2022, OpenArt. Available online: https://openart.ai/blog (accessed on 21 July 2025).
Sambridge, M.; Tkalčić, H.; Jackson, A. Benford’s law in the natural sciences. Geophys. Res. Lett. 2010, 37, L22301. [Google Scholar] [CrossRef]

Figure 1. An example of a real (a) and a generated (b, Seed: 1069438814, Steps: 28, Weight: 1.5) microscopic image of Dolomite.

Figure 2. Examples of real images from selected geological groups and their corresponding synthetic equivalents generated using a diffusion-based model [28].

Figure 3. Examples of the best and worst image generation: (a) best-generated image, misidentified by a geologist as real; (b) worst-generated image, clearly recognized by a geologist as synthetic.

Figure 4. Feature importance ranking scatter for the Kruskal-Wallis test and Random Forest classifier.

Figure 5. Example of frequency spectrum images with a fixed circular mask (white dashed circle, r = 80 px) for a real (a) and a generated (b) image, using a Dolomite sample.

Figure 6. Example of frequency spectrum images with a fixed circular mask (white dashed circle, r = 100, 399) for a real (a,c) and a generated (b,d) image, using a Dolomite sample.

Figure 7. Dependence of edge energy (a) and C/E (b) on radius of central mask.

Figure 8. The product of the C/E coefficient and the skewness of eccentricity of generated and real images.

Figure 9. Examples of box plots for selected texture-based features.

Figure 10. Examples of box plots for selected object-based features.

Figure 11. Confusion matrices for various classification models—all features included, with automatic NaN removal.

Figure 12. Metrics (y-axis, legend) for selected classification models (x-axis) using NaN remediation (via group-wise moving averages), sorted in descending order by F1-score.

Table 1. A grouped list of selected global feature categories, along with a description of the calculation method.

Feature	Description
1. Global Entropy	Image-wide entropy (computed on an 8-bit image)—a measure of brightness disorder.
2. Histogram Entropy	Luminance histogram entropy—analysis of the distribution of grayscale levels.
3. Benford Deviation	Deviation from Benford’s law for the first digits of pixel values—a statistical anomaly indicator.
4. Colorfulness	Measure of color saturation and diversity—based on the color axis derived from opponent color channels.
5. Hue Histogram Peaks	Number of peaks in the hue histogram (H in HSV)—color complexity.
6. Mean Local Entropy	Mean of local entropies measured in 10 × 10 windows—local information variability.
7. Z-score Deviation	Local homogeneity, estimated via the standard deviation of Z-scores across 5 × 5 pixel neighborhoods.
8, 9. Noise Estimate	Noise estimation based on the mean standard deviation of pixel intensity differences in the vertical and horizontal directions, normalized to the global image contrast.
10. LBP Mean	Mean value of the Local Binary Pattern (LBP) histogram.
11. DFT CenterToEdgeRatio	Ratio of energy in the center of the DFT spectrum to peripheral energy.
12. DFT Entropy	Entropy of the frequency spectrum (DFT) as an energy dispersion.
13. DFT Skewness	Skewness of the frequency spectrum as a frequency asymmetry.
14. Wavelet ApproxVar	Variance of approximation coefficients (low-frequency) from the Daubechies-1 (db1) wavelet.
15–17. Wavelet Horiz., Vert., Diag.	Variance of horizontal, vertical, and diagonal detail coefficients.
18–21. Norm Wavelet Horiz., Vert., Diag.	Normalized versions of wavelet variances (relative to global variance).
22, 23. Laplacian Variance	Laplacian filter variability—edge and detail detection; normalized as Laplacian variance divided by the global image variance.
24. Histogram Peak Count	Number of local maxima in the histogram—image tone structure.
25. GLCM Contrast	Pixel co-occurrence indicators. Mean GLCM contrast from 4 directions—local gray-level variation.
26. GLCM Correlation	Mean GLCM correlation—relationship between neighboring pixels.
27. GLCM Energy	Mean of GLCM Energy, texture uniformity.
28. GLCM Homogeneity	Mean of GLCM Homogeneity, higher values indicate more homogeneous textures.
29–32. GDLCM	As above (contrast, correlation, energy, homogeneity), but computed on the gradient map (GDLCM), offering increased sensitivity to edges and fine details.
33–35. Tamura	Mean values of standard Tamura parameters: coarseness (image structure granularity), contrast (brightness complexity), and directionality (number of dominant gradient directions).
36. Norm Colorfulness	Colorfulness normalized to mean brightness.
37. Saturation Deviation	Standard deviation of the S component in HSV—saturation variability.

Table 2. An overview of the applied classifiers.

Model	Description
Random Forest (Tree Bagger)	A model consisting of 100 decision trees with Out-of-Bag prediction. It performs well on data with complex structures and enables analysis of feature importance.
Deep decision tree	A classical decision tree with the maximum number of splits equal to the number of training samples. It is characterized by high sensitivity but is prone to overfitting.
SVM	Support Vector Machine with a non-linear RBF kernel, automatic scaling, and a low BoxConstraint parameter (0.01), increases the model’s sensitivity to subtle differences.
K-Nearest Neighbors (KNN)	A model with a single neighbor (K = 1, i.e., NN) and Euclidean distance metric. Effective when classes are well-separated but sensitive to noise.
Naive Bayes	A probabilistic model based on kernel density estimation, assuming feature independence.
Neural Network (MLP)	A single-layer neural network with 50 neurons in the hidden layer, trained without regularization. It uses labels transformed into one-hot encoded vectors.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Habrat, M.; Dwornik, M. Identification of AI-Generated Rock Thin-Section Images by Feature Analysis Under Data Scarcity. Appl. Sci. 2025, 15, 8314. https://doi.org/10.3390/app15158314

AMA Style

Habrat M, Dwornik M. Identification of AI-Generated Rock Thin-Section Images by Feature Analysis Under Data Scarcity. Applied Sciences. 2025; 15(15):8314. https://doi.org/10.3390/app15158314

Chicago/Turabian Style

Habrat, Magdalena, and Maciej Dwornik. 2025. "Identification of AI-Generated Rock Thin-Section Images by Feature Analysis Under Data Scarcity" Applied Sciences 15, no. 15: 8314. https://doi.org/10.3390/app15158314

APA Style

Habrat, M., & Dwornik, M. (2025). Identification of AI-Generated Rock Thin-Section Images by Feature Analysis Under Data Scarcity. Applied Sciences, 15(15), 8314. https://doi.org/10.3390/app15158314

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identification of AI-Generated Rock Thin-Section Images by Feature Analysis Under Data Scarcity

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Acquisition

2.2. Feature Extraction and Selection

2.3. Image Type Identification

3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI