Semantic Segmentation of Some Rock-Forming Mineral Thin Sections Using Deep Learning Algorithms: A Case Study from the Nikeiba Area, South Eastern Desert, Egypt

: Image semantic segmentation using deep learning algorithms plays a vital role in identifying different rock-forming minerals. In this paper, we employ the U-net model for its architecture that guarantees precise localization and efficient data utilization. We implement this deep learning model across two distinct datasets: (1) the first dataset from the ALEX Streckeisen website, and (2) the second dataset from the Gabal Nikeiba area, South Eastern Desert of Egypt. Our model exhibits excellent performance in both datasets, with an average accuracy of precision at 0.89 and 0.83, recall at 0.80 and 0.78, and F1 score at 0.82 and 0.79, respectively, helping in identifying and detecting rock-forming minerals in thin-section images. The model’s most exceptional performance is clearly in eleven different basement rock-forming minerals with precision up to 0.89, recall at 0.80, and F1 score at 0.82 on average. This study is significant as it represents the key to identifying and detecting minerals in the thin sections of rock samples in Egypt and the Arabian–Nubian Shield as a whole. By significantly reducing analysis time and improving accuracy compared to manual methods, it revolutionizes geological research and resource exploration in the region.


Introduction
On an image-thin section, mineral identification and textural description require microscopic investigation with interpretation and counting data obtained using a light microscope [1][2][3][4][5][6][7][8][9].Petrography is a specialized discipline that involves extensive knowledge in identifying and classifying minerals and textural relationships within the rock [3].According to the mineral compositions of the rocks, thin-section photos display color, grain size, shape, internal cleavage, structure, and other features [6,10].These attributes offer valuable information on the petrographic properties of rocks.It is difficult to interpret some minerals in a rock-thin section, even for an experienced petrologist, since they are small, dark, or opaque.To identify minerals, other techniques may be used in addition to light microscopy, such as scanning electron microscopy (SEM) and electron probe microanalysis (EPMA), or bulk measurements, such as X-ray diffraction (XRD) [3].
As a result of point counting or grid-based statistics, workflows are used to deal with the massive amounts of data present in a single slide and accelerate the collection of representative quantitative data [3,6,11,12].Despite being reliable and well-liked, this method skips over a lot of information on each image's thin section or even entire thin sections in order to save time.This manual process requires a significant amount of time, money, and effort to complete.There are several approaches to categorizing information in an image using computer vision algorithms [3,13,14].An algorithm that has seen a training set of photos is used to predict the class of a fresh observation using classification learning.A subfield of computer vision called semantic segmentation aims to identify the class label that each pixel in an image represents.In addition to autonomous vehicles [3,15], satellite image classification is another popular semantic segmentation approach [3,14,16,17].These applications involve transforming pixels into meaningful output labels (e.g., boat, plane, car, road, person, tree, sky, etc.) as a result of extensive effort.Instance segmentation is a step further that seeks to group associated pixels into a region with a boundary, e.g., one quartz grain is distinct and separate from another.This study is not intended to address instance segmentation.However, using a "watershed" algorithmic approach as a post-processing step to an output image can still be suggested [3,18].
In the field of petrology, semantic segmentation is still in its infancy, while nontransferrable networks and small classification schemes for sandstones examine distinct objects from the organic petrology discipline [18][19][20].Mlynarczuk and Skiba [21] demonstrate how this field is developing.Segmentation cannot be performed without a pixel-by-pixel label set [12].With the purpose of labeling 2-dimensional RGB images captured by digital cameras linked to light microscope equipment, Shell Research developed Computer Aided Petrology (CAP) in the early-to mid-2000s.Through this method, a subject matter expert manually obtains an image.
Image segmentation plays a pivotal role in discriminating the mineral composition of rock units, particularly in the field of petrographic studies [22][23][24][25][26][27][28][29][30].This process involves partitioning a digital image of a rock sample into multiple segments, each representing different minerals or textures.By applying machine learning algorithms, such as convolutional neural networks (CNNs), researchers can automate the identification and classification of these segments with high accuracy.These models are trained on datasets containing labeled images, enabling them to learn the unique features and patterns associated with various minerals [23][24][25][26].Once trained, the machine learning models can analyze new rock images, segmenting and identifying mineral compositions quickly and efficiently.This approach not only enhances the precision of mineralogical analysis but also significantly reduces the time and effort required compared to traditional manual methods, thereby advancing the capabilities and scope of petrographic studies [23].
This study shows that the semantic segmentation of thin-section images from two datasets (dataset 1 with 10 images and dataset 2 with 5 images) can provide comprehensive scene understanding within a classification framework.Additionally, it can compute basic attributes much faster than an expert petrologist.In addition, experts may be able to gain some knowledge of all the thin sections while refining their interpretations of the primary, secondary, and accessory minerals-all of which are essential for refocusing their efforts on microscopy methods other than light microscopy.
Let us further assume that thin-section image analysis may yield additional quantitative geometric parameters (e.g., contact length, number of nearest neighbors, and preferred orientation).If so, a greater range of assessment models for applications in geomechanics and rock physics can be directly fed by these data.In order to achieve this, we train semantic segmentation models to extract classification information from 2D RGB images taken on the thin slices of basement rocks using the deep neural encoder-decoder architectures of U-net.An excellently hand-labeled collection of training photos is used to train this model.The following is the organization: First, we discuss the training dataset, the semantic segmentation networks, and the general effectiveness of the trained model to predict mineralogy using thin-section images.Next, we discuss the impact of the hyperparameters of the preferred model.Finally, we present a thorough statistical and petrological review of the preferred model, discuss limitations and opportunities, and present our conclusions.
One of the objectives of this study is to implement a novel deep learning image processing technique for the automatic identification of rock-forming mineral systems that help in creating detailed geological maps of the study area with high precision, as well as to investigate thin-section approaches that could help identify the minerals in the study area and their extension.

Geologic Setting and Petrography of the Study Area
The Nikeiba area belongs to the North Arabian-Nubian Shield (ANS) and is located in the South Eastern Desert of Egypt between latitudes 23 • 49 ′ N to 23 • 53 ′ N and longitudes 34 • 18 ′ E to 34 • 24 ′ E, with an area of ~416 km 2 (Figure 1).In this area, different types of lithological rocks are exposed including mafic metavolcanics, a metagabbro-diorite complex, granodiorites, syenogranites, alkali feldspar granites, and quartz syenites (Figure 2a-c).Many quartz veins dissect the above-mentioned units (Figure 2d) and microgranite dykes.These rock units are well discriminated using Sentinel-2 decorrelation stretch image (b12, b8, and b3 as RGB) (Figure 1b).Mafic metavolcanics and the metagabbro-diorite show intrusive contacts with granitic rocks.On the other hand, syenogranites show gradational contact with alkali feldspar granites.Many faults, mainly NW-SE, N-S, and E-W, dissect the area [31].The proposed methods used in this paper are clearly shown in Figure 3.In this paper, we used two datasets (Figures 4 and 5), which are described in detail in the following sections.

Image Datasets
In general, there are only a few label sets available for petrology.In order to create these sets and to compare several methods for automatic mineral identification in thinsection images, we produced a dataset with pixel-level segmentation masks for images containing various minerals.These datasets are divided into subgroups representing various mineralogical compositions.The datasets were gathered from two distinct sources: (1) the first dataset comprises high-resolution photographs from Alex Strekeisen's website (https://www.alexstrekeisen.it);(2) the second dataset includes different granitic samples collected in the Nikeiba area, South Eastern Desert of Egypt.We prepared 24 thin sections from the specimens gathered in the Nikeiba area of Egypt's Eastern Desert for microscopic analysis at the Geology Department of Kafrelsheikh University using a Kemet Geoform thin sectioning machine (Kemet International Ltd, Parkwood Trading Estate, Maidstone, United Kingdom).These sections were polished after being cut into billets, impregnated with white epoxy resin, mounted on glass slides, and ground to a final thickness of 30 µm.The process involved the careful polishing of the surface.The materials used included white epoxy (Resin A&B) and silicon carbide powder in various grit sizes: 220, 400, 600, 800, 1000, and 1200.
The petrographic analysis of the second dataset was conducted using an Olympus TH4-200 standard polarizing microscope equipped with a UC-30 digital camera [33,34].Labeling images were taken using optical microscopes with 4× or 10× magnification, resulting in pixel resolutions ranging from 0.84 to 2.08 pixels/µm.The first dataset of Alex Strekeisen consists of 10 main minerals (Table 1).Our second dataset from the granites of the Nikeiba area comprises ten different types of rock-forming minerals (Table 1).It is essential to emphasize that despite the fact that an experienced geologist can investigate many more minerals in a thin section, we only include 16 minerals in our work because gathering a huge collection of thin sections with the hand annotation of all the used minerals is resource-expensive.

Preparing Thin-Section Samples for Deep Learning Analysis
The workflow of the methodology employed in this study is illustrated in Figure 3. Preparing thin-section training samples for deep learning in petrography requires a meticulous and technical approach.This begins with capturing images under two distinct lighting conditions: cross-polarized light (XPL) and Plane-polarized light (PPL) (Figures 4 and 5).These imaging techniques are essential in petrography as they highlight different aspects of the mineral components in the thin sections.XPL and PPL images provide complementary information about the rock samples' mineral composition, texture, and structure (Figures 4 and 5).Obtaining high-quality and representative images under both lighting conditions is vital for the subsequent steps in the data preparation process.
Once the XPL and PPL images of the thin sections are obtained, the next critical step is image registration.This process involves adjusting and aligning the XPL and PPL images to match each other perfectly.Image registration is crucial as it ensures that the features observed in the XPL and PPL images correspond to the same areas of the thin section.Accurate alignment is essential for effectively combining data from both imaging modalities.This alignment allows for a more comprehensive analysis of the mineral components, combining the information obtained from both polarized light conditions.
After aligning the XPL and PPL images, they are stacked to form a single composite image.This composite image consists of six bands-three from the XPL image and three from the PPL image.Arranging the photos in this way forms a multi-dimensional dataset that captures the optical information from both types of polarized light.The next crucial step involves segmenting this stacked image (Figures 4 and 5).This segmentation process isolates different regions within the image, facilitating the subsequent labeling process.By merging different segments under their respective labels, the process effectively categorizes various mineral components and other features in the thin sections.This step is fundamental in preparing the data for more detailed analysis and classification.
The final step in preparing the training samples involves two key activities: building a vector layer for labeling and tiling the stacked images (Figures 4 and 5).The vector layer is meticulously constructed to label different mineral components within the sections.This layer includes major minerals and, importantly, mirror labels for secondary components (Figures 4 and 5).This detailed labeling is vital for training deep learning algorithms to recognize and classify different minerals accurately.The last step is converting the stacked images and their corresponding vector layers into fixed-size square tiles.The tiles and their associated masks are input into a deep learning model like the U-net algorithm, which is commonly utilized for image segmentation.This method of dividing the images into tiles and applying masks is essential for effectively training deep learning models.It enables them to recognize and learn mineralogical characteristics in thin-section samples accurately.

Train, Validation, and Testing Datasets
In the labeled dataset, the images were randomly assigned to three subgroups: 960 images for training, 194 images for validation, and 234 images for testing.Every RGB image and its label are sized at 775 × 518 pixels.In this study, the model was initially fit using a training dataset with an optimized set of hyperparameters (Table 1) as the learning method [34].Predicting the responses to the observations in the validation dataset was then accomplished using the fitted model.In the next phase, we tuned hyperparameters (Figures 4 and 5) based on model performance against the validation dataset.
To perform semantic segmentation of thin-section images, we used a convolutional neural network based on a popular U-Net architecture (Figure 6) with batch normalization layers [35,36].Adding residual connections inside convolution blocks increases learning speed and overcomes the vanishing gradient problem [37].Using typical patch-based methodologies to train a neural network with uneven input will produce poor results.To address the issue of data imbalance, we applied a modified version of the specific data balancing strategy suggested by Kochkarev et al. [38].Specifically, during the generation step, the patch probability maps that were based on the distance to the nearest class were replaced with the area of a certain class in the patch.Optimal hyperparameters for models based on Resnet-34 and the U-net network are listed in the right column.

CNN Architectures and Models
We applied semantic segmentation convolutional neural networks (CNNs) to classify each pixel in a thin-section image.This produces an image segmented according to the petrological class defined by the labeling schemes (16 classes).There are multiple types of networks for semantic segmentation, including ResNet34 and U-Net.All of these networks are effective and trainable, and they make use of the limited GPU resources available in the MATLAB libraries.Briefly, ResNet34 and U-Net include different network layers and have convolutional kernels of size 3 × 3, max pooling size 2 × 2, and a stride of two (Figure 6).ResNet 34 utilizes exactly the same 3 × 3 convolutional kernels, located at the beginning and end of the network.It makes connections between each of the two convolutional layers.
Both base architectures include a softmax (probability) layer to get the highest probability pixel outcome from the classification procedure (Figure 6).Exploring the U-Net Architecture: An Overview of its Structure for Efficient Image Segmentation U-Net is a convolutional neural network architecture developed primarily for biomedical image segmentation.This architecture is notable for its efficiency and accuracy in segmenting images, even with limited data.The structure of U-Net can be described in four main sections: 1.

Downsampling (Contracting Path):
The first part of U-Net is the contracting path, which is a typical convolutional network.This path contains the continual application of two 3 × 3 convolutions (unpadded convolutions), each followed by a rectified linear unit (ReLU) and a 2 × 2 max pooling operation with stride 2 for downsampling.At each downsampling step, the number of feature channels is doubled.This part of the network captures the context in the input image, essential for accurate segmentation.

2.
Bottleneck: After several layers of downsampling, the U-Net reaches its bottleneck.This is the lowest level of the network, where it has the smallest spatial dimension of feature maps.In the bottleneck, two 3 × 3 convolutions are applied, followed by a ReLU.This section is crucial as it allows the network to process features at the lowest resolution, capturing the most abstract representations of the input data.

3.
Upsampling (Expanding Path): Following the bottleneck, the network then transitions into the expansive path, which includes a sequence of upsampling and convolution operations.The upsampling of the feature map is followed by a 2 × 2 convolution ("up-convolution") that halves the number of feature channels.Then, a concatenation is performed with the correspondingly cropped feature map from the contracting path.This step is crucial for the network to learn precise localization, a critical aspect of accurate segmentation.

4.
Final Layer: The final layer of the network is a 1 × 1 convolution that maps each feature vector to the desired number of classes in the output segmentation map.This layer produces the final segmentation map, where each pixel in the image is classified into a specific class.The architecture's use of expansive paths and concatenation with high-resolution features from the contracting path allows for precise localization and detailed segmentation.

Experiments Setup
In this paper, we conducted a series of experiments focusing on a subset of data, including image training datasets for three different types of mineral thin sections.These initial experiments were crucial for fine-tuning the various parameters we used in our study.To carry out these experiments, we employed the U-net deep learning model, which is renowned for its efficiency and accuracy in image segmentation tasks.The model was configured with specific parameters: a maximum of 20 epochs, a batch size of 8, and a learning rate tailored for optimal performance.Additional model arguments included settings for class balancing, mix-up, focal loss, and ignore classes, with a chip size set at 128 and the monitoring of the validation loss.
Further technical details of our experimental setup include the choice of RESNET34 as the backbone model, which is highly regarded for its performance in image classification tasks.We also utilized a pre-trained model to leverage previously learned patterns and features, enhancing the model's effectiveness.The validation data comprised 10% of the total dataset, ensuring a robust evaluation of the model's performance.An essential feature of our model training was the "STOP_TRAINING" functionality, which automatically halted training when no further improvements were observed, thereby optimizing the training process and preventing overfitting.
The entire configuration and adjustment of these specifications were carried out within the ArcGIS Pro version 3.0 environment, utilizing its deep learning toolbox.This environment provided a stable and efficient platform for manipulating the deep learning parameters and models, ensuring precise and reliable outcomes in our experiments.The integration of these advanced tools and settings in the ArcGIS Pro version 3.0 environment was crucial for achieving the high level of accuracy and efficiency required in our study of the mineral thin sections.
In our recent experiments, we focused on training data analysis using a balanced dataset, where 50% was allocated for validation and the other 50% for training [39].This methodology was applied across 50 epochs to ensure comprehensive learning and evaluation.The results of these experiments were meticulously documented and presented in Figures 7-9. Figure 7

Comparing the Two Datasets
A detailed overview of the model's performance across various classes, with a focus on the precision, recall, and F1 score metrics, for the two datasets is shown in Table 2.
The first dataset shows strong precision at 0.81 and recall at 0.85, leading to a balanced F1 score of 0.83, whereas the second dataset of the Gabal Nikeiba area displays excellent precision at 0.91 and recall (0.79) (Table 2), leading to a very high F1 score.This indicates a reliable performance in correctly identifying quartz and effectively reducing false negatives (Figures 8 and 9).

•
Plagioclase and Biotite: These minerals demonstrate high precision in the two datasets, ranging from 0.77 to 0.90, and recall from 0.77 to 0.98 (Table 2), suggesting the model's strong capability in accurately identifying these minerals and consistently detecting their instances (Figures 8 and 9).• K-feldspar: In the Nikeiba area, it has strong precision (0.91) and recall (0.94), resulting in a very high F1 score (0.91) (Table 2).This result indicates that the model is very accurate in the detection of K-feldspar (Figure 9).

•
Riebeckite and Arfvedsonite: These minerals belong to alkali amphibole and are present in the three different types of the Gabal Nikeiba granites (syenogranite, alkali feldspar granites, and quartz syenites) (Figure 5).They show a very high precision range from 0.84 to 0.95 and recall range from 0.77 to 0.90, leading to a strong F1 score of 0.80 and 0.92 (Table 2).This result exhibits the accuracy of the model in identifying and detecting these minerals (Figure 9).

•
Muscovite: It shows moderate precision (0.79) and moderate recall (0.62), resulting in a moderate F1 score of 0.69 in the Nikeiba area, which implies the moderate performance of the model in both identifying and detecting the muscovite (Figure 9d).

•
Chlorite, Olivine, and Serpentine: These minerals run only in the first dataset, and they exhibit excellent precision ranging from 0.87 to 0.89 and recall from 0.80 to 0.98, resulting in a high F1 score of 0.84, 0.91, and 0.93, respectively (Table 2), indicating that the model shows exceptional performance in both accurately identifying and consistently detecting chlorite, olivine, and serpentine (Figure 8d-h).

•
Titanite and Talc: These classes belong to the first dataset, and they have high precision (0.78 and 0.94, respectively; Table 2) but moderate recall (both at 0.77 and 0.76; Table 2), indicating the model's effectiveness in correctly identifying them, though with some missed instances (Figure 8g-i).

•
Tourmaline: It also runs in the first dataset and has perfect precision (1.00); its recall is significantly lower at 0.62, leading to an F1 score of 0.77 (Table 2).This suggests that while the model accurately identifies tourmaline, when it detects it, it misses a considerable number of instances (Figure 8j).

•
Zircon: In the Nikeiba area, it is a prevalent accessory mineral (Figures 4e and 8e).It has high precision (0.76) and strong recall (0.89), leading to a strong F1 score of 0.82 (Table 2).This indicates that the model accurately identifies and consistently detects the zircon (Figure 9e).

•
Apatite: It is an accessory mineral in all the granitic phases of the Nikeiba area (Figure 4a).It has excellent precision (0.92) and low recall (0.51), leading to an F1 score of 0.66 (Table 2).This implies that the model accurately identifies and detects apatite with some missed instances (Figure 9a).

•
Background: Notably, the model achieves a moderate to high precision range (0.68 to 0.92) in identifying the background in the two datasets but has a low recall (0.26 to 0.46, respectively), leading to a lower F1 score range from 0.40 to 0.55 (Table 2).This implies that while the model accurately identifies background, when it does, it often fails to detect it.
The overall average scores across all the classes of the first dataset (precision = 0.89, recall = 0.80, and F1: 0.82) are slightly high compared to the second dataset of the Nikeiba area with precision equal to 0.83, recall = 0.78, and F1 = 0.79 (Table 2).This reflects a highly effective model with solid capabilities in accurately identifying and consistently detecting various classes, though with some variation across different classes in both datasets.The lower recall for background suggests an area for potential improvement, highlighting the need for enhanced detection in this aspect.
Finally, we have successfully developed a model capable of detecting more distinct classes of petrographic features, namely 11 classes, significantly surpassing the previous methodologies [23][24][25][26][27]40].This improvement is primarily attributed to the utilization of the ResNet-34 architecture, which strikes an optimal balance between depth and computational efficiency.Unlike the more cumbersome ResNet-134, ResNet-34 offers a streamlined structure that enhances both processing speed and model performance.This architectural choice has enabled us to achieve a more refined and accurate classification of the petrographic samples, facilitating more comprehensive analysis and interpretation in geological studies.
Furthermore, our method leverages the full resolution of the petrographic images by employing a sophisticated tiling strategy.This approach involves dividing high-resolution images into smaller, manageable tiles, thereby increasing the effective size of the training dataset without compromising the detail and quality of the data.In contrast, the previous methods relied on downsampling to create lower-resolution images, which often resulted in the loss of critical textural and structural information.By maintaining full resolution, our model is better equipped to capture the intricate features necessary for accurate classification, leading to a more precise and reliable petrographic analysis.
A comparative analysis of mineral percentages identified in thin sections using an automated mineralogy method versus traditional, manual point counting ("ground truth") is illustrated in Table 3.Each row represents a separate thin section analyzed, with its predicted mineralogy based on the automated method and the actual mineralogy determined by point counting.The percentage of each mineral identified is listed, allowing for a direct comparison of the accuracy of the automated method.The "Difference" column highlights the discrepancies between the automated and manual methods for each mineral and thin section.The positive values indicate an overestimation by the automated method, while the negative values represent an underestimation.This column is crucial for assessing the reliability and potential biases of the automated approach.For instance, the thin-section image (Figure 8c; Table 3) shows a substantial difference in the biotite percentage, suggesting the automated method may struggle to differentiate biotite from the other minerals in certain contexts.This is due to human error in labeling the biotite crystals due to their variable textures (subhedral plated and flakes) and colors (pale greenish-brown, pale green, brown, and dark red-brown).Additionally, some biotite crystals may be affected by hydrothermal alteration and converted to several minerals like chlorite, muscovite, and clay minerals [41].Overall, this table provides a valuable snapshot of the performance of the automated mineralogy technique, paving the way for further investigation into its strengths, limitations, and potential refinements.

Conclusions
In this study, we applied semantic segmentation deep learning techniques to petrological imaging, focusing on automatic mineral detection in rock thin-section images.Our approach is based on the U-Net framework with ResNet 34 as a backbone.In this study, we evaluated the performance of the U-Net deep learning algorithm using two distinct datasets.The first dataset from ALEX Strekeisen encompasses a diverse range of classes with high average scores, demonstrating robust model capabilities.The second dataset, specific to the Nikeiba area, serves as a comparative benchmark to assess the model's generalizability.By comparing the results from these datasets, we aim to identify the areas of strength and potential improvement in the model's segmentation performance.The overall average scores across all the classes of the first dataset-precision (0.89), recall (0.80), and F1 (0.82)-are slightly higher compared to the second dataset of the Nikeiba area, which has precision equal to 0.83, recall equal to 0.78, and an F1 score of 0.79.These metrics indicate that the model performs effectively in accurately identifying and consistently detecting various classes, although there is some variation in performance across different classes in both datasets.The differences in these scores underscore the model's robustness and capability, yet they also reveal areas where further refinement might be beneficial.One noteworthy area for potential improvement is the recall for the background class, which is lower in comparison to other classes.This suggests that the model may occasionally miss instances of the background class, indicating a need for enhanced detection strategies in this specific aspect.By addressing this shortcoming, the overall performance of the model can be further optimized, ensuring more comprehensive and reliable detection across all the classes.Moreover, the findings from this study can be applied to enhance thin-section mineral identification across the Arabian-Nubian Shield, demonstrating the broader applicability of the modern deep learning models in petrographic studies.

Figure 2 .
Figure 2. Field photographs of the Gabal Nikeiba granites.(a,b) A close-up view of syenogranites and alkali feldspar granites, respectively.(c) A panorama view of quartz syenites.(d) Quartz vein at the periphery of alkali feldspar granites.

Figure 3 .
Figure 3. Workflow diagram for the proposed method.

Figure 4 .
Figure 4.The datasets used from ALEX Strekeisen (https://www.alexstrekeisen.it).(a) The crosspolarized light (XPL), Plane-polarized light (PPL), segmentation results of the stacked images of (XPL) and (PPL), and vector images, respectively, of biotite, quartz, and plagioclase crystals in granites.Vector image as a result of converting a segmented raster image to a polygon in ArcGIS Pro version 3.0.Every pixel in an image is tagged with a semantic segmentation class.(b) Biotite and quartz crystals in granites.(c) Biotite, quartz, and plagioclase in granites.(d,e) Chlorite, quartz, and plagioclase in granites.(f) Olivine and plagioclase crystals in troctolite.The fractures are due to the increase in volume due to the serpentinization.(g,h) Talc veins in a serpentinite.(i) Wedgeshaped titanite crystals in the interstitial space of quartz are associated with biotite in granites.(j) Tourmaline crystals in the interstitial space of large quartz crystals.XPL and PPL images, 2× (field of view = 7 mm).

Figure 5 .
Figure 5.The datasets used from the Gabal Nikeiba area.(a) The cross-polarized light (XPL), Planepolarized light (PPL), segmentation results of the stacked images of (XPL) and (PPL), and vector images, respectively, of biotite, quartz, and plagioclase crystals in granites.Vector image as a result of converting a segmented raster image to a polygon in ArcGIS Pro version 3.0.Every pixel in an image is tagged with a semantic segmentation class.(b) Biotite and quartz crystals in granites.(c) Biotite, quartz, and plagioclase in granites.(d,e) Chlorite, quartz, and plagioclase in granites.

Figure 5 .
Figure 5.The datasets used from the Gabal Nikeiba area.(a) The cross-polarized light (XPL), Planepolarized light (PPL), segmentation results of the stacked images of (XPL) and (PPL), and vector images, respectively, of biotite, quartz, and plagioclase crystals in granites.Vector image as a result of converting a segmented raster image to a polygon in ArcGIS Pro version 3.0.Every pixel in an image is tagged with a semantic segmentation class.(b) Biotite and quartz crystals in granites.(c) Biotite, quartz, and plagioclase in granites.(d,e) Chlorite, quartz, and plagioclase in granites.

Figure 6 .
Figure 6.The whole network architecture of the deep learning U-net method.
offers a detailed look at the training and validation loss graph.This visual representation provides insights into the model's performance over the course of the training process, highlighting the loss metrics during each epoch for both the training and validation phases.

Figure 7 .
Figure 7. Training and validation loss graph of the dataset.

Figure 8 .
Figure 8. (a-j) The results of the ground truth and prediction masked images of the first dataset.

Figure 9 .
Figure 9. (a-e) The results of the ground truth and prediction masked images of the second dataset of the Gabal Nikeiba area, South Eastern Desert, Egypt.

Table 1 .
Hyperparameters were tested for all base networks.

Table 2 .
Evaluation matrix result for each class and the total average in the two datasets used.

Table 3 .
Evaluation of the input thin section and predicted images in the two datasets.