3D Carbonate Digital Rock Reconstruction by Self-Attention Network and GAN Structure

: Amidst the rapid advancements in digital technology, the pursuit of simulating geologic and mineralogic samples in a digital domain has garnered considerable attention, becoming a linchpin in modern earth science and petrological research. This manuscript intricately explores the deployment of state-of-the-art generative models for the meticulous reconstruction of digital rock core samples. Central to this investigation was the innovative incorporation of the self-attention mechanism—a pioneering endeavor in the domain of digital rock core studies. By harnessing the prowess of this sophisticated model, we endeavored to produce samples that echo the nuanced geological and mineralogical attributes emblematic of authentic rock specimens. Distinguishing our approach, the generative architecture, bolstered by the self-attention mechanism, demonstrated unparalleled proﬁciency in replicating quintessential rock features, ranging from porosity and granular texture to contiguous core sequences. Additionally, the idiosyncrasies of carbonate rocks were meticulously captured, highlighting phenomena like dissolution. Empirical evaluations, rooted in stringent statistical analyses, attested to the model’s capability to generate outputs that resonate closely with genuine samples. This exploration not only ampliﬁes the potential applications of our proposed model in geoscientiﬁc endeavors but also signals a transformative stride in digital rock physics, emphasizing the harmonious amalgamation of innovative computational models with profound geological insights.


Introduction
Energy exploration and extraction have progressively veered towards complex reservoirs, including carbonates and tight sandstones, underscored by intricate and non-homogeneous pore structures.These complex formations are pivotal in addressing the escalating global energy demands, prompting advancements in rock physics and related empirical and theoretical models.
The exploration of the nuanced interplay between microstructural geometries and macroscopic percolation properties of rocks has historically relied on experimental measurements and numerical simulations [1][2][3][4].Traditional digital rock reconstruction methods have been bifurcated into physical experiment methods like CT and SEM and numerical reconstruction methods leveraging statistical information [5][6][7].While physical methods offer precision, they are often hampered by high costs and specialized equipment needs.Conversely, numerical methods, though cost-effective, sometimes compromise on the connectivity and complexity of the reconstructed rocks [8,9].
Deep learning, and particularly the ascent of GANs, has significantly ameliorated these challenges, enabling the generation of detailed and complex 3D digital rock models efficiently [10][11][12][13].These methods not only enhance the reconstruction quality but also provide avenues for the exploration of latent spaces and generation of more continuous and controllable images [14,15].

Discussion and Novelty
The exploration in digital rock reconstruction has been extensively focused on sandstone reservoirs.This has led to the development of numerous methodologies and models that are specifically tuned to the attributes of sandstones.However, this emphasis has inadvertently created a knowledge gap in understanding other complex rock formations, notably carbonate rocks.
Sandstones are typically granular, comprising sand grains bonded by various cementing materials.Their porosity and permeability are largely influenced by factors such as grain size, shape, arrangement, and the type of cement.Contrastingly, carbonate rocks present a more complex scenario.They exhibit a broad spectrum of pore sizes and configurations, primarily due to the dissolution process of the primary rock matrix.This dissolution results in the formation of vugs, fractures, and complex pore networks, contributing to the carbonate rocks' secondary porosity.Notably, the heterogeneity in pore distribution within carbonate rocks is more pronounced and variable compared to that in sandstones.
In response to this evident research chasm, our investigation aims to: Refocus on Carbonate Rocks: Pivot the research lens to comprehensively address carbonate formations.
Adaptation of Methodologies: Customize the reconstruction paradigms to reflect the multifaceted nature of carbonate edifices.
Incorporate Self-attention Mechanism: A groundbreaking adaptation, this mechanism accentuates the reconstruction process by honing in on subtle patterns and features characteristic to carbonate rocks, thus bolstering the model's precision and representational accuracy.
Our approach marks a departure from traditional methodologies, delving into the intricacies of carbonate rock structures through the advanced self-attention mechanism.This strategy not only fills the existing research voids but also brings to light the complex geological aspects of carbonate formations, thereby advancing the field of digital rock reconstruction.
By adopting this refined approach, our objective is to deepen the understanding of carbonate rocks and provide a more comprehensive perspective on digital rock modeling.This endeavor reflects the diverse range of rock types present on Earth, underlining our commitment to enhancing the scope and accuracy of digital rock physics.

Sequential Execution of Digital Rock Reconstruction
Digital rock reconstruction provides a nuanced understanding of intricate rock structures and features.With the incorporation of Generative Adversarial Networks (GANs), we can emulate these structures with remarkable precision.The process is delineated step by step as demonstrated in Figure 1.
(1) Training Phase: The upper section of Figure 1 delineates the training phase, which is pivotal to our approach:

Role of the Discriminator in Model Training
Understanding the intricate dynamics within the realm of digital rock reconstruction, particularly the role of the discriminator in a Generative Adversarial Network (GAN) setup, is imperative.This section delves deep into elucidating this pivotal component.
As illustrated in Figure 2, the GAN architecture hinges on two primary modules: the generator and the discriminator.While the generator's function is to produce synthetic rock images, the discriminator's responsibility lies in discerning these generated structures from real rock samples.This dual interplay forms the foundation of our methodology.

Functioning of the Discriminator
Input Assimilation: The discriminator concurrently receives both real rock images and the synthetic outputs generated by the model.
Differentiation: Tasked with distinguishing between the real and the synthesized, the discriminator evaluates the authenticity of each image.This discernment process becomes the primary feedback mechanism for the model, guiding the generator's subsequent iterations.
Feedback Mechanism: Based on the evaluation, the discriminator relays its findings to the generator.If the generated images deviate from real samples, the feedback instructs the generator on areas of improvement.
Optimization Goal: The continuous feedback loop aims to refine the generator's output with each iteration.The ultimate objective is to reach a juncture where the discriminator, despite its adeptness, struggles to differentiate between genuine and generated rock structures.This indicates that the synthesized images have attained a realism level mirroring actual rock structures.
In essence, the discriminator serves as the model's quality control mechanism.By constantly challenging the generator's outputs, it ensures that the end results are not only realistic but also of superior fidelity.This internal rivalry within the GAN setup (2) Testing Phase: Once training is accomplished, the model transitions to the testing phase, depicted at the bottom of Figure 1: Input Noise: For the generation of new digital rock samples, noise is introduced.This random stimulus ensures variability and uniqueness in each generated rock sample.SAE-GAN-Generated Output: The previously trained SAE-GAN uses the input noise to generate digital rock samples.The red arrows signify the flow from the input noise to the generation of digital rock samples, illustrating the model's application in the testing phase without further training or modifications.The red arrows in Figure 1 symbolize the transition of the model from a state of active training to a state where it has been fully trained and is ready for deployment without further modifications.This transition is a critical phase in the model's development, marking the point where it shifts from learning and adapting to being capable of consistent performance based on its training.Specifically, the final red arrow represents the generation process, where the trained model, upon receiving input noise, produces digital rock samples.This crucial step is where the model's ability to reconstruct digital rocks is actualized.The output generated during this phase mirrors the characteristics and intricacies learned during the training phase, yielding realistic and accurate rock structures.It demonstrates the practical application of the SAE-GAN in generating a multitude of digital rock samples, thus achieving the objective of accurate rock reconstruction.
The training phase of our Self-Attention Enhanced Generative Adversarial Network (SAE-GAN) specifically addresses the complex characteristics of carbonate rocks.By incorporating both traditional and fully convolutional self-attention mechanisms, the SAE-GAN is uniquely equipped to process the intricate structures of these rocks.This dual approach allows the model to effectively capture the heterogeneity and variability in pore distribution, which are critical aspects of carbonate rocks.Detailed discussions of these mechanisms and their specific roles in enhancing our model's capabilities are elaborated in Section 2.3.This strategic integration underscores our model's ability to generate digital rock samples that accurately reflect the nuanced geological features of carbonate formations, marking a significant advancement in the field of digital rock reconstruction.

Role of the Discriminator in Model Training
Understanding the intricate dynamics within the realm of digital rock reconstruction, particularly the role of the discriminator in a Generative Adversarial Network (GAN) setup, is imperative.This section delves deep into elucidating this pivotal component.
As illustrated in Figure 2, the GAN architecture hinges on two primary modules: the generator and the discriminator.While the generator's function is to produce synthetic rock images, the discriminator's responsibility lies in discerning these generated structures from real rock samples.This dual interplay forms the foundation of our methodology.accelerates the learning process and guarantees the production of high-caliber digital rock reconstructions.

Model Architecture
In our approach to digital rock reconstruction, the model architecture is meticulously designed to ensure precision and efficiency.The entire process is streamlined by using a series of components, each specialized for its task.
The generator, illustrated in Figure 3, stands as the central component of the model, tasked with synthesizing the digital rock images.Starting from a noise input, the generator consists of a series of convolutional layers, complemented by Batch Normalization (BatchNorm3d) and activation functions such as ReLU.Two core components of the generator are the DoubleConvBlock and ConvAttentionBlock.A noteworthy inclusion in our model is the fully convolutional self-attention network.By executing the self-attention mechanism across the entire input image, the model is equipped to discern long-range

Functioning of the Discriminator
Input Assimilation: The discriminator concurrently receives both real rock images and the synthetic outputs generated by the model.
Differentiation: Tasked with distinguishing between the real and the synthesized, the discriminator evaluates the authenticity of each image.This discernment process becomes the primary feedback mechanism for the model, guiding the generator's subsequent iterations.
Feedback Mechanism: Based on the evaluation, the discriminator relays its findings to the generator.If the generated images deviate from real samples, the feedback instructs the generator on areas of improvement.
Optimization Goal: The continuous feedback loop aims to refine the generator's output with each iteration.The ultimate objective is to reach a juncture where the discriminator, despite its adeptness, struggles to differentiate between genuine and generated rock structures.This indicates that the synthesized images have attained a realism level mirroring actual rock structures.
In essence, the discriminator serves as the model's quality control mechanism.By constantly challenging the generator's outputs, it ensures that the end results are not only realistic but also of superior fidelity.This internal rivalry within the GAN setup accelerates the learning process and guarantees the production of high-caliber digital rock reconstructions.

Model Architecture
In our approach to digital rock reconstruction, the model architecture is meticulously designed to ensure precision and efficiency.The entire process is streamlined by using a series of components, each specialized for its task.
The generator, illustrated in Figure 3, stands as the central component of the model, tasked with synthesizing the digital rock images.Starting from a noise input, the generator consists of a series of convolutional layers, complemented by Batch Normalization (Batch-Norm3d) and activation functions such as ReLU.Two core components of the generator are the DoubleConvBlock and ConvAttentionBlock.A noteworthy inclusion in our model is the fully convolutional self-attention network.By executing the self-attention mechanism across the entire input image, the model is equipped to discern long-range dependencies between different positions within the image.This capability empowers the model with a refined understanding of the image's content and contextual information, leading to an enhancement in the quality of generated images.Through the self-attention mechanism, the model autonomously designates which areas of the image merit heightened focus and importance weighting.This selective attention minimizes the generation of ambiguous or implausible details, bolstering the accuracy and realism of the produced images.Conclusively, the implementation of this mechanism provides meaningful improvement to the model.The generator's final output undergoes a tanh activation to ensure normalization of the generated image values.Presented in Figure 4, the discriminator plays a pivotal role in evaluating the genuineness of the generated rock images.It is designed to differentiate between the real and synthesized images.With its convolutional layers, LeakyReLU activations, and Batch Normalization (BatchNorm3d), it efficiently classifies images as 'Real' or 'Fake', guiding the generator in its training phase.Presented in Figure 4, the discriminator plays a pivotal role in evaluating the genuineness of the generated rock images.It is designed to differentiate between the real and synthesized images.With its convolutional layers, LeakyReLU activations, and Batch Normalization (BatchNorm3d), it efficiently classifies images as 'Real' or 'Fake', guiding the generator in its training phase.Presented in Figure 4, the discriminator plays a pivotal role in evaluating the genuineness of the generated rock images.It is designed to differentiate between the real and synthesized images.With its convolutional layers, LeakyReLU activations, and Batch Normalization (BatchNorm3d), it efficiently classifies images as 'Real' or 'Fake', guiding the generator in its training phase.The DoubleConvBlock module, visualized in Figure 5, is a specialized architecture that involves two sequential convolution operations.With Group Normalization (Group-Norm3d) and ReLU activation sandwiched between them, this block aids in extracting more nuanced features from the input while retaining spatial hierarchies, ensuring better texture and detail in the generated rock structures.The ConvAttentionBlock, depicted in Figure 6, integrates an attention mechanism into the convolutional process.This innovative block assigns distinct attention weights to The DoubleConvBlock module, visualized in Figure 5, is a specialized architecture that involves two sequential convolution operations.With Group Normalization (Group-Norm3d) and ReLU activation sandwiched between them, this block aids in extracting more nuanced features from the input while retaining spatial hierarchies, ensuring better texture and detail in the generated rock structures.Presented in Figure 4, the discriminator plays a pivotal role in evaluating the genuineness of the generated rock images.It is designed to differentiate between the real and synthesized images.With its convolutional layers, LeakyReLU activations, and Batch Normalization (BatchNorm3d), it efficiently classifies images as 'Real' or 'Fake', guiding the generator in its training phase.The DoubleConvBlock module, visualized in Figure 5, is a specialized architecture that involves two sequential convolution operations.With Group Normalization (Group-Norm3d) and ReLU activation sandwiched between them, this block aids in extracting more nuanced features from the input while retaining spatial hierarchies, ensuring better texture and detail in the generated rock structures.The ConvAttentionBlock, depicted in Figure 6, integrates an attention mechanism into the convolutional process.This innovative block assigns distinct attention weights to The ConvAttentionBlock, depicted in Figure 6, integrates an attention mechanism into the convolutional process.This innovative block assigns distinct attention weights to different regions of the input, enhancing the model's ability to discern and focus on the essential features in the data.This attention-centric approach ensures that the generated rock samples encapsulate the intricate details reminiscent of real-world structures.different regions of the input, enhancing the model's ability to discern and focus on the essential features in the data.This attention-centric approach ensures that the generated rock samples encapsulate the intricate details reminiscent of real-world structures.
The traditional self-attention mechanism, also known as SA, is a mechanism used to process sequence data.Its basic principle is to split the vector representation of each position in the input sequence into three parts: query, key, and value.Then, by calculating the correlation score between each position and others in the sequence, a weight vector is ob- The traditional self-attention mechanism, also known as SA, is a mechanism used to process sequence data.Its basic principle is to split the vector representation of each position in the input sequence into three parts: query, key, and value.Then, by calculating the correlation score between each position and others in the sequence, a weight vector is obtained, and the values are weighted and summed according to this weight vector to obtain the representation of each position.
Specifically, given an input sequence x 1 , x 2 , . .., x n , the output of each position i can be calculated as follows: (1) Use the vector representation x i of the current position i as query, key, and value, respectively, and calculate their similarity scores with other positions.(2) Based on these scores, compute the weight vector Wi between the current position i and other positions.(3) Weight the values according to the weight vector and sum them up to obtain the output vector Yi of the current position i.
When calculating the similarity scores in the traditional self-attention mechanism, dot product or multi-head attention mechanism is commonly used, which can effectively capture long-range dependencies between sequence elements.
In contrast, the fully convolutional self-attention mechanism is designed specifically for high-dimensional image tasks.This mechanism combines convolution operations with self-attention mechanism, using a convolution layer with a kernel size of 3 to replace the original dot product operation.Specifically, the mechanism divides the input image into query, key, and value parts, and applies convolution operations to each part to obtain local contextual information.Then, by calculating the correlation scores between each position and others, a weight vector is obtained, and the values are weighted and summed according to this weight vector to obtain the representation of each position.
In the fully convolutional self-attention mechanism, convolution operations can help the model better capture local structural information in the image while reducing computational complexity.Additionally, it retains the advantage of self-attention mechanism in capturing long-range dependencies between elements in a sequence, effectively improving the performance of the model.As a result, the traditional self-attention mechanism and the fully convolutional self-attention mechanism adopt different strategies for processing different types of data, but their basic principle is to capture dependencies between data elements by calculating correlation scores, and use these scores to weight and sum the data to obtain the representation of each position.
In summation, our model architecture is a harmonized blend of conventional GAN structures and innovative design choices.By integrating specialized blocks like Double-ConvBlock and ConvAttentionBlock, our model achieves unparalleled fidelity in digital rock generation, setting a benchmark in the field.

Evaluation Metrics
In order to quantitatively evaluate and compare the qualities of the real digital rock samples and the generated digital rock samples, several statistical metrics are utilized.The metrics adopted in this study include: Density Distribution Curve: This curve represents the probability distribution of data points in both real and generated digital rock samples.It is essential for visualizing the distribution of pixel values, ensuring that generated digital rocks resemble real rocks.The kernel density estimation (KDE) is often used for this purpose, given by: where K is the kernel function, h is the bandwidth, and N is the number of data points.Box Plot: The box plot provides a visual summary of the data's central tendency, variability, and outliers.The central rectangle spans the interquartile range (IQR), the segment inside the rectangle shows the median, and "whiskers" above and below the box show the range outside the middle 50% of the data.
The calculation for IQR is: where Q 3 is the third quartile and Q 1 is the first quartile.
Outliers can be determined using: Mean: Represents the average value of the digital rock samples.It is calculated as: where N is the total number of data points, and x i is the value of the ith data point.Standard Deviation: Measures the amount of variation or dispersion of the data points.It provides insights into the variability of the data.The formula is: 2  (5)

Data Sources
In our study, we concentrated on the Estaillades Carbonate images from the Digital Rocks Super Resolution Dataset 1 (DRSRD1) [19].Estaillades carbonate is a mono-mineralic, calcitic rock that features two types of pores: larger intergranular macropores and smaller intragranular micropores.Our analysis utilized high-resolution micro-CT images of a 7 mm diameter Estaillades sample, scanned using micro-CT scanner.These images prominently capture the intergranular macropores, providing a clear view of the macroscopic pore structures within the carbonate.
The micro-CT images in our dataset have a resolution of 3.8 and 3.1 microns, allowing us to examine the complex pore structures of the Estaillades carbonate in great detail.This dataset also includes a three-phase segmentation of the rock sample, distinguishing pore voxels, voxels with unresolved porosity, and solid voxels.Such detailed segmentation is crucial for our analysis, particularly for evaluating the porosity and permeability characteristics of the carbonate rock.

Training Analysis
Our experiment harnesses a training approach that leans on the foundations of WGAN, making apt modifications to cater to our specific requirements.The WGAN, or Wasserstein Generative Adversarial Network, presents an advanced twist to the traditional GAN paradigm.It seeks to ameliorate challenges linked with unstable training and the inconsistent quality of generated samples endemic to conventional GANs.The innovation lies in the adoption of the Wasserstein distance, manifested in Equations ( 6) and (7), to serve as the loss function.This unique integration is coupled with pivotal refinements ensuring that GAN training remains steadfast and stable.
To elucidate, D symbolizes the discriminator, G denotes the generator, x characterizes real samples, and z stands for random noise.Zooming into the specifics, our experimental training unfolds in two distinct stages, graphically depicted in Figure 7, which showcases the loss trajectory across both stages.The onset of the initial stage witnesses the initialization of parameters for both G and D, achieved either through random seeding or leveraging weights from pre-trained models.This is followed by an iterative training regimen where the generator gets refreshed every five iterations and the discriminator sees an update every single iteration.This sequential structure primarily ensures that the discriminator first attains robust capabilities, subsequently enabling a more efficient and effective guidance for the generator's training.
discriminator with an augmented window of opportunity for updates and learning.This strategic recalibration harmonizes the learning trajectories of both the generator and the discriminator, bolstering the overarching stability of the training process.
Only after the discriminator has displayed a proficiency level comparable to the generator do we pivot to simultaneous updates for both entities.This shift in strategy not only amplifies the equilibrium during the training phase but also propels the overall training efficiency.
In summary, our tailored adaptation of the WGAN framework, supplemented by strategic training methodologies, stands testament to our commitment to address and overcome the conventional challenges posed by GANs.The visual representations, as observed in Figure 7, along with our quantitative analysis, further substantiate the efficacy and robustness of our approach.Through meticulous model adjustments and innovative integrations, our research augments the domain of digital core generation, paving the path for more reliable and high-quality results in future endeavors.The main training parameters employed in this study are delineated in Table 1.It is noteworthy to mention that both Phase 1 and Phase 2 of the training process utilized identical parameters and the dataset of 1000 carbonate samples was augmented through rotation, resulting in a fourfold increase in dataset size.

Test Result Analysis
In our endeavor to understand and replicate the intricate details of carbonate rock structures using our model, we first delved deep into the analysis of real carbonate rock digital core samples.These samples serve as our ground truth, providing us with valuable insights into their geological features.
The micro-CT images of the Estaillades carbonate samples, as shown in Figure 8, exhibit a range of pore sizes and distributions.These images showcase the heterogeneity inherent within the rock matrix, characterized by varying degrees of porosity and pore connectivity.The contrasts observed in grayscale intensities across the samples are indicative of differences in material density and pore structure within the homogeneous rock, reflecting the calcitic composition's natural variability.It is imperative to note that these variations do not imply distinct depositional environments, but rather are representative of the intrinsic heterogeneity of the rock sample.During each generative update, a batch of samples emanates from the input random noise vectors.These samples, once produced, are channeled into the discriminator, D. The ensuing outcome from D for these generated samples sets the stage for computing the associated loss function.Here, the Wasserstein distance serves as a metric to discern discrepancies between genuine and fabricated samples.The overarching aim is to curtail this generated sample loss by initiating backward propagation and fine-tuning the generator parameters.One of the cornerstones of the WGAN approach is the imposition of constraints on D's parameters.In our experiments, we diligently clip the model parameters within a range of −0.1 to 0.1.This meticulous curation pre-empts potential pitfalls like an overpowered discriminator or gradient anomalies during the training trajectory.These enumerated steps recur till a stipulated training cycle count is met or until the loss function exhibits signs of convergence.
The payoff of adopting the Wasserstein distance as the linchpin loss function is manifold: it crafts a more nuanced portrayal of disparities between data distributions, amplifying the caliber of generated samples while anchoring GAN training in stability.The second training chapter is marked by resuscitating the parameters of G and D honed during the initial stage, followed by simultaneous updates in each iteration.This modus operandi mirrors its predecessor in the first phase.One of the inherent tribulations with legacy GANs resides in the disproportionate training dynamics between the generator and discriminator.This imbalance often culminates in a scenario where one model might overshadow the other prematurely, thus injecting instability into the training phase.By modulating the update cadence of the generator to every fifth iteration, we furnish the discriminator with an augmented window of opportunity for updates and learning.This strategic recalibration harmonizes the learning trajectories of both the generator and the discriminator, bolstering the overarching stability of the training process.
Only after the discriminator has displayed a proficiency level comparable to the generator do we pivot to simultaneous updates for both entities.This shift in strategy not only amplifies the equilibrium during the training phase but also propels the overall training efficiency.
In summary, our tailored adaptation of the WGAN framework, supplemented by strategic training methodologies, stands testament to our commitment to address and overcome the conventional challenges posed by GANs.The visual representations, as observed in Figure 7, along with our quantitative analysis, further substantiate the efficacy and robustness of our approach.Through meticulous model adjustments and innovative integrations, our research augments the domain of digital core generation, paving the path for more reliable and high-quality results in future endeavors.
The main training parameters employed in this study are delineated in Table 1.It is noteworthy to mention that both Phase 1 and Phase 2 of the training process utilized identical parameters and the dataset of 1000 carbonate samples was augmented through rotation, resulting in a fourfold increase in dataset size.

Test Result Analysis
In our endeavor to understand and replicate the intricate details of carbonate rock structures using our model, we first delved deep into the analysis of real carbonate rock digital core samples.These samples serve as our ground truth, providing us with valuable insights into their geological features.
The micro-CT images of the Estaillades carbonate samples, as shown in Figure 8, exhibit a range of pore sizes and distributions.These images showcase the heterogeneity inherent within the rock matrix, characterized by varying degrees of porosity and pore connectivity.The contrasts observed in grayscale intensities across the samples are indicative of differences in material density and pore structure within the homogeneous rock, reflecting the calcitic composition's natural variability.It is imperative to note that these variations do not imply distinct depositional environments, but rather are representative of the intrinsic heterogeneity of the rock sample.

Generated Carbonate Rock Digital Core Analysis
Generated Sample 1: The digitally recreated specimen in Figure 9a successfully replicates typical carbonate features, portraying a mixture of microporosity and larger interconnected channels suggestive of vuggy porosity.Its cut-through view in Figure 9b further emphasizes the dissolution patterns reminiscent of karstic environments, suggesting that the generator has learned to capture complex secondary porosity traits often seen in carbonates.
Generated Sample 2: Sample 2 in Figure 9c demonstrates the model's prowess in capturing a relatively homogenous matrix, which likely simulates a carbonate mudstone or a wackestone.The consistent grain and pore distribution in Figure 9d supports this assess-

Generated Carbonate Rock Digital Core Analysis
Generated Sample 1: The digitally recreated specimen in Figure 9a successfully replicates typical carbonate features, portraying a mixture of microporosity and larger interconnected channels suggestive of vuggy porosity.Its cut-through view in Figure 9b further emphasizes the dissolution patterns reminiscent of karstic environments, suggest-ing that the generator has learned to capture complex secondary porosity traits often seen in carbonates.

Generated Carbonate Rock Digital Core Analysis
Generated Sample 1: The digitally recreated specimen in Figure 9a successfully replicates typical carbonate features, portraying a mixture of microporosity and larger interconnected channels suggestive of vuggy porosity.Its cut-through view in Figure 9b further emphasizes the dissolution patterns reminiscent of karstic environments, suggesting that the generator has learned to capture complex secondary porosity traits often seen in carbonates.Generated Sample 2: Sample 2 in Figure 9c demonstrates the model's prowess in capturing a relatively homogenous matrix, which likely simulates a carbonate mudstone or a wackestone.The consistent grain and pore distribution in Figure 9d supports this assessment, showcasing a rock that would likely have lower permeability but possibly higher porosity due to its finer grain size.Generated Sample 3: This sample, presented in Figure 9e, appears to simulate a grainstone facies, given the well-defined intergranular porosity and consistent grain size.Image Figure 9f underscores the clarity of the porosity channels, hinting at efficient fluid flow.The quality of the digitally generated structure, in terms of both mineralogy and porosity patterns, is commendably accurate.Generated Sample 4: In Figure 9g, the simulated rock core reveals a heterogeneity characteristic of packstones.The darker regions, possibly indicative of organic matter or micrite envelopes, punctuate the matrix, suggesting a dynamic sedimentary environment.The cross-section in Figure 9h highlights potential areas of diagenetic alteration or mineral replacement, indicative of the intricate diagenetic history often inherent in carbonates.
Generated Sample 5: Sample 5 in Figure 9i portrays characteristics similar to dolomitic carbonates, with a more consistent and crystalline texture.The cleavage patterns and grain boundaries, as observed in Figure 9j, mirror those commonly seen in dolomitized rocks, implying a thorough mineralogical understanding by the generative model.Generated Sample 6: The core in Figure 9k is characteristic of a rudstone or floatstone, with larger grains or clasts seemingly set in a finer matrix.This suggests a high-energy depositional environment with significant clast transport.The mid-section view in Figure 9l reaffirms the presence of these larger components, indicating a digital replication capturing a diverse set of sedimentary structures.

Pore Segmentation Analysis
The transformation from grayscale images to binary representations plays a pivotal role in the precise analysis of the pore structures in 3D digital rock core images.This binary segmentation facilitates a more detailed examination of pore spaces and their distribution, laying the groundwork for subsequent analyses of porosity and pore volumes.To achieve this, we employed the level set method for segmentation and binarization of two-dimensional slices of the digital rock cores, as illustrated in Figure 10.

Grayscale Value Distribution Analysis
In assessing the efficacy of our generated digital cores, it is imperative to delve into a quantitative analysis.Here, we provide a comparative study of the distribution patterns observed in both real and generated digital rock images.The metrics used for this comparison, including their computational methods, can be referenced in Section 2.4.
Figure 11a illustrates the distribution density of values from both real and generated images.Notably, the profiles of these distributions are strikingly similar, suggesting that the generated digital cores bear a close resemblance to the real counterparts in terms of their value distribution.As expressed in Equation ( 1), the bandwidth used in Figure 11a is determined using Scott's rule, which is calculated as n −1/5 × sd, where n represents the sample size of 1000 and sd represents the standard deviation of the data.The kernel function used is the Gaussian kernel function: The level set method operates by evolving a contour within the image, driven by an energy function that guides the contour's evolution towards the boundaries of interest.This energy function is a composite of internal and external energies.The internal energy ensures the smoothness of the contour, maintaining minimal variance in pixel values within it.In contrast, the external energy attracts the contour towards the target regions.By judiciously balancing the weights of internal and external energies, accurate segmentation of the rock core slices is achieved.
Throughout the iterative process of contour evolution, the position of the contour is continually updated based on the definition of the energy function, until it converges.Once the contour or region stabilizes, the image can be binarized based on the outcome of this evolution, as shown in Figure 10c,f,i.
This approach to segmentation using the level set method not only allows for accurate delineation of pore spaces but also sets the stage for in-depth analyses of porosity and pore volume in the segmented digital rock cores.The binary images resulting from this process are crucial for quantitatively assessing the porous characteristics of the rocks, thus providing valuable insights into their physical properties.In assessing the efficacy of our generated digital cores, it is imperative to delve into a quantitative analysis.Here, we provide a comparative study of the distribution patterns observed in both real and generated digital rock images.The metrics used for this comparison, including their computational methods, can be referenced in Section 2.4.
Figure 11a illustrates the distribution density of values from both real and generated images.Notably, the profiles of these distributions are strikingly similar, suggesting that the generated digital cores bear a close resemblance to the real counterparts in terms of their value distribution.As expressed in Equation ( 1), the bandwidth used in Figure 11a is determined using Scott's rule, which is calculated as n −1/5 × sd, where n represents the sample size of 1000 and sd represents the standard deviation of the data.The kernel function used is the Gaussian kernel function: The summary statistics for both categories are presented in Table 2.The summary statistics for both categories are presented in Table 2. Table 2 underscores the proximate resemblance in terms of mean, standard deviation, and extremities between the real and generated images.Though there are minute deviations, especially in the minimum values for the generated images, these discrepancies are minuscule.Such consistency reaffirms the robustness of the image generation model in replicating the nuances of real digital cores.

Porosity Distribution Analysis
In this section, we extend the analytical purview to encompass the porosity distribution within the digital rock samples.We synthesized a cohort of 1000 digital rock samples and computed their porosity, juxtaposing these findings with those derived from an equivalent number of original samples.The histogram in Figure 12 portrays the comparative porosity distribution between the real and generated digital rock samples.This visual representation elucidates that the distribution of porosity is congruent across both sets of samples, with no instances of anomalously low or high porosity values detected within the generated subset.In conclusion, the visual and quantitative assessments jointly affirm the quality and authenticity of our generated digital cores, marking a significant stride in the domain of digital rock physics.

Comparative Analysis of Pore Volume Distribution
This subsection is dedicated to the comparative analysis of the pore volume distribution in carbonate rock samples, selecting two real and two generated specimens for a focused comparative study.The porosities for the real samples, designated as Real1 and Real2, were determined to be 0.109 and 0.12, respectively, while the generated samples, labeled Generate1 and Generate2, exhibited porosities of 0.12 and 0.06, accordingly.
Figure 13 provides a visual exposition of the selected samples, showcasing the original grayscale images alongside their binary transformations and porosity mappings.This tripartite display allows for an intuitive understanding of the pore structures, where despite variations in individual porosity, a semblance in the overall distribution patterns is discernible.The histogram reveals that the fidelity of the synthetic generation process is of a high caliber, capturing the intrinsic porosity characteristics with commendable accuracy.It is noteworthy that the porosity values of the generated samples did not deviate significantly from the natural samples, suggesting the absence of overfitting or underfitting phenomena within the generative model employed.
The distribution curves manifest a slight deviation in the tails, yet these differences are statistically insignificant.Such minor discrepancies do not detract from the overall conclusion that the generative model can reliably replicate the porosity distribution of natural digital rock samples.Thus, we posit that the generative model exhibits a high degree of validity, rendering it a potent tool for the simulation of digital rock porosity which could be indispensable for predictive modeling and reservoir characterization.
In conclusion, the visual and quantitative assessments jointly affirm the quality and authenticity of our generated digital cores, marking a significant stride in the domain of digital rock physics.

Comparative Analysis of Pore Volume Distribution
This subsection is dedicated to the comparative analysis of the pore volume distribution in carbonate rock samples, selecting two real and two generated specimens for a focused comparative study.The porosities for the real samples, designated as Real1 and Real2, were determined to be 0.109 and 0.12, respectively, while the generated samples, labeled Generate1 and Generate2, exhibited porosities of 0.12 and 0.06, accordingly.
Figure 13 provides a visual exposition of the selected samples, showcasing the original grayscale images alongside their binary transformations and porosity mappings.This tripartite display allows for an intuitive understanding of the pore structures, where despite variations in individual porosity, a semblance in the overall distribution patterns is discernible.The pore volume distribution graph depicted in Figure 14 further cements the likeness in distribution patterns between the real and generated samples.It is particularly noteworthy that despite the apparent disparity in porosity between Real2 and Generate2, the pore volume distribution is remarkably similar, attesting to the generative model's capability to capture the quintessential characteristics of pore distributions within carbonate rocks.
These visual and quantitative analyses substantiate the proficiency of the generative model in learning and reproducing the pore distribution patterns endemic to carbonate rocks.Such fidelity not only validates the model's application for simulating realistic rock structures but also enhances the credibility of using these generated samples as reliable proxies in reservoir analysis and modeling.The implications of this are significant, promising strides in the accuracy of subsurface characterizations and the potential for improved predictive insights in hydrocarbon exploration and recovery.The pore volume distribution graph depicted in Figure 14 further cements the likeness in distribution patterns between the real and generated samples.It is particularly noteworthy that despite the apparent disparity in porosity between Real2 and Generate2, the pore volume distribution is remarkably similar, attesting to the generative model's capability to capture the quintessential characteristics of pore distributions within carbonate rocks.
These visual and quantitative analyses substantiate the proficiency of the generative model in learning and reproducing the pore distribution patterns endemic to carbonate rocks.Such fidelity not only validates the model's application for simulating realistic rock structures but also enhances the credibility of using these generated samples as reliable proxies in reservoir analysis and modeling.The implications of this are significant, promising strides in the accuracy of subsurface characterizations and the potential for improved predictive insights in hydrocarbon exploration and recovery.

Conclusions
In this research, we explored the potential of advanced generative models for simulating digital rock core samples.Our approach was centered on a state-of-the-art model that was particularly adept at capturing the intricate geological and mineralogical features of real rock samples.
The synthesized images closely resembled real samples in terms of geological features, such as porosity.Through rigorous evaluations, the model proved its effectiveness in replicating essential geological attributes, underscoring its relevance for in-depth geological studies.
Quantitative analysis, supported by statistical data and distribution plots, confirmed a strong alignment between the generated and actual rock samples.This consistency highlights the model's capability to maintain the primary characteristics of rock samples while introducing subtle variations that reflect natural formations.
Significantly, our model's integration of the self-attention mechanism emerged as a key innovation.This feature allowed the model to focus on intricate patterns and relationships within the images, enhancing the quality and authenticity of the generated rock core slices.The ability to accurately represent unique carbonate rock features, like dissolution, further accentuated its utility and innovation.
In conclusion, our study offers a significant contribution to the field of digital rock physics.It emphasizes the synergy of advanced computational techniques with geology, setting a precedent for future interdisciplinary research in the domain.

Figure 1 .
Figure 1.Sequential process of digital rock reconstruction using GAN.

Figure 1 .
Figure 1.Sequential process of digital rock reconstruction using GAN.Dataset Input: During this phase, actual rock samples are fed into the model.These samples serve as the foundation, providing real-world data that our model learns from.The green arrows in the figure represent the iterative process of training, where through each iteration, the model progressively refines its learning based on the input data.Self-Attention Enhanced Generative Adversarial Network (SAE-GAN) Training: With the real data samples as input, the model embarks on the training journey.Utilizing the self-attention mechanism, the SAE-GAN meticulously learns the intrinsic features and distributions present in carbonate rocks, focusing particularly on minerals and pores.

Figure 3 .
Figure 3. Architecture of the generator with fully convolutional self-attention network.

Figure 3 .
Figure 3. Architecture of the generator with fully convolutional self-attention network.

Figure 3 .
Figure 3. Architecture of the generator with fully convolutional self-attention network.

Figure 3 .
Figure 3. Architecture of the generator with fully convolutional self-attention network.

Figure 7 .
Figure 7. Loss curve of (a) Phase 1 and (b) Phase 2. The legends for the first and second phases are identical.

Figure 7 .
Figure 7. Loss curve of (a) Phase 1 and (b) Phase 2. The legends for the first and second phases are identical.

Figure 8 .
Figure 8. Digital core images of representative carbonate rock samples.(a,c,e,g) are complete samples, while (b,d,f,h) represent mid-section views of the corresponding samples.

Figure 8 .
Figure 8. Digital core images of representative carbonate rock samples.(a,c,e,g) are complete samples, while (b,d,f,h) represent mid-section views of the corresponding samples.

Figure 9 .
Figure 9. Digitally generated carbonate rock samples using GAN.(a,c,e,g,i,k) show complete digital cores, while (b,d,f,h,j,l) provide mid-section views of the corresponding samples.

18 Figure 10 .
Figure 10.Segmentation and binarization of our generated digital rock.(a,d,g) are two-dimensional slice images derived from the generated three-dimensional digital rock core images, while (b,e,h) are images obtained after segmenting the two-dimensional slice images using the level set algorithm.(c,f,i) are based on the binary images post-segmentation.

Figure 10 .
Figure 10.Segmentation and binarization of our generated digital rock.(a,d,g) are two-dimensional slice images derived from the generated three-dimensional digital rock core images, while (b,e,h) are images obtained after segmenting the two-dimensional slice images using the level set algorithm.(c,f,i) are based on the binary images post-segmentation.

Figure 11 .
Figure 11.Quantitative comparison of real vs.generated digital rock cores: (a) density distribution curve; (b) box plot.

Figure 11 .
Figure 11.Quantitative comparison of real vs.generated digital rock cores: (a) density distribution curve; (b) box plot.

Figure
Figure 11b presents box plots that capture the dispersion and central tendency of values from real and generated images.The median and interquartile range (IQR) from both categories demonstrate close proximity, further underscoring the alignment of generated cores with their real counterparts.The summary statistics for both categories are presented in Table2.

18 Figure 12 .
Figure 12.Overlay of porosity distribution histograms for real and generated digital rock samples.

Figure 12 .
Figure 12.Overlay of porosity distribution histograms for real and generated digital rock samples.

Figure 14 .
Figure 14.Comparative pore volume distribution curves for selected real and generated rock samples.

Table 1 .
Training parameters for both phases of the experiment.

Table 1 .
Training parameters for both phases of the experiment.

Table 2 .
Comparison of statistical metrics between real and generated digital rock samples.

Table 2
underscores the proximate resemblance in terms of mean, standard deviation, and extremities between the real and generated images.Though there are minute deviations, especially in the minimum values for the generated images, these discrepancies are minuscule.Such consistency reaffirms the robustness of the image generation model in replicating the nuances of real digital cores.

Table 2 .
Comparison of statistical metrics between real and generated digital rock samples.