Advanced Elastic and Reservoir Properties Prediction through Generative Adversarial Network

: The prediction of subsurface properties such as velocity, density, porosity, and water saturation has been the main focus of petroleum geosciences. Advanced methods such as Full Waveform Inversion (FWI), Joint Migration Inversion (JMI) and ML-Rock Physics are able to produce better predictions than their predecessors, but they still require tedious manual interpretation that is prone to human error. The research on these methods remains open as they suffer from technical limitations. As computing resources are becoming cheaper, the use of a single deep-generative adversarial network is feasible in predicting all these properties in a completely data-driven manner. In our proposed method of multiscale pix2pix applied to SEG SEAM salt data, we have managed to map from one input, which is seismic post-stack data, to several outputs of reservoir and elastic properties such as porosity, velocity, and density by using only one trained model and without having to manually interpret or pre-process the input data. With 90% accuracy of the results in the synthetic data testing, the method is worthy of being explored by the petroleum geoscience fraternity.


Introduction
Accurate prediction of elastic and reservoir properties is crucial in getting a more precise image of the subsurface.This will lead to the correct placement of a well, reducing risks and uncertainties and thus increasing the chance of success in finding hydrocarbon.Velocity model building (VMB) as well as seismic inversion and rock physics study have been used to achieve these objectives.Although significant advancement has been made in these methods, they remain an open topic with a lot of ongoing research.There are several approaches to VMB [1].The most conventional method is tomography, an iterative process involving several steps, such as sorting the data into common image points (CIP) and picking the reflections to flatten the image gathered prior to producing the final velocity model.Another widely used approach is Full Waveform Inversion (FWI) [2], which matches calculated data with observed data by considering amplitude and traveltimes [3,4].Traditional FWI is depth-limited as the diving wave does not penetrate deep enough in the subsurface, hence the reflection FWI (RFWI) is proposed [5].RFWI suffers from a highly non-linear coupling of density and velocity.The problem faced by RFWI can be solved by Joint Migration Inversion (JMI) [6,7], which manages to reduce the nonlinearity of the inversion.However, JMI is based on a one-way wave equation, which is inferior to FWI.In 2018, a hybrid FWI-JMI was proposed [8].Seismic inversion and rock physics study is another method used in reservoir characterization.Unlike VMB, this method is much more focused on the reservoir itself than on the entire seismic volume [9,10].In a multi-step inversion approach, elastic properties such as P-impedance, S-impedance, and density are produced by stochastic or deterministic inversion on seismic data.Next, at a well, a rock physics model is developed connecting these elastic properties to reservoir properties.High-resolution models of the reservoir's properties are created in a single-loop manner before rock physics transformations are used to create the elastic properties volume.After that, calculated synthetic seismic traces are compared to actual seismic data.In order to combine these multi-step and single-loop approaches, Grana, D. [9] suggested using the output from the first approach as an a priori for the second approach.As was previously established, inversion and rock physics are reservoir-centric, and accuracy decreases as reservoir depth increases.A regional rock physics template (RPT), which is based on the rock physics inclusion model, has been proposed to address these issues [11].

Related Work
AI-Velocity Model.The application of neural networks for velocity model building dates back to the 1990s.Röth, G. et al. [12] used a neural network to predict 1D velocity from shot gathers.Similar work has recently been published [13] that maps the 1D vertical velocity profile from data cubes of the neighboring common midpoint (CMP) gathers.The work applied advanced DL architecture, the Visual Geometry Group (VGG) network, and is able to accommodate lateral heterogeneity in the model, a problem faced by [14] in generalizing the neural network training, by creating four different sets of geologically inspired geometric training models with and without background velocity gradient.The authors managed to produce plausible results with only one neural net (NN) and one fully connected (FC) layer as the network architecture.Earlier work of [15] has also shown the feasibility of the UNET architecture to approximate the non-linear mapping into the velocity model from multi-shot gathers.The authors also suggested that a generative adversarial network (GAN) be used in future testing.Another example is by [16] in a paper on DL tomography in which the authors applied three dense layers to learn the tomography operator from the seismic data.Based on the example here, most of the model-building techniques focus only on building the velocity model [13,14,16,17] and, in certain cases, the density model [18].
AI-Rock Physics.The application of machine learning or deep learning in rock physics has gained attention in recent years.One of the early applications used two simple 1D convolutional layers in solving the seismic inversion problem of predicting the elastic model of the subsurface from recorded seismic data [19].The authors generated a numerical training set of P-impedance realizations with the corresponding seismic response and were able to have a good generalized network.In another example, convolutional neural networks (CNN) have been used to predict reservoir properties directly in the depth domain given the time domain pre-stack seismic data [20].The feasibility of the method was shown by comparing two CNN networks, namely PetroNet (end-to-end CNN) and ElasticNet-ElasticPetroNet (cascaded CNN).In addition to deriving the reservoir and elastic properties directly from the seismic data, the neural net was also tested in predicting the reservoir properties from the elastic properties [21].Several inputs of different elastic properties were passed together into a three-layer neural net to produce the reservoir properties.
Generative Adversarial Networks.Deep learning is a subset of machine learning, and machine learning itself is a subset of artificial intelligence.Generative adversarial networks, or GAN [22], is a two-network architecture that consists of a generator that produces the translated image and a discriminator that tries to classify its input whether it is generated (fake) or coming from the label set (real).GAN has become one of the best deep learning networks due to its ability to produce remarkable photorealistic results in many computer vision tasks such as image generation [22], image transformation or translation [22,23], styling [24], and super-resolution [25,26].Furthermore, many GAN networks were also applied to medical image segmentation [27][28][29][30][31][32][33].Despite the success of GAN in both domains, the application of GAN in geosciences is limited.Some of the applications in geosciences are compressive sensing [34], facies classification [35], and rock-type inference [36,37].
In this paper, we modified a generative deep learning (GAN) method called Pix2Pix proposed by Isola, P. et al. [22] and applied it to synthetic post-stack seismic data.We verified the feasibility of the method in predicting or mapping the reservoir and elastic properties directly.First, the input and target data were prepared as required by the DL network.Then, we split the data into training/validation and testing, and trained them for a number of epochs.Next, we looked at the validation results and did the testing on the unseen data and finally, we discussed those results, the shortcomings, and the way forward for this method.
Overall, there are two main contributions of our work: 1.
The groundwork of a new method alongside velocity model building and rock physics to predict subsurface properties that is entirely data-driven without any manual interactions.

2.
The multiscale patchGAN extracts features at different scales and thus improves the accuracy of final predictions.

Pix2pix and Multiscale PatchGAN of Pix2Pix
Pix2pix is a generalized image-to-image translation network that is based on a conditional GAN [22].It comprises a generator, G, and a discriminator, D. In the original pix2pix network, G is U-NET while D is PatchGAN.G takes the observed data x and the random noise vector z as its input and generates an output, y .
G is trained to produce plausible outputs that are indistinguishable from the label data (real) while D is trained to detect whether its incoming input is real (from the label data) or fake (generated by G).Both G and D are competing with each other in a min-max game in which G is trying to minimize its loss by producing output as similar as possible to the true label (Figure 1) while D is trying to maximize its "correctness" at detecting whether its incoming input is real, y, or fake, y (Figure 2).The ability of pix2pix to transform different images is helped by its two objective functions: reconstruction loss, l L1 (G) (Equation (2), and conditional GAN adversarial loss, l cGAN (G, D) (Equation (3).In the reconstruction loss, the difference between the label y and the generated output y is calculated in L1 distance as L1 encourages less blurriness [22].The adversarial loss, on the other hand, is given by the sum of the likelihood of D correctly classifying the real data as real and the generated data as fake.
The final objective function is with λ being the regularization coefficient.

Proposed Network
The proposed network is an enhanced version of the original pix2pix.The general idea is to use the original pix2pix in two (2) runs.The first run (Figure 3), together with all the default hyperparameters, is used to obtain the base model.This base model is re-trained in the second run, but this time, the kernel size of the discriminator is changed from 4 × 4 to 3 × 3 (Figure 4).In theory, a smaller kernel size would mean high-resolution feature extraction, but it cannot be too small or there will not be enough information for the kernel can extract.In addition, the proposed solution is also based on StarGAN [23], an approach in which only a single network is trained for performing image-to-image translation of multiple domains.Such a unified model architecture allows for the simultaneous training of multiple datasets with different domains within a single network.We have adapted this idea of transforming the seismic data into four different reservoir and elastic properties by using only one model.The summary of our proposed network can be seen in Table 1.
Appl.Sci.2023, 13, x FOR PEER REVIEW Figure 1.A diagram for generator training.The generator takes the input and is trained to p output that is similar and indistinguishable from the label data.The prediction is generated generator and the target is the label.The reconstruction loss is calculated based on the di between the generated output and the label.The generated output is also sent to the discri to "fool" it.The adversarial loss is given by the ability of the discriminator to correctly clas generated output as fake and the label as real.The total loss for the generator training is the the reconstruction loss and the adversarial loss.

Proposed Network
The proposed network is an enhanced version of the original pix2pix.The g idea is to use the original pix2pix in two (2) runs.The first run (Figure 3), together w the default hyperparameters, is used to obtain the base model.This base mode trained in the second run, but this time, the kernel size of the discriminator is ch from 4 × 4 to 3 × 3 (Figure 4).In theory, a smaller kernel size would mean high-res feature extraction, but it cannot be too small or there will not be enough informat the kernel can extract.In addition, the proposed solution is also based on StarGA an approach in which only a single network is trained for performing image-to The generator takes the input and is trained to produce output that is similar and indistinguishable from the label data.The prediction is generated by the generator and the target is the label.The reconstruction loss is calculated based on the difference between the generated output and the label.The generated output is also sent to the discriminator to "fool" it.The adversarial loss is given by the ability of the discriminator to correctly classify the generated output as fake and the label as real.The total loss for the generator training is the sum of the reconstruction loss and the adversarial loss.The generator takes the input and is trained to produce output that is similar and indistinguishable from the label data.The prediction is generated by the generator and the target is the label.The reconstruction loss is calculated based on the difference between the generated output and the label.The generated output is also sent to the discriminator to "fool" it.The adversarial loss is given by the ability of the discriminator to correctly classify the generated output as fake and the label as real.The total loss for the generator training is the sum of the reconstruction loss and the adversarial loss.

Proposed Network
The proposed network is an enhanced version of the original pix2pix.The general idea is to use the original pix2pix in two (2) runs.The first run (Figure 3), together with all the default hyperparameters, is used to obtain the base model.This base model is retrained in the second run, but this time, the kernel size of the discriminator is changed from 4 × 4 to 3 × 3 (Figure 4).In theory, a smaller kernel size would mean high-resolution feature extraction, but it cannot be too small or there will not be enough information for the kernel can extract.In addition, the proposed solution is also based on StarGAN [23], an approach in which only a single network is trained for performing image-to-image translation of multiple domains.Such a unified model architecture allows for the simultaneous training of multiple datasets with different domains within a single network.We have adapted this idea of transforming the seismic data into four different reservoir and elastic properties by using only one model.The summary of our proposed network can be seen in Table 1.

Experiments
In this section, we validated our proposed method on the SEG SEAM salt dataset.The dataset was a 3D representation of a deepwater Gulf of Mexico salt domain, complete with fine-scale stratigraphy that included oil and gas reservoirs.The model extended 35 km east-west, 40 km north-south, and 15 km in depth with a 10 m grid in all directions.The stratigraphic variation was created based on geostatistics.All model properties were derived from fundamental rock properties that followed typical compaction gradients below the water bottom.Hence, properties have subtle contrasts at microlayer boundaries, translation of multiple domains.Such a unified model architecture allows for the simultaneous training of multiple datasets with different domains within a single network.We have adapted this idea of transforming the seismic data into four different reservoir and elastic properties by using only one model.The summary of our proposed network can be seen in Table 1.

Experiments
In this section, we validated our proposed method on the SEG SEAM salt dataset.The dataset was a 3D representation of a deepwater Gulf of Mexico salt domain, complete with fine-scale stratigraphy that included oil and gas reservoirs.The model extended 35 km east-west, 40 km north-south, and 15 km in depth with a 10 m grid in all directions.The stratigraphic variation was created based on geostatistics.All model properties were derived from fundamental rock properties that followed typical compaction gradients below the water bottom.Hence, properties have subtle contrasts at microlayer boundaries,

Experiments
In this section, we validated our proposed method on the SEG SEAM salt dataset.The dataset was a 3D representation of a deepwater Gulf of Mexico salt domain, complete with fine-scale stratigraphy that included oil and gas reservoirs.The model extended 35 km east-west, 40 km north-south, and 15 km in depth with a 10 m grid in all directions.The stratigraphic variation was created based on geostatistics.All model properties were derived from fundamental rock properties that followed typical compaction gradients below the water bottom.Hence, properties have subtle contrasts at microlayer boundaries, especially in the shallow section, generating very realistic synthetic data [38].In our project, the dataset that we used was seismic post-stack, velocity, porosity, density, and water saturation.especially in the shallow section, generating very realistic synthetic data [38].In our project, the dataset that we used was seismic post-stack, velocity, porosity, density, and water saturation.
Figure 5 shows the complete workflow for the experiment.A. Data preparation Each of the datasets (except for porosity) was normalized to the 0 and 1 range based on the property's fixed values, i.e., 1500-5000 for velocity, −0.3-0.3 for seismic amplitude and 1-5 for density, by using the min-max normalization method.We then selected 1000 inlines from each of the datasets and transformed them into patches sized 256 × 256.The total number of patches for each dataset was 25,000.We then inputted the seismic data as our  and velocity, porosity, density, and water saturation as our  ( 1 ,  2 ,  3 ,  4 ) into the data loader.We then split the data loader into 90% training and 10% validation.The 90% of training data was then inputted into the network randomly.

B. Training/validation
We started the training with the default pix2pix setting for up to 2000 epochs.We implemented early stopping [39] manually by doing quality control (QC) on the validation results and model every 50 epochs.The advantage of implementing early stopping is that we do not have to wait until the training is completed for 2000 epochs.If we have a good intermediate model, we can stop the training early.We selected the model that gave the highest validation accuracy as our base model, which in this case was model 1000 (Figure 6).

A. Data preparation
Each of the datasets (except for porosity) was normalized to the 0 and 1 range based on the property's fixed values, i.e., 1500-5000 for velocity, −0.3-0.3 for seismic amplitude and 1-5 for density, by using the min-max normalization method.We then selected 1000 inlines from each of the datasets and transformed them into patches sized 256 × 256.The total number of patches for each dataset was 25,000.We then inputted the seismic data as our x and velocity, porosity, density, and water saturation as our y (y 1 , y 2 , y 3 , y 4 ) into the data loader.We then split the data loader into 90% training and 10% validation.The 90% of training data was then inputted into the network randomly.

B. Training/validation
We started the training with the default pix2pix setting for up to 2000 epochs.We implemented early stopping [39] manually by doing quality control (QC) on the validation results and model every 50 epochs.The advantage of implementing early stopping is that we do not have to wait until the training is completed for 2000 epochs.
If we have a good intermediate model, we can stop the training early.We selected the model that gave the highest validation accuracy as our base model, which in this case was model 1000 (Figure 6).Next, we re-trained the base model using the same number of parameters except for the discriminator kernel size, which we changed from 4 to 3. We implemented the same early-stopping strategy.Here, the best model was model 0 as it gave the highest accuracy for the validation set (Figure 7).

C. Testing
Now that we had our base model (pix2pix) and final model (MS-pix2pix, the proposed method), we conducted testing on 2 inlines that were not previously included during the training phase (Figure 8).Inline 1 was chosen to represent an area with simple geology, a layer-cake strata with a small salt body, while Inline 2 represented an area with complex geology with a big salt body.The inlines sizes were 1024 × 1024 and we applied normalization to them before being inputted into both models.We compare the results of both models and also both inlines in the next section.Next, we re-trained the base model using the same number of parameters except for the discriminator kernel size, which we changed from 4 to 3. We implemented the same early-stopping strategy.Here, the best model was model 0 as it gave the highest accuracy for the validation set (Figure 7).Next, we re-trained the base model using the same number of parameters except for the discriminator kernel size, which we changed from 4 to 3. We implemented the same early-stopping strategy.Here, the best model was model 0 as it gave the highest accuracy for the validation set (Figure 7).

C. Testing
Now that we had our base model (pix2pix) and final model (MS-pix2pix, the proposed method), we conducted testing on 2 inlines that were not previously included during the training phase (Figure 8).Inline 1 was chosen to represent an area with simple geology, a layer-cake strata with a small salt body, while Inline 2 represented an area with complex geology with a big salt body.The inlines sizes were 1024 × 1024 and we applied normalization to them before being inputted into both models.We compare the results of both models and also both inlines in the next section.

C. Testing
Now that we had our base model (pix2pix) and final model (MS-pix2pix, the proposed method), we conducted testing on 2 inlines that were not previously included during the training phase (Figure 8).Inline 1 was chosen to represent an area with simple geology, a layer-cake strata with a small salt body, while Inline 2 represented an area with complex geology with a big salt body.The inlines sizes were 1024 × 1024 and we applied normalization to them before being inputted into both models.We compare the results of both models and also both inlines in the next section.

Results
This section is organized as follows.Each figure contains 3 images: prediction from pix2pix, prediction from MS-pix2pix (proposed method) and the ground truth.The first 2 figures are the velocity prediction for Inline 1 and Inline 2. The next 2 figures are the porosity prediction, followed by density and lastly water saturation.The accuracy metrics used were correlation (Equation ( 5)) and the structural similarity index, SSIM (Equation ( 6)).The SSIM metric calculates the similarity between 2 images based on 3 key features: luminance,  ; contrast,  ; and structure,  [41].Both metrics range between −1, which means the 2 images are very different, and +1, which means the 2 images are very similar.The results can be found in Table 2.

𝑆𝑆𝐼𝑀(𝑦, 𝑦 ′ ) = [𝑙(𝑦, 𝑦′)]𝛼 ⋅ [𝑐(𝑦, 𝑦′)]𝛽 ⋅ [𝑠(𝑦, 𝑦′)]𝛾
Table 2.The prediction accuracy was calculated using cross-correlation and SSIM.The MS-pix2pix produced higher accuracy predictions for all properties and for all inlines.MS-pix2pix managed to produce predictions of higher accuracy qualitatively and quantitatively compared to the pix2pix network for all subsurface properties (Figures 9-16).Although the predictions were better, there are still some areas that require further attention.In Figures 9, 11, 13, and 15, for example, MS-pix2pix improved the predictions of pix2pix by a lot.However, there is a wrong prediction that is predicted as a salt body

Results
This section is organized as follows.Each figure contains 3 images: prediction from pix2pix, prediction from MS-pix2pix (proposed method) and the ground truth.The first 2 figures are the velocity prediction for Inline 1 and Inline 2. The next 2 figures are the porosity prediction, followed by density and lastly water saturation.The accuracy metrics used were correlation (Equation ( 5)) and the structural similarity index, SSIM (Equation ( 6)).The SSIM metric calculates the similarity between 2 images based on 3 key features: luminance, l; contrast, c; and structure, s [41].Both metrics range between −1, which means the 2 images are very different, and +1, which means the 2 images are very similar.The results can be found in Table 2.
Table 2.The prediction accuracy was calculated using cross-correlation and SSIM.The MS-pix2pix produced higher accuracy predictions for all properties and for all inlines.MS-pix2pix managed to produce predictions of higher accuracy qualitatively and quantitatively compared to the pix2pix network for all subsurface properties (Figures 9-16).Although the predictions were better, there are still some areas that require further attention.In Figures 9, 11, 13 and 15, for example, MS-pix2pix improved the predictions of pix2pix by a lot.However, there is a wrong prediction that is predicted as a salt body at a shallow depth.This could be due to the "dimmed" amplitude in the seismic data that resembles the salt body signature.In Figure 16, although the prediction from MS-pix2pix is better, it is over-predicted.There should not be a layer inside the salt body.As pix2pix was conditioned on the input image, the layer might be carried over from the input seismic.It is also important to note that although both accuracy metrics gave high accuracy for water saturation, the network actually failed to detect the water layer.The high accuracy was given by the salt and non-salt formation of the water saturation model.The salt and non-salt formations must be removed from the training for a more accurate water saturation prediction.

Properties
Appl.Sci.2023, 13, x FOR PEER REVIEW 9 of 17 at a shallow depth.This could be due to the "dimmed" amplitude in the seismic data that resembles the salt body signature.In Figure 16, although the prediction from MS-pix2pix is better, it is over-predicted.There should not be a layer inside the salt body.As pix2pix was conditioned on the input image, the layer might be carried over from the input seismic.It is also important to note that although both accuracy metrics gave high accuracy for water saturation, the network actually failed to detect the water layer.The high accuracy was given by the salt and non-salt formation of the water saturation model.The salt and non-salt formations must be removed from the training for a more accurate water saturation prediction.at a shallow depth.This could be due to the "dimmed" amplitude in the seismic data that resembles the salt body signature.In Figure 16, although the prediction from MS-pix2pix is better, it is over-predicted.There should not be a layer inside the salt body.As pix2pix was conditioned on the input image, the layer might be carried over from the input seismic.It is also important to note that although both accuracy metrics gave high accuracy for water saturation, the network actually failed to detect the water layer.The high accuracy was given by the salt and non-salt formation of the water saturation model.The salt and non-salt formations must be removed from the training for a more accurate water saturation prediction.Appl.Sci.2023, 13, x FOR PEER REVIEW 9 of 17 at a shallow depth.This could be due to the "dimmed" amplitude in the seismic data that resembles the salt body signature.In Figure 16, although the prediction from MS-pix2pix is better, it is over-predicted.There should not be a layer inside the salt body.As pix2pix was conditioned on the input image, the layer might be carried over from the input seismic.It is also important to note that although both accuracy metrics gave high accuracy for water saturation, the network actually failed to detect the water layer.The high accuracy was given by the salt and non-salt formation of the water saturation model.The salt and non-salt formations must be removed from the training for a more accurate water saturation prediction.However, it failed to detect the water saturation layer.

Application on Field Data
Applying deep learning to field data is a very challenging task due to two main reasons, which are (1) the residual noise presence in the seismic data and (2) the unavailability of the true label data.To test the robustness of our approach to field data, we used seismic data from a Malaysian field with the labels generated by A. Fuad, M.I., et al. [42] via rock-physics-guided, deep-learning-based properties inversion (Figure 17).The same workflow was applied to the field data, such as normalization and transformation to 256 × 256 patches.The total number of patches was 512, of which 90% of them were used for training and the other 10% for validation.We repeated the same procedure for training.The best model from the first training was re-trained with a smaller kernel size for the discriminator.The best model from this was then used for testing.The testing data (Figure 18) was a patch that was never used in the training and validation.

Application on Field Data
Applying deep learning to field data is a very challenging task due to two main reasons, which are (1) the residual noise presence in the seismic data and (2) the unavailability of the true label data.To test the robustness of our approach to field data, we used seismic data from a Malaysian field with the labels generated by A. Fuad, M.I., et al. [42] via rock-physicsguided, deep-learning-based properties inversion (Figure 17).The same workflow was applied to the field data, such as normalization and transformation to 256 × 256 patches.The total number of patches was 512, of which 90% of them were used for training and the other 10% for validation.We repeated the same procedure for training.The best model from the first training was re-trained with a smaller kernel size for the discriminator.The best model from this was then used for testing.The testing data (Figure 18) was a patch that was never used in the training and validation.
Before examining the results in detail, let us look at the accuracy metric.Our objective here was to predict the value as close as possible to the real value.We then established an accuracy metric (Figure 19) that compared the value of each element with a 5% margin of error.We counted the number of correct predictions and divided them by the total number of elements to get our final accuracy number.
Table 3 shows the results of the field data test.In general, MS-pix2pix produced predictions of higher accuracy than the predictions coming from the pix2pix network (Figure 20).MS-pix2pix (Figure 21) managed to predict the velocity and density accurately.This would help in producing better-migrated images of the subsurface thus resulting in a better interpretation of the geological features of the area.The porosity prediction was also improved in MS-pix2pix, which can lead to a better delineation of good reservoir rocks.However, the prediction of water saturation remains a challenge.Before examining the results in detail, let us look at the accuracy metri here was to predict the value as close as possible to the real value.We then accuracy metric (Figure 19) that compared the value of each element with error.We counted the number of correct predictions and divided them by ber of elements to get our final accuracy number.Table 3 shows the results of the field data test.In general, MS-pix2pix dictions of higher accuracy than the predictions coming from the pix2pix n 20).MS-pix2pix (Figure 21) managed to predict the velocity and density a  Before examining the results in detail, let us look at the accurac here was to predict the value as close as possible to the real value.accuracy metric (Figure 19) that compared the value of each eleme error.We counted the number of correct predictions and divided ber of elements to get our final accuracy number.Table 3 shows the results of the field data test.In general, MS dictions of higher accuracy than the predictions coming from the p 20). MS-pix2pix (Figure 21) managed to predict the velocity and d would help in producing better-migrated images of the subsurface ter interpretation of the geological features of the area.The poros improved in MS-pix2pix, which can lead to a better delineation o

Conclusions and Discussion
In this paper, we lay down the groundwork for implementing a deep learning model in predicting subsurface properties.We propose multi-scale pix2pix (MS-pix2pix), which is based on pix2pix and a conditional generative adversarial network, as an alternative to solve the non-linearity and non-uniqueness problems encountered in geosciences.We devised an experiment to test the method to transform the post-stack seismic to velocity, porosity, density, and water saturation by using just one single network.The result of the synthetic data test looks good.MS-pix2pix produced better results than pix2pix.There were mispredictions but the general prediction matched the ground truth.The results of the field data test was less accurate even though MS-pix2pix still produced better results than pix2pix.This was expected as the real data had a lot of unknowns and noise which The MS-pix2pix prediction of the field data test.The first row is the result for velocity, followed by the result for porosity in the second row, density in the third row, and water saturation in the last row.The first column shows the pix2pix result, followed by the ground truth in the second column, and the difference in the third column.The MS-pix2pix showed better prediction compared with pix2pix as shown by the smaller range in the difference plot.Again, water saturation is the hardest to predict.

Conclusions and Discussion
In this paper, we lay down the groundwork for implementing a deep learning model in predicting subsurface properties.We propose multi-scale pix2pix (MS-pix2pix), which is based on pix2pix and a conditional generative adversarial network, as an alternative to solve the non-linearity and non-uniqueness problems encountered in geosciences.We devised an experiment to test the method to transform the post-stack seismic to velocity, porosity, density, and water saturation by using just one single network.The result of the synthetic data test looks good.MS-pix2pix produced better results than pix2pix.There were mispredictions but the general prediction matched the ground truth.The results of the field data test was less accurate even though MS-pix2pix still produced better results than pix2pix.This was expected as the real data had a lot of unknowns and noise which were not present in the synthetic data.The accuracy of the field data test varied according to the predicted properties.Velocity and density predictions recorded high accuracy whereas porosity and water saturation predictions recorded low accuracy.In both tests, synthetic and field data, the prediction of water saturation remains challenging.So why were we able to predict the velocity, porosity, and density, but failed to predict water saturation?One way to look at this is because the three properties have continuous values that work well with our regression network, MS-pix2pix.Water saturation, on the other hand, is normally quantified by the value 0 or 1, which is more suited to a classification problem.In the synthetic test, the salt and background formation needed to be removed from the label to have a better judgment on the water saturation prediction.Moving forward, the proposed MS-pix2pix model looks promising and could be the base for further research.

Figure 2 .
Figure 2. A diagram for the discriminator training.The discriminator loss is the adversar which is the ability of the discriminator to correctly classify the generated output as fake label as real.It is worth noting that in conditional GAN, the discriminator sees the input too

Figure 1 .
Figure 1.A diagram for generator training.The generator takes the input and is trained to produce output that is similar and indistinguishable from the label data.The prediction is generated by the generator and the target is the label.The reconstruction loss is calculated based on the difference between the generated output and the label.The generated output is also sent to the discriminator to "fool" it.The adversarial loss is given by the ability of the discriminator to correctly classify the generated output as fake and the label as real.The total loss for the generator training is the sum of the reconstruction loss and the adversarial loss.

Figure 1 .
Figure 1.A diagram for generator training.The generator takes the input and is trained to produce output that is similar and indistinguishable from the label data.The prediction is generated by the generator and the target is the label.The reconstruction loss is calculated based on the difference between the generated output and the label.The generated output is also sent to the discriminator to "fool" it.The adversarial loss is given by the ability of the discriminator to correctly classify the generated output as fake and the label as real.The total loss for the generator training is the sum of the reconstruction loss and the adversarial loss.

Figure 2 .
Figure 2. A diagram for the discriminator training.The discriminator loss is the adversarial loss, which is the ability of the discriminator to correctly classify the generated output as fake and the label as real.It is worth noting that in conditional GAN, the discriminator sees the input too.

Figure 2 .
Figure 2. A diagram for the discriminator training.The discriminator loss is the adversarial loss, which is the ability of the discriminator to correctly classify the generated output as fake and the label as real.It is worth noting that in conditional GAN, the discriminator sees the input too.

Figure 3 .
Figure 3.The original pix2pix network.The original pix2pix is used to obtain the base model.

Figure 4 .
Figure 4.The architecture of the proposed method.Once the base model is obtained in Figure 3, the model is re-trained but this time a smaller kernel size for the discriminator is used.

Figure 3 .
Figure 3.The original pix2pix network.The original pix2pix is used to obtain the base model.

Figure 3 .
Figure 3.The original pix2pix network.The original pix2pix is used to obtain the base model.

Figure 4 .
Figure 4.The architecture of the proposed method.Once the base model is obtained in Figure 3, the model is re-trained but this time a smaller kernel size for the discriminator is used.

Figure 4 .
Figure 4.The architecture of the proposed method.Once the base model is obtained in Figure 3, the model is re-trained but this time a smaller kernel size for the discriminator is used.

Figure 5
Figure5shows the complete workflow for the experiment.

Figure 5 .
Figure5.The complete workflow of the proposed method.The required data was post-stack seismic, velocity, porosity, density, and water saturation.All the data was normalized from 0 to 1 using the min-max normalization method.One thousand inlines were selected from these data and transformed into 256 × 256 patches.The patches were inputted into the data loader and were trained randomly using the MS-Pix2Pix (a combination of 2 Pix2Pix of different kernel sizes).The final model was then selected and tested on unseen data.

Figure 5 .
Figure5.The complete workflow of the proposed method.The required data was post-stack seismic, velocity, porosity, density, and water saturation.All the data was normalized from 0 to 1 using the min-max normalization method.One thousand inlines were selected from these data and transformed into 256 × 256 patches.The patches were inputted into the data loader and were trained randomly using the MS-Pix2Pix (a combination of 2 Pix2Pix of different kernel sizes).The final model was then selected and tested on unseen data.

Figure 6 .
Figure 6.Validation accuracy of the training of the original pix2pix.Model 1000 was selected as our base model as it gave the highest accuracy for the validation results.Note that we stopped the training at 1250 and not 2000 as the accuracy trend was decreasing.As mentioned above, the generator calculates 2 losses that can lead to instability[40] in the training, hence the oscillation pattern seen in the accuracy curve.

Figure 7 .
Figure 7. Validation accuracy of the second network training with different kernel sizes.Model 0 gave the highest accuracy.Oscillation was also seen here.

Figure 6 .
Figure 6.Validation accuracy of the training of the original pix2pix.Model 1000 was selected as our base model as it gave the highest accuracy for the validation results.Note that we stopped the training at 1250 and not 2000 as the accuracy trend was decreasing.As mentioned above, the generator calculates 2 losses that can lead to instability [40] in the training, hence the oscillation pattern seen in the accuracy curve.

17 Figure 6 .
Figure 6.Validation accuracy of the training of the original pix2pix.Model 1000 was selected as our base model as it gave the highest accuracy for the validation results.Note that we stopped the training at 1250 and not 2000 as the accuracy trend was decreasing.As mentioned above, the generator calculates 2 losses that can lead to instability[40] in the training, hence the oscillation pattern seen in the accuracy curve.

Figure 7 .
Figure 7. Validation accuracy of the second network training with different kernel sizes.Model 0 gave the highest accuracy.Oscillation was also seen here.

Figure 7 .
Figure 7. Validation accuracy of the second network training with different kernel sizes.Model 0 gave the highest accuracy.Oscillation was also seen here.

Figure 8 .
Figure 8. (left): Inline 1. Simple geological section with a small salt body, (right): Inline 2. Complex geological section with a bigger salt body.The color bar is exaggerated to make the sections clearer.

Figure 8 .
Figure 8. (left): Inline 1. Simple geological section with a small salt body, (right): Inline 2. Complex geological section with a bigger salt body.The color bar is exaggerated to make the sections clearer.

Figure 9 .
Figure 9. Velocity prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.Visually, the MS-pix2pix showed better results.There was an artifact detected at the shallow part of the velocity model.

Figure 11 .
Figure 11.Porosity prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.The same shallow artifact was observed in the porosity model.

Figure 9 .
Figure 9. Velocity prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.Visually, the MS-pix2pix showed better results.There was an artifact detected at the shallow part of the velocity model.

Figure 9 .
Figure 9. Velocity prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.Visually, the MS-pix2pix showed better results.There was an artifact detected at the shallow part of the velocity model.

Figure 11 .
Figure 11.Porosity prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.The same shallow artifact was observed in the porosity model.

Figure 9 .
Figure 9. Velocity prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.Visually, the MS-pix2pix showed better results.There was an artifact detected at the shallow part of the velocity model.

Figure 11 .
Figure 11.Porosity prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.The same shallow artifact was observed in the porosity model.

Figure 11 .
Figure 11.Porosity prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.The same shallow artifact was observed in the porosity model.

Figure 12 .
Figure 12.Porosity prediction results of Inline 2. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.In general, MS-pix2pix produced a better porosity model compared to pix2pix prediction.However, there was misprediction in the deep pre-salt formation.

Figure 13 .
Figure 13.Density prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.The same shallow artifact can be seen here too.

Figure 14 .
Figure 14.Density prediction results of Inline 2. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix showed a big improvement in the density model over pix2pix.However, there was misprediction inside the salt body.

Figure 15 .
Figure 15.Water saturation prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix improved pix2pix's prediction except for the shallow part of the water saturation model.

Figure 12 . 17 Figure 12 .
Figure 12.Porosity prediction results of Inline 2. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.In general, MS-pix2pix produced a better porosity model compared to pix2pix prediction.However, there was misprediction in the deep pre-salt formation.

Figure 13 .
Figure 13.Density prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.The same shallow artifact can be seen here too.

Figure 14 .
Figure 14.Density prediction results of Inline 2. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix showed a big improvement in the density model over pix2pix.However, there was misprediction inside the salt body.

Figure 15 .
Figure 15.Water saturation prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix improved pix2pix's prediction except for the shallow part of the water saturation model.

Figure 13 .
Figure 13.Density prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.The same shallow artifact can be seen here too.

Figure 14 .
Figure 14.Density prediction results of Inline 2. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix showed a big improvement in the density model over pix2pix.However, there was misprediction inside the salt body.

Figure 15 .
Figure 15.Water saturation prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix improved pix2pix's prediction except for the shallow part of the water saturation model.

Figure 14 . 17 Figure 12 .
Figure 14.Density prediction results of Inline 2. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix showed a big improvement in the density model over pix2pix.However, there was misprediction inside the salt body.

Figure 13 .
Figure 13.Density prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.The same shallow artifact can be seen here too.

Figure 14 .
Figure 14.Density prediction results of Inline 2. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix showed a big improvement in the density model over pix2pix.However, there was misprediction inside the salt body.

Figure 15 .
Figure 15.Water saturation prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix improved pix2pix's prediction except for the shallow part of the water saturation model.

Figure 15 .
Figure 15.Water saturation prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix improved pix2pix's prediction except for the shallow part of the water saturation model.

Figure 16 .
Figure 16.Water saturation prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix corrected the structure predicted using pix2pix.However, it failed to detect the water saturation layer.

Figure 16 .
Figure 16.Water saturation prediction results of Inline 1. (a): Prediction from pix2pix.(b): Prediction from MS-pix2pix.(c): The ground truth.MS-pix2pix corrected the structure predicted using pix2pix.However, it failed to detect the water saturation layer.

Figure 17 .
Figure 17.Seismic and subsurface properties data used in field data testing.The subsurface properties were generated via the rock physics method.The actual generated data only contained 160 samples.As pix2pix requires 256 samples, sample 161 to 256 was filled with sample 1 to 96.

Figure 17 .
Figure 17.Seismic and subsurface properties data used in field data testing.The subsurface properties were generated via the rock physics method.The actual generated data only contained 160 samples.As pix2pix requires 256 samples, sample 161 to 256 was filled with sample 1 to 96.

Figure 18 .
Figure 18.The seismic section used as the input data for testing.

Figure 19 .
Figure19.The accuracy metric.As we are interested in the right value at each lo duced element-wise comparison with a 5% margin of error.The values at the sam two images were compared.If the difference in value was within 5% of the true v sidered "correct".The total number of "correct" predictions was then divided by th elements to give us the final accuracy.

Figure 18 .
Figure 18.The seismic section used as the input data for testing.

Figure 18 .
Figure 18.The seismic section used as the input data for testing.

Figure 19 .
Figure19.The accuracy metric.As we are interested in the right value a duced element-wise comparison with a 5% margin of error.The values a two images were compared.If the difference in value was within 5% of t sidered "correct".The total number of "correct" predictions was then divid elements to give us the final accuracy.

Figure 19 .
Figure19.The accuracy metric.As we are interested in the right value at each location, we introduced element-wise comparison with a 5% margin of error.The values at the same location of the two images were compared.If the difference in value was within 5% of the true value, it was considered "correct".The total number of "correct" predictions was then divided by the total number of elements to give us the final accuracy.

Figure 20 .
Figure20.The pix2pix prediction of the field data test.The first row is the result for velocity, followed by the result for porosity in the second row, density in the third row, and water saturation in the last row.The first column shows the pix2pix result, followed by the ground truth in the second column and the difference in the third column.It can be seen that the velocity, porosity, and density predictions are following the geological structure as the ground truth.The water saturation prediction is horizontal, cutting across the structure and far from the ground truth.

Figure 20 .
Figure20.The pix2pix prediction of the field data test.The first row is the result for velocity, followed by the result for porosity in the second row, density in the third row, and water saturation in the last row.The first column shows the pix2pix result, followed by the ground truth in the second column and the difference in the third column.It can be seen that the velocity, porosity, and density predictions are following the geological structure as the ground truth.The water saturation prediction is horizontal, cutting across the structure and far from the ground truth.

Figure 21 .
Figure21.The MS-pix2pix prediction of the field data test.The first row is the result for velocity, followed by the result for porosity in the second row, density in the third row, and water saturation in the last row.The first column shows the pix2pix result, followed by the ground truth in the second column, and the difference in the third column.The MS-pix2pix showed better prediction compared with pix2pix as shown by the smaller range in the difference plot.Again, water saturation is the hardest to predict.

Figure 21 .
Figure21.The MS-pix2pix prediction of the field data test.The first row is the result for velocity, followed by the result for porosity in the second row, density in the third row, and water saturation in the last row.The first column shows the pix2pix result, followed by the ground truth in the second column, and the difference in the third column.The MS-pix2pix showed better prediction compared with pix2pix as shown by the smaller range in the difference plot.Again, water saturation is the hardest to predict.

Table 1 .
Summary of the changes made to the original pix2pix.

Table 1 .
Summary of the changes made to the original pix2pix.

Table 1 .
Summary of the changes made to the original pix2pix.

Table 3 .
The prediction accuracy of the field data testing.MS-pix2pix managed to improve the prediction of the Pix2Pix network.