Saturation Modeling of Gas Hydrate Using Machine Learning with X-Ray CT Images

: This study conducts saturation modeling in a gas hydrate (GH) sand sample with X-ray CT images using the following machine learning algorithms: random forest (RF), convolutional neural network (CNN), and support vector machine (SVM). The RF yields the best prediction performance for water, gas, and GH saturation in the samples among the three methods. The CNN and SVM also exhibit su ﬃ cient performances under the restricted conditions, but require improvements to their reliability and overall prediction performance. Furthermore, the RF yields the lowest mean square error and highest correlation coe ﬃ cient between the original and predicted datasets. Although the GH CT images aid in approximately understanding how ﬂuids act in a GH sample, di ﬃ culties were encountered in accurately understanding the behavior of GH in a GH sample during the experiments owing to limited physical conditions. Therefore, the proposed saturation modeling method can aid in understanding the behavior of GH in a GH sample in real-time with the use of an appropriate machine learning method. Furthermore, highly accurate descriptions of each saturation, obtained from the proposed method, lead to an accurate resource evaluation and well-guided optimal depressurization for a target GH ﬁeld production.


Introduction
Previous studies, in the field of petroleum engineering, have performed core experiments to understand how multi-phase fluids flow through porous media and measure important petrophysical features, such as permeability or porosity [1][2][3][4][5]. Mass balance can be used to analyze changes in saturation and pressure in each phase during a core experiment. This requires the entire core to be acting as a single system, under the assumption that the attributes of the target core have homogeneous characteristics. In addition, changes in the saturation or its trend can be indirectly analyzed depending on observations of the pressure, temperature, and fluid injection/production in conventional core flow experiments. Although the saturation presents an overall change throughout the entire core system based on the mass balance equation, there is a lack of understanding with respect to the behavior of every single spot in a sample.
controller (model SLA5850S manufactured by BrooksInstrument) were used to input water and gas, respectively. The fluid production component connects to the back pressure regulator (BPR) and syringe pump to control the pressure of the BPR. Water and gas are drained at the same time during GH dissociation. Thus, the fluid production component consists of a separator for the water and methane, mass balance meter, and wet gas meter (model TG0.5 manufactured by Ritter). The control component manage the syringe pump and mass flow controller. The measurement component gauges the temperature, pressure, and rates of the water and gas in real-time. The X-ray CT is taken with medical grade CT equipment (model OPTIMA 660 manufactured by General Electricity), which is characterized by a fast-filming velocity (96 mm/s). The CT was used for visualization and analysis of the phase behavior in the high-pressure cell in real-time. The CT system was installed in a chamber that shields the radioactivity, such that we were able to control the temperature and humidity and obtain CT images in a stable environment.  The installed X-ray CT equipment for the GH visualization system at KIGAM [8,10].
The pressure of a sample is charged by methane, followed by a decrease in the temperature until the GH is formed at low temperature-high pressure conditions. These procedures are conducted throughout the five stages of the experiment. CT images are obtained in each stage to perform CT normalization analysis and build training data for the machine learning methods.
First, the sample is packed with sand in the high-pressure cell. The used sample is a duplicate characterized by the same attributes as samples acquired from the Korean East Sea region, which is Figure 1. A schematic diagram of the GH visualization system at KIGAM [6,8].
Energies 2020, 13, x FOR PEER REVIEW 3 of 20 controller (model SLA5850S manufactured by BrooksInstrument) were used to input water and gas, respectively. The fluid production component connects to the back pressure regulator (BPR) and syringe pump to control the pressure of the BPR. Water and gas are drained at the same time during GH dissociation. Thus, the fluid production component consists of a separator for the water and methane, mass balance meter, and wet gas meter (model TG0.5 manufactured by Ritter). The control component manage the syringe pump and mass flow controller. The measurement component gauges the temperature, pressure, and rates of the water and gas in real-time. The X-ray CT is taken with medical grade CT equipment (model OPTIMA 660 manufactured by General Electricity), which is characterized by a fast-filming velocity (96 mm/s). The CT was used for visualization and analysis of the phase behavior in the high-pressure cell in real-time. The CT system was installed in a chamber that shields the radioactivity, such that we were able to control the temperature and humidity and obtain CT images in a stable environment.  The installed X-ray CT equipment for the GH visualization system at KIGAM [8,10].
The pressure of a sample is charged by methane, followed by a decrease in the temperature until the GH is formed at low temperature-high pressure conditions. These procedures are conducted throughout the five stages of the experiment. CT images are obtained in each stage to perform CT normalization analysis and build training data for the machine learning methods.
First, the sample is packed with sand in the high-pressure cell. The used sample is a duplicate characterized by the same attributes as samples acquired from the Korean East Sea region, which is The installed X-ray CT equipment for the GH visualization system at KIGAM [8,10].
The pressure of a sample is charged by methane, followed by a decrease in the temperature until the GH is formed at low temperature-high pressure conditions. These procedures are conducted throughout the five stages of the experiment. CT images are obtained in each stage to perform CT normalization analysis and build training data for the machine learning methods. First, the sample is packed with sand in the high-pressure cell. The used sample is a duplicate characterized by the same attributes as samples acquired from the Korean East Sea region, which is a promising area for GH resources. We obtained the charge of the sedimentary grain using a vibrator to homogeneously distribute the particles in a dry state. Additionally, the homogeneity of the grain was verified in the CT images during the charging process. In this study, the GH experiment was conducted three times. Therefore, there are three sets of samples names, A, B, and C, all of which have 35% porosity.
The packing procedure is the first procedure among the five experiment stages (Figure 3a), where the sample is scanned by the CT to obtain the image data. The high-pressure cell was then coupled with temperature and pressure sensor and water and gas line. The sample remained in a vacuum environment for more than six hours. The sample was saturated with 100% water, followed by CT scanning of the sample (Figure 3b). We can calculate the porosity of the sample by measuring the amount of injected water. Methane gas was injected into the sample and the initial water and gas saturation was set as "IWS," as well as with CT scan (Figure 3c). At this sample condition, extra methane gas was injected to reach the high pressure condition, followed by a decrease in the temperature to form the GH (CT scan for GH shown in Figure 3d). Free gas in the pore was displaced via water injection to obtain similar conditions and environment as the research target area, i.e., the Korean East Sea. The "GTW" stage was scanned by CT ( Figure 3e). The resolution of the acquired CT images was 512 × 512 × 96 for the three GH samples, A, B, and C, and the size of each pixel was~100 and 660 µm for the horizontal and vertical direction, respectively. High intensity CT value areas, such as the end-piece, which includes the temperature sensor, were excluded to obtain the proper CT values for machine learning.
Energies 2020, 13, x FOR PEER REVIEW 4 of 20 a promising area for GH resources. We obtained the charge of the sedimentary grain using a vibrator to homogeneously distribute the particles in a dry state. Additionally, the homogeneity of the grain was verified in the CT images during the charging process. In this study, the GH experiment was conducted three times. Therefore, there are three sets of samples names, A, B, and C, all of which have 35% porosity. The packing procedure is the first procedure among the five experiment stages (Figure 3a), where the sample is scanned by the CT to obtain the image data. The high-pressure cell was then coupled with temperature and pressure sensor and water and gas line. The sample remained in a vacuum environment for more than six hours. The sample was saturated with 100% water, followed by CT scanning of the sample (Figure 3b). We can calculate the porosity of the sample by measuring the amount of injected water. Methane gas was injected into the sample and the initial water and gas saturation was set as "IWS," as well as with CT scan (Figure 3c). At this sample condition, extra methane gas was injected to reach the high pressure condition, followed by a decrease in the temperature to form the GH (CT scan for GH shown in Figure 3d). Free gas in the pore was displaced via water injection to obtain similar conditions and environment as the research target area, i.e., the Korean East Sea. The "GTW" stage was scanned by CT ( Figure 3e). The resolution of the acquired CT images was 512 × 512 × 96 for the three GH samples, A, B, and C, and the size of each pixel was ~100 and 660 μm for the horizontal and vertical direction, respectively. High intensity CT value areas, such as the end-piece, which includes the temperature sensor, were excluded to obtain the proper CT values for machine learning. If we assume that there is no particle movement in the sample, we can compute the normalized CT values [4,6,8,26]. All CT values from the five stages were normalized as follows: If we assume that there is no particle movement in the sample, we can compute the normalized CT values [4,6,8,26]. All CT values from the five stages were normalized as follows: where CT norm is the normalized CT values, CT STAGE is the CT values from each stage, CT DRY is the empty space within the sample (i.e., should have the smallest CT value), and CT SAT is saturated with water (i.e., should have the highest CT value).

Data Acquisition and Pre-Process
The CT images from the GH experiment were modified and normalized, as shown in Figure 4. Figure 4a presents the ImageJ window, which is an open-source software, developed at the National Institutes of Health and the Laboratory for Optical and Computational Instrumentation, that provides numerous functions to edit images [27,28]. This study used ImageJ to modify and organize the given CT images for pre-processing. Figure 4b shows 64 slices of sample B in the initial water saturation stage (according to the definitions of the five stages provided in Section 2.1). Figure 4b contains a circle-shaped image surrounded by a white part, which indicates that the cylinder contains a sample. Thus, this white must be eliminated, as shown in Figure 4c, for every image slice of a sample. ImageJ recognizes the dotted lined circle areas and removes the remaining circle area (right column in Figure 4c). The organized images are placed in the "CT," which are then subtracted by the "DRY" stage image and divided by the difference between the "SAT" and "DRY" stages, as described in Figure 4d. The far right image in Figure 4d displays the normalized CT image, which has a scale of 0-1 due to the normalization procedure. When this image has a value of 1, it is fully saturated with water, which indicates that the sample has the highest density with respect to the current experimental conditions. In contrast, a value of 0 indicates that the sample is saturated with air, which is the lowest density value at the experimental conditions. where CT is the normalized CT values, CT is the CT values from each stage, CT is the empty space within the sample (i.e., should have the smallest CT value), and CT is saturated with water (i.e., should have the highest CT value).

Data Acquisition and Pre-Process
The CT images from the GH experiment were modified and normalized, as shown in Figure 4. Figure 4a presents the ImageJ window, which is an open-source software, developed at the National Institutes of Health and the Laboratory for Optical and Computational Instrumentation, that provides numerous functions to edit images [27,28]. This study used ImageJ to modify and organize the given CT images for pre-processing. Figure 4b shows 64 slices of sample B in the initial water saturation stage (according to the definitions of the five stages provided in Section 2.1). Figure 4b contains a circle-shaped image surrounded by a white part, which indicates that the cylinder contains a sample. Thus, this white must be eliminated, as shown in Figure 4c, for every image slice of a sample. ImageJ recognizes the dotted lined circle areas and removes the remaining circle area (right column in Figure 4c). The organized images are placed in the "CT," which are then subtracted by the "DRY" stage image and divided by the difference between the "SAT" and "DRY" stages, as described in Figure 4d. The far right image in Figure 4d displays the normalized CT image, which has a scale of 0-1 due to the normalization procedure. When this image has a value of 1, it is fully saturated with water, which indicates that the sample has the highest density with respect to the current experimental conditions. In contrast, a value of 0 indicates that the sample is saturated with air, which is the lowest density value at the experimental conditions. In this study, we assume that the sample matrix is fixed in its original position, which indicates that there is only a change in the phase. The normalization procedure removes the effect of the sand sample structure, such that we can only quantitatively consider the behaviors of each phase in the porous media. Therefore, the GH CT data from different samples can be used together to model fluid behaviors because they only reflect changes in the internal fluid and not the sample frame. Figure 5 displays the raw and normalized sample images for the first slice of sample B. In Figure 5b, In this study, we assume that the sample matrix is fixed in its original position, which indicates that there is only a change in the phase. The normalization procedure removes the effect of the sand sample Energies 2020, 13, 5032 6 of 20 structure, such that we can only quantitatively consider the behaviors of each phase in the porous media. Therefore, the GH CT data from different samples can be used together to model fluid behaviors because they only reflect changes in the internal fluid and not the sample frame. Figure 5 displays the raw and normalized sample images for the first slice of sample B. In Figure 5b, the "DRY" and "SAT" images have 0 and 1 value circle-shaped sample areas, respectively. Theoretically, the samples should only have values between 0 and 1 based on the normalization. However, certain samples have values less than 0 or greater than 1 due to measurement error or other physical effects from the experimental conditions. In this study, these sample values were modified to 0 or 1 for samples with values less than 0 or greater than 1, respectively.
Energies 2020, 13, x FOR PEER REVIEW 6 of 20 However, certain samples have values less than 0 or greater than 1 due to measurement error or other physical effects from the experimental conditions. In this study, these sample values were modified to 0 or 1 for samples with values less than 0 or greater than 1, respectively. Saturation of the three phases (i.e., gas, water, and GH) must be assigned for the given GH CT images to construct the training data, which is the labeling for the normalized GH CT images. The first two stages among the five stages theoretically have clear saturation values: SW = 0, SGH = 0, and SG = 1 for the "DRY" stage and SW = 1, SGH = 0, and SG = 0 for the "SAT" stage. In the other three stages, mass balance was used to compute each saturation, as listed in Table 1. The normalized CT values have 1 and 0 for the maximum and minimum, respectively. The maximum and minimum normalized CT values are corresponding to 100% water ("SAT") or 100% gas ("DRY") saturated because water density is about 1 g/cc and gas density is almost 0 compared to that of water. Therefore, the average density should correspond to the normalized CT value as following Equation (2): where S , is the saturation of "water" in the "IWS" stage, the first and second subscript mean the fluid phase and experimental stage, respectively. In Equation (3), the constant value 1 and 0.000678 g/cc are the density of water and gas at the conditions in the "IWS" stage, respectively. The gas density of "IWS" and the water density of "GTW" are assumed as 0 and 1, respectively, for the satisfaction of Equation (2) in this study. In Equations (3)-(5), they all satisfy the sum of saturations equals 1 in each of the stages ( Table 1). CT of Equation (6) is the average normalized CT values from all data points in one image slice among 64 slices, , of the actual sample part (the Saturation of the three phases (i.e., gas, water, and GH) must be assigned for the given GH CT images to construct the training data, which is the labeling for the normalized GH CT images. The first two stages among the five stages theoretically have clear saturation values: S W = 0, S GH = 0, and S G = 1 for the "DRY" stage and S W = 1, S GH = 0, and S G = 0 for the "SAT" stage. In the other three stages, mass balance was used to compute each saturation, as listed in Table 1. The normalized CT values have 1 and 0 for the maximum and minimum, respectively. The maximum and minimum normalized CT values are corresponding to 100% water ("SAT") or 100% gas ("DRY") saturated because water density is about 1 g/cc and gas density is almost 0 compared to that of water. Therefore, the average density should correspond to the normalized CT value as following Equation (2): where S W, IWS is the saturation of "water" in the "IWS" stage, the first and second subscript mean the fluid phase and experimental stage, respectively. In Equation (3), the constant value 1 and 0.000678 g/cc are the density of water and gas at the conditions in the "IWS" stage, respectively. The gas density of "IWS" and the water density of "GTW" are assumed as 0 and 1, respectively, for the satisfaction of Equation (2) in this study. In Equations (3)-(5), they all satisfy the sum of saturations equals 1 in each of the stages (Table 1). CT avg norm of Equation (6) is the average normalized CT values from all data points in one image slice among 64 slices, n c , of the actual sample part (the circle-shaped area in Figure 5). Here, n c is 47,992 at the experimental conditions used in this study. In the "IWS", "GH", and "GTW" stages, we previously are aware of one of the three saturations and all the densities under certain experimental conditions (see the first column of Table 1), which allows us to calculate the saturation values of each phase. Therefore, the collected CT images can be labeled with saturation values based on the given conditions and Equations (2)- (5). The GH saturation of the "GTW" stage should be identical to that of the "GH" stage, assuming that the same generated GH also exists in the "GTW" The labeling was conducted for each slice such that one slice was paired with three saturation values: S GH , S W , and S G . Table 1. Calculations for the saturation labeling in the three experimental stages: "IWS", "GH" and "GTW".

Random Forest
The random forest (RF) has been used as one of the powerful machine learning methods in numerous previous studies to address prediction problems for a target parameter based on given data. The RF consists of multiple decision trees, where each decision tree is constructed, as shown in Figure 6. Figure 6a provides n sample data with a d property. The property d is the explanatory variable, which is one pixel point of the normalized CT image slice in this study. The saturation of each phase is the dependent variable, which is represented by m 1 and m 2 in Figure 6a. For example, in Figure 6a, those empty and filled circles (m 1 and m 2 ) are one of water, GH, or gas saturation values of the training samples and they are arrayed in descending order of an assigned property. As described in the four columns of Figure 6a, the training data is organized in descending order according to a specific property from the first property to the last. The RF finds the decision boundary to obtain the minimum sum of two MSVs (mean square variance) from two divided groups. There are d minimum MSVs from each d property and the final decision boundary is selected according to the overall minimum MSV. In case of this study, the minimum MSV will be selected considering for the all three phase saturations. The fixed decision boundary separates the training data into two groups, as described in Figure 6b. These procedures are repeated to partition the given training data until it reaches a maximum depth in the decision tree. More data partitioning results in improved data separation performance until a certain level. However, data partitioning may also cause poorer data separation after immoderately excessive partitioning, which is known as the overfitting problem. This is followed by pruning to prevent or mitigate overfitting.  Figure 7 illustrates the principle of prediction by the RF based on multiple decision trees. Figure  7a shows an entire given training data pool. Different types of decision trees are trained from different k datasets, which are generated via bootstrapping (Figure 7b). The number of decision trees is k, corresponding to the selected k datasets. Each decision tree separately yields its own prediction results, which is known as bootstrap aggregation (Figure 7c). Each result is summarized for majority voting, such that we can obtain the final results (Figure 7d,e). Multiple decision trees can lead to the stochastically stable prediction of the target parameter. Thus, a larger number of decision trees yields improved RF stability and performance in most cases. Despite this, a proper number of decision trees must be selected due to the computational cost required to train a number of decision trees.   Figure 7a shows an entire given training data pool. Different types of decision trees are trained from different k datasets, which are generated via bootstrapping (Figure 7b). The number of decision trees is k, corresponding to the selected k datasets. Each decision tree separately yields its own prediction results, which is known as bootstrap aggregation (Figure 7c). Each result is summarized for majority voting, such that we can obtain the final results (Figure 7d,e). Multiple decision trees can lead to the stochastically stable prediction of the target parameter. Thus, a larger number of decision trees yields improved RF stability and performance in most cases. Despite this, a proper number of decision trees must be selected due to the computational cost required to train a number of decision trees.

Convolutional Neural Network
The convolutional neural network (CNN) is one of the neural network methodologies, which manages images without pre-processing, such as flattening, and extracts features from images via convolution or subsampling [29,30]. CNN has been used in a variety of fields to obtain insight and the proper solution from large amounts of image data, e.g., handwritten recognition, human action recognition, and image classification [30][31][32][33]. Figure 8 shows a schematic image of the CNN applied to the GH CT images in this study. Input images are exposed to convolution and pooling procedures for feature extraction. Then, it ends with a fully-connected layer, where the final three nodes must be the three saturation values of each phase. The training performance of the CNN highly depends on hyper-parameters, such as the number or size of the convolutional and pooling layers. This study used AutoKeras, which efficiently and automatically searches for a proper neural architecture [34]. However, we also note that the structure of the input data can highly affect the performance of the CNN performance. Figure 8a-d displays the different types of input data structure, which we tested to find obtain the suitable input format for the GH CT images.

Convolutional Neural Network
The convolutional neural network (CNN) is one of the neural network methodologies, which manages images without pre-processing, such as flattening, and extracts features from images via convolution or subsampling [29,30]. CNN has been used in a variety of fields to obtain insight and the proper solution from large amounts of image data, e.g., handwritten recognition, human action recognition, and image classification [30][31][32][33]. Figure 8 shows a schematic image of the CNN applied to the GH CT images in this study. Input images are exposed to convolution and pooling procedures for feature extraction. Then, it ends with a fully-connected layer, where the final three nodes must be the three saturation values of each phase. The training performance of the CNN highly depends on hyper-parameters, such as the number or size of the convolutional and pooling layers. This study used AutoKeras, which efficiently and automatically searches for a proper neural architecture [34]. However, we also note that the structure of the input data can highly affect the performance of the CNN performance. Figure 8a-d displays the different types of input data structure, which we tested to find obtain the suitable input format for the GH CT images.

Support Vector Machine
The support vector machine (SVM), which is one of the supervised machine learning algorithms, was proposed by Vapnik and Chervonekis in 1963. If we assume that a set of data in 2-D space consist of two classes, this dataset can be classified by several hyperplanes (Figure 9). The SVM determines a maximum-margin hyperplane to distinguish each group and the data on the margin are referred to as support vectors (Figure 9). Two additional techniques have been used with the SVM to solve more complex problems: soft margin and kernel trick. When there are difficulties associated with classifying the data using an exact hyperplane, the concept of the soft margin can be introduced into the SVM [35]. The kernel trick enables nonlinear classification by converting a problem that is not linearly classified in a given dimension into a high-dimensional feature space [36].
The SVM can also handle a regression problem, as shown in Vapnik et al. [37]. In the application of a regression problem, the SVM attempts to fit the optimized hyperplane with as many data as possible within the epsilon, which is a reverse optimization compared with the classification problem. Support vector regression uses a slack variable that plays a role similar to the soft margin in the classification problem [38]. (a) the first option to construct input data and it is the largest amount of data compared to the rest of three options; (b) the second option to construct input data with only inner area from the outer square; (c) the third option for input data in one dimension and it attaches pixels in horizontal direction; (d) the fourth option for input data constructed in vertical direction.

Support Vector Machine
The support vector machine (SVM), which is one of the supervised machine learning algorithms, was proposed by Vapnik and Chervonekis in 1963. If we assume that a set of data in 2-D space consist of two classes, this dataset can be classified by several hyperplanes (Figure 9). The SVM determines a maximum-margin hyperplane to distinguish each group and the data on the margin are referred to as support vectors (Figure 9). Two additional techniques have been used with the SVM to solve more complex problems: soft margin and kernel trick. When there are difficulties associated with classifying the data using an exact hyperplane, the concept of the soft margin can be introduced into the SVM [35]. The kernel trick enables nonlinear classification by converting a problem that is not linearly classified in a given dimension into a high-dimensional feature space [36].
The SVM can also handle a regression problem, as shown in Vapnik et al. [37]. In the application of a regression problem, the SVM attempts to fit the optimized hyperplane with as many data as possible within the epsilon, which is a reverse optimization compared with the classification problem. Support vector regression uses a slack variable that plays a role similar to the soft margin in the classification problem [38]. Energies 2020, 13, x FOR PEER REVIEW 11 of 20 Figure 9. A conceptual description of SVM classification.

Results
These three machine learning methods were applied to the training data from the three datasets: samples A, B, and C. The GH CT images of each GH sample were broken into 96 slices and the first 64 slices were used for the five experiment stages to construct the training data. Therefore, 3 GH samples × 64 slices × 5 stages is equal to 960 samples, which consist of the training data for machine learning. Figure 10 shows the normalized CT values for the first slice of sample A, which are represented in the histogram. Figure 10b,c displays the histogram of the normalized CT values for the circled and outer areas in the images, respectively, marked with the yellow dotted line. The overall normalized CT values increase from the "IWS" to "GTW" stages based on the changes in the stage. Table 2 lists the other detailed training information for the three methodologies. The size of one data sample depends on the machine learning method and the structure of the input data. The number of properties for bootstrap of RF (Figure 7b) is the square root of the number of data points in one sample. Training and test data are applied to each machine learning technique. For the CNN, the validation data is a part of the training data and the result of the validation is presented together with the training data.
The RF and CNN were realized in the Python language environment using a workstation whose specifications are as follows: Intel Xeon Gold 6136 central processing unit with 3 and 2.99 GHz processors and 128 GB random-access memory. The SVM was conducted in MATLAB using the same workstation. The methods required approximately 28, 31, and 1 min for the RF, CNN, and SVM, respectively. Although the computation time for each machine learning methodology depends on its hyper-parameters, we speculated that the RF and SVM would require less time as compared with the CNN, which specifically has many parameters that must be fixed due to a comparatively complicated CNN structure. Thus, the RF and SVM are more advantageous with respect to the computational cost as compared with the CNN.

Results
These three machine learning methods were applied to the training data from the three datasets: samples A, B, and C. The GH CT images of each GH sample were broken into 96 slices and the first 64 slices were used for the five experiment stages to construct the training data. Therefore, 3 GH samples × 64 slices × 5 stages is equal to 960 samples, which consist of the training data for machine learning. Figure 10 shows the normalized CT values for the first slice of sample A, which are represented in the histogram. Figure 10b,c displays the histogram of the normalized CT values for the circled and outer areas in the images, respectively, marked with the yellow dotted line. The overall normalized CT values increase from the "IWS" to "GTW" stages based on the changes in the stage.

Method List Condition
Common condition Training data 864 Test data 96 (10% of the entire data)  Table 2 lists the other detailed training information for the three methodologies. The size of one data sample depends on the machine learning method and the structure of the input data. The number of properties for bootstrap of RF (Figure 7b) is the square root of the number of data points in one sample. Training and test data are applied to each machine learning technique. For the CNN, the validation data is a part of the training data and the result of the validation is presented together with the training data.
The RF and CNN were realized in the Python language environment using a workstation whose specifications are as follows: Intel Xeon Gold 6136 central processing unit with 3 and 2.99 GHz processors and 128 GB random-access memory. The SVM was conducted in MATLAB using the same workstation. The methods required approximately 28, 31, and 1 min for the RF, CNN, and SVM, respectively. Although the computation time for each machine learning methodology depends on its hyper-parameters, we speculated that the RF and SVM would require less time as compared with the CNN, which specifically has many parameters that must be fixed due to a comparatively complicated CNN structure. Thus, the RF and SVM are more advantageous with respect to the computational cost as compared with the CNN. Table 2. Conditions and settings for the training of the RF, CNN, and SVM.

Method
List Condition

Common condition
Training data 864 Test data 96 (10% of the entire data)  Figure 11 shows the training and test results for the RF. The X and Y axis represent the original data and the values predicted by the RF, respectively. Thus, more data positioned on the 1:1 indicates a better matching performance. The number of scattered purple points is 864. The more points in the same position results in a darker color, indicating overlap. As shown in Figure 11a, most data points are positioned on the 1:1 line. Their correlation coefficient R 2 values are larger than 0.99, which indicates that the original and predicted values are significantly related and fitted to each other. In Figure 11a, the gas saturation, in particular, shows a strong correlated trend (R 2 = 0.99976), as compared with the S W and S GH . This is because the density of gas is significantly less than that of water or the GH, which allows us to more easily distinguish the gas phase from the two other phases.

RF Results
Certain original data have zero values for S W and S GH in the first and second pictures of Figure 11a,b. The original S G values at or near zero were sufficiently predicted in both the training and test data. In contrast, the S W and S GH values near zero exhibit overestimation in the bottom-left corner of the first and second pictures in Figure 11a,b. Overall, the RF yields a significantly decent performance for both the training and test results. Certain original data have zero values for SW and SGH in the first and second pictures of Figure  11a,b. The original SG values at or near zero were sufficiently predicted in both the training and test data. In contrast, the SW and SGH values near zero exhibit overestimation in the bottom-left corner of the first and second pictures in Figure 11a,b. Overall, the RF yields a significantly decent performance for both the training and test results.   (Figures 14 and 15). We note that the overall results are similar between Figures 14  and 15 with respect to the trend of the scattered points and R 2 values, which indicates that the direction of the 1-D data does not lead to a substantial difference in this study.

CNN Results
As previous studies have shown, the CNN should excel at feature extraction from specific trends or patterns in images [39,40]. Thus, we expect the CNN to function correctly for image-oriented performance. However, the CNN does not function in this manner, considering that Figures 14 and 15 depict an improved performance over Figures 12 and 13. The former two results derive from 2-D images of the CT images, which should reflect certain 2-D characteristics from the input data (Figures 12 and 13). In contrast, the latter two results of the CNN with 1-D input data should only consider the relationship of neighborhood grids. Overall, considering the relationship between each CT value for one grid and the saturation values is difficult because one entire slice has only three saturation values (SW, SGH, and SG), which correspond to averaged normalized CT values based on the labeling. In particular, the outer squared 2-D input image (Figure 8a) resulted in a low correlation and poor matching performance because it has an unnecessary component, i.e., out of   (Figure 8a-d). The structure of the input data is a 2-D square-shape in both Figures 12  and 13, where the R 2 values are less than 0.5, which are considerably low compared with the results of the RF in Figure 11. We observe no specific diagonal trends in the scattered data in all the images of Figures 12 and 13. In contrast, Figures 14 and 15 exhibit a diagonally scattered trend, as compared with Figures 12 and 13. The R 2 values of the CNN results with the 1-D input data are larger than 0.9 for every result (Figures 14 and 15). We note that the overall results are similar between Figures 14 and 15 with respect to the trend of the scattered points and R 2 values, which indicates that the direction of the 1-D data does not lead to a substantial difference in this study.

CNN Results
As previous studies have shown, the CNN should excel at feature extraction from specific trends or patterns in images [39,40]. Thus, we expect the CNN to function correctly for image-oriented performance. However, the CNN does not function in this manner, considering that Figures 14 and 15 depict an improved performance over Figures 12 and 13. The former two results derive from 2-D images of the CT images, which should reflect certain 2-D characteristics from the input data (Figures 12  and 13). In contrast, the latter two results of the CNN with 1-D input data should only consider the relationship of neighborhood grids. Overall, considering the relationship between each CT value for one grid and the saturation values is difficult because one entire slice has only three saturation values (S W , S GH , and S G ), which correspond to averaged normalized CT values based on the labeling. In particular, the outer squared 2-D input image (Figure 8a) resulted in a low correlation and poor matching performance because it has an unnecessary component, i.e., out of one sand sample, in the input data image, whose scale is between 0 and 1, which may have caused a loss of meaning and sensitivity with respect to the normalized CT values. In contrast, the inner squared 2-D input image (Figure 8b) showed improvement compared with the outer squared 2-D input case due to the elimination of the unnecessary component, such that its scale became narrower than between 0 and 1 (see Figure 10b,c). This appears to be helpful for the enhancement of the performance by reserving meaningful variations from the data.
Energies 2020, 13, x FOR PEER REVIEW 14 of 20 one sand sample, in the input data image, whose scale is between 0 and 1, which may have caused a loss of meaning and sensitivity with respect to the normalized CT values. In contrast, the inner squared 2-D input image (Figure 8b) showed improvement compared with the outer squared 2-D input case due to the elimination of the unnecessary component, such that its scale became narrower than between 0 and 1 (see Figure 10b,c). This appears to be helpful for the enhancement of the performance by reserving meaningful variations from the data.  one sand sample, in the input data image, whose scale is between 0 and 1, which may have caused a loss of meaning and sensitivity with respect to the normalized CT values. In contrast, the inner squared 2-D input image (Figure 8b) showed improvement compared with the outer squared 2-D input case due to the elimination of the unnecessary component, such that its scale became narrower than between 0 and 1 (see Figure 10b,c). This appears to be helpful for the enhancement of the performance by reserving meaningful variations from the data.

SVM Results
The SVM yielded outstanding performance for the training results, which showed R 2 values larger than 0.9. We note that the SVM mitigated the overestimation of near zero values in the original S W and S GH , as compared with both the RF and CNN (bottom left corner of Figure 16a). Despite the decent performance by the SVM, the test results were not reliable, especially for the S W and S GH (the first and second images in Figure 16b). The test results for gas saturation show an overall well-matching performance between the original and predicted data, which is because the gas saturation should have discriminating features compared with the S W and S GH in terms of their typical normalized CT values. Despite sufficient potential as a candidate among the other machine learning methods, the SVM requires validation for test performance in future studies.
Energies 2020, 13, x FOR PEER REVIEW 16 of 20 Figure 15. The training (a) and test (b) results from the CNN with the 1-D vertical input data ( Figure  8d).

SVM Results
The SVM yielded outstanding performance for the training results, which showed R 2 values larger than 0.9. We note that the SVM mitigated the overestimation of near zero values in the original SW and SGH, as compared with both the RF and CNN (bottom left corner of Figure 16a). Despite the decent performance by the SVM, the test results were not reliable, especially for the SW and SGH (the first and second images in Figure 16b). The test results for gas saturation show an overall well-matching performance between the original and predicted data, which is because the gas saturation should have discriminating features compared with the SW and SGH in terms of their typical normalized CT values. Despite sufficient potential as a candidate among the other machine learning methods, the SVM requires validation for test performance in future studies.
where and are the number of training and test datasets, respectively, is the original ith data, and represents the predicted data for the ith original data.
where n train and n test are the number of training and test datasets, respectively, Orig i is the original ith data, and Pred i represents the predicted data for the ith original data.