Data-Driven Three-Phase Saturation Identiﬁcation from X-ray CT Images with Critical Gas Hydrate Saturation

: This study proposes three-phase saturation identiﬁcation using X-ray computerized tomography (CT) images of gas hydrate (GH) experiments considering critical GH saturation (S GH,C ) based on the machine-learning method of random forest. Eight GH samples were categorized into three low and ﬁve high GH saturation (S GH ) groups. Mean square error of test results in the low and the high groups showed decreases of 37% and 33%, respectively, compared to that of the total eight. Additionally, a universal test set was conﬁgured from the total eight and tested with two trained machines for the low and high GH groups. Results revealed a boundary at ~50% of S GH signifying di ﬀ erent saturation identiﬁcation performance and the ~50% was estimated as S GH,C in this study. The trained machines for the low and high S GH groups had less performance on the larger and smaller values, respectively, of S GH,C . These ﬁndings conclude that we can take advantage of suitable separation of obtained training data, such as GH CT images, under the criteria of S GH,C . Moreover, the proposed data-driven method not only serves as a saturation identiﬁcation method for GH samples in real time, but also provides a guideline to make decisions for data acquirement priorities.


Introduction
Naturally reserved gas hydrate (GH) has high uncertainty regarding its kinetic behavior, geomechanical stability, and economic feasibility. For these reasons, multiple countries such as South Korea, Japan, the United States, and India [1][2][3][4][5][6] encounter difficulties in pursuing research and development (R&D) or holding their test production in fields, despite having conducted related R&D. In particular, one location containing reserved GH-the East Sea in South Korea-is a challenging field for producing GH owing to its sparse GH distribution, well stability problems, uncertain GH dissociation, and the current energy ecosystem being incompatible for GH [7].
Concerning these issues, many investigations into the behaviors of GH and reducing uncertainty of general characteristics of GH have been carried out [8][9][10][11][12]. One method, X-ray computerized tomography (CT), involves scanning out of a target GH sample during experiments to infer how inner fluids behave in porous media to address the difficulty of understanding what happens in a GH sample directly [13][14][15]. In the GH experimental environment, production rates are difficult to measure accurately due to either the dead volume between measurement equipment and a GH sample, or flow delay in a GH sample, or a flow line. By addressing these issues, X-ray CT scanning could be an appropriate method to investigate fluid workings in a GH sample and infer its approximate trend.
KIGAM introduced a visualization system to experiment with different GH production options and a variety of production-related parameters. The visualization system consists of four main parts: GH sample installation, fluid injection, fluid production, and X-ray CT equipment, as presented in Figure 1. X-ray CT scanning is continuously performed during all experimental procedures. The highpressure cell is composed of glass fiber and aluminum complex to minimize X-ray diffraction, with a diameter and length of 1 and 2 inches, respectively (Figure 1a). The X-ray CT scanning equipment (General Electricity OPTIMA 660) can scan at a rate of up to 96 mm/sec and can also conduct quantitative analysis. Other details about the equipment and its specifications are explained in Kim et al. [16].  [31]; (b) the installed X-ray computerized tomography (CT) equipment for the system [31].
In this study, GH generation is followed by GH sample charge, initial water saturation setting, and pressurization by CH4. In particular, the pressurization is performed by excess gas method, which forms GH showing grain-coating or cementing-type habit [32,33]. Excess gas method is to increase pressure and decrease temperature by methane gas injection in partial water saturation condition. As presented in Figure 2, the experiment consists of five stages before the GH depressurization stage: "DRY", "SAT", "IWS", "GH", and "GTW", described as follows. X-ray CT was scanned for every stage and the scanned CT images corresponded to the five stages. Only one scan is sufficient to determine the internal state of a target GH sample in each of the five stages because there is no critical change within the same stage. In the DRY stage, CT images are scanned immediately after the charge with sand to check whether the distribution of sand particles is regular  [31]; (b) the installed X-ray computerized tomography (CT) equipment for the system [31].
In this study, GH generation is followed by GH sample charge, initial water saturation setting, and pressurization by CH 4 . In particular, the pressurization is performed by excess gas method, which forms GH showing grain-coating or cementing-type habit [32,33]. Excess gas method is to increase pressure and decrease temperature by methane gas injection in partial water saturation condition. As presented in Figure 2, the experiment consists of five stages before the GH depressurization stage: "DRY", "SAT", "IWS", "GH", and "GTW", described as follows. X-ray CT was scanned for every stage and the scanned CT images corresponded to the five stages. Only one scan is sufficient to determine the internal state of a target GH sample in each of the five stages because there is no critical change within the same stage. In the DRY stage, CT images are scanned immediately after the charge with sand to check whether the distribution of sand particles is regular and that those CT values are smallest for the entire experiment because a sand sample is saturated only with the air. Before SAT stage, the entire experiment system is inspected to ensure that the system is stably operated and each compartment is tightly connected through valves. In the SAT stage, a GH sample is 100% saturated with water, which has the largest CT Energies 2020, 13, 5844 4 of 19 values in this experiment. Therefore, the smallest and largest CT values are utilized to both normalize and quantitatively analyze CT images, as follows: where CT norm is the normalized CT value; CT STAGE is the CT value from the IWS, GH, or GTW stages. Note that more experimental explanation can be reviewed in detail in [16].
Energies 2020, 13, x FOR PEER REVIEW 4 of 21 and that those CT values are smallest for the entire experiment because a sand sample is saturated only with the air. Before SAT stage, the entire experiment system is inspected to ensure that the system is stably operated and each compartment is tightly connected through valves. In the SAT stage, a GH sample is 100% saturated with water, which has the largest CT values in this experiment. Therefore, the smallest and largest CT values are utilized to both normalize and quantitatively analyze CT images, as follows: where is the normalized CT value; is the CT value from the IWS, GH, or GTW stages. Note that more experimental explanation can be reviewed in detail in [16]. In the IWS stage, the sample has initial water saturation with CH4, wherein the GH sample is pressurized to make GH (GH stage of Figure 2d). Methane gas is injected with air-cooling operation until it reaches the suitable high pressure and low temperature condition for GH formation. Then, remaining gas in the GH sample is replaced by 3% salinity brine to build the most similar environment of naturally reserved GH (GTW stage of Figure 2e). According to Equation (1), the normalized CT values will be 0 and 1 for the images from the DRY and SAT stages, respectively. In terms of the IWS, GH, or GTW stages, the normalized CT values will be between 0 and 1.

Data Acquisition and Preprocessing
The CT images resolution is 512 × 512 × 96 and the size of each pixel is ~660 µm and 100 µm in the vertical and horizontal directions, respectively. High CT values due to the end-piece, including the temperature sensor, were removed in order to acquire proper CT values for training the machinelearning models. Thus, a total of 96 slices were obtained; however, only the front 64 slices were utilized as training data. According to Equation (1), given CT values were normalized and the related procedures are shown in Figure 3. Row CT images were obtained by CT scanning (Figure 3a) and the useless parts were discarded (Figure 3b). Target images for normalization were put to the CT slot as described in Figure 3c and this procedure makes image data standardized for versatility between In the IWS stage, the sample has initial water saturation with CH 4 , wherein the GH sample is pressurized to make GH (GH stage of Figure 2d). Methane gas is injected with air-cooling operation until it reaches the suitable high pressure and low temperature condition for GH formation. Then, remaining gas in the GH sample is replaced by 3% salinity brine to build the most similar environment of naturally reserved GH (GTW stage of Figure 2e). According to Equation (1), the normalized CT values will be 0 and 1 for the images from the DRY and SAT stages, respectively. In terms of the IWS, GH, or GTW stages, the normalized CT values will be between 0 and 1.

Data Acquisition and Preprocessing
The CT images resolution is 512 × 512 × 96 and the size of each pixel is~660 µm and 100 µm in the vertical and horizontal directions, respectively. High CT values due to the end-piece, including the temperature sensor, were removed in order to acquire proper CT values for training the machine-learning models. Thus, a total of 96 slices were obtained; however, only the front 64 slices were utilized as training data. According to Equation (1), given CT values were normalized and the related procedures are shown in Figure 3. Row CT images were obtained by CT scanning (Figure 3a) and the useless parts were discarded (Figure 3b). Target images for normalization were put to the CT slot as described in Figure 3c and this procedure makes image data standardized for versatility between other GH sample  Saturation values of water, GH, and gas were generated, corresponding to the normalized GH CT images, a process known as "labeling" for building training data for supervised learning [34]. CT images of the DRY and SAT stages technically have fixed values such as SG = 1 or SW = 1, respectively. The IWS and GH stages would clearly show SGH = 0 or SW = 0, respectively. In addition, SGH of the GTW stage would be the same as that of the GH stage. We can calculate three phase saturations according to those conditions, given an experimental environment according to Table 1 and Equations (2) where and indicate saturation and density, respectively. The subscripts W, GH, and G of are water, gas hydrate, and gas, respectively. The second subscript STAGE represents the given experimental stage (IWS, GH, or GTW). is the average, normalized CT value from all data pixels ( ) of the actual GH sample in one slice image (i.e., the circle-shaped area in the right column of Figure 3b). Here, is 47,992 at the experimental environment in this study. is the number of slices of a GH sample; 64 in this experiment setting. Consequently, one CT slice image is paired with three saturation values, becoming one sample for training data.  Saturation values of water, GH, and gas were generated, corresponding to the normalized GH CT images, a process known as "labeling" for building training data for supervised learning [34]. CT images of the DRY and SAT stages technically have fixed values such as S G = 1 or S W = 1, respectively. The IWS and GH stages would clearly show S GH = 0 or S W = 0, respectively. In addition, S GH of the GTW stage would be the same as that of the GH stage. We can calculate three phase saturations according to those conditions, given an experimental environment according to Table 1 and Equations (2) and (3): where S and d indicate saturation and density, respectively. The subscripts W, GH, and G of S are water, gas hydrate, and gas, respectively. The second subscript STAGE represents the given experimental stage (IWS, GH, or GTW). CT avg norm is the average, normalized CT value from all data pixels (n c ) of the actual GH sample in one slice image (i.e., the circle-shaped area in the right column of Figure 3b). Here, n c is 47,992 at the experimental environment in this study. n s is the number of slices of a GH sample; 64 in this experiment setting. Consequently, one CT slice image is paired with three saturation values, becoming one sample for training data. Table 1. Saturation and density conditions for labeling in the three experimental stages.

Machine-Learning Methodology: Random Forest (RF)
As an ensemble machine-learning method, RF consists of multiple decision trees. Figure 4 describes how a decision tree is constructed when it has n sample data of the property of d. First, the data portioning procedure divides the given data into m 1 and m 2 , locating the single decision boundary which minimizes the average of two mean square variances, MSV 1 and MSV 2 (Figure 4a). The decided boundary then becomes a standard for dividing into two branches ( Figure 4b). This procedure is repeated until reaching a specific criterion, which is usually set by a user. After that, pruning is conducted to prevent the final decision tree from overfitting ( Figure 4c).

Machine-Learning Methodology: Random Forest (RF)
As an ensemble machine-learning method, RF consists of multiple decision trees. Figure 4 describes how a decision tree is constructed when it has n sample data of the property of d. First, the data portioning procedure divides the given data into m1 and m2, locating the single decision boundary which minimizes the average of two mean square variances, MSV1 and MSV2 (Figure 4a). The decided boundary then becomes a standard for dividing into two branches ( Figure 4b). This procedure is repeated until reaching a specific criterion, which is usually set by a user. After that, pruning is conducted to prevent the final decision tree from overfitting ( Figure 4c).    circumstances of given computing power (Intel Xeon Gold 6136 central processing unit with 3 and 2.99 GHz processors and 128 GB random-access memory), a couple of hundred trees is sufficiently affordable. Besides, the computational cost issue is out of scope from this study. Therefore, that issue will not be handled in detail.
Energies 2020, 13, x FOR PEER REVIEW 7 of 21 hundred trees is sufficiently affordable. Besides, the computational cost issue is out of scope from this study. Therefore, that issue will not be handled in detail.  Figure 6 shows an example of normalized CT values and their distribution in the five experimental stages. In the DRY and SAT stages, only 0 or 1 values are present for all normalized CT values according to Equation (1) and Figure 3 [16]. On the contrary, individual distributions in the IWS, GH, and GTW stages are present, shown in histogram form as Figure 6b. The GH sample is marked with the yellow dotted circles in three of the stages and each histogram covers only that circle area. From IWS to GTW, the averaged, normalized CT values become larger than the previous stage because the overall density increases corresponding to the experiment design due to the five stages ( Figure 2). The normalized CT values become closer to 0.9-1, similar to the density of GH and water (Table 1).   Figure 6b. The GH sample is marked with the yellow dotted circles in three of the stages and each histogram covers only that circle area. From IWS to GTW, the averaged, normalized CT values become larger than the previous stage because the overall density increases corresponding to the experiment design due to the five stages ( Figure 2). The normalized CT values become closer to 0.9-1, similar to the density of GH and water (Table 1).   There was no difficulty with the separation into these two groups because of the obvious difference of SGH values-whereby values were around 40% and 50%, respectively. Figure 7b presents the distributions of normalized CT values for the first slice of eight GH samples, with black dots indicating the means of each distribution. Generally, the low GH saturation group has low averages of the normalized CT values and the high GH saturation group gives higher average values. We can distinguish SGH values into the forty-and fifty-percent groups which are taken as the conventional values and critical values, respectively, because SGH,C was expected to be ~50% for the target GH sample in this study. For values close to that of SGH,C, production efficiency of GH can be drastically reduced due to the slow pressure propagation in porous media. The eight GH samples are randomly indexed after they are categorized into the two groups. In this study, the total eight GH samples are divided into four cases (Cases 1-4) for analysis of data construction and machine-learning performance. Case 1 represents the total eight GH samples and Cases 2 and 3 are the low and high SGH, respectively. Case 4 consists of the odd-numbered GH samples-1, 3, 5, and 7-for randomly constructed data collection with low and high SGH evenly. Case 4 is set to have the four selected GH samples in order to produce the fairest comparison with Cases 2 and 3 by fitting to the amount of training data.  There was no difficulty with the separation into these two groups because of the obvious difference of S GH values-whereby values were around 40% and 50%, respectively. Figure 7b presents the distributions of normalized CT values for the first slice of eight GH samples, with black dots indicating the means of each distribution. Generally, the low GH saturation group has low averages of the normalized CT values and the high GH saturation group gives higher average values. We can distinguish S GH values into the forty-and fifty-percent groups which are taken as the conventional values and critical values, respectively, because S GH,C was expected to be~50% for the target GH sample in this study. For values close to that of S GH,C , production efficiency of GH can be drastically reduced due to the slow pressure propagation in porous media.  There was no difficulty with the separation into these two groups because of the obvious difference of SGH values-whereby values were around 40% and 50%, respectively. Figure 7b presents the distributions of normalized CT values for the first slice of eight GH samples, with black dots indicating the means of each distribution. Generally, the low GH saturation group has low averages of the normalized CT values and the high GH saturation group gives higher average values. We can distinguish SGH values into the forty-and fifty-percent groups which are taken as the conventional values and critical values, respectively, because SGH,C was expected to be ~50% for the target GH sample in this study. For values close to that of SGH,C, production efficiency of GH can be drastically reduced due to the slow pressure propagation in porous media. The eight GH samples are randomly indexed after they are categorized into the two groups. In this study, the total eight GH samples are divided into four cases (Cases 1-4) for analysis of data construction and machine-learning performance. Case 1 represents the total eight GH samples and Cases 2 and 3 are the low and high SGH, respectively. Case 4 consists of the odd-numbered GH samples-1, 3, 5, and 7-for randomly constructed data collection with low and high SGH evenly. Case 4 is set to have the four selected GH samples in order to produce the fairest comparison with Cases 2 and 3 by fitting to the amount of training data. The eight GH samples are randomly indexed after they are categorized into the two groups. In this study, the total eight GH samples are divided into four cases (Cases 1-4) for analysis of data construction and machine-learning performance. Case 1 represents the total eight GH samples and Cases 2 and 3 are the low and high S GH , respectively. Case 4 consists of the odd-numbered GH samples-1, 3, 5, and 7-for randomly constructed data collection with low and high S GH evenly. Case 4 is set to have the four selected GH samples in order to produce the fairest comparison with Cases 2 and 3 by fitting to the amount of training data. Table 2 shows averaged GH saturation and CT values of each GH sample and Cases 1-4. Orange-colored cells indicate the utilized GH sample for each of the four cases. The means that Cases 1 and 4 are similar to each other because they are composed of combined GH samples from the low and Energies 2020, 13, 5844 9 of 19 high S GH groups. The difference between averaged S GH of Cases 2 and 3 is~11%, which could give significant contrast of GH behaviors.  Table 2 shows averaged GH saturation and CT values of each GH sample and Cases 1-4. Orangecolored cells indicate the utilized GH sample for each of the four cases. The means that Cases 1 and 4 are similar to each other because they are composed of combined GH samples from the low and high SGH groups. The difference between averaged SGH of Cases 2 and 3 is ~11%, which could give significant contrast of GH behaviors.  Figure 8 and Table 3 explain how the training and test data are divided and constructed for Cases 1-4. Figure 8 illustrates that four test sets are randomly selected from each data pool as a conventional machine-learning procedure. Case 1 uses all eight GH samples, and its test set consists of a random 10% of these. This test set of Case 1 is set as the universal test set, which is utilized as the common standard to compare all four cases with each other. The test sets for Cases 2, 3, and 4 are random 10% sets from each entire pool, represented by green, orange, and blue colored lines, respectively, in the figure. Training of RF is conducted for all four cases and the number of used training data is presented in Table 3. The eight GH samples have 320 data points due to the multiplication of 5 stages and 64 slices (Figures 2 and 3). Case 1 has 2560 due to the multiplication of 8 GH samples and 320 slices. The number of training data should then be 2304, which is 2560 subtracted by 256, and the same procedure is carried out for the rest of the three cases. The detailed training conditions of RF are shown in Table 3. Regarding number of properties, 219 is determined by the root of the total number of CT values, 47,996. Table 3 explain how the training and test data are divided and constructed for Cases 1-4. Figure 8 illustrates that four test sets are randomly selected from each data pool as a conventional machine-learning procedure. Case 1 uses all eight GH samples, and its test set consists of a random 10% of these. This test set of Case 1 is set as the universal test set, which is utilized as the common standard to compare all four cases with each other. The test sets for Cases 2, 3, and 4 are random 10% sets from each entire pool, represented by green, orange, and blue colored lines, respectively, in the figure. Training of RF is conducted for all four cases and the number of used training data is presented in Table 3. The eight GH samples have 320 data points due to the multiplication of 5 stages and 64 slices (Figures 2 and 3). Case 1 has 2560 due to the multiplication of 8 GH samples and 320 slices. The number of training data should then be 2304, which is 2560 subtracted by 256, and the same procedure is carried out for the rest of the three cases. The detailed training conditions of RF are shown in Table 3. Regarding number of properties, 219 is determined by the root of the total number of CT values, 47,996.

Figure 8 and
Energies 2020, 13, x FOR PEER REVIEW 10 of 21 Figure 8. Two test sets: random 10% test from each of the four cases and universal test set from the eight GH samples.   Figures 9b, 10b, 11b and 12b are the test set, while each column lists S W , S GH , and S G in order. In both (a) and (b) panels, the first row is the scatterplot and the second row displays the same results in a histogram. In the first row, the X-axis indicates the original value of saturations and the Y-axis indicates the predicted (modeled) values. Blue dots indicate individual data samples and increasing darkness of the color indicates increasingly scattered data relative to a certain position. Therefore, although some of data seem to be deviated from the diagonal line, it can give a high coefficient of determination (S W of Figure 9b). Correlation coefficients are calculated and presented at the top of all charts. In the histograms of the second row, the blue dotted box means the predicted saturation values and the red solid-line box presents the original saturations.  In the scatter plots, the X and Y axes indicate the given original data and the predicted value by random forest, respectively. In the histogram pictures, the X and Y axes mean each saturation and frequency, respectively. Figure 9. (a) Training and (b) test results of Case 1, each column means each saturation, water, gas hydrate, and gas with coefficient of determination. In the scatter plots, the X and Y axes indicate the given original data and the predicted value by random forest, respectively. In the histogram pictures, the X and Y axes mean each saturation and frequency, respectively.
Energies 2020, 13, x FOR PEER REVIEW 12 of 21 Figure 10. (a) Training and (b) test results of Case 2, each column means each saturation, water, gas hydrate, and gas with coefficient of determination. In the scatter plots, the X and Y axes indicate the given original data and the predicted value by random forest, respectively. In the histogram pictures, the X and Y axes mean each saturation and frequency, respectively. test results of Case 2, each column means each saturation, water, gas hydrate, and gas with coefficient of determination. In the scatter plots, the X and Y axes indicate the given original data and the predicted value by random forest, respectively. In the histogram pictures, the X and Y axes mean each saturation and frequency, respectively. Figure 9 illustrates the largest number and widest range of data because all eight GH samples are included. The coefficient of determination (R 2 ) of the training data is~0.99 for all of the variables-S W , S GH , and S G . In particular, S G shows considerable fitting between the original and the predicted values in comparison with S W and S GH . This is because the density of water and GH are relatively similar, which leads to little difference in X-ray CT images and normalized CT values. However, the density of gas is much lower than that of water or GH, thereby causing certain changes in CT values. The certain difference of densities between gas and the others causes a clear discrepancy between normalized CT Energies 2020, 13, 5844 12 of 19 values of gas and the others. It makes the prediction of gas saturation easier than the prediction of water and GH saturations. This phenomenon was identified in our previous study [16].
Energies 2020, 13, x FOR PEER REVIEW 13 of 21 Figure 11. (a) Training and (b) test results of Case 3, each column means each saturation, water, gas hydrate, and gas with coefficient of determination. In the scatter plots, the X and Y axes indicate the given original data and the predicted value by random forest, respectively. In the histogram pictures, the X and Y axes mean each saturation and frequency, respectively. Figure 11. (a) Training and (b) test results of Case 3, each column means each saturation, water, gas hydrate, and gas with coefficient of determination. In the scatter plots, the X and Y axes indicate the given original data and the predicted value by random forest, respectively. In the histogram pictures, the X and Y axes mean each saturation and frequency, respectively. Figures 10 and 11 contrast each other in terms of S GH . It was expected for these charts to show the different distribution of S GH because they are separated with S GH criteria, showing the S GH results. In both Cases 2 and 3, the overall machine-learning performances are suitable considering that all R 2 values are greater than 0.99 and the scattered dots are positioned on the diagonal line in an orderly fashion. However, in terms of S GH , Case 2 has a relatively wide range of 0-0.6, whereas Case 3 mostly shows either 0 or~0.5. In Figure 11, the scattered dots are deviated from the diagonal line especially near 0.4, which is a comparably low S GH . It would be expected that Case 3 was trained for large S GH values compared to Case 2, which is the reason why Case 3 functions this way according to the given data composition.
Energies 2020, 13, x FOR PEER REVIEW 14 of 21 Figure 12. (a) Training and (b) test results of Case 4, each column means each saturation, water, gas hydrate, and gas with coefficient of determination. In the scatter plots, the X and Y axes indicate the given original data and the predicted value by random forest, respectively. In the histogram pictures, the X and Y axes mean each saturation and frequency, respectively. Figure 9 illustrates the largest number and widest range of data because all eight GH samples are included. The coefficient of determination (R 2 ) of the training data is ~0.99 for all of the variables-SW, SGH, and SG. In particular, SG shows considerable fitting between the original and the predicted values in comparison with SW and SGH. This is because the density of water and GH are relatively similar, which leads to little difference in X-ray CT images and normalized CT values. However, the density of gas is much lower than that of water or GH, thereby causing certain changes in CT values. The certain difference of densities between gas and the others causes a clear discrepancy between normalized CT values of gas and the others. It makes the prediction of gas saturation easier than the  and (b) test results of Case 4, each column means each saturation, water, gas hydrate, and gas with coefficient of determination. In the scatter plots, the X and Y axes indicate the given original data and the predicted value by random forest, respectively. In the histogram pictures, the X and Y axes mean each saturation and frequency, respectively. Figure 12 shows overall decent performance except for the test set of S GH whose R 2 is 0.89, the lowest value. Case 1 has the smallest R 2 value of 0.92 in the test set for S GH . In Cases 2 and 3, S GH shows the smallest R 2 , seemingly indicating that the most challenging component of the process is the identification of S GH for the test sets. The density of water is maintained at approximately 1 g/cc regardless of experimental stage. However, the density of GH highly depends on the given pressure and temperature of the three experimental stages-IWS, GH, and GTW (i.e., the last column of Table 1). Therefore, S GH differently affects the normalized CT values according to these experimental stages. This phenomenon of S GH could further lead to more complex relationships between S GH and the normalized CT values, and consequently, results in higher difficulty of the machine-learning training. For this reason, the methodology introduced in this study should continue to be conducted.
In Figures 13 and 14, the four trained RF models correspond to the four cases, and those four RF models are tested using the universal test. The test results shown in Figures 13a and 14a are identical with those of Figure 9b. In most machine-learning-related studies, its trained performance is mainly evaluated with errors and correlation coefficients between original and predicted data in the test data set. In this study, the four, learned, RF models are tested together with the universal test set for a consistent analysis. According to one previous study, it was estimated that S GH,C might be somewhere between approximately 50-60% [31]. (d) Case 4 composed of the two low SGH and the two high SGH samples; each column represents each saturation, water, gas hydrate, and gas with coefficient of determination; the X and Y axes indicate the given original data and the predicted value by random forest; the red dotted line is drawn at 50% of SGH to clarify given data trend. (d) Case 4 composed of the two low S GH and the two high S GH samples; each column represents each saturation, water, gas hydrate, and gas with coefficient of determination; the X and Y axes indicate the given original data and the predicted value by random forest; the red dotted line is drawn at 50% of S GH to clarify given data trend. Interestingly, according to the SGH results shown in Figure 13b,c, an obvious boundary is shown at ~50% SGH, where the red dotted lines indicate the validation as to whether there is any trend related to SGH,C. In Figure 13b, the SGH values over 50% are poorly matched in performance compared to Figure 13a,c,d. On the other hand, to the left of the red line in the SGH results shown in Figure 13b, the SGH values less than 50% have comparatively good fitting results. This is further emphasized when viewing the left of the red line for SGH results of Figure 13c.
Furthermore, it should be noted that Case 2 shows deviated results for over 50% SGH (Figures  13b and 14b), even though some training data nearby showed 50% SGH (Figure 10a). This is an indication of an additional effect other than the range of training data values, "critical gas saturation". Thus, we can expect certain distinguishing behaviors of GH samples according to the different SGH,C values setting. Although there could be some SGH values near 40% or 50% in one specific GH sample, the trend of GH behaviors would highly depend on the decided SGH,C as an experimental condition. Based on that possibility, we can infer that there must be a certain radical change of GH behavior from SGH,C, which is evaluated as ~50% in this study. Cases 2, 3, and 4 are relatively comparable to each other in terms of the absolute number of training data-864, 1440, and 1152, respectively ( Table 3)-all of which are close to the value of 1000. of the two low S GH and the two high S GH samples; each column means each saturation, water, gas hydrate, and gas with coefficient of determination; the X axis is saturation value of each phase and the Y axis indicates the frequency corresponding to saturation values. The red solid line box is the original data, and the blue dotted box is the predicted result.
Interestingly, according to the S GH results shown in Figure 13b,c, an obvious boundary is shown at~50% S GH , where the red dotted lines indicate the validation as to whether there is any trend related to S GH,C . In Figure 13b, the S GH values over 50% are poorly matched in performance compared to Figure 13a,c,d. On the other hand, to the left of the red line in the S GH results shown in Figure 13b, the S GH values less than 50% have comparatively good fitting results. This is further emphasized when viewing the left of the red line for S GH results of Figure 13c.
Furthermore, it should be noted that Case 2 shows deviated results for over 50% S GH (Figures 13b  and 14b), even though some training data nearby showed 50% S GH (Figure 10a). This is an indication of an additional effect other than the range of training data values, "critical gas saturation". Thus, we can expect certain distinguishing behaviors of GH samples according to the different S GH,C values setting. Although there could be some S GH values near 40% or 50% in one specific GH sample, the trend of GH behaviors would highly depend on the decided S GH,C as an experimental condition. Based on that possibility, we can infer that there must be a certain radical change of GH behavior from S GH,C , which is evaluated as~50% in this study. Cases 2, 3, and 4 are relatively comparable to each other in terms of the absolute number of training data-864, 1440, and 1152, respectively (Table 3)-all of which are close to the value of 1000. Considering this, Case 4 has relatively less-biased results of S GH compared to Cases 2 and 3 ( Figure 13d). On both the left and right sides of the red line, data are generally positioned following the diagonal line. Table 4 organizes the mean square error (MSE) results of Cases 1-4 for both the training and test sets and the universal test set. The MSEs are computed as follows: where n is the number of training or test data, Orig i is the original ith data, and Pred i represents the predicted data for the ith original data.  Table 4 presents the MSEs corresponding to each data set, fluid phase, and case, and also shows the averaged MSEs for an overall comparison of training data, random 10% data, and the universal test. In terms of each fluid phase, S G has the lowest errors among the three phase saturations. As shown in Figures 9-12 and Figure 13, S W and S GH have the larger MSEs. As conventional machine-learning shows, the MSEs of the training data are lower than those of the two test sets.
Meaningful lessons can be understood from the comparison between all four cases. First, the absolute number of data substantially affects machine-learning performance (Cases 1 and 4). Typically, a higher number of training data would be expected to have better performance, as long as other conditions such as features and algorithms are sufficiently appropriate [35,36]. However, Cases 1 and 4 have two times the difference in the number of data; however, the MSEs are nearly in the same scale without critical difference (39 and 31.5 in the universal test). Therefore, the results of Cases 1 and 4 of this study indicate that distributions of these cases must be similar to each other, such that they also produce similar-scaled MSEs. This indicates that the process of how data is constructed is as important as the absolute number of data for machine-learning performance. Second, if it were possible to obtain a limited number of data, it would be ideal to focus on the specific S GH range. Cases 2 and 3 show lower MSEs compared to Cases 1 and 4 in random 10% test with the similar number of training data, which means specialized targeted data construction can be strategically advantageous for machine-learning performance. Third, the ratio of MSEs from Cases 2 and 3 is about 5:3, which is similar to a ratio of the rest of the data for Cases 2 and 3, respectively-in that, Case 2 has only three GH samples among the total eight GH samples and the rest has five GH samples. For Case 3, the rest of the data has three GH samples. The less related the data, the larger the MSE.

Conclusions
This paper proposed the saturation identification of water, GH, and the gas phase in GH samples based on a machine-learning method in consideration of S GH,C . Moreover, the effects of training data quantity and quality were analyzed for RF utilization in the four cases. Compared to our previous related study [16], this study utilized five additional GH samples, whose number of data was 1600. Owing to this extra data, we could categorize samples into low and high S GH groups and determine how the number of data and S GH,C affect the overall machine-learning performance.
This study validated the significant influence of S GH,C in cases where training data consists of low and high S GH groups. The average MSE differences of random 10% test (Table 4) between Cases 1 and 4 (10.9) was larger than that between Cases 2 and 3 (1.2), indicating that S GH can be a highly important standard for saturation identification in GH formation and dissociation experiments. In particular, S GH,C can be an important criterion to divide training data when any machine-learning technique is applied to given CT images in a GH experiment (refer to Cases 2 and 3). Thus, the separation of CT images according to S GH,C can be an appropriate option for constructing training data, leading to obtaining reliably specialized machine-learning models.
In conclusion, it is important to acquire a sufficiently high number of data in order to carry out trustworthy application of machine-learning; however, proper data construction should also be considered. It was expected that one specific standard for data building would be identified from the essential factors of interested behaviors based on domain knowledge, and it was verified to be S GH,C from this study. Therefore, if obtainment of data was restricted to some specific type or quantity of data, the first order of business would be selection of GH experiment to be conducted first according to a value of S GH . After that, GH experiments could be intensively performed to preferentially obtain data of CT images and saturations based on a target field condition of S GH,C . Accordingly, S GH,C would be the optimal guideline for training data building.
In future studies, additionally acquired GH CT images would be assigned to one of low or high S GH groups whose criterion is S GH,C . According to that, two machines could then be trained with those two categorized data, respectively, so as to produce two customized machine-learning models. After construction of the reliable machine-learning models based on qualitatively and quantitatively sufficient data, those models could be utilized to identify saturation values during the dissociation stage of GH sample experiments with depressurization. Saturation identification of GH samples in real time is expected to be a powerful tool to help determine general GH behaviors and conduct a variety of experiments for optimization of the parameters of GH production by depressurization.