1. Introduction
Many hydrocarbon reservoirs contain a considerable amount of H
2S and, potentially, CO
2 content, which has to be separated at gas plants by isolating the acid content from the hydrocarbons in amine units. This way, a natural “sweet” gas product is formed with specifications appropriate for transport to a variety of end users or for on-site consumption for operation energy demands [
1,
2]. Acid gas is composed mainly of H
2S and/or CO
2, water vapor, arriving from the sour gas sweetening process, and contaminants such as small amounts of methane and heavier hydrocarbons [
3]. Typically, the resulting acid gas waste stream is processed in sulfur recovery units (SRUs), such as the Claus unit, where H
2S is converted to elemental sulfur [
2,
4]. However, SRUs are not a major revenue generator due to the economically unattractive sulfur market price, whereas, air emission standards and regulatory authorities are becoming increasingly strict. As a result, oil and gas operators face an expanding economic burden and are in search of environment-friendly and cost-effective alternative methods for dealing with acid gases produced in association with sour natural resources [
1,
5].
One such alternative is acid gas (re-)injection (AGI) into suitable subsurface reservoirs combining, thus cutting operating costs due to the Claus unit deactivation, reducing sulfur emissions into the atmosphere and increasing oil recovery [
2,
5]. In a basic AGI scheme (
Figure 1), the produced reservoir gas undergoes a one- or two-stage absorbing process where it contacts an amine solution. The water-saturated acid gas mixture is then separated from the amine unit at low pressures (35 to 70 kPa) and at relatively high temperatures and is typically compressed in three or four stages to arrive at sufficiently high pressure for its injection into the subsurface formation [
6,
7]. The high-pressure acid gas flows through pipelines to the well site, arrives at the wellbore in a dense-fluid (liquid or supercritical) state, and finally gets injected into the reservoir through the well tubing [
7]. Depending on the composition and the specifications set by the operator, it may also be necessary to dehydrate the acid gas to moderate corrosion [
8].
Gas hydrates are white, solid, ice-like structures of hydrogen-bonded cavities of water molecules, in which gas molecules, H
2S and CO
2 in the acid gas treatment context (typically smaller than 0.9 nm), are encapsulated in the cavities of the hydrate’s crystallized cage-like lattice (
Figure 2) under low temperature and elevated pressure conditions. Their structural stability depends on the van der Waals and London forces developed by the interactions between the host (water molecules) and the guest (gas molecules) components [
9]. The required conditions for their formation are sufficient gas and water supply under suitable pressure and temperature conditions [
10,
11]. These conditions are indicated by the p-T phase envelope, also known as the hydrate equilibrium or hydrate dissociation curve which defines the boundary line below which (i.e., at lower temperatures and/or at higher pressures) hydrates might form [
5].
Hydrates are considered a promising energy technology for the future as the amount of methane trapped in subsurface formations is enormous. However, they also are a long-standing challenge faced by the chemical industry when it comes to flow in pipelines and related equipment, as they are responsible for severe flow assurance issues, such as pipeline plugging, thus posing economic, safety, health and environmental risks [
5,
12]. When it comes to subsea or permafrost pipelines, the extreme temperature conditions prevailing in such environments severely increase the risk of hydrate formation leading to new challenges in production operations and imposing the need for additional safety control procedures [
13,
14]. The risk of hydrate formation can be mitigated by continuous injection of thermodynamic inhibitors, such as monoethylene glycol (MEG) [
15] which causes a shift of the hydrate equilibrium curve (phase envelope) to lower temperatures and to higher pressures [
16]. As this technique imposes significant additional operational costs, the accurate determination of the pressure and temperature conditions of hydrate formation using complex computational fluid dynamics (CFD) simulations becomes necessary.
Figure 2.
Molecular structure of gas hydrate [
17].
Figure 2.
Molecular structure of gas hydrate [
17].
For the case of sour/acid gas re-injection purposes, the hydrate formation risk is high due to the nature and prevailing conditions of this operation. Hydrates occur mostly in transportation pipelines, restrictions (chokes or valves) due to the Joule–Thompson effect, as well as at the plant restart following shut-in operations where transient conditions are observed [
18]. Indeed, after the acid gas is compressed, it is directed to subsurface pipelines where the inevitably low-temperature prevailing conditions often lead to the formation of hydrates that can potentially block them. The best possible scenario would be for the acid gas to be thermally controlled sufficiently above the hydrate formation temperature and, based on the pipeline length and the seawater temperature, to arrive at the injection point (wellhead) before its temperature arrives at the formation one. Nevertheless, in cases where long pipelines must be constructed due to a large distance between the compression and the injection point, the acid gas temperature can potentially reach the temperature of the subsurface environment that, especially in deepwater and permafrost areas, can be lower than the hydrate formation one. The course of action in such cases is usually decided based on the selected surface injection design and can include a mandatory dehydration system or, if dehydration is not considered feasible due to economic, design or operating constraints, the insulation of the pipeline and its heating using electric resistances or the injection of an inhibitor (e.g., methanol) can be alternatively investigated [
19].
Clearly, engineers must perform a thorough investigation on the possibility of hydrates’ appearance based on accurately predicting the hydrate formation conditions in all liable operating systems and parts to better secure the safety of the operations and, most importantly, mitigate safety, health and environmental risks. Such predictions are typically based on complex thermodynamic calculations which check for hydrates formation during the course of CFD simulations, at each segment of a pipeline network and for each timestep, for any varying composition, pressure and temperature. Millions of such calculations are needed that can become even more intense when transient, rather than steady-state conditions, are anticipated. The complexity is attributed to the use of the van der Walls–Platteeuw theory in conjunction with the Langmuir adsorption theory, further combined to quite complex fluid models to account for the polarity and electrolyte properties of the fluids phases (i.e., water and inhibitors), such as the cubic plus association (CPA) one [
20]. Much more complex calculations are required to identify the exact hydrate type formed (sI or sII) [
21].
To significantly accelerate such calculations, ML methods have been proposed. The concept of ML entails a set of techniques that allow the generation of computational models representing physical problems without the demand to mathematically express first principle laws. These models are developed (trained) entirely by the use of data gathered through the observation of a system and they offer explicit computational tools to understand and control the system under study [
22]. Among other applications, ML methods are used to solve regression and classification problems. In the former, the model’s input is mapped to one or more continuous quantitative variables whereas in the latter, the purpose is to assign the input to one out of a number of qualitative/discrete categories (labels) [
23]. In their basic form, binary classification problems assign the input to classes such as yes/no, 0/1, etc., although multiclass problems can be handled as well. During their training, classifiers learn each class’s decision boundary using ML algorithms that try to minimize the misclassification rate [
24], that is, the number of data points for which a wrong class is assigned. The model development is performed using training data, which consists of several input variable samples as well as the desired output which is represented by a class for each sample (thus rendering the learning strategy as a supervised one).
As the Machine Learning (ML) community progressively expands, the number of ML-based projects has significantly increased, as evidenced by the successful implementation of several methods for a variety of engineering problems, such as for EOR-related production optimization [
25,
26], history matching [
27,
28], field development planning [
29], phase behavior predictions [
22,
30], waterflooding processes [
31], gas lift optimization [
32], etc. More specifically, in the context of hydrate formation prediction and related subjects, Yu and Tian [
33] developed Random Forest, Naïve Bayes and Support Vector Regression models to determine the formation conditions of natural gas hydrates. Similarly, Qasim and Lal [
34] presented four different case studies involving the use of ML methods for gas hydrates prediction purposes. Suresh et al. [
35] developed three ML algorithms based on Artificial Neural Networks, the Least Square version of Support Vector Machines (LSSVM), and Extremely Randomized Trees. They evaluated their accuracy in predicting gas hydrate formation conditions by using natural gas composition, pressure and inhibitor concentration as input to predict hydrate formation temperature. Finally, Kumari et al. [
36] examined LSSVM and ANN models in conjunction with Genetic Programming and Genetic Algorithms to predict the stability conditions of natural gas hydrates.
In this work, a set of classifiers is proposed to handle the hydrates formation question in acid gas flow simulations. The ML models answer whether the prevailing conditions at any pipeline segment and at any timestep during the simulation lie to the left (i.e., where hydrates are formed) or to the right-hand side of the phase envelope of the fluid’s composition where hydrates formation is not favorable. Eventually, the classifiers will provide a clear answer to the hydrates formation question directly, in a non-iterative fashion, for any possible composition and prevailing conditions, in a tiny fraction of the time required by the conventional iterative, complex and CPU-demanding process. Moreover, they can account for arbitrarily high operating pressures and acid gas compositions containing various amounts of impurities and inhibitors. A large set of hydrate formation “yes/no” test points are generated offline, using the conventional, rigorous approach. Subsequently, the test data is introduced to various classifying ML architectures which are trained to provide rapidly the correct hydrate formation answer. The prediction capability and the CPU time gain of the developed tool are demonstrated by simulating the flow conditions along large pipeline networks and for a variety of acid gas compositions. The developed model is directly applicable to any acid gas pipeline problem and for any prevailing conditions to drastically reduce the CPU time spent for multiphase equilibrium calculations during heavy-duty CFD flow simulations.
The rest of the paper is structured as follows:
Section 2 discusses the existing rigorous methods to determine hydrate formation conditions. In
Section 3, the classification techniques tested in this study are presented, including Decision Trees, Random Forests, Support Vector Classifiers and classification Neural Networks. Additionally, this section elaborates on the methodology used for acquiring the training data.
Section 4 presents a thorough analysis of the obtained results followed by an acid gas reinjection design case study.
Section 6 concludes the paper with the final findings.
2. Determination of Hydrates Formation Conditions
2.1. Thermodynamic Approach
The study of hydrate phase equilibrium has a long history spanning several decades. In the 1940s, Wilcox et al. [
37] developed a semi-empirical model that employed the equilibrium constants (k-values) method and relied on the theory of solid phase equilibrium to create a corresponding hydrate phase equilibrium chart. Later in 1989, Mann et al. [
38] introduced an updated chart for CO
2, H
2S and nitrogen gas hydrates to enhance the accuracy. In 1988, Holder et al. [
39] introduced the first empirical correlations for single-component gas hydrate phase equilibrium. Markogon [
40] and Kobayashi et al. [
41] expanded these empirical correlations to account for multiple-composition natural gas and developed correlations based on gas gravity.
These charts and empirical correlations were extensively used in early hydrate phase equilibrium prediction but have become largely outdated with the advent of more precise rigorous thermodynamic models. Currently, the existing thermodynamic models for hydrate phase equilibrium are founded on the base model proposed by van der Waals and Platteeuw [
42] as discussed below. Classic stability analysis is typically used to examine whether some specific fluid phase is formed in a mixture at given pressure and temperature conditions. When it comes to hydrates, there is no simple stability algorithm to provide a single binary answer (hydrate forming/no forming) and the presence of hydrates can only be examined by comparing the prevailing pressure (temperature) to the hydrates formation one at current temperature (pressure). This approach requires that the hydrates’ formation conditions are estimated using rigorous thermodynamic methods.
Figure 3 depicts an example hydrates phase diagram formation curve for a gas mixture of C
2H
6 and C
3H
8. Various phase boundaries can be observed depending on the presence of hydrates and their specific structure type (sI or sII).
Estimating hydrates formation pressure or temperature has been a hot topic in thermodynamics since 1960 when van der Waals and Platteeuw presented the first rigorous approach. The idea lies in that at hydrate formation conditions, all phases present exhibit the same water fugacity and that hydrates appear at an infinitesimal quantity, that is
where superscripts
denote hydrates, liquid, gas and ice (if applicable) phases. Therefore, estimating the formation conditions requires the solution of a multiphase phase split problem to determine the amount and composition of each phase present [
44]. The computational cost to obtain the fugacity of water in the fluid phases is moderate as complex Equation of State models, such as the CPA one [
20], need to be incorporated, the complexity of which is significantly higher than that of simple cubic EoS models [
45]. However, the estimation of water fugacity in the hydrate phase is very cumbersome as acid gas molecules are assumed to be trapped in the water molecules’ cage through adsorption. Langmuir’s theory is utilized to describe the thermodynamics of adsorption incorporating complex potential functions such as those proposed by Kihara [
46].
Acid gases, specifically H
2S, play a significant role in altering the phase equilibrium of gas hydrates during acid gas injection operations since their presence affects both the stability and formation of hydrates. When H
2S and CO
2 are present in an injection system, they participate in hydrate formation along with other components and they can form hydrates at higher temperatures and lower pressures compared to methane (the dominant constituent of natural gas), thereby expanding the hydrate stability zone. In other words, the presence of these acid gases lowers the temperature and pressure thresholds for hydrate formation.
Figure 4 illustrates the calculated hydrate formation curves for three different gases: a sweet gas, the sour gas obtained by enriching the sweet one with 20 mole % CO
2, and 20 mole % H
2S. As demonstrated, the impact of H
2S is significantly more pronounced than that of CO
2. While CO
2 slightly depresses (shifts to the left) the hydrate formation condition, H
2S considerably promotes hydrate formation [
47].
Wu and Carroll [
48] performed an experimental procedure to test the hydrate formation of 4 sour gas mixtures with increasing H
2S content (8.3, 8.4, 11.68 and 28.8%), concluding with the following remarks:
The hydrate formation temperature rises with an elevated H2S content and, specifically, when it exceeds 10%, there is a notable increase in the temperature of hydrate formation. For gas containing less than 10% H2S, the increase in hydrate formation temperature is relatively small, but not insignificant.
When the gas contains more than 30% H2S, the hydrate formation temperature becomes comparable to that of pure H2S.
The hydrate formation temperature shows a rapid increase with pressure changes at lower pressures. However, at higher pressures, the hydrate formation temperature changes more gradually, indicating that it is more sensitive to pressure variations at lower pressure levels.
The thermodynamics of hydrate formation is also influenced by the molecular size and shape of the gas. Acid gas (CO2 and H2S) have larger molecular sizes compared to methane and thus prefer to occupy larger cavities in the hydrate structure. This may lead to a change in the distribution of hydrate structures, potentially causing the hydrate to shift from a structure I (sI) to a structure II (sII) depending on the composition. Furthermore, acid gases affect the stability of the hydrate. CO2 hydrates are less stable than methane hydrates, implying that, in a mixed system, the presence of CO2 could destabilize the hydrate. However, H2S forms more stable hydrates than methane, so it can stabilize hydrates in a mixed system.
Strictly speaking, crossing the hydrates formation boundary does not necessarily mean that blockage will immediately take place. In fact, hydrates are spread in the flowing acid gas phase which is continuously enriched in solid particles, forming a “slurry”. Although the deposition, which will eventually lead to pipeline blockage, starts later, the initiation of the process acts as a warning to engineers who need to handle this issue as soon as possible.
It should be noted that this work focuses on hydrate generation rather than on hydrate blockage which requires real-time information and detailed knowledge of the exact conditions along the pipelines. Nevertheless, anticipating the blockage problem using a comparison to the phase envelope is the way engineers and related software go when first dealing with a network design problem. Steady-state network simulations are used to roughly evaluate whether hydrates endanger the acid gas flow as well as to estimate the inhibitors’ dosage, if needed. Getting further with a fully detailed analysis is not a common task, especially for small players in the market, as it requires plenty of monitoring data which will be utilized to tune and run a representative model of the flowing conditions and predict in detail hydrates blockage.
The commercial software available to the petroleum industry only handles steady-state flow conditions where time derivatives are equal to zero as is the case with Pipesim by SLB, Prosper by Petroleum Experts, HYSYS by Aspen and UniSim by Honeywell. As a result, handling the hydrates formation effect in a fully detailed level where subcooling and eventually aggregation take place needs to be handled by transient analysis. OLGA by Schlumberger and HYSYS Dynamics by Aspen are suitable products for that kind of analysis. Furthermore, according to the industry’s experience, engineers’ expertise in a corporate environment is usually focused on steady-state solutions rather than transient ones as the latter are much more complex to handle. To satisfy that need of the industry, software developers have focused mostly on steady-state products. Therefore, this work aims at providing an easy-to-embed methodology to improve the speed of exactly this kind of simulation where the only criterion that can be applied is the comparison of running conditions against the thermodynamically defined hydrate formation ones.
2.2. Classification Approach
Estimating the formation pressure (temperature) at a given temperature (pressure) is a complex task that requires multiple iterative calculations and consumes a significant part of the total acid gas flow simulation time as it involves the computation of numerous intermediate values such as components’ fugacity, Langmuir constants and cell potentials. On the other hand, the stability question eventually only requires a binary yes/no answer which, since rigorous thermodynamics cannot provide that, ML methods can be utilized instead.
Let , and correspond to the flowing stream composition and the pressure and temperature prevailing conditions. Let also denote a function that exhibits
Positive values when the prevailing conditions do favor hydrates formation;
Negative values when the prevailing conditions do not favor hydrates formation;
Zero value exactly at the hydrates formation phase envelope.
Function
is known as a “discriminating function”, the sign of which suffices to clearly determine the existence or not of hydrates since
To generate such a model, classification methods from the ML field, such as Decision Trees or Support Vector Machines can be utilized. The training dataset can be generated by picking random input points within the expected operating space and then running offline the rigorous hydrates stability algorithm to obtain the corresponding class label, i.e., whether hydrates are formed or not. Subsequently, the training dataset is forwarded to the training algorithm which generates the discriminating function form employing an iterative procedure known as the “training” of the classifier.
As a simple example consider a fixed acid gas composition case and a set of training points, i.e., combinations of potential pressure and temperature values. In
Figure 5, points in the green-colored area correspond to conditions where hydrates formation is favorable as opposed to those in the blue background where pressure is too low or temperature is too high to allow hydrates to form. The classifier training aims at developing an explicit expression to evaluate the red line which separates the points in the two areas without allowing for misclassifications. Clearly, the more training data points, the more densely each area is populated, hence, the closer the red discriminating line lies to the exact, thermodynamically rigorous phase boundary.
3. Classification Models Development
3.1. Classification Models
Four popular ML classification techniques are evaluated in this work, namely Decision Trees (DTs), Random Forests (RFs), Support Vector Classifiers (SVCs) and classification Neural Networks (NNs). It must be emphasized that training time is not an issue in the present case, as training is performed once, offline and prior to any fluid flow simulation run. What really matters is the time required to evaluate the sign of the obtained discriminating function, i.e., obtain the label during the simulation, once the training has been completed. A very complex function expression may handicap the anticipated CPU time gain; hence, it should be dropped and replaced by another classification technique that leads to a simpler expression.
DTs construct a flowchart-like structure (
Figure 6), where internal nodes represent input features such as pressure and temperature, and branches represent decision rules which split data points based on a chosen feature and threshold value (e.g.,
< 10 bar,
> 8 °C). Finally, the ending leaf nodes represent predicted class labels (True or False). Training the DT aims at defining the appropriate order of splits, i.e., selecting the feature to be used and its splitting value, which minimizes or even zeros the misclassifications over the training population while ensuring optimal generalization capability.
To train a DT, the Gini impurity measure is utilized to select optimal splits in the variable space as a criterion that quantifies the impurity or disorder within a set of class labels. It is defined as the probability of misclassifying a randomly chosen data point based on the distribution of class labels. By selecting the splits that minimize the Gini impurity, the decision tree aims to segregate the data points into pure or nearly pure subsets, optimizing the classification accuracy. This allows the DT to effectively partition the input space based on the training data, enabling accurate predictions for new, unseen data points.
RF is an ensemble learning method that combines multiple simple DTs to create a more robust and accurate prediction model. Each tree is trained on a different subset of the data using a random subset of the input features and the algorithm randomly selects a subset of data points with replacement, a procedure known as bootstrap aggregating or “bagging”. Additionally, at each split in a DT, only a random subset of features is considered. This randomization reduces overfitting and increases the diversity among the individual DTs. The final prediction of the RF is determined by aggregating the predictions of all the individual trees through majority voting. By combining the predictions of multiple DTs, RF improves the generalization performance and provides robustness, scalability and better overall predictive accuracy compared to a single DT.
SVCs are powerful supervised ML algorithms that aim to identify an optimal hyperplane for separating data points of different classes. Unlike traditional classification algorithms that focus on minimizing misclassification error, SVMs seek to maximize the margin, which is the distance between the class discriminating hyperplane and the closest training data points, known as support vectors. The optimization task of SVMs involves finding the optimal hyperplane that maximizes the margin while correctly classifying the training data. This is achieved by solving a quadratic optimization problem subject to linear inequality constraints. To handle non-linear relationships, SVMs utilize kernel functions, such as polynomial or a Radial Basis Function (RBF), to implicitly map the input data into a higher-dimensional feature space where the data becomes linearly separable. The choice of the kernel function depends on the specific problem and the underlying characteristics of the data. By employing SVMs with appropriate kernel functions, complex decision boundaries can be captured, allowing for the effective classification of the data.
NNs are a class of ML models inspired by the structure and functioning of the human brain. They are composed of interconnected layers of artificial neurons, which are organized in an input layer, one or more hidden layers and an output one. Each neuron processes the information and passes it to the next layer through weighted connections. During the training process, NNs learn to adjust the weights of these connections based on a given objective, typically to minimize the error between the predicted and actual output. This is done through an optimization algorithm, such as gradient descent, which iteratively updates the weights to improve the network’s performance. Classification NNs obtain a class label from the output of the neural network, by applying a threshold to the value produced by the logistic function at the single output layer. The logistic function produces a value between 0 and 1, which can be interpreted as the probability of the input belonging to a particular class. By setting a threshold, typically 0.5, the input data can be assigned to class 1 if the output value is greater than or equal to the threshold, and class 0 otherwise. This way, the neural network can produce a class label based on the input data. Once trained, NNs can classify (, ) points quickly by propagating the input through the network and producing an output at the final layer.
3.2. Classification Models Input
The data required by a thermodynamically rigorous approach to determine hydrates formation conditions are the composition vector 𝐳, the prevailing pressure and temperature values as well as the component properties. The same input needs to be incorporated into the ML models apart from the component properties which are constant and do not bear any information. Based on that, the data used to train the ML models consisted of a large number of
pairs, where the input vector
comprises the composition vector 𝐳 containing the concentration of all four components typically found in acid gas mixtures, that is, CO
2, H
2S, C
1 and C
2. To honor the condition that the composition vector 𝐳 lies in the 3D simplex since valid composition mole fractions sum up to unity and to avoid linear dependence of the inputs, only 3 independent components concentrations were introduced, thus reducing the input vector size to 5. The pressure and temperature values for each composition combination,
and
, respectively, were uniformly drawn and the Prosper software by PetEx was used to construct the hydrate dissociation curves that, ultimately, generate the corresponding output vector
which contains the assigned label that designates whether hydrates will form or not (1 or 0, respectively). The generated dataset can be arbitrarily large and the data itself is noise-free as it is generated by a thermodynamically consistent method such as the solids thermodynamic model implemented in the HydraFLASH software that is integrated into Prosper. For a given acid gas mixture, HydraFLASH uses the multiphase equilibrium algorithm by Michelsen [
50], the van der Waals and Platteeuw theory and the CPA EoS [
51] to predict the hydrate dissociation curve for the acid gas mixture of interest.
Compositions were randomly generated from the uniform distribution shown in
Table 1 to densely cover the expected range of reservoir and surface conditions of the acid gas re-injection system. To account for the anticipated acid gas re-injection conditions, the pressure and temperature range of interest was determined based on the hydrate formation conditions in acid gas streams (
Figure 7) as studied by Wu and Carroll [
52]. This chart provides useful insight into the hydrate formation curves of three acid gas mixtures of varying H
2S and CO
2 content, namely 75%/25%, 50%/50% and 25%/75% with the rightmost and the leftmost lines corresponding to pure H
2S and CO
2, respectively. The three test acid gas mixtures lie perfectly between the two bounding curves, with decreasing maximum formation temperature as H
2S decreases [
5].
Based on
Figure 7, the temperature range selected is [−20, 40] °C to account for the minimum possible subsea temperature (−20 °C) that may be encountered in colder regions across the world, as well as for the maximum possible upper hydrate formation temperature limit (approximately 35 °C) when a safe margin of 5 °C is added to secure a safety window (40 °C). As far as the pressure range is concerned, the lower pressure limit was dictated by the acid gas output pressure from the AU, which is close to the atmospheric one. However, the selection of the upper-pressure limit is a trickier procedure since it depends on the operation under consideration. For the case of acid/sour gas injection in shallow formations, high-pressure compressors can reach high-pressure values, up until 1000 psi (70 bar), as is the case of the high-pressurized sour gas compressor in the Tengiz field in Kazakhstan [
53]. In the case of the pipeline distribution system, pressures can be as high as 2800 psi (or 190 bar) [
54]. Finally, for the case of the injection pressure at the wellhead, which is determined during the process design phases and depends on the reservoir properties, it can reach values as high as 4000 psi (275 bar) [
5]. Subsequently, the pressure range of [1, 300] bar was selected for the present study.
3.3. Classification Models Dataset Generation
To build the training dataset, hydrate formation curves were generated for several acid gas mixtures which span the entire composition spectrum, using the Prosper software by IPM [
55]. Subsequently, for each composition, thousands of
pairs were randomly generated. To properly classify the position of each test point relative to the curve (left or right), an algorithm known as the winding number algorithm was used. The winding number of a given reference point, in this case, a non-uniformly randomly generated
one, is an integer that represents the total number of times that a closed curve travels counterclockwise around the point under consideration. This algorithm is implemented by, firstly, considering a horizontal line segment that starts at the reference point and extends out to positive infinity (ray). Then, the algorithm iterates over each edge of the curve and checks whether the ray cast from the point intersects the edge. If the ray intersects the edge from below and the slope of the edge is greater than the slope of the ray, the winding number is incremented otherwise it is decremented.
As the above-mentioned procedure produces uniformly drawn samples, over the operating space a tweak was used to generate a “biased” sampling procedure. Indeed, the training dataset should include more points “close” to the phase envelope, as these points provide more detailed information compared to the ones lying comfortably far to the left or right. To achieve such a non-uniform distribution of training points, each point was assigned a “keep or discard” probability
as soon as it was randomly generated, which depended on its Euclidean distance
to the phase boundary in the Min–Max scaled
space and takes the form
The lower the distance, the higher the probability that the uniformly drawn data point will eventually be included in the training population, thus densifying points close to the phase boundary.
Ultimately, the dataset used to train the selected ML models consisted of approximately 40,000 pressure and temperature data points. A total of 60% of them were used for training purposes and another 30% for testing after the training procedure was over. Furthermore, the remaining 10% of the training dataset was retained for validation purposes to confirm the efficiency of the trained models.
5. Case Study
Figure 16 depicts the initially proposed design of a CO
2-rich acid gas re-injection scheme. Firstly, the compressors receive the acid gas at the AU outlet and increase its pressure at a compression ratio of 2.5. Throughout each compression stage strict temperature control is maintained to ensure that the temperature of the gas exiting the compressor does not exceed 150 °C (due to the Joule–Thompson effect). Subsequently, each compression stage is followed by a cooling one (implemented using chillers) which cools down the heated fluid at constant pressure. The whole process begins at atmospheric conditions and gradually reaches a pressure of 100 bar, thus requiring a five-stage process that involves both compression and cooling of the undesired acid gas mixture produced by the reservoir. After the last cooling stage, where the acid gas turns from a supercritical fluid to a liquid, the stream is driven to the wellhead location through a subsea pipeline lying at a minimum temperature of 9 °C according to the local weather data. At that stage, the acid gas undergoes a slight pressure drop (due to friction) and a significant temperature drop due to the low temperature and huge capacity of seawater. Finally, the high pressure to ensure re-injection to the reservoir (250 bar) necessitates the inclusion of a pump.
The proposed design effectively avoids the two-phase region of the CO2-rich acid gas by developing a pressure (100 bar) higher than the acid gas cricondenbar (76 bar). However, a more thorough examination of the acid gas mixture hydrates formation curve (generated using the trained NN model and a bisection method) verifies the flow simulation results which indicate the formation of hydrates halfway along the pipeline and all along the pump path.
To remedy this situation, pipeline insulation is proposed, thus achieving a similar pressure drop but reducing the temperature drop and keeping the acid gas conditions safely away from the hydrate formation conditions (
Figure 17). It should be noted that the insulation quality and thickness are also a function of the flow rate, as the lower the latter, the bigger the temperature drop since more time is allowed for the fluid to exchange heat with the seawater.
According to the operator’s field development plans, it is expected that additional H
2S-rich sections of the reservoir will soon be brought into production, resulting in an increased concentration of H
2S in the acid gas. This way, the design shown in
Figure 17 is no longer suitable, although the prevailing conditions during compression still lie far away from the two-phase region. This is due to the shifting of the hydrates formation curve to the right-hand side by approximately 8 °C due to the increased H
2S content of the acid gas stream, as shown in
Figure 18. As a precautionary measure, there are two potential approaches to consider. Firstly, reinforcing the pipeline insulation can be considered to provide extra protection against hydrate formation by further reducing the temperature drop. Alternatively, the compressed gas can be mixed with some suitable additive, at a considerable cost, which helps to inhibit hydrate formation, or electric heating elements might be installed. By implementing these precautionary measures, the formation of hydrates can be successfully prevented, as demonstrated in
Figure 19.