Predicting Gas Separation Efficiency of a Downhole Separator Using Machine Learning

Sharma, Ashutosh; Osorio Ojeda, Laura Camila; Yuan, Na; Burak, Tunc; Gupta, Ishank; Konate, Nabe; Karami, Hamidreza

doi:10.3390/en17112655

Open AccessArticle

Predicting Gas Separation Efficiency of a Downhole Separator Using Machine Learning

by

Ashutosh Sharma

^1,*

,

Laura Camila Osorio Ojeda

¹

,

Na Yuan

^1,*

,

Tunc Burak

¹

,

Ishank Gupta

²,

Nabe Konate

¹

and

Hamidreza Karami

¹

Mewbourne School of Petroleum and Geological Engineering, University of Oklahoma, Norman, OK 73019, USA

²

H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA

^*

Authors to whom correspondence should be addressed.

Energies 2024, 17(11), 2655; https://doi.org/10.3390/en17112655

Submission received: 2 May 2024 / Revised: 15 May 2024 / Accepted: 28 May 2024 / Published: 30 May 2024

(This article belongs to the Special Issue Recent Advances in Oil and Gas Recovery and Production Optimisation)

Download

Browse Figures

Versions Notes

Abstract

Artificial lift systems, such as electrical submersible pumps and sucker rod pumps, frequently encounter operational challenges due to high gas–oil ratios, leading to premature tool failure and increased downtime. Effective upstream gas separation is critical to maintain continuous operation. This study aims to predict the efficiency of downhole gas separator using machine learning models trained on data from a centrifugal separator and tested on data from a gravity separator (blind test). A comprehensive experimental setup included a multiphase flow system with horizontal (31 ft. (9.4 m)) and vertical (27 ft. (8.2 m)) sections to facilitate the tests. Seven regression models—multilinear regression, random forest, support vector machine, ridge, lasso, k-nearest neighbor, and XGBoost—were evaluated using performance metrics like RMSE, MAPE, and R-squared. In-depth exploratory data analysis and data preprocessing identified inlet liquid and gas volume flows as key predictors for gas volume flow per minute at the outlet (GVFO). Among the models, random forest was most effective, exhibiting an R-squared of 96% and an RMSE of 112. This model, followed by KNN, showed great promise in accurately predicting gas separation efficiency, aided by rigorous hyperparameter tuning and cross-validation to prevent overfitting. This research offers a robust machine learning workflow for predicting gas separation efficiency across different types of downhole gas separators, providing valuable insights for optimizing the performance of artificial lift systems.

Keywords:

downhole separator; machine learning; efficiency; multiphase flow

1. Introduction

At a certain stage in the production life, artificial lift or tertiary recovery systems become inevitable for oil and gas wells worldwide. The main target for every operator company from the exploration to development phase is to optimize drilling and production of a well. When the natural reservoir pressure declines, artificial lift equipment are installed within the primary recovery stage to extract hydrocarbon from reservoir. Artificial lift systems play a pivotal role in ensuring the continued extraction of hydrocarbons from reservoirs that have undergone pressure depletion. Various forms of artificial lift equipment, such as sucker rod pumps (SRPs) or beam pumps, electrical submersible pumps (ESPs), plunger lifts, and progressive cavity pumps (PCPs), are selected based on specific operating conditions. The performance of these technologies is significantly impacted by the presence of gas in the production fluid, which can diminish the efficiency of the equipment and lead to increased non-productive time (NPT). A downhole gas separator is a device designed to separate gas from the liquid–gas mixture stream within the subsurface environment. It effectively channels gas into the casing–tubing annulus to be brought to the surface, while directing the liquid into the pump and up through the tubing string. By segregating gas from liquids, the separator enhances the effectiveness of artificial lift systems, boosts the productivity of the wellbore, and minimizes non-productive time [1,2].

Gas separator technology initially started as a water management tool and, with time, advanced significantly. Ref. [3] mentioned that gas separators, also known as gas anchors, were used to separate oil and gas from the produced water downhole, discharging the oil and gas mixture into the fluid stream and injecting produced water into the disposal zones. Hydrocyclone and gravity processes were used to segregate fluids in these separators. Also, if separating gas in the downhole is challenging, less efficient methods like hydraulic pump and gas lift methods could be used, rather than electrical submersible pump (ESP) of high flowrate capacity. Ref. [4] mentioned that gas separators expand the operating window for ESP, especially in high gas–oil ratio (GOR) wells. Natural gas separators, poor boy separators, modified poor boy separators, packer-type separators, and special separators are some of the most common downhole separators used in oil and gas wells. The oldest and most efficient separators are natural gas separators, and further research was conducted by modifying their shape and evaluating their separation efficiency [5,6,7,8]. Almost all types of separators work on the principle of gravity. A major disadvantage of gravitational separators is the time consumed during the separation. Modifying the design of separators, for instance by adding a centrifugal section in the separator that uses centrifugal force, enhances and accelerates the process. Stokes’ law governs the physical mechanism of gas–liquid separation [9]. This law describes how a particle (or bubble) travels through a fluid, depending on its size, viscosity, and density. Centrifugal separators force an inflowing stream into a swirling motion to create a vortex-type phenomenon. This movement can be obtained by abruptly changing the direction of the fluid tangentially or by using helical elements or designs such as augers [9].

In the past, different types of downhole separators have been experimentally investigated. Ref. [10] presented a detailed laboratory investigation into the efficiency of different downhole gas separator designs, challenging common industry assumptions with experimental data. The study highlights the importance of separator entry port placement, dip tube length, and the innovative use of centrifugal forces in enhancing gas–liquid separation efficiency [10]. Similarly, Refs. [1,2] conducted a study on the effectiveness of a prototype centrifugal packer-type downhole separator, examining its performance under different liquid and gas flow rates. Through laboratory experiments using a multiphase flow setup, it was discovered that the separator’s average gas separation efficiency was notably high, particularly at medium liquid flow rates [1,2]. Ref. [11] conducted multiple tests to evaluate a newly designed centrifugal packer-type downhole separator for artificial lift systems in oil and gas wells. The research found the separator consistently achieved over 90% liquid separation efficiency across various conditions, although efficiency slightly dropped at higher liquid rates. This highlights the separator’s potential to significantly reduce gas-related issues in pumping systems, marking a notable advancement in artificial lift technologies [11]. Ref. [12] conducted a comparative analysis between centrifugal and gravitational downhole separators, evaluating the performance of each through over 150 tests for centrifugal separators and 55 tests for gravitational separators. Their findings indicated that centrifugal separators outperformed gravitational separators in terms of separation efficiency. The research aimed to assess the impact of flow rates on the performance and stability of a newly developed packer-type centrifugal separator, contrasting it with that of a gravitational separator [12]. Ref. [13] developed and validated a numerical model to evaluate a hybrid separator combined with a piston pump, utilizing two-phase turbulent CFD analysis. The study examines the impact of piston speed and gas/liquid ratio on the separator’s performance, highlighting that higher piston speeds decrease efficiency due to greater pressure drops. It also introduces a correlation for predicting separator efficiency, providing key insights for optimizing downhole separator design and operation [13]. Ref. [14] reviewed advancements, limitations, and applications of downhole liquid–gas separators in oil and gas operations, emphasizing their importance in enhancing artificial lift systems. The study provides insights into optimizing separator designs for improved performance in unconventional wells, underlining the need for further research to address current challenges and enhance separator efficiency [14].

Machine learning (ML) has previously been applied to drilling, reservoir, and production data to enhance drilling and production efficiency of oil and gas wells. However, only two past works have conducted studies on gas separation efficiency of downhole separator by employing ML. Ref. [15] showed that centrifugal separators utilize this principle by inducing a swirling motion in an incoming flow to create a vortex effect. Stokes’ law outlines the behavior of a particle (or bubble) moving through a fluid, influenced by its size, viscosity, and density. This action is achieved by either abruptly altering the fluid’s direction tangentially or employing helical structures [15]. Operating on a basis similar to gravitational separators, centrifugal separators leverage the difference in densities between gas and liquid. The centrifugal force propels the denser liquid toward the vortex’s outer edge, while the gas remains closer to the center [15]. Ref. [16] study advances the understanding of downhole centrifugal separators in oil and gas extraction by leveraging ML and dimensional analysis to pinpoint critical efficiency factors. The research, grounded on a comprehensive database from prior studies, shows separators can reach over 79% efficiency under certain conditions, though this declines with higher fluid rates. It introduces a predictive model emphasizing the importance of the Weber and gas Reynolds numbers, offering valuable insights for enhancing artificial lift system designs [15,17].

In the past, machine learning has been applied to predict separation efficiency of the downhole gas separator in a limited number of studies using multiple independent variables. However, prediction of separation efficiency of the downhole gas separator with blind test has not been performed in the past using supervised machine learning method by incorporating only two independent variables. In this study, a model was trained using experimental test data from a centrifugal-type downhole separator and then tested on a gravity-based downhole gas separator experimental test data in what is referred to as a blind test. To quantify the separators’ efficiency, the gas volume flow per minute at the outlet was predicted based on the liquid and gas volume flows at the inlet, from which the gas separation efficiency was subsequently calculated. The formula to calculate gas separation efficiency is shown below.

M e a s u r e d G a s s e p a r a t i o n e f f i c i e n c y (f r a c t i o n) = \frac{G a s v o l u m e f l o w p e r m i n u t e a t o u t l e t}{G a s v o l u m e f l o w p e r m i n u t e a t i n l e t},

(1)

2. Data Collection

This study utilizes data from experiments conducted to test the separation efficiency of both centrifugal and gravity-based downhole gas separators [1,2,11,12,18]. Refs. [1,2,12] conducted experiments on the centrifugal gas separator, while Ref. [11] conducted experiments on the gravity-based downhole gas separator. In total, 260 tests were performed to assess the separation efficiency of the centrifugal separator, and 55 tests were conducted for the gravity-based separator using multiphase flow facility as shown in next section. Consequently, 260 tests were incorporated for model training, while 55 tests were used for model testing and analysis. The main distinction between centrifugal and gravity-based separators lies in the spiral section present in the centrifugal separator. Additionally, the gravity-based separator measured 10 inches in length, compared to the 15-inch length of the centrifugal separator.

3. Facility Design to Test Downhole Separator

A comprehensive multiphase flow system was constructed for this purpose, extending 31 feet in length horizontally and rising 27 feet vertically. The centrifugal downhole gas separator was provided by Echometer Company, located in Texas, United States. As depicted in Figure 1, the system encompasses the gas and water inlet lines, horizontal and vertical sections, and both tubing and casing return lines, comprising the full multiphase flow setup. Instrumentation within the configuration includes five Coriolis flowmeters, five control valves, and eight pressure transducers, supplemented by various manual pressure gauges and two differential pressure gauges. These instruments are all connected to a central data acquisition system (DAQ) for real-time monitoring and control of the flow parameters, with the DAQ cards being operated via Visual Basic software. In each experiment, the setup, featuring over 20 pieces of equipment, captures more than 30 different variables, such as pressure, temperature, density, and flow rates. As detailed in Figure 1, the tubing return line (TRL) connects to the tubing flow path, while the casing return line (CRL) links to the casing–tubing annulus flow path. Through the forces of gravity and centrifugal motion, the downhole gas separator segregates the liquid and gas within the vertical section, directing the liquid or water through the tubing to the TRL and the gas through the annulus to the CRL. Consequently, the flowmeter at the top of the CRL measures the gas, and the flowmeter at the bottom of the TRL measures the liquid or water. The flowmeter installed at the top of TRL measures gas flow, if there is any gas flow from the tubing. The gas volume flow per minute at the outlet (GVFO), which is the CRL, is utilized to forecast the efficiency of the downhole separator, calculated based on the liquid and gas volume flow per minute at the inlet (LVFI and GVFI) located at the T-section.

The gas separator provided by Echometer Company is shown in Figure 2, showcasing a design concept for a static centrifugal packer-type downhole separator. This separator is particularly beneficial in wells with high gas-to-oil ratios, effectively addressing issues such as diminished artificial lift performance and gas lock. A key advantage of this gas separator is its lack of moving parts (static), which significantly reduces maintenance requirements and can help operating companies minimize non-productive time. The separator’s design allows for the entry of the gas–liquid mixture through two ports, depicted in green and red in Figure 2. The rotational motion generated by the spiral section, coupled with the centrifugal force at the outlet, forces liquid droplets to impinge against the casing’s inner wall, thereby enhancing liquid and gas separation at the outlet. The separated liquid, under gravity, descends into the casing–tubing annulus and is channeled into the separator’s shroud, subsequently ascending through the tubing-shroud annulus into the two intake ports.

4. Methodology

This section outlines the development and application of various ML algorithms, including multiple linear regression (MLR), k-nearest neighbors (KNN), random forest regression (RFR), support vector machine (SVM), ridge, lasso, and XGBoost, with the objective of forecasting the gas separation efficiency in downhole separators. The workflow is visually represented in Figure 3, which illustrates the comprehensive steps involved: data collection, data processing, model training and testing, followed by efficient model selection. The comparative analysis of all regression models is performed based on the R-squared and error metrics as mentioned in [19,20,21] for efficient model selection.

The experimental setup, comprising over 20 different pieces of apparatus, records more than 30 distinct variables such as pressure, temperature, and flow rates in each test run. Given the extensive nature of this dataset, it is crucial to judiciously select independent variables that predict the separator’s efficiency. This selection process relies on methods such as scatter and correlation plotting, alongside multicollinearity analysis. Further elaboration on input parameter selection and multicollinearity assessment is provided in subsequent sections. Prior to being included in model training and testing, the selected independent variables undergo scaling and outlier removal. The dataset used for training and testing the ML models comprises 260 rows of data collected from experimental testing of centrifugal separator efficiency, and 55 rows of data collected from experimental testing of gravity-based downhole separator efficiency, respectively.

In this research, the effectiveness of the models was compared and assessed using metrics such as the coefficient of determination (R-squared), root mean square error (RMSE), and mean absolute percentage error (MAPE). Models with low R-squared values and high error rates were eliminated. The model demonstrating the highest R-squared and the lowest error rates was ultimately selected as the most reliable for predicting the efficiency of the downhole separator. This approach not only ensures accuracy but also contributes to optimizing the separator’s performance in practical scenarios.

Data Preprocessing and Feature Engineering

In this study, two critical independent variables were identified for predicting the gas volume flow per minute at the outlet (GVFO), the gas volume flow (GVFI) and the liquid volume flow per minute (LVFI), both measured at the inlet. A thorough analysis of the experimental data was conducted, including examination for multicollinearity, as discussed later in this section. This analysis determined that GVFI and LVFI were the most impactful predictors, ensuring that the final model was not adversely affected by collinearity between variables.

To construct the most accurate model for predicting GVFO, seven regression algorithms were incorporated for analysis: KNN, lasso, MLR, RFR, ridge, SVM, and XGBoost regression. We conducted hyperparameter tuning for each model using a grid search method combined with cross-validation. Specifically, for the random forest model, we optimized parameters such as the number of trees (ntree = 500), maximum depth which is controlled indirectly through the “mtry” parameter, which represents the number of variables randomly sampled as candidate at each split, minimum samples split, and minimum samples leaf. The tuning process involved a five-fold cross-validation on the training dataset to identify the optimal hyperparameters that minimize prediction error. Each model’s performance was meticulously evaluated based on its R-squared value and error metrics. The selection process was data-driven, with a clear focus on minimizing prediction errors. The optimal model, as depicted in Figure 4, was chosen for its superior performance, marked by the lowest error values among the contenders and MLR is considered as the base case in the comparative analysis. This rigorous selection methodology ensures that the implemented model for downhole separator efficiency analysis is robust and reliable.

The preprocessing of the data, including the selection of relevant parameters, involved both univariate and bivariate analyses. Box plots were used for each parameter to detect outliers, and the analysis concluded that the experimental data did not contain any outlier values. Similarly, bivariate analysis was conducted to understand the relationship between the dependent variable, GVFO, and the available independent parameters, with the findings shown in Figure 5. According to Figure 5, there is an observable trend where GVFO tends to increase in conjunction with rises in gas–liquid ratio at inlet (IGLR), control valve opening percentage installed at gas inlet line (CVGIL), and casing control valve (CCV), while it does not change much with increases in average inlet gas flow rate (AIGF) and average inlet liquid flow rate (AILF). To refine the model, further exploration is necessary to examine the correlations between the response variable and independent parameters, which will aid in selecting the optimal set of independent variables for accurate GVFO prediction.

Figure 6 presents a correlation matrix that visually maps out the relationships between the dependent variable, gas volume flow at the outlet (GVFO), and various independent variables, alongside the intercorrelations among the independent variables themselves. The visualization reveals a high correlation of GVFO with parameters such as CVGIL, gas volume flow per minute at inlet (GVFI), and gas volume flow at tubing return line (GVFTRL), indicating strong predictive relationships. Conversely, GVFO is moderately associated with CCV, IGLR, and gas–liquid ratio at outlet (OGLR), while sharing a low degree of correlation with AILF, AIGF, liquid volume flow per minute at inlet (LVFI), and gas volume flow at tubing return (LVFTRL). This graphical representation also indicates significant correlations between several independent variables, an observation that shows a more detailed analysis. To ensure the integrity of the regression model and to mitigate the risk of multicollinearity, a subsequent application of the variable inflation factor (VIF) is recommended for the meticulous selection of independent variables. By doing so, the analysis will eliminate redundant variables, ensuring that the model’s predictive accuracy is not compromised by highly interdependent predictors.

Examining Figure 6 underscores the importance of assessing multicollinearity to understand the interdependencies among chosen predictor variables. The variable inflation factor (VIF) was utilized to evaluate the degree of correlation between the independent variables within the regression analysis. The VIF essentially quantifies the extent to which multicollinearity has amplified the variance of an estimated regression coefficient. Referring to Table 1, it is evident that the VIF scores for regression model is consistent, reflecting the uniformity of input variables used in the estimations of GVFO. VIF values below 1 imply the absence of correlation, those ranging from 1 to 5 denote a moderate level of correlation, and values exceeding 5 reveal a strong correlation amongst the predictors. The selected two independent variables have moderate VIF, indicating moderate level of correlation.

Following a detailed examination of scatter plots and correlation matrices, along with the implementation of the stepAIC function, the variables GVFI and LVFI were chosen as the input parameters. To facilitate a faster convergence and enhance the efficiency of the machine learning algorithms, these variables underwent a scaling process to ensure uniformity in their range. Subsequently, the dataset from experimental testing of centrifugal separator was used for model training and the dataset from experimental testing of gravity-based separator (data not used in model training) was used for model testing. This blind test was essential to check how well the model predicts and how it performs on unseen data.

5. Results

The research conducts seven separate regression analyses using a suite of methods that include k-nearest neighbors (KNN), lasso, multiple linear regression (MLR), random forest regression (RFR), ridge, support vector machine (SVM), and XGBoost. These techniques aim to forecast the dependent variable, GVFO. The regression models are tested using centrifugal separator data and tested using gravity-based separator experimental data. The selection of the superior model from the seven was informed by its low error rates and high R-squared value, as detailed in Table 2. This table presents the R-squared, root mean square error (RMSE), and mean absolute percentage error (MAPE) for each model for prediction of GVFO using testing dataset (unseen dataset). R-squared is employed to evaluate how well the regression models explain the variance in GVFO, with a range from 0 to 100% providing a direct measure of the correlation between dependent and independent variables. Meanwhile, RMSE offers insight into the models’ predictive precision, with lower values indicating predictions more closely reflecting actual outcomes. The analysis also involves a qualitative review, examining scatterplots that showed predicted values against actual data, as will be elaborated in subsequent sections. These plots reveal the explanatory strength of each model, with a tighter clustering of points around the regression line indicative of lower variance. Additionally, MAPE is utilized to gauge the accuracy of predictions, with smaller MAPE values pointing to more precise models. MLR serves as the benchmark for comparison, illustrating the relationship between predicted and observed data, though it is not graphically represented in the forthcoming sections.

5.1. KNN Regression Model

Figure 7 offers a visual assessment of the predictive capabilities of the KNN model by showing predicted values against actual measurements of the gas volume flow at the annulus outlet (GVFO). The scatterplot (Figure 7) demonstrates the correlation between the predicted and actual GVFO, with the x-axis denoting the actual measurements and the y-axis depicting the predicted values. The KNN model’s predictive strength is highlighted by its R-squared value of 96.6%, indicating that a substantial portion of GVFO variability is captured by the model. With RMSE and MAPE scores standing at 130 and 8, respectively, the KNN model’s predictions exhibit notable reliability, specifically when compared to MLR, SVM, and XGBoost models. Figure 7 further indicates the KNN model’s robustness; the proximity of the predicted points to the regression line suggests a commendable degree of accuracy, surpassing that of MLR, lasso, ridge, and SVM methods. While the KNN model demonstrates a strong predictive performance, it is the second efficient model among all those evaluated, suggesting there is room for improvement or potential advantages to be gained from other models in certain aspects of prediction for GVFO.

5.2. Lasso Regression Model

Figure 8 visually represents the lasso model’s predictive capability in estimating GVFO, showcasing the plotted predicted values against actual measurements. This scatterplot illustrates the relationship between predicted and measured GVFO values, with the x-axis indicating measured values and the y-axis displaying predicted ones. The model’s accuracy is highlighted by an R-squared value of 92.1%, indicating its ability to capture a reasonable amount of variability in GVFO, albeit lower than some other models. However, its precision is somewhat compromised, as reflected by RMSE and MAPE values of 199 and 7, respectively, suggesting a relatively lower level of reliability compared to other models considered. Based on these metrics, the lasso model appears to be less effective compared to the other models incorporated for predicting GVFO.

5.3. Random Forest Model

Figure 9 demonstrates the predictive capability of the RFR model in estimating GVFO by contrasting its predictions with actual data. The scatterplot illustrates the relationship between predicted and measured GVFO values, with actual values represented on the x-axis and predictions on the y-axis. The RFR model excels in predicting GVFO, as evidenced by its impressive R-squared value of 95.9%, indicating its ability to explain a significant portion of the variability in GVFO. The model demonstrates reliability, with an RMSE of 112 and an MAPE of 8, the lowest error metrics among all incorporated models. Moreover, Figure 9 illustrates that the RFR model’s predictions closely align with the line of best fit, showcasing its high prediction accuracy compared to all other models. Overall, the RFR model ranks highest in its ability to predict GVFO among all seven regression models.

5.4. Ridge Regression Model

In Figure 10, the predicted data are visually compared against the observed measurements for the ridge regression model, focusing on the GVFO response variable. The model demonstrates an R-squared value of 94.1%, indicating that it performs slightly better than the MLR, lasso, and XGBoost models in capturing the variance within the GVFO data. However, the model’s reliability is compromised, as evidenced by an RMSE of 175 and a MAPE of 10, which are less favorable compared to the outcomes from other models. Based on these error metrics and the quality of predictions, the ridge regression model does not rank among the top predictive models evaluated in this study.

5.5. Support Vector Machine Model

Figure 11 displays the comparison between predicted and actual GVFO measurements using the SVM regression model. The scatterplot showcases the model’s accuracy in explaining GVFO variability, evidenced by an R-squared value of 92.4%. This value indicates that the SVM model performs more effectively than the MLR model in capturing the changes in GVFO data. With an RMSE of 140 and a MAPE of 10, the SVM regression model’s results demonstrate moderate reliability compared to other models evaluated in the study. While the SVM model shows improvement over the MLR model, it does not outperform the random forest and KNN regression models.

5.6. XGBoost Regression Model

Figure 12 presents the predicted versus actual data for the GVFO, as estimated by the XGBoost regression model. This model’s ability to account for fluctuations in GVFO is demonstrated by its R-squared value of 93.9%. The data in Figure 12 imply that XGBoost regression is more effective in modeling GVFO variation compared to the MLR and SVM models. The model shows a moderate level of dependability, evidenced by an RMSE of 175 and a MAPE of 13. Although the XGBoost model exhibits enhancements in prediction accuracy and error rates over MLR and lasso, it does not surpass the other models evaluated in this research.

6. Discussion

The result section above shows the efficacy of machine learning techniques in forecasting the gas volume flow per minute at the outlet (GVFO), a key performance indicator for the liquid–gas separation efficiency of downhole centrifugal separators. As delineated in Table 2, the random forest regression approach exhibits the most favorable R-squared value, signifying its superior predictive capacity for GVFO, while the multilinear regression method registers the least. Concurrently, Figure 13 shows the selection criteria for the optimal model, which is predicated on achieving the lowest error alongside the highest R-squared value for GVFO prediction. The R-squared and RMSE values were normalized with the maximum value and plotted as shown in Figure 13, indicating the random forest regression model as the most precise, characterized by the highest R-squared and the lowest RMSE. The results showed that the random forest model provided superior predictive performance due to its ability to handle complex interactions between variables and mitigate overfitting through the bagging approach. All seven models’ comparisons are mentioned in Table 2 based on their error metrics. Furthermore, when scrutinizing the R-squared values, both the RFR and KNN regression models demonstrate good predictive proficiency with R-squared values of 95.9% and 96.6%, respectively. The RFR model, in particular, outperformed others in terms of RMSE on the testing dataset (blind test), whereas the multilinear regression model lags behind, registering the highest RMSE. It is noteworthy that the support vector machine algorithm, typically favored for classification tasks, also shows better performance in predicting GVFO, more than the multilinear and lasso regression models. In summation, upon considering the collective metrics, the random forest regression model emerges as the preeminent choice for GVFO prediction. Its distinguished R-squared and minimal RMSE values corroborate its aptitude to accurately predict GVFO from experimental data with minimal error, thereby substantiating its adoption in the analysis of downhole separator efficiency.

7. Conclusions

This investigation aimed to forecast the gas volume flow per minute at the outlet (GVFO) for gravity-based separators using experimental test data measuring the efficiency for centrifugal downhole separators. Seven distinct regression techniques were developed and evaluated: k-nearest neighbors, lasso, multilinear, random forest, ridge, support vector machine, and XGBoost. These models utilized gas volume flow per minute at the inlet (GVFI) and liquid volume flow per minute at the inlet (LVFI) as input parameters, selected after data preprocessing and feature engineering. Their performance was quantified by comparing R-squared and error statistics. Among these models, the random forest regression algorithm emerged as the most precise in predicting GVFO, outperforming its counterparts. Comparative analysis revealed R-squared values for the k-nearest neighbors, lasso, multilinear, random forest, ridge, support vector machine, and XGBoost models at 96.6%, 92.1%, 90.2%, 95.9%, 94.1%, 92.4%, and 93.9% respectively, with corresponding RMSE scores of 130, 199, 201, 112, 174, 140, and 175. The analysis indicates that the random forest regression method demonstrates superior performance in predicting GVFO, closely followed by the KNN regression model. The findings strongly advocate for the use of random forest regression in predicting the efficiency of downhole gas separators, with the model accounting for 96.6% of the variability in gas efficiency data and a prediction error of 7.5%. The random forest model’s robustness to overfitting, ability to handle non-linear relationships, and effectiveness in variable importance assessment are key factors contributing to its superior performance. The primary difference between centrifugal and gravity-based separators is the spiral section found exclusively in the centrifugal separator. Furthermore, the gravity-based separator boasts a length of 10 inches, contrasting with the 15-inch length of the centrifugal separator. The seven regression models were developed using experimental test data conducted to evaluate the separation efficiency of centrifugal downhole separators, while blind testing was conducted using experimental data to evaluate the separation efficiency of gravity-based separators. The model testing indicated that changes in the shape and size of the separator did not significantly affect the prediction quality of the models. These insights from machine learning will aid in optimizing production using artificial lift methods.

Author Contributions

Conceptualization, A.S., I.G., and T.B.; data curation, A.S.; methodology, A.S., I.G., and T.B.; formal analysis, A.S.; investigation, A.S.; methodology, A.S. and T.B.; project administration, H.K.; software, A.S.; supervision, H.K.; Visualization, A.S., L.C.O.O., N.Y., I.G., T.B., and N.K.; writing—original draft, A.S.; writing—review and editing, L.C.O.O., N.Y., I.G., T.B., N.K., and H.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data will be made available on specific user request.

Acknowledgments

The authors would like to thank Echometer Company for providing downhole separator for testing. The authors would also like to thank Michael Olubode for sharing his experiment data.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviation

AIGF	Average inlet gas flowrate (scf/min)
AILF	Average inlet liquid flowrate (bbl/min)
CCV	Casing control valve opening percentage (%)
CRL	Casing return line
CVGIL	Control valve opening percentage installed at gas inlet line (%)
DAQ	Data acquisition
ESP	Electrical submersible pump
GVFI	Gas volume flow per minute at inlet (ft3)
GVFO	Gas volume flow per minute at outlet (ft3)
GVFTRL	Gas volume flow at tubing return line (ft3)
IGLR	Gas–liquid ratio at inlet (scf/stb)
KNN	K-nearest neighbors regression
LVFI	Liquid volume flow per minute at inlet (bbl)
LVFTRL	Liquid volume flow per minute at outlet (bbl)
MAPE	Mean absolute percentage error
ML	Machine learning
MLR	Multilinear regression
MSE	Mean square error
OGLR	Gas–liquid ratio at outlet (scf/stb)
PCP	Progressive cavity pump
R-squared	Coefficient of determination [%]
RFR	Random forest regression
RMSE	Root mean square error
SRP	Sucker rod pump
stepAIC	Step Akaike information criteria
SVM	Support vector machine regression
TRL	Tubing return line
VIF	Variable inflation factor

References

Sharma, A. Experimental Evaluation of a Centrifugal Packer-Type Downhole Separator; University of Oklahoma: Norman, OK, USA, 2019; Available online: https://hdl.handle.net/11244/323223 (accessed on 15 May 2024).
Sharma, A.; Iradukunda, P.; Karami, H.; McCoy, J.N.; Podio, A.L.; Teodoriu, C. Experimental Evaluation of a Prototype Centrifugal Packer-Type Downhole Separator. In Proceedings of the SPE Artificial Lift Conference and Exhibition—Americas 2020, Virtual, 10–12 November 2020. [Google Scholar] [CrossRef]
Gao, C.; Rivero, M.; Nakagawa, E.; Sanchez, G. Downhole Separation Technology—Past, Present and Future. APPEA J. 2007, 47, 283. [Google Scholar] [CrossRef]
Kobylinski, L.S.; Taylor, F.T.; Brienan, J.W. Development and Field Test Results of an Efficient Downhole Centrifugal Gas Separator. JPT J. Pet. Technol. 1985, 37, 1295–1304. [Google Scholar] [CrossRef]
McCoy, J.N.; Podio, A.L. An Improved Downhole Gas Separator. In Proceedings of the Annual Technical Meeting 1999, ATM 1999, Monterey, CA, USA, 6–11 June 1999. [Google Scholar] [CrossRef]
McCoy, J.M.; Patterson, J.; Podio, A.L. Downhole Gas Separators—A Laboratory and Field Study. J. Can. Pet. Technol. 2007, 46, 48–54. [Google Scholar] [CrossRef]
McCoy, J.N.; Podio, A.L.; Rowlan, O.L.; Becker, D. Evaluation and Performance of Packer-Type Downhole Gas Separators. SPE Prod. Oper. 2015, 30, 236–242. [Google Scholar] [CrossRef]
Mccoy, J.N.; Rowlan, O.L.; Company, D.B.E.; Podio, A.L. Optimizing Downhole Packer-Type Separators Packer-Type Separators. In Proceedings of the 2013 Southwestern Petroleum Short Course, Lubbock, TX, USA, 20–21 April 2011. [Google Scholar]
Ogunsina, O.O.; Wiggins, M.L. A Review of Downhole Separation Technology. In Proceedings of the Society of Petroleum Engineers—SPE Production Operations Symposium 2005, POS 2005, Oklahoma City, OK, USA, 16–19 April 2005; pp. 1–8. [Google Scholar] [CrossRef]
Bohorquez, R.; Ananaba, V.; Alabi, O.; Podio, A.L.; Lisigurski, O.; Guzman, M. Laboratory Testing of Downhole Gas Separators. SPE Prod. Oper. 2009, 24, 499–509. [Google Scholar] [CrossRef]
Olubode, M.; Osorio, L.; Karami, H.; McCoy, J.; Podio, T. Experimental Comparison of Two Downhole Separators in Boosting Artificial Lift Performance. In Proceedings of the Society of Petroleum Engineers—SPE Artificial Lift Conference and Exhibition—Americas 2022, ALCE 2022, Galveston, TX, USA, 23–25 August 2022. [Google Scholar] [CrossRef]
Olubode, M.O.; Iradukunda, P.; Karami, H.; Podio, T.; McCoy, J.N. Experimental Analysis of Centrifugal Downhole Separators in Boosting Artificial Lift Performance. J. Nat. Gas Sci. Eng. 2022, 99, 104408. [Google Scholar] [CrossRef]
Dastyar, Z.; Rabieh, M.M.; Hajidavalloo, E. Proposing a Method for Performance Evaluation of a Designed Two-Phase Vertical Separator and a Piston Pump Using Computational Fluid Dynamics. SPE J. 2023, 28, 2642–2659. [Google Scholar] [CrossRef]
Al Munif, E.H.; Alhamad, L.; Ejim, C.E.; Banjar, H.M. Review of Downhole Gas Liquid Separators In Unconventional Reservoirs. In Proceedings of the SPE Annual Technical Conference and Exhibition 2023, San Antonio, TX, USA, 16–18 October 2023; pp. 16–18. [Google Scholar] [CrossRef]
Osorio Ojeda, L.C.; Olubode, M.; Karami, H.; Podio, T. Application of Machine Learning to Evaluate the Performances of Various Downhole Centrifugal Separator Types in Oil and Gas Production Systems. In Proceedings of the SPE Oklahoma City Oil and Gas Symposium, Oklahoma City, OK, USA, 17–19 April 2023. [Google Scholar] [CrossRef]
Ojeda, L.C.O. Application of Machine Learning and Dimensional Analysis to Evaluate the Performances of Various Downhole Centrifugal Separator Types. In Proceedings of the SPE Annual Technical Conference and Exhibition, San Antonio, TX, USA, 16–18 October 2023. [Google Scholar] [CrossRef]
Ojeda, L.C.O. A Simulation and Analytical Study on the Performance of Gas-liquid Centrifugal Downhole Separator; University of Oklahoma: Norman, OK, USA, 2023; Available online: https://hdl.handle.net/11244/340045 (accessed on 15 May 2024).
Olubode, M. Experimental Analysis of Centrifugal Downhole Separators in Boosting Artificial Lift Performance; University of Oklahoma: Norman, OK, USA, 2021; Available online: https://hdl.handle.net/11244/332419 (accessed on 15 May 2024).
Sharma, A.; Burak, T.; Nygaard, R.; Hellvik, S.; Hoel, E.; Welmer, M. Projection of Logging While Drilling Data at the Bit by Implementing Supervised Machine Learning Algorithm. In Proceedings of the SPE Oklahoma City Oil and Gas Symposium, Oklahoma City, OK, USA, 17–19 April 2023. [Google Scholar] [CrossRef]
Sharma, A.; Gupta, I.; Phi, T.; Ashesh, S.; Kumar, R.; Borgogno, F.G. Utilizing Machine Learning to Improve Reserves Estimation and Production Forecasting Accuracy. In Proceedings of the SPE/AAPG/SEG Latin America Unconventional Resources Technology Conference 2023, Buenos Aires, Argentina, 4–6 December 2023. [Google Scholar] [CrossRef]
Sharma, A.; Burak, T.; Nygaard, R.; Hoel, E.; Kristiansen, T.; Hellvik, S.; Welmer, M. Projecting Petrophysical Logs at the Bit through Multi-Well Data Analysis with Machine Learning. In Proceedings of the SPE Offshore Europe Conference & Exhibition 2023, Aberdeen, Scotland, UK, 5–8 September 2023. [Google Scholar] [CrossRef]

Figure 1. Multiphase flow setup constructed to measure the efficiency of the downhole gas separator installed in the vertical section.

Figure 2. Downhole gas separator schematic with flow directions.

Figure 3. Workflow for supervised learning model selection for prediction of downhole separator’s efficiency.

Figure 4. Selection of the efficient regression model for prediction of gas volume flow per minute at outlet (GVFO) (response variable) using gas volume flow per minute at inlet (GVFI) and liquid volume flow per minute at inlet (LVFI) as independent parameters.

Figure 5. Scatterplot of gas volume flow per minute in ft³ with independent variables IGLR, CVGIL, CCV (Top left to right), and AILF and AIGF (Bottom left to right). This figure indicates how GVFO is behaving with changes in all independent variables.

Figure 6. Correlation plot indicating correlation coefficient of dependent (GVFO) and independent variables (GVFI and LVFI), and among independent variables, indicating multicollinearity. (Dark blue (i.e., correlation coefficient = 1) means perfect correlation and dark red (i.e., correlation coefficient = −1) is perfect inverse correlation. 0 is no correlation). GVFO is highly correlated with CVGIL, GVFI, and GVTRL and moderately correlated with CCV, IGLR, and OGLR and has low correlation with AILF, AIGF, LVFI, and LVFTRL.

Figure 7. Scatterplot of measured vs. predicted GVFO using testing dataset for KNN regression model.

Figure 8. Scatterplot of measured vs. predicted GVFO using testing dataset for lasso regression model.

Figure 9. Scatterplot of measured vs. predicted GVFO using testing dataset for random forest regression model.

Figure 10. Scatterplot of measured vs. predicted GVFO using testing dataset for ridge regression model.

Figure 11. Scatterplot of measured vs. predicted GVFO using testing dataset for SVM regression model.

Figure 12. Scatterplot of measured vs. predicted GVFO using testing dataset for XGBoost regression model.

Figure 13. Best regression model selection based on lowest RMSE and highest R-squared to predict GVFO. The highest R-squared of 96% and lowest RMSE value of 112 for GVFO prediction is for RFR regression model. Therefore, RFR model is selected for prediction of GVFO.

Table 1. VIF for each variable indicating multicollinearity between independent variables. A VIF value ranging from 1 to 5 suggests a moderate level of correlation between the two independent variables considered in the analysis.

Parameter	VIF
GVFI	1.01
LVFI	1.12

Table 2. Regression methods used in the model to predict GVFO with R-squared, RMSE, and MAPE. The most efficient model among all is the lasso model with highest R-squared value of 98% and lowest RMSE and MAPE values on test data.

Regression Model	R-Squared (%)	RMSE	MAPE
KNN	96.6	130.1	7.7
Lasso	92.1	199.2	7.3
Multilinear	90.2	201.3	7.5
Random Forest	95.9	112.1	7.5
Ridge	94.1	174.6	10.3
Support Vector Machine	92.4	140.4	9.8
XGBoost	93.9	175.2	12.5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sharma, A.; Osorio Ojeda, L.C.; Yuan, N.; Burak, T.; Gupta, I.; Konate, N.; Karami, H. Predicting Gas Separation Efficiency of a Downhole Separator Using Machine Learning. Energies 2024, 17, 2655. https://doi.org/10.3390/en17112655

AMA Style

Sharma A, Osorio Ojeda LC, Yuan N, Burak T, Gupta I, Konate N, Karami H. Predicting Gas Separation Efficiency of a Downhole Separator Using Machine Learning. Energies. 2024; 17(11):2655. https://doi.org/10.3390/en17112655

Chicago/Turabian Style

Sharma, Ashutosh, Laura Camila Osorio Ojeda, Na Yuan, Tunc Burak, Ishank Gupta, Nabe Konate, and Hamidreza Karami. 2024. "Predicting Gas Separation Efficiency of a Downhole Separator Using Machine Learning" Energies 17, no. 11: 2655. https://doi.org/10.3390/en17112655

APA Style

Sharma, A., Osorio Ojeda, L. C., Yuan, N., Burak, T., Gupta, I., Konate, N., & Karami, H. (2024). Predicting Gas Separation Efficiency of a Downhole Separator Using Machine Learning. Energies, 17(11), 2655. https://doi.org/10.3390/en17112655

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predicting Gas Separation Efficiency of a Downhole Separator Using Machine Learning

Abstract

1. Introduction

2. Data Collection

3. Facility Design to Test Downhole Separator

4. Methodology

Data Preprocessing and Feature Engineering

5. Results

5.1. KNN Regression Model

5.2. Lasso Regression Model

5.3. Random Forest Model

5.4. Ridge Regression Model

5.5. Support Vector Machine Model

5.6. XGBoost Regression Model

6. Discussion

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI